The cost of large numbers of hypothesis tests on power, effect size and sample size.
Lazzeroni, L C; Ray, A
2012-01-01
Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.
Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.
You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary
2011-02-01
The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.
NASA Technical Reports Server (NTRS)
Rao, R. G. S.; Ulaby, F. T.
1977-01-01
The paper examines optimal sampling techniques for obtaining accurate spatial averages of soil moisture, at various depths and for cell sizes in the range 2.5-40 acres, with a minimum number of samples. Both simple random sampling and stratified sampling procedures are used to reach a set of recommended sample sizes for each depth and for each cell size. Major conclusions from statistical sampling test results are that (1) the number of samples required decreases with increasing depth; (2) when the total number of samples cannot be prespecified or the moisture in only one single layer is of interest, then a simple random sample procedure should be used which is based on the observed mean and SD for data from a single field; (3) when the total number of samples can be prespecified and the objective is to measure the soil moisture profile with depth, then stratified random sampling based on optimal allocation should be used; and (4) decreasing the sensor resolution cell size leads to fairly large decreases in samples sizes with stratified sampling procedures, whereas only a moderate decrease is obtained in simple random sampling procedures.
Phylogenetic effective sample size.
Bartoszek, Krzysztof
2016-10-21
In this paper I address the question-how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Improving the accuracy of livestock distribution estimates through spatial interpolation.
Bryssinckx, Ward; Ducheyne, Els; Muhwezi, Bernard; Godfrey, Sunday; Mintiens, Koen; Leirs, Herwig; Hendrickx, Guy
2012-11-01
Animal distribution maps serve many purposes such as estimating transmission risk of zoonotic pathogens to both animals and humans. The reliability and usability of such maps is highly dependent on the quality of the input data. However, decisions on how to perform livestock surveys are often based on previous work without considering possible consequences. A better understanding of the impact of using different sample designs and processing steps on the accuracy of livestock distribution estimates was acquired through iterative experiments using detailed survey. The importance of sample size, sample design and aggregation is demonstrated and spatial interpolation is presented as a potential way to improve cattle number estimates. As expected, results show that an increasing sample size increased the precision of cattle number estimates but these improvements were mainly seen when the initial sample size was relatively low (e.g. a median relative error decrease of 0.04% per sampled parish for sample sizes below 500 parishes). For higher sample sizes, the added value of further increasing the number of samples declined rapidly (e.g. a median relative error decrease of 0.01% per sampled parish for sample sizes above 500 parishes. When a two-stage stratified sample design was applied to yield more evenly distributed samples, accuracy levels were higher for low sample densities and stabilised at lower sample sizes compared to one-stage stratified sampling. Aggregating the resulting cattle number estimates yielded significantly more accurate results because of averaging under- and over-estimates (e.g. when aggregating cattle number estimates from subcounty to district level, P <0.009 based on a sample of 2,077 parishes using one-stage stratified samples). During aggregation, area-weighted mean values were assigned to higher administrative unit levels. However, when this step is preceded by a spatial interpolation to fill in missing values in non-sampled areas, accuracy is improved remarkably. This counts especially for low sample sizes and spatially even distributed samples (e.g. P <0.001 for a sample of 170 parishes using one-stage stratified sampling and aggregation on district level). Whether the same observations apply on a lower spatial scale should be further investigated.
Sampling for area estimation: A comparison of full-frame sampling with the sample segment approach
NASA Technical Reports Server (NTRS)
Hixson, M.; Bauer, M. E.; Davis, B. J. (Principal Investigator)
1979-01-01
The author has identified the following significant results. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plans. Evaluation of four sampling schemes involving different numbers of samples and different size sampling units shows that the precision of the wheat estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling size unit.
Estimating numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population
Keating, K.A.; Schwartz, C.C.; Haroldson, M.A.; Moody, D.
2001-01-01
For grizzly bears (Ursus arctos horribilis) in the Greater Yellowstone Ecosystem (GYE), minimum population size and allowable numbers of human-caused mortalities have been calculated as a function of the number of unique females with cubs-of-the-year (FCUB) seen during a 3- year period. This approach underestimates the total number of FCUB, thereby biasing estimates of population size and sustainable mortality. Also, it does not permit calculation of valid confidence bounds. Many statistical methods can resolve or mitigate these problems, but there is no universal best method. Instead, relative performances of different methods can vary with population size, sample size, and degree of heterogeneity among sighting probabilities for individual animals. We compared 7 nonparametric estimators, using Monte Carlo techniques to assess performances over the range of sampling conditions deemed plausible for the Yellowstone population. Our goal was to estimate the number of FCUB present in the population each year. Our evaluation differed from previous comparisons of such estimators by including sample coverage methods and by treating individual sightings, rather than sample periods, as the sample unit. Consequently, our conclusions also differ from earlier studies. Recommendations regarding estimators and necessary sample sizes are presented, together with estimates of annual numbers of FCUB in the Yellowstone population with bootstrap confidence bounds.
Generating Random Samples of a Given Size Using Social Security Numbers.
ERIC Educational Resources Information Center
Erickson, Richard C.; Brauchle, Paul E.
1984-01-01
The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)
Shen, You-xin; Liu, Wei-li; Li, Yu-hui; Guan, Hui-lin
2014-01-01
A large number of small-sized samples invariably shows that woody species are absent from forest soil seed banks, leading to a large discrepancy with the seedling bank on the forest floor. We ask: 1) Does this conventional sampling strategy limit the detection of seeds of woody species? 2) Are large sample areas and sample sizes needed for higher recovery of seeds of woody species? We collected 100 samples that were 10 cm (length) × 10 cm (width) × 10 cm (depth), referred to as larger number of small-sized samples (LNSS) in a 1 ha forest plot, and placed them to germinate in a greenhouse, and collected 30 samples that were 1 m × 1 m × 10 cm, referred to as small number of large-sized samples (SNLS) and placed them (10 each) in a nearby secondary forest, shrub land and grass land. Only 15.7% of woody plant species of the forest stand were detected by the 100 LNSS, contrasting with 22.9%, 37.3% and 20.5% woody plant species being detected by SNLS in the secondary forest, shrub land and grassland, respectively. The increased number of species vs. sampled areas confirmed power-law relationships for forest stand, the LNSS and SNLS at all three recipient sites. Our results, although based on one forest, indicate that conventional LNSS did not yield a high percentage of detection for woody species, but SNLS strategy yielded a higher percentage of detection for woody species in the seed bank if samples were exposed to a better field germination environment. A 4 m2 minimum sample area derived from power equations is larger than the sampled area in most studies in the literature. Increased sample size also is needed to obtain an increased sample area if the number of samples is to remain relatively low.
Optimal number of features as a function of sample size for various classification rules.
Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R
2005-04-15
Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.
Winston Paul Smith; Daniel J. Twedt; David A. Wiedenfeld; Paul B. Hamel; Robert P. Ford; Robert J. Cooper
1993-01-01
To compare efficacy of point count sampling in bottomland hardwood forests, duration of point count, number of point counts, number of visits to each point during a breeding season, and minimum sample size are examined.
NASA Technical Reports Server (NTRS)
Hixson, M. M.; Bauer, M. E.; Davis, B. J.
1979-01-01
The effect of sampling on the accuracy (precision and bias) of crop area estimates made from classifications of LANDSAT MSS data was investigated. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plants. Four sampling schemes involving different numbers of samples and different size sampling units were evaluated. The precision of the wheat area estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling unit size.
Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R
2017-09-14
While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.
Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, İrem Ersöz
2013-01-01
Objective: The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Study Design: Simulation study. Material and Methods: SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Results: Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. Conclusion: It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values. PMID:25207065
Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, Irem Ersöz
2013-03-01
The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Simulation study. SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values.
Hammerstrom, Kamille K; Ranasinghe, J Ananda; Weisberg, Stephen B; Oliver, John S; Fairey, W Russell; Slattery, Peter N; Oakden, James M
2012-10-01
Benthic macrofauna are used extensively for environmental assessment, but the area sampled and sieve sizes used to capture animals often differ among studies. Here, we sampled 80 sites using 3 different sized sampling areas (0.1, 0.05, 0.0071 m(2)) and sieved those sediments through each of 2 screen sizes (0.5, 1 mm) to evaluate their effect on number of individuals, number of species, dominance, nonmetric multidimensional scaling (MDS) ordination, and benthic community condition indices that are used to assess sediment quality in California. Sample area had little effect on abundance but substantially affected numbers of species, which are not easily scaled to a standard area. Sieve size had a substantial effect on both measures, with the 1-mm screen capturing only 74% of the species and 68% of the individuals collected in the 0.5-mm screen. These differences, though, had little effect on the ability to differentiate samples along gradients in ordination space. Benthic indices generally ranked sample condition in the same order regardless of gear, although the absolute scoring of condition was affected by gear type. The largest differences in condition assessment were observed for the 0.0071-m(2) gear. Benthic indices based on numbers of species were more affected than those based on relative abundance, primarily because we were unable to scale species number to a common area as we did for abundance. Copyright © 2010 SETAC.
HYPERSAMP - HYPERGEOMETRIC ATTRIBUTE SAMPLING SYSTEM BASED ON RISK AND FRACTION DEFECTIVE
NASA Technical Reports Server (NTRS)
De, Salvo L. J.
1994-01-01
HYPERSAMP is a demonstration of an attribute sampling system developed to determine the minimum sample size required for any preselected value for consumer's risk and fraction of nonconforming. This statistical method can be used in place of MIL-STD-105E sampling plans when a minimum sample size is desirable, such as when tests are destructive or expensive. HYPERSAMP utilizes the Hypergeometric Distribution and can be used for any fraction nonconforming. The program employs an iterative technique that circumvents the obstacle presented by the factorial of a non-whole number. HYPERSAMP provides the required Hypergeometric sample size for any equivalent real number of nonconformances in the lot or batch under evaluation. Many currently used sampling systems, such as the MIL-STD-105E, utilize the Binomial or the Poisson equations as an estimate of the Hypergeometric when performing inspection by attributes. However, this is primarily because of the difficulty in calculation of the factorials required by the Hypergeometric. Sampling plans based on the Binomial or Poisson equations will result in the maximum sample size possible with the Hypergeometric. The difference in the sample sizes between the Poisson or Binomial and the Hypergeometric can be significant. For example, a lot size of 400 devices with an error rate of 1.0% and a confidence of 99% would require a sample size of 400 (all units would need to be inspected) for the Binomial sampling plan and only 273 for a Hypergeometric sampling plan. The Hypergeometric results in a savings of 127 units, a significant reduction in the required sample size. HYPERSAMP is a demonstration program and is limited to sampling plans with zero defectives in the sample (acceptance number of zero). Since it is only a demonstration program, the sample size determination is limited to sample sizes of 1500 or less. The Hypergeometric Attribute Sampling System demonstration code is a spreadsheet program written for IBM PC compatible computers running DOS and Lotus 1-2-3 or Quattro Pro. This program is distributed on a 5.25 inch 360K MS-DOS format diskette, and the program price includes documentation. This statistical method was developed in 1992.
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2012 CFR
2012-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2011 CFR
2011-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2013 CFR
2013-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2010 CFR
2010-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2014 CFR
2014-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F
2015-01-01
Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.
Mudalige, Thilak K; Qu, Haiou; Linder, Sean W
2015-11-13
Engineered nanoparticles are available in large numbers of commercial products claiming various health benefits. Nanoparticle absorption, distribution, metabolism, excretion, and toxicity in a biological system are dependent on particle size, thus the determination of size and size distribution is essential for full characterization. Number based average size and size distribution is a major parameter for full characterization of the nanoparticle. In the case of polydispersed samples, large numbers of particles are needed to obtain accurate size distribution data. Herein, we report a rapid methodology, demonstrating improved nanoparticle recovery and excellent size resolution, for the characterization of gold nanoparticles in dietary supplements using asymmetric flow field flow fractionation coupled with visible absorption spectrometry and inductively coupled plasma mass spectrometry. A linear relationship between gold nanoparticle size and retention times was observed, and used for characterization of unknown samples. The particle size results from unknown samples were compared to results from traditional size analysis by transmission electron microscopy, and found to have less than a 5% deviation in size for unknown product over the size range from 7 to 30 nm. Published by Elsevier B.V.
Sample sizes to control error estimates in determining soil bulk density in California forest soils
Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber
2016-01-01
Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
Wareham, K J; Hyde, R M; Grindlay, D; Brennan, M L; Dean, R S
2017-10-04
Randomised controlled trials (RCTs) are a key component of the veterinary evidence base. Sample sizes and defined outcome measures are crucial components of RCTs. To describe the sample size and number of outcome measures of veterinary RCTs either funded by the pharmaceutical industry or not, published in 2011. A structured search of PubMed identified RCTs examining the efficacy of pharmaceutical interventions. Number of outcome measures, number of animals enrolled per trial, whether a primary outcome was identified, and the presence of a sample size calculation were extracted from the RCTs. The source of funding was identified for each trial and groups compared on the above parameters. Literature searches returned 972 papers; 86 papers comprising 126 individual trials were analysed. The median number of outcomes per trial was 5.0; there were no significant differences across funding groups (p = 0.133). The median number of animals enrolled per trial was 30.0; this was similar across funding groups (p = 0.302). A primary outcome was identified in 40.5% of trials and was significantly more likely to be stated in trials funded by a pharmaceutical company. A very low percentage of trials reported a sample size calculation (14.3%). Failure to report primary outcomes, justify sample sizes and the reporting of multiple outcome measures was a common feature in all of the clinical trials examined in this study. It is possible some of these factors may be affected by the source of funding of the studies, but the influence of funding needs to be explored with a larger number of trials. Some veterinary RCTs provide a weak evidence base and targeted strategies are required to improve the quality of veterinary RCTs to ensure there is reliable evidence on which to base clinical decisions.
Reporting of sample size calculations in analgesic clinical trials: ACTTION systematic review.
McKeown, Andrew; Gewandter, Jennifer S; McDermott, Michael P; Pawlowski, Joseph R; Poli, Joseph J; Rothstein, Daniel; Farrar, John T; Gilron, Ian; Katz, Nathaniel P; Lin, Allison H; Rappaport, Bob A; Rowbotham, Michael C; Turk, Dennis C; Dworkin, Robert H; Smith, Shannon M
2015-03-01
Sample size calculations determine the number of participants required to have sufficiently high power to detect a given treatment effect. In this review, we examined the reporting quality of sample size calculations in 172 publications of double-blind randomized controlled trials of noninvasive pharmacologic or interventional (ie, invasive) pain treatments published in European Journal of Pain, Journal of Pain, and Pain from January 2006 through June 2013. Sixty-five percent of publications reported a sample size calculation but only 38% provided all elements required to replicate the calculated sample size. In publications reporting at least 1 element, 54% provided a justification for the treatment effect used to calculate sample size, and 24% of studies with continuous outcome variables justified the variability estimate. Publications of clinical pain condition trials reported a sample size calculation more frequently than experimental pain model trials (77% vs 33%, P < .001) but did not differ in the frequency of reporting all required elements. No significant differences in reporting of any or all elements were detected between publications of trials with industry and nonindustry sponsorship. Twenty-eight percent included a discrepancy between the reported number of planned and randomized participants. This study suggests that sample size calculation reporting in analgesic trial publications is usually incomplete. Investigators should provide detailed accounts of sample size calculations in publications of clinical trials of pain treatments, which is necessary for reporting transparency and communication of pre-trial design decisions. In this systematic review of analgesic clinical trials, sample size calculations and the required elements (eg, treatment effect to be detected; power level) were incompletely reported. A lack of transparency regarding sample size calculations may raise questions about the appropriateness of the calculated sample size. Copyright © 2015 American Pain Society. All rights reserved.
Sequential sampling: a novel method in farm animal welfare assessment.
Heath, C A E; Main, D C J; Mullan, S; Haskell, M J; Browne, W J
2016-02-01
Lameness in dairy cows is an important welfare issue. As part of a welfare assessment, herd level lameness prevalence can be estimated from scoring a sample of animals, where higher levels of accuracy are associated with larger sample sizes. As the financial cost is related to the number of cows sampled, smaller samples are preferred. Sequential sampling schemes have been used for informing decision making in clinical trials. Sequential sampling involves taking samples in stages, where sampling can stop early depending on the estimated lameness prevalence. When welfare assessment is used for a pass/fail decision, a similar approach could be applied to reduce the overall sample size. The sampling schemes proposed here apply the principles of sequential sampling within a diagnostic testing framework. This study develops three sequential sampling schemes of increasing complexity to classify 80 fully assessed UK dairy farms, each with known lameness prevalence. Using the Welfare Quality herd-size-based sampling scheme, the first 'basic' scheme involves two sampling events. At the first sampling event half the Welfare Quality sample size is drawn, and then depending on the outcome, sampling either stops or is continued and the same number of animals is sampled again. In the second 'cautious' scheme, an adaptation is made to ensure that correctly classifying a farm as 'bad' is done with greater certainty. The third scheme is the only scheme to go beyond lameness as a binary measure and investigates the potential for increasing accuracy by incorporating the number of severely lame cows into the decision. The three schemes are evaluated with respect to accuracy and average sample size by running 100 000 simulations for each scheme, and a comparison is made with the fixed size Welfare Quality herd-size-based sampling scheme. All three schemes performed almost as well as the fixed size scheme but with much smaller average sample sizes. For the third scheme, an overall association between lameness prevalence and the proportion of lame cows that were severely lame on a farm was found. However, as this association was found to not be consistent across all farms, the sampling scheme did not prove to be as useful as expected. The preferred scheme was therefore the 'cautious' scheme for which a sampling protocol has also been developed.
Monitoring Species of Concern Using Noninvasive Genetic Sampling and Capture-Recapture Methods
2016-11-01
ABBREVIATIONS AICc Akaike’s Information Criterion with small sample size correction AZGFD Arizona Game and Fish Department BMGR Barry M. Goldwater...MNKA Minimum Number Known Alive N Abundance Ne Effective Population Size NGS Noninvasive Genetic Sampling NGS-CR Noninvasive Genetic...parameter estimates from capture-recapture models require sufficient sample sizes , capture probabilities and low capture biases. For NGS-CR, sample
The endothelial sample size analysis in corneal specular microscopy clinical examinations.
Abib, Fernando C; Holzchuh, Ricardo; Schaefer, Artur; Schaefer, Tania; Godois, Ronialci
2012-05-01
To evaluate endothelial cell sample size and statistical error in corneal specular microscopy (CSM) examinations. One hundred twenty examinations were conducted with 4 types of corneal specular microscopes: 30 with each BioOptics, CSO, Konan, and Topcon corneal specular microscopes. All endothelial image data were analyzed by respective instrument software and also by the Cells Analyzer software with a method developed in our lab. A reliability degree (RD) of 95% and a relative error (RE) of 0.05 were used as cut-off values to analyze images of the counted endothelial cells called samples. The sample size mean was the number of cells evaluated on the images obtained with each device. Only examinations with RE < 0.05 were considered statistically correct and suitable for comparisons with future examinations. The Cells Analyzer software was used to calculate the RE and customized sample size for all examinations. Bio-Optics: sample size, 97 ± 22 cells; RE, 6.52 ± 0.86; only 10% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 162 ± 34 cells. CSO: sample size, 110 ± 20 cells; RE, 5.98 ± 0.98; only 16.6% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 157 ± 45 cells. Konan: sample size, 80 ± 27 cells; RE, 10.6 ± 3.67; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 336 ± 131 cells. Topcon: sample size, 87 ± 17 cells; RE, 10.1 ± 2.52; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 382 ± 159 cells. A very high number of CSM examinations had sample errors based on Cells Analyzer software. The endothelial sample size (examinations) needs to include more cells to be reliable and reproducible. The Cells Analyzer tutorial routine will be useful for CSM examination reliability and reproducibility.
What is the optimum sample size for the study of peatland testate amoeba assemblages?
Mazei, Yuri A; Tsyganov, Andrey N; Esaulov, Anton S; Tychkov, Alexander Yu; Payne, Richard J
2017-10-01
Testate amoebae are widely used in ecological and palaeoecological studies of peatlands, particularly as indicators of surface wetness. To ensure data are robust and comparable it is important to consider methodological factors which may affect results. One significant question which has not been directly addressed in previous studies is how sample size (expressed here as number of Sphagnum stems) affects data quality. In three contrasting locations in a Russian peatland we extracted samples of differing size, analysed testate amoebae and calculated a number of widely-used indices: species richness, Simpson diversity, compositional dissimilarity from the largest sample and transfer function predictions of water table depth. We found that there was a trend for larger samples to contain more species across the range of commonly-used sample sizes in ecological studies. Smaller samples sometimes failed to produce counts of testate amoebae often considered minimally adequate. It seems likely that analyses based on samples of different sizes may not produce consistent data. Decisions about sample size need to reflect trade-offs between logistics, data quality, spatial resolution and the disturbance involved in sample extraction. For most common ecological applications we suggest that samples of more than eight Sphagnum stems are likely to be desirable. Copyright © 2017 Elsevier GmbH. All rights reserved.
Simulation analyses of space use: Home range estimates, variability, and sample size
Bekoff, Marc; Mech, L. David
1984-01-01
Simulations of space use by animals were run to determine the relationship among home range area estimates, variability, and sample size (number of locations). As sample size increased, home range size increased asymptotically, whereas variability decreased among mean home range area estimates generated by multiple simulations for the same sample size. Our results suggest that field workers should ascertain between 100 and 200 locations in order to estimate reliably home range area. In some cases, this suggested guideline is higher than values found in the few published studies in which the relationship between home range area and number of locations is addressed. Sampling differences for small species occupying relatively small home ranges indicate that fewer locations may be sufficient to allow for a reliable estimate of home range. Intraspecific variability in social status (group member, loner, resident, transient), age, sex, reproductive condition, and food resources also have to be considered, as do season, habitat, and differences in sampling and analytical methods. Comparative data still are needed.
(I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research.
van Rijnsoever, Frank J
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: "random chance," which is based on probability sampling, "minimal information," which yields at least one new code per sampling step, and "maximum information," which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.
Sample Size and Item Parameter Estimation Precision When Utilizing the One-Parameter "Rasch" Model
ERIC Educational Resources Information Center
Custer, Michael
2015-01-01
This study examines the relationship between sample size and item parameter estimation precision when utilizing the one-parameter model. Item parameter estimates are examined relative to "true" values by evaluating the decline in root mean squared deviation (RMSD) and the number of outliers as sample size increases. This occurs across…
Qualitative Meta-Analysis on the Hospital Task: Implications for Research
ERIC Educational Resources Information Center
Noll, Jennifer; Sharma, Sashi
2014-01-01
The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…
Sampling strategies for estimating brook trout effective population size
Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher
2012-01-01
The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...
Sample size and allocation of effort in point count sampling of birds in bottomland hardwood forests
Smith, W.P.; Twedt, D.J.; Cooper, R.J.; Wiedenfeld, D.A.; Hamel, P.B.; Ford, R.P.; Ralph, C. John; Sauer, John R.; Droege, Sam
1995-01-01
To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect of increasing the number of points or visits by comparing results of 150 four-minute point counts obtained from each of four stands on Delta Experimental Forest (DEF) during May 8-May 21, 1991 and May 30-June 12, 1992. For each stand, we obtained bootstrap estimates of mean cumulative number of species each year from all possible combinations of six points and six visits. ANOVA was used to model cumulative species as a function of number of points visited, number of visits to each point, and interaction of points and visits. There was significant variation in numbers of birds and species between regions and localities (nested within region); neither habitat, nor the interaction between region and habitat, was significant. For a = 0.05 and a = 0.10, minimum sample size estimates (per factor level) varied by orders of magnitude depending upon the observed or specified range of desired detectable difference. For observed regional variation, 20 and 40 point counts were required to accommodate variability in total individuals (MSE = 9.28) and species (MSE = 3.79), respectively, whereas ? 25 percent of the mean could be achieved with five counts per factor level. Sample size sufficient to detect actual differences of Wood Thrush (Hylocichla mustelina) was >200, whereas the Prothonotary Warbler (Protonotaria citrea) required <10 counts. Differences in mean cumulative species were detected among number of points visited and among number of visits to a point. In the lower MAV, mean cumulative species increased with each added point through five points and with each additional visit through four visits. Although no interaction was detected between number of points and number of visits, when paired reciprocals were compared, more points invariably yielded a significantly greater cumulative number of species than more visits to a point. Still, 36 point counts per stand during each of two breeding seasons detected only 52 percent of the known available species pool in DEF.
Performance of a Line Loss Correction Method for Gas Turbine Emission Measurements
NASA Astrophysics Data System (ADS)
Hagen, D. E.; Whitefield, P. D.; Lobo, P.
2015-12-01
International concern for the environmental impact of jet engine exhaust emissions in the atmosphere has led to increased attention on gas turbine engine emission testing. The Society of Automotive Engineers Aircraft Exhaust Emissions Measurement Committee (E-31) has published an Aerospace Information Report (AIR) 6241 detailing the sampling system for the measurement of non-volatile particulate matter from aircraft engines, and is developing an Aerospace Recommended Practice (ARP) for methodology and system specification. The Missouri University of Science and Technology (MST) Center for Excellence for Aerospace Particulate Emissions Reduction Research has led numerous jet engine exhaust sampling campaigns to characterize emissions at different locations in the expanding exhaust plume. Particle loss, due to various mechanisms, occurs in the sampling train that transports the exhaust sample from the engine exit plane to the measurement instruments. To account for the losses, both the size dependent penetration functions and the size distribution of the emitted particles need to be known. However in the proposed ARP, particle number and mass are measured, but size is not. Here we present a methodology to generate number and mass correction factors for line loss, without using direct size measurement. A lognormal size distribution is used to represent the exhaust aerosol at the engine exit plane and is defined by the measured number and mass at the downstream end of the sample train. The performance of this line loss correction is compared to corrections based on direct size measurements using data taken by MST during numerous engine test campaigns. The experimental uncertainty in these correction factors is estimated. Average differences between the line loss correction method and size based corrections are found to be on the order of 10% for number and 2.5% for mass.
Hierarchical modeling of cluster size in wildlife surveys
Royle, J. Andrew
2008-01-01
Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Technical note: Alternatives to reduce adipose tissue sampling bias.
Cruz, G D; Wang, Y; Fadel, J G
2014-10-01
Understanding the mechanisms by which nutritional and pharmaceutical factors can manipulate adipose tissue growth and development in production animals has direct and indirect effects in the profitability of an enterprise. Adipocyte cellularity (number and size) is a key biological response that is commonly measured in animal science research. The variability and sampling of adipocyte cellularity within a muscle has been addressed in previous studies, but no attempt to critically investigate these issues has been proposed in the literature. The present study evaluated 2 sampling techniques (random and systematic) in an attempt to minimize sampling bias and to determine the minimum number of samples from 1 to 15 needed to represent the overall adipose tissue in the muscle. Both sampling procedures were applied on adipose tissue samples dissected from 30 longissimus muscles from cattle finished either on grass or grain. Briefly, adipose tissue samples were fixed with osmium tetroxide, and size and number of adipocytes were determined by a Coulter Counter. These results were then fit in a finite mixture model to obtain distribution parameters of each sample. To evaluate the benefits of increasing number of samples and the advantage of the new sampling technique, the concept of acceptance ratio was used; simply stated, the higher the acceptance ratio, the better the representation of the overall population. As expected, a great improvement on the estimation of the overall adipocyte cellularity parameters was observed using both sampling techniques when sample size number increased from 1 to 15 samples, considering both techniques' acceptance ratio increased from approximately 3 to 25%. When comparing sampling techniques, the systematic procedure slightly improved parameters estimation. The results suggest that more detailed research using other sampling techniques may provide better estimates for minimum sampling.
Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu
2013-01-01
The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
Albasan, Hasan; Lulich, Jody P; Osborne, Carl A; Lekcharoensuk, Chalermpol; Ulrich, Lisa K; Carpenter, Kathleen A
2003-01-15
To determine effects of storage temperature and time on pH and specific gravity of and number and size of crystals in urine samples from dogs and cats. Randomized complete block design. 31 dogs and 8 cats. Aliquots of each urine sample were analyzed within 60 minutes of collection or after storage at room or refrigeration temperatures (20 vs 6 degrees C [68 vs 43 degrees F]) for 6 or 24 hours. Crystals formed in samples from 11 of 39 (28%) animals. Calcium oxalate (CaOx) crystals formed in vitro in samples from 1 cat and 8 dogs. Magnesium ammonium phosphate (MAP) crystals formed in vitro in samples from 2 dogs. Compared with aliquots stored at room temperature, refrigeration increased the number and size of crystals that formed in vitro; however, the increase in number and size of MAP crystals in stored urine samples was not significant. Increased storage time and decreased storage temperature were associated with a significant increase in number of CaOx crystals formed. Greater numbers of crystals formed in urine aliquots stored for 24 hours than in aliquots stored for 6 hours. Storage time and temperature did not have a significant effect on pH or specific gravity. Urine samples should be analyzed within 60 minutes of collection to minimize temperature- and time-dependent effects on in vitro crystal formation. Presence of crystals observed in stored samples should be validated by reevaluation of fresh urine.
(I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario. PMID:28746358
Keiter, David A.; Cunningham, Fred L.; Rhodes, Olin E.; Irwin, Brian J.; Beasley, James
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Keiter, David A; Cunningham, Fred L; Rhodes, Olin E; Irwin, Brian J; Beasley, James C
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.; ...
2016-05-25
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS
Huang, Jian; Horowitz, Joel L.; Wei, Fengrong
2010-01-01
We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739
Considerations for throughfall chemistry sample-size determination
Pamela J. Edwards; Paul Mohai; Howard G. Halverson; David R. DeWalle
1989-01-01
Both the number of trees sampled per species and the number of sampling points under each tree are important throughfall sampling considerations. Chemical loadings obtained from an urban throughfall study were used to evaluate the relative importance of both of these sampling factors in tests for determining species' differences. Power curves for detecting...
Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas
2014-01-01
Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357
Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas
2014-01-01
The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. We found a negative correlation of r = -.45 [95% CI: -.53; -.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.
Hydroxyapatite coatings containing Zn and Si on Ti-6Al-4Valloy by plasma electrolytic oxidation
NASA Astrophysics Data System (ADS)
Hwang, In-Jo; Choe, Han-Cheol
2018-02-01
In this study, hydroxyapatite coatings containing Zn and Si on Ti-6Al-4Valloy by plasma electrolytic oxidation were researched using various experimental instruments. The pore size is depended on the electrolyte concentration and the particle size and number of pore increase on surface part and pore part. In the case of Zn/Si sample, pore size was larger than that of Zn samples. The maximum size of pores decreased and minimum size of pores increased up to 10Zn/Si and Zn and Si affect the formation of pore shapes. As Zn ion concentration increases, the size of the particle tends to increase, the number of particles on the surface part is reduced, whereas the size of the particles and the number of particles on pore part increased. Zn is mainly detected at pore part, and Si is mainly detected at surface part. The crystallite size of anatase increased as the Zn ion concentration, whereas, in the case of Si ion added, crystallite size of anatase decreased.
Code of Federal Regulations, 2010 CFR
2010-07-01
... tests. (m) Test sample means the collection of compressors from the same category or configuration which is randomly drawn from the batch sample and which will receive emissions tests. (n) Batch size means... category or configuration in a batch. (o) Test sample size means the number of compressors of the same...
The Impact of Sample Size and Other Factors When Estimating Multilevel Logistic Models
ERIC Educational Resources Information Center
Schoeneberger, Jason A.
2016-01-01
The design of research studies utilizing binary multilevel models must necessarily incorporate knowledge of multiple factors, including estimation method, variance component size, or number of predictors, in addition to sample sizes. This Monte Carlo study examined the performance of random effect binary outcome multilevel models under varying…
Code of Federal Regulations, 2012 CFR
2012-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Code of Federal Regulations, 2013 CFR
2013-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Code of Federal Regulations, 2014 CFR
2014-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Potential Reporting Bias in Neuroimaging Studies of Sex Differences.
David, Sean P; Naudet, Florian; Laude, Jennifer; Radua, Joaquim; Fusar-Poli, Paolo; Chu, Isabella; Stefanick, Marcia L; Ioannidis, John P A
2018-04-17
Numerous functional magnetic resonance imaging (fMRI) studies have reported sex differences. To empirically evaluate for evidence of excessive significance bias in this literature, we searched for published fMRI studies of human brain to evaluate sex differences, regardless of the topic investigated, in Medline and Scopus over 10 years. We analyzed the prevalence of conclusions in favor of sex differences and the correlation between study sample sizes and number of significant foci identified. In the absence of bias, larger studies (better powered) should identify a larger number of significant foci. Across 179 papers, median sample size was n = 32 (interquartile range 23-47.5). A median of 5 foci related to sex differences were reported (interquartile range, 2-9.5). Few articles (n = 2) had titles focused on no differences or on similarities (n = 3) between sexes. Overall, 158 papers (88%) reached "positive" conclusions in their abstract and presented some foci related to sex differences. There was no statistically significant relationship between sample size and the number of foci (-0.048% increase for every 10 participants, p = 0.63). The extremely high prevalence of "positive" results and the lack of the expected relationship between sample size and the number of discovered foci reflect probable reporting bias and excess significance bias in this literature.
On the repeated measures designs and sample sizes for randomized controlled trials.
Tango, Toshiro
2016-04-01
For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Klinkenberg, Don; Thomas, Ekelijn; Artavia, Francisco F Calvo; Bouma, Annemarie
2011-08-01
Design of surveillance programs to detect infections could benefit from more insight into sampling schemes. We address the effect of sampling schemes for Salmonella Enteritidis surveillance in laying hens. Based on experimental estimates for the transmission rate in flocks, and the characteristics of an egg immunological test, we have simulated outbreaks with various sampling schemes, and with the current boot swab program with a 15-week sampling interval. Declaring a flock infected based on a single positive egg was not possible because test specificity was too low. Thus, a threshold number of positive eggs was defined to declare a flock infected, and, for small sample sizes, eggs from previous samplings had to be included in a cumulative sample to guarantee a minimum flock level specificity. Effectiveness of surveillance was measured by the proportion of outbreaks detected, and by the number of contaminated table eggs brought on the market. The boot swab program detected 90% of the outbreaks, with 75% fewer contaminated eggs compared to no surveillance, whereas the baseline egg program (30 eggs each 15 weeks) detected 86%, with 73% fewer contaminated eggs. We conclude that a larger sample size results in more detected outbreaks, whereas a smaller sampling interval decreases the number of contaminated eggs. Decreasing sample size and interval simultaneously reduces the number of contaminated eggs, but not indefinitely: the advantage of more frequent sampling is counterbalanced by the cumulative sample including less recently laid eggs. Apparently, optimizing surveillance has its limits when test specificity is taken into account. © 2011 Society for Risk Analysis.
Recommended protocols for sampling macrofungi
Gregory M. Mueller; John Paul Schmit; Sabine M. Hubndorf Leif Ryvarden; Thomas E. O' Dell; D. Jean Lodge; Patrick R. Leacock; Milagro Mata; Loengrin Umania; Qiuxin (Florence) Wu; Daniel L. Czederpiltz
2004-01-01
This chapter discusses several issues regarding reommended protocols for sampling macrofungi: Opportunistic sampling of macrofungi, sampling conspicuous macrofungi using fixed-size, sampling small Ascomycetes using microplots, and sampling a fixed number of downed logs.
Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling
2006-01-01
Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083
Accounting for Incomplete Species Detection in Fish Community Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
McManamay, Ryan A; Orth, Dr. Donald J; Jager, Yetta
2013-01-01
Riverine fish assemblages are heterogeneous and very difficult to characterize with a one-size-fits-all approach to sampling. Furthermore, detecting changes in fish assemblages over time requires accounting for variation in sampling designs. We present a modeling approach that permits heterogeneous sampling by accounting for site and sampling covariates (including method) in a model-based framework for estimation (versus a sampling-based framework). We snorkeled during three surveys and electrofished during a single survey in suite of delineated habitats stratified by reach types. We developed single-species occupancy models to determine covariates influencing patch occupancy and species detection probabilities whereas community occupancy models estimated speciesmore » richness in light of incomplete detections. For most species, information-theoretic criteria showed higher support for models that included patch size and reach as covariates of occupancy. In addition, models including patch size and sampling method as covariates of detection probabilities also had higher support. Detection probability estimates for snorkeling surveys were higher for larger non-benthic species whereas electrofishing was more effective at detecting smaller benthic species. The number of sites and sampling occasions required to accurately estimate occupancy varied among fish species. For rare benthic species, our results suggested that higher number of occasions, and especially the addition of electrofishing, may be required to improve detection probabilities and obtain accurate occupancy estimates. Community models suggested that richness was 41% higher than the number of species actually observed and the addition of an electrofishing survey increased estimated richness by 13%. These results can be useful to future fish assemblage monitoring efforts by informing sampling designs, such as site selection (e.g. stratifying based on patch size) and determining effort required (e.g. number of sites versus occasions).« less
To measure airborne asbestos and other fibers, an air sample must represent the actual number and size of fibers. Typically, mixed cellulose ester (MCE, 0.45 or 0.8 µm pore size) and to a much lesser extent, capillary-pore polycarbonate (PC, 0.4 µm pore size) membrane filters are...
Trap configuration and spacing influences parameter estimates in spatial capture-recapture models
Sun, Catherine C.; Fuller, Angela K.; Royle, J. Andrew
2014-01-01
An increasing number of studies employ spatial capture-recapture models to estimate population size, but there has been limited research on how different spatial sampling designs and trap configurations influence parameter estimators. Spatial capture-recapture models provide an advantage over non-spatial models by explicitly accounting for heterogeneous detection probabilities among individuals that arise due to the spatial organization of individuals relative to sampling devices. We simulated black bear (Ursus americanus) populations and spatial capture-recapture data to evaluate the influence of trap configuration and trap spacing on estimates of population size and a spatial scale parameter, sigma, that relates to home range size. We varied detection probability and home range size, and considered three trap configurations common to large-mammal mark-recapture studies: regular spacing, clustered, and a temporal sequence of different cluster configurations (i.e., trap relocation). We explored trap spacing and number of traps per cluster by varying the number of traps. The clustered arrangement performed well when detection rates were low, and provides for easier field implementation than the sequential trap arrangement. However, performance differences between trap configurations diminished as home range size increased. Our simulations suggest it is important to consider trap spacing relative to home range sizes, with traps ideally spaced no more than twice the spatial scale parameter. While spatial capture-recapture models can accommodate different sampling designs and still estimate parameters with accuracy and precision, our simulations demonstrate that aspects of sampling design, namely trap configuration and spacing, must consider study area size, ranges of individual movement, and home range sizes in the study population.
Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel
2016-10-01
Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.
NASA Astrophysics Data System (ADS)
Cantarello, Elena; Steck, Claude E.; Fontana, Paolo; Fontaneto, Diego; Marini, Lorenzo; Pautasso, Marco
2010-03-01
Recent large-scale studies have shown that biodiversity-rich regions also tend to be densely populated areas. The most obvious explanation is that biodiversity and human beings tend to match the distribution of energy availability, environmental stability and/or habitat heterogeneity. However, the species-people correlation can also be an artefact, as more populated regions could show more species because of a more thorough sampling. Few studies have tested this sampling bias hypothesis. Using a newly collated dataset, we studied whether Orthoptera species richness is related to human population size in Italy’s regions (average area 15,000 km2) and provinces (2,900 km2). As expected, the observed number of species increases significantly with increasing human population size for both grain sizes, although the proportion of variance explained is minimal at the provincial level. However, variations in observed Orthoptera species richness are primarily associated with the available number of records, which is in turn well correlated with human population size (at least at the regional level). Estimated Orthoptera species richness (Chao2 and Jackknife) also increases with human population size both for regions and provinces. Both for regions and provinces, this increase is not significant when controlling for variation in area and number of records. Our study confirms the hypothesis that broad-scale human population-biodiversity correlations can in some cases be artefactual. More systematic sampling of less studied taxa such as invertebrates is necessary to ascertain whether biogeographical patterns persist when sampling effort is kept constant or included in models.
The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.
Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J S
2016-10-01
The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.
Haverkamp, Nicolas; Beauducel, André
2017-01-01
We investigated the effects of violations of the sphericity assumption on Type I error rates for different methodical approaches of repeated measures analysis using a simulation approach. In contrast to previous simulation studies on this topic, up to nine measurement occasions were considered. Effects of the level of inter-correlations between measurement occasions on Type I error rates were considered for the first time. Two populations with non-violation of the sphericity assumption, one with uncorrelated measurement occasions and one with moderately correlated measurement occasions, were generated. One population with violation of the sphericity assumption combines uncorrelated with highly correlated measurement occasions. A second population with violation of the sphericity assumption combines moderately correlated and highly correlated measurement occasions. From these four populations without any between-group effect or within-subject effect 5,000 random samples were drawn. Finally, the mean Type I error rates for Multilevel linear models (MLM) with an unstructured covariance matrix (MLM-UN), MLM with compound-symmetry (MLM-CS) and for repeated measures analysis of variance (rANOVA) models (without correction, with Greenhouse-Geisser-correction, and Huynh-Feldt-correction) were computed. To examine the effect of both the sample size and the number of measurement occasions, sample sizes of n = 20, 40, 60, 80, and 100 were considered as well as measurement occasions of m = 3, 6, and 9. With respect to rANOVA, the results plead for a use of rANOVA with Huynh-Feldt-correction, especially when the sphericity assumption is violated, the sample size is rather small and the number of measurement occasions is large. For MLM-UN, the results illustrate a massive progressive bias for small sample sizes ( n = 20) and m = 6 or more measurement occasions. This effect could not be found in previous simulation studies with a smaller number of measurement occasions. The proportionality of bias and number of measurement occasions should be considered when MLM-UN is used. The good news is that this proportionality can be compensated by means of large sample sizes. Accordingly, MLM-UN can be recommended even for small sample sizes for about three measurement occasions and for large sample sizes for about nine measurement occasions.
Assessment of sampling stability in ecological applications of discriminant analysis
Williams, B.K.; Titus, K.
1988-01-01
A simulation study was undertaken to assess the sampling stability of the variable loadings in linear discriminant function analysis. A factorial design was used for the factors of multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. A review of 60 published studies and 142 individual analyses indicated that sample sizes in ecological studies often have met that requirement. However, individual group sample sizes frequently were very unequal, and checks of assumptions usually were not reported. The authors recommend that ecologists obtain group sample sizes that are at least three times as large as the number of variables measured.
Moser, Barry Kurt; Halabi, Susan
2013-01-01
In this paper we develop the methodology for designing clinical trials with any factorial arrangement when the primary outcome is time to event. We provide a matrix formulation for calculating the sample size and study duration necessary to test any effect with a pre-specified type I error rate and power. Assuming that a time to event follows an exponential distribution, we describe the relationships between the effect size, the power, and the sample size. We present examples for illustration purposes. We provide a simulation study to verify the numerical calculations of the expected number of events and the duration of the trial. The change in the power produced by a reduced number of observations or by accruing no patients to certain factorial combinations is also described. PMID:25530661
Pinto-Leite, C M; Rocha, P L B
2012-12-01
Empirical studies using visual search methods to investigate spider communities were conducted with different sampling protocols, including a variety of plot sizes, sampling efforts, and diurnal periods for sampling. We sampled 11 plots ranging in size from 5 by 10 m to 5 by 60 m. In each plot, we computed the total number of species detected every 10 min during 1 hr during the daytime and during the nighttime (0630 hours to 1100 hours, both a.m. and p.m.). We measured the influence of time effort on the measurement of species richness by comparing the curves produced by sample-based rarefaction and species richness estimation (first-order jackknife). We used a general linear model with repeated measures to assess whether the phase of the day during which sampling occurred and the differences in the plot lengths influenced the number of species observed and the number of species estimated. To measure the differences in species composition between the phases of the day, we used a multiresponse permutation procedure and a graphical representation based on nonmetric multidimensional scaling. After 50 min of sampling, we noted a decreased rate of species accumulation and a tendency of the estimated richness curves to reach an asymptote. We did not detect an effect of plot size on the number of species sampled. However, differences in observed species richness and species composition were found between phases of the day. Based on these results, we propose guidelines for visual search for tropical web spiders.
An In Situ Method for Sizing Insoluble Residues in Precipitation and Other Aqueous Samples
Axson, Jessica L.; Creamean, Jessie M.; Bondy, Amy L.; Capracotta, Sonja S.; Warner, Katy Y.; Ault, Andrew P.
2015-01-01
Particles are frequently incorporated into clouds or precipitation, influencing climate by acting as cloud condensation or ice nuclei, taking up coatings during cloud processing, and removing species through wet deposition. Many of these particles, particularly ice nuclei, can remain suspended within cloud droplets/crystals as insoluble residues. While previous studies have measured the soluble or bulk mass of species within clouds and precipitation, no studies to date have determined the number concentration and size distribution of insoluble residues in precipitation or cloud water using in situ methods. Herein, for the first time we demonstrate that Nanoparticle Tracking Analysis (NTA) is a powerful in situ method for determining the total number concentration, number size distribution, and surface area distribution of insoluble residues in precipitation, both of rain and melted snow. The method uses 500 μL or less of liquid sample and does not require sample modification. Number concentrations for the insoluble residues in aqueous precipitation samples ranged from 2.0–3.0(±0.3)×108 particles cm−3, while surface area ranged from 1.8(±0.7)–3.2(±1.0)×107 μm2 cm−3. Number size distributions peaked between 133–150 nm, with both single and multi-modal character, while surface area distributions peaked between 173–270 nm. Comparison with electron microscopy of particles up to 10 μm show that, by number, > 97% residues are <1 μm in diameter, the upper limit of the NTA. The range of concentration and distribution properties indicates that insoluble residue properties vary with ambient aerosol concentrations, cloud microphysics, and meteorological dynamics. NTA has great potential for studying the role that insoluble residues play in critical atmospheric processes. PMID:25705069
Doyle, Jacqueline M; McCormick, Cory R; DeWoody, J Andrew
2011-01-01
Many animals, such as crustaceans, insects, and salamanders, package their sperm into spermatophores, and the number of spermatozoa contained in a spermatophore is relevant to studies of sexual selection and sperm competition. We used two molecular methods, real-time quantitative polymerase chain reaction (RT-qPCR) and spectrophotometry, to estimate sperm numbers from spermatophores. First, we designed gene-specific primers that produced a single amplicon in four species of ambystomatid salamanders. A standard curve generated from cloned amplicons revealed a strong positive relationship between template DNA quantity and cycle threshold, suggesting that RT-qPCR could be used to quantify sperm in a given sample. We then extracted DNA from multiple Ambystoma maculatum spermatophores, performed RT-qPCR on each sample, and estimated template copy numbers (i.e. sperm number) using the standard curve. Second, we used spectrophotometry to determine the number of sperm per spermatophore by measuring DNA concentration relative to the genome size. We documented a significant positive relationship between the estimates of sperm number based on RT-qPCR and those based on spectrophotometry. When these molecular estimates were compared to spermatophore cap size, which in principle could predict the number of sperm contained in the spermatophore, we also found a significant positive relationship between sperm number and spermatophore cap size. This linear model allows estimates of sperm number strictly from cap size, an approach which could greatly simplify the estimation of sperm number in future studies. These methods may help explain variation in fertilization success where sperm competition is mediated by sperm quantity. © 2010 Blackwell Publishing Ltd.
MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.
Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi
2018-03-05
Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.
NASA Astrophysics Data System (ADS)
Prasetya, A.; Mawadati, A.; Putri, A. M. R.; Petrus, H. T. B. M.
2018-01-01
Comminution is one of crucial steps in gold ore processing used to liberate the valuable minerals from gaunge mineral. This research is done to find the particle size distribution of gold ore after it has been treated through the comminution process in a rod mill with various number of rod and rotational speed that will results in one optimum milling condition. For the initial step, Sumbawa gold ore was crushed and then sieved to pass the 2.5 mesh and retained on the 5 mesh (this condition was taken to mimic real application in artisanal gold mining). Inserting the prepared sample into the rod mill, the observation on effect of rod-number and rotational speed was then conducted by variating the rod number of 7 and 10 while the rotational speed was varied from 60, 85, and 110 rpm. In order to be able to provide estimation on particle distribution of every condition, the comminution kinetic was applied by taking sample at 15, 30, 60, and 120 minutes for size distribution analysis. The change of particle distribution of top and bottom product as time series was then treated using Rosin-Rammler distribution equation. The result shows that the homogenity of particle size and particle size distribution is affected by rod-number and rotational speed. The particle size distribution is more homogeneous by increasing of milling time, regardless of rod-number and rotational speed. Mean size of particles do not change significantly after 60 minutes milling time. Experimental results showed that the optimum condition was achieved at rotational speed of 85 rpm, using rod-number of 7.
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival
Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas
2016-01-01
Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
A note on sample size calculation for mean comparisons based on noncentral t-statistics.
Chow, Shein-Chung; Shao, Jun; Wang, Hansheng
2002-11-01
One-sample and two-sample t-tests are commonly used in analyzing data from clinical trials in comparing mean responses from two drug products. During the planning stage of a clinical study, a crucial step is the sample size calculation, i.e., the determination of the number of subjects (patients) needed to achieve a desired power (e.g., 80%) for detecting a clinically meaningful difference in the mean drug responses. Based on noncentral t-distributions, we derive some sample size calculation formulas for testing equality, testing therapeutic noninferiority/superiority, and testing therapeutic equivalence, under the popular one-sample design, two-sample parallel design, and two-sample crossover design. Useful tables are constructed and some examples are given for illustration.
Christopher W. Woodall; Vicente J. Monleon
2009-01-01
The Forest Inventory and Analysis program of the Forest Service, U.S. Department of Agriculture conducts a national inventory of fine woody debris (FWD); however, the sampling protocols involve tallying only the number of FWD pieces by size class that intersect a sampling transect with no measure of actual size. The line intersect estimator used with those samples...
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.
Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides
2010-01-01
We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
Chow, Jeffrey T Y; Turkstra, Timothy P; Yim, Edmund; Jones, Philip M
2018-06-01
Although every randomized clinical trial (RCT) needs participants, determining the ideal number of participants that balances limited resources and the ability to detect a real effect is difficult. Focussing on two-arm, parallel group, superiority RCTs published in six general anesthesiology journals, the objective of this study was to compare the quality of sample size calculations for RCTs published in 2010 vs 2016. Each RCT's full text was searched for the presence of a sample size calculation, and the assumptions made by the investigators were compared with the actual values observed in the results. Analyses were only performed for sample size calculations that were amenable to replication, defined as using a clearly identified outcome that was continuous or binary in a standard sample size calculation procedure. The percentage of RCTs reporting all sample size calculation assumptions increased from 51% in 2010 to 84% in 2016. The difference between the values observed in the study and the expected values used for the sample size calculation for most RCTs was usually > 10% of the expected value, with negligible improvement from 2010 to 2016. While the reporting of sample size calculations improved from 2010 to 2016, the expected values in these sample size calculations often assumed effect sizes larger than those actually observed in the study. Since overly optimistic assumptions may systematically lead to underpowered RCTs, improvements in how to calculate and report sample sizes in anesthesiology research are needed.
Size Distributions and Characterization of Native and Ground Samples for Toxicology Studies
NASA Technical Reports Server (NTRS)
McKay, David S.; Cooper, Bonnie L.; Taylor, Larry A.
2010-01-01
This slide presentation shows charts and graphs that review the particle size distribution and characterization of natural and ground samples for toxicology studies. There are graphs which show the volume distribution versus the number distribution for natural occurring dust, jet mill ground dust, and ball mill ground dust.
Re-estimating sample size in cluster randomised trials with active recruitment within clusters.
van Schie, S; Moerbeek, M
2014-08-30
Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster level and individual level variance should be known before the study starts, but this is often not the case. We suggest using an internal pilot study design to address this problem of unknown variances. A pilot can be useful to re-estimate the variances and re-calculate the sample size during the trial. Using simulated data, it is shown that an initially low or high power can be adjusted using an internal pilot with the type I error rate remaining within an acceptable range. The intracluster correlation coefficient can be re-estimated with more precision, which has a positive effect on the sample size. We conclude that an internal pilot study design may be used if active recruitment is feasible within a limited number of clusters. Copyright © 2014 John Wiley & Sons, Ltd.
Williams, Michael S; Cao, Yong; Ebel, Eric D
2013-07-15
Levels of pathogenic organisms in food and water have steadily declined in many parts of the world. A consequence of this reduction is that the proportion of samples that test positive for the most contaminated product-pathogen pairings has fallen to less than 0.1. While this is unequivocally beneficial to public health, datasets with very few enumerated samples present an analytical challenge because a large proportion of the observations are censored values. One application of particular interest to risk assessors is the fitting of a statistical distribution function to datasets collected at some point in the farm-to-table continuum. The fitted distribution forms an important component of an exposure assessment. A number of studies have compared different fitting methods and proposed lower limits on the proportion of samples where the organisms of interest are identified and enumerated, with the recommended lower limit of enumerated samples being 0.2. This recommendation may not be applicable to food safety risk assessments for a number of reasons, which include the development of new Bayesian fitting methods, the use of highly sensitive screening tests, and the generally larger sample sizes found in surveys of food commodities. This study evaluates the performance of a Markov chain Monte Carlo fitting method when used in conjunction with a screening test and enumeration of positive samples by the Most Probable Number technique. The results suggest that levels of contamination for common product-pathogen pairs, such as Salmonella on poultry carcasses, can be reliably estimated with the proposed fitting method and samples sizes in excess of 500 observations. The results do, however, demonstrate that simple guidelines for this application, such as the proportion of positive samples, cannot be provided. Published by Elsevier B.V.
Ramezani, Habib; Holm, Sören; Allard, Anna; Ståhl, Göran
2010-05-01
Environmental monitoring of landscapes is of increasing interest. To quantify landscape patterns, a number of metrics are used, of which Shannon's diversity, edge length, and density are studied here. As an alternative to complete mapping, point sampling was applied to estimate the metrics for already mapped landscapes selected from the National Inventory of Landscapes in Sweden (NILS). Monte-Carlo simulation was applied to study the performance of different designs. Random and systematic samplings were applied for four sample sizes and five buffer widths. The latter feature was relevant for edge length, since length was estimated through the number of points falling in buffer areas around edges. In addition, two landscape complexities were tested by applying two classification schemes with seven or 20 land cover classes to the NILS data. As expected, the root mean square error (RMSE) of the estimators decreased with increasing sample size. The estimators of both metrics were slightly biased, but the bias of Shannon's diversity estimator was shown to decrease when sample size increased. In the edge length case, an increasing buffer width resulted in larger bias due to the increased impact of boundary conditions; this effect was shown to be independent of sample size. However, we also developed adjusted estimators that eliminate the bias of the edge length estimator. The rates of decrease of RMSE with increasing sample size and buffer width were quantified by a regression model. Finally, indicative cost-accuracy relationships were derived showing that point sampling could be a competitive alternative to complete wall-to-wall mapping.
Sample size determination for mediation analysis of longitudinal data.
Pan, Haitao; Liu, Suyu; Miao, Danmin; Yuan, Ying
2018-03-27
Sample size planning for longitudinal data is crucial when designing mediation studies because sufficient statistical power is not only required in grant applications and peer-reviewed publications, but is essential to reliable research results. However, sample size determination is not straightforward for mediation analysis of longitudinal design. To facilitate planning the sample size for longitudinal mediation studies with a multilevel mediation model, this article provides the sample size required to achieve 80% power by simulations under various sizes of the mediation effect, within-subject correlations and numbers of repeated measures. The sample size calculation is based on three commonly used mediation tests: Sobel's method, distribution of product method and the bootstrap method. Among the three methods of testing the mediation effects, Sobel's method required the largest sample size to achieve 80% power. Bootstrapping and the distribution of the product method performed similarly and were more powerful than Sobel's method, as reflected by the relatively smaller sample sizes. For all three methods, the sample size required to achieve 80% power depended on the value of the ICC (i.e., within-subject correlation). A larger value of ICC typically required a larger sample size to achieve 80% power. Simulation results also illustrated the advantage of the longitudinal study design. The sample size tables for most encountered scenarios in practice have also been published for convenient use. Extensive simulations study showed that the distribution of the product method and bootstrapping method have superior performance to the Sobel's method, but the product method was recommended to use in practice in terms of less computation time load compared to the bootstrapping method. A R package has been developed for the product method of sample size determination in mediation longitudinal study design.
van Hassel, Daniël; van der Velden, Lud; de Bakker, Dinny; van der Hoek, Lucas; Batenburg, Ronald
2017-12-04
Our research is based on a technique for time sampling, an innovative method for measuring the working hours of Dutch general practitioners (GPs), which was deployed in an earlier study. In this study, 1051 GPs were questioned about their activities in real time by sending them one SMS text message every 3 h during 1 week. The required sample size for this study is important for health workforce planners to know if they want to apply this method to target groups who are hard to reach or if fewer resources are available. In this time-sampling method, however, standard power analyses is not sufficient for calculating the required sample size as this accounts only for sample fluctuation and not for the fluctuation of measurements taken from every participant. We investigated the impact of the number of participants and frequency of measurements per participant upon the confidence intervals (CIs) for the hours worked per week. Statistical analyses of the time-use data we obtained from GPs were performed. Ninety-five percent CIs were calculated, using equations and simulation techniques, for various different numbers of GPs included in the dataset and for various frequencies of measurements per participant. Our results showed that the one-tailed CI, including sample and measurement fluctuation, decreased from 21 until 3 h between one and 50 GPs. As a result of the formulas to calculate CIs, the increase of the precision continued and was lower with the same additional number of GPs. Likewise, the analyses showed how the number of participants required decreased if more measurements per participant were taken. For example, one measurement per 3-h time slot during the week requires 300 GPs to achieve a CI of 1 h, while one measurement per hour requires 100 GPs to obtain the same result. The sample size needed for time-use research based on a time-sampling technique depends on the design and aim of the study. In this paper, we showed how the precision of the measurement of hours worked each week by GPs strongly varied according to the number of GPs included and the frequency of measurements per GP during the week measured. The best balance between both dimensions will depend upon different circumstances, such as the target group and the budget available.
Dilution effects on ultrafine particle emissions from Euro 5 and Euro 6 diesel and gasoline vehicles
NASA Astrophysics Data System (ADS)
Louis, Cédric; Liu, Yao; Martinet, Simon; D'Anna, Barbara; Valiente, Alvaro Martinez; Boreave, Antoinette; R'Mili, Badr; Tassel, Patrick; Perret, Pascal; André, Michel
2017-11-01
Dilution and temperature used during sampling of vehicle exhaust can modify particle number concentration and size distribution. Two experiments were performed on a chassis dynamometer to assess exhaust dilution and temperature on particle number and particle size distribution for Euro 5 and Euro 6 vehicles. In the first experiment, the effects of dilution (ratio from 8 to 4 000) and temperature (ranging from 50 °C to 150 °C) on particle quantification were investigated directly from tailpipe for a diesel and a gasoline Euro 5 vehicles. In the second experiment, particle emissions from Euro 6 diesel and gasoline vehicles directly sampled from the tailpipe were compared to the constant volume sampling (CVS) measurements under similar sampling conditions. Low primary dilutions (3-5) induced an increase in particle number concentration by a factor of 2 compared to high primary dilutions (12-20). Low dilution temperatures (50 °C) induced 1.4-3 times higher particle number concentration than high dilution temperatures (150 °C). For the Euro 6 gasoline vehicle with direct injection, constant volume sampling (CVS) particle number concentrations were higher than after the tailpipe by a factor of 6, 80 and 22 for Artemis urban, road and motorway, respectively. For the same vehicle, particle size distribution measured after the tailpipe was centred on 10 nm, and particles were smaller than the ones measured after CVS that was centred between 50 nm and 70 nm. The high particle concentration (≈106 #/cm3) and the growth of diameter, measured in the CVS, highlighted aerosol transformations, such as nucleation, condensation and coagulation occurring in the sampling system and this might have biased the particle measurements.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren
2011-01-01
Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less
Chen, Hua; Chen, Kun
2013-01-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n − An(t) follows a Poisson distribution, and as m → n, n(n−1)Tm/2N(0) follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference. PMID:23666939
Chen, Hua; Chen, Kun
2013-07-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\\left(n-1\\right){T}_{m}/2N\\left(0\\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.
Gondikas, Andreas; von der Kammer, Frank; Hofmann, Thilo; Marchetti-Deschmann, Martina; Allmaier, Günter; Marko-Varga, György; Andersson, Roland
2017-01-01
For drug delivery, characterization of liposomes regarding size, particle number concentrations, occurrence of low-sized liposome artefacts and drug encapsulation are of importance to understand their pharmacodynamic properties. In our study, we aimed to demonstrate the applicability of nano Electrospray Gas-Phase Electrophoretic Mobility Molecular Analyser (nES GEMMA) as a suitable technique for analyzing these parameters. We measured number-based particle concentrations, identified differences in size between nominally identical liposomal samples, and detected the presence of low-diameter material which yielded bimodal particle size distributions. Subsequently, we compared these findings to dynamic light scattering (DLS) data and results from light scattering experiments coupled to Asymmetric Flow-Field Flow Fractionation (AF4), the latter improving the detectability of smaller particles in polydisperse samples due to a size separation step prior detection. However, the bimodal size distribution could not be detected due to method inherent limitations. In contrast, cryo transmission electron microscopy corroborated nES GEMMA results. Hence, gas-phase electrophoresis proved to be a versatile tool for liposome characterization as it could analyze both vesicle size and size distribution. Finally, a correlation of nES GEMMA results with cell viability experiments was carried out to demonstrate the importance of liposome batch-to-batch control as low-sized sample components possibly impact cell viability. PMID:27639623
Parajulee, M N; Shrestha, R B; Leser, J F
2006-04-01
A 2-yr field study was conducted to examine the effectiveness of two sampling methods (visual and plant washing techniques) for western flower thrips, Frankliniella occidentalis (Pergande), and five sampling methods (visual, beat bucket, drop cloth, sweep net, and vacuum) for cotton fleahopper, Pseudatomoscelis seriatus (Reuter), in Texas cotton, Gossypium hirsutum (L.), and to develop sequential sampling plans for each pest. The plant washing technique gave similar results to the visual method in detecting adult thrips, but the washing technique detected significantly higher number of thrips larvae compared with the visual sampling. Visual sampling detected the highest number of fleahoppers followed by beat bucket, drop cloth, vacuum, and sweep net sampling, with no significant difference in catch efficiency between vacuum and sweep net methods. However, based on fixed precision cost reliability, the sweep net sampling was the most cost-effective method followed by vacuum, beat bucket, drop cloth, and visual sampling. Taylor's Power Law analysis revealed that the field dispersion patterns of both thrips and fleahoppers were aggregated throughout the crop growing season. For thrips management decision based on visual sampling (0.25 precision), 15 plants were estimated to be the minimum sample size when the estimated population density was one thrips per plant, whereas the minimum sample size was nine plants when thrips density approached 10 thrips per plant. The minimum visual sample size for cotton fleahoppers was 16 plants when the density was one fleahopper per plant, but the sample size decreased rapidly with an increase in fleahopper density, requiring only four plants to be sampled when the density was 10 fleahoppers per plant. Sequential sampling plans were developed and validated with independent data for both thrips and cotton fleahoppers.
Influence of item distribution pattern and abundance on efficiency of benthic core sampling
Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.
2014-01-01
ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
NASA Astrophysics Data System (ADS)
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-04-01
In the last three decades, an increasing number of studies analyzed spatial patterns in throughfall to investigate the consequences of rainfall redistribution for biogeochemical and hydrological processes in forests. In the majority of cases, variograms were used to characterize the spatial properties of the throughfall data. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and an appropriate layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation methods on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with heavy outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling), and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the numbers recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes << 200, our current knowledge about throughfall spatial variability stands on shaky ground.
NASA Astrophysics Data System (ADS)
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-09-01
In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes ≪200, currently available data are prone to large uncertainties.
Estimation of the bottleneck size in Florida panthers
Culver, M.; Hedrick, P.W.; Murphy, K.; O'Brien, S.; Hornocker, M.G.
2008-01-01
We have estimated the extent of genetic variation in museum (1890s) and contemporary (1980s) samples of Florida panthers Puma concolor coryi for both nuclear loci and mtDNA. The microsatellite heterozygosity in the contemporary sample was only 0.325 that in the museum samples although our sample size and number of loci are limited. Support for this estimate is provided by a sample of 84 microsatellite loci in contemporary Florida panthers and Idaho pumas Puma concolor hippolestes in which the contemporary Florida panther sample had only 0.442 the heterozygosity of Idaho pumas. The estimated diversities in mtDNA in the museum and contemporary samples were 0.600 and 0.000, respectively. Using a population genetics approach, we have estimated that to reduce either the microsatellite heterozygosity or the mtDNA diversity this much (in a period of c. 80years during the 20th century when the numbers were thought to be low) that a very small bottleneck size of c. 2 for several generations and a small effective population size in other generations is necessary. Using demographic data from Yellowstone pumas, we estimated the ratio of effective to census population size to be 0.315. Using this ratio, the census population size in the Florida panthers necessary to explain the loss of microsatellite variation was c .41 for the non-bottleneck generations and 6.2 for the two bottleneck generations. These low bottleneck population sizes and the concomitant reduced effectiveness of selection are probably responsible for the high frequency of several detrimental traits in Florida panthers, namely undescended testicles and poor sperm quality. The recent intensive monitoring both before and after the introduction of Texas pumas in 1995 will make the recovery and genetic restoration of Florida panthers a classic study of an endangered species. Our estimates of the bottleneck size responsible for the loss of genetic variation in the Florida panther completes an unknown aspect of this account. ?? 2008 The Authors. Journal compilation ?? 2008 The Zoological Society of London.
What about N? A methodological study of sample-size reporting in focus group studies.
Carlsen, Benedicte; Glenton, Claire
2011-03-11
Focus group studies are increasingly published in health related journals, but we know little about how researchers use this method, particularly how they determine the number of focus groups to conduct. The methodological literature commonly advises researchers to follow principles of data saturation, although practical advise on how to do this is lacking. Our objectives were firstly, to describe the current status of sample size in focus group studies reported in health journals. Secondly, to assess whether and how researchers explain the number of focus groups they carry out. We searched PubMed for studies that had used focus groups and that had been published in open access journals during 2008, and extracted data on the number of focus groups and on any explanation authors gave for this number. We also did a qualitative assessment of the papers with regard to how number of groups was explained and discussed. We identified 220 papers published in 117 journals. In these papers insufficient reporting of sample sizes was common. The number of focus groups conducted varied greatly (mean 8.4, median 5, range 1 to 96). Thirty seven (17%) studies attempted to explain the number of groups. Six studies referred to rules of thumb in the literature, three stated that they were unable to organize more groups for practical reasons, while 28 studies stated that they had reached a point of saturation. Among those stating that they had reached a point of saturation, several appeared not to have followed principles from grounded theory where data collection and analysis is an iterative process until saturation is reached. Studies with high numbers of focus groups did not offer explanations for number of groups. Too much data as a study weakness was not an issue discussed in any of the reviewed papers. Based on these findings we suggest that journals adopt more stringent requirements for focus group method reporting. The often poor and inconsistent reporting seen in these studies may also reflect the lack of clear, evidence-based guidance about deciding on sample size. More empirical research is needed to develop focus group methodology.
Observational studies of patients in the emergency department: a comparison of 4 sampling methods.
Valley, Morgan A; Heard, Kennon J; Ginde, Adit A; Lezotte, Dennis C; Lowenstein, Steven R
2012-08-01
We evaluate the ability of 4 sampling methods to generate representative samples of the emergency department (ED) population. We analyzed the electronic records of 21,662 consecutive patient visits at an urban, academic ED. From this population, we simulated different models of study recruitment in the ED by using 2 sample sizes (n=200 and n=400) and 4 sampling methods: true random, random 4-hour time blocks by exact sample size, random 4-hour time blocks by a predetermined number of blocks, and convenience or "business hours." For each method and sample size, we obtained 1,000 samples from the population. Using χ(2) tests, we measured the number of statistically significant differences between the sample and the population for 8 variables (age, sex, race/ethnicity, language, triage acuity, arrival mode, disposition, and payer source). Then, for each variable, method, and sample size, we compared the proportion of the 1,000 samples that differed from the overall ED population to the expected proportion (5%). Only the true random samples represented the population with respect to sex, race/ethnicity, triage acuity, mode of arrival, language, and payer source in at least 95% of the samples. Patient samples obtained using random 4-hour time blocks and business hours sampling systematically differed from the overall ED patient population for several important demographic and clinical variables. However, the magnitude of these differences was not large. Common sampling strategies selected for ED-based studies may affect parameter estimates for several representative population variables. However, the potential for bias for these variables appears small. Copyright © 2012. Published by Mosby, Inc.
Estimating the Size of a Large Network and its Communities from a Random Sample
Chen, Lin; Karbasi, Amin; Crawford, Forrest W.
2017-01-01
Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924
Estimating the Size of a Large Network and its Communities from a Random Sample.
Chen, Lin; Karbasi, Amin; Crawford, Forrest W
2016-01-01
Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.
Stability and bias of classification rates in biological applications of discriminant analysis
Williams, B.K.; Titus, K.; Hines, J.E.
1990-01-01
We assessed the sampling stability of classification rates in discriminant analysis by using a factorial design with factors for multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. Simulation results indicated strong bias in correct classification rates when group sample sizes were small and when overlap among groups was high. We also found that stability of the correct classification rates was influenced by these factors, indicating that the number of samples required for a given level of precision increases with the amount of overlap among groups. In a review of 60 published studies, we found that 57% of the articles presented results on classification rates, though few of them mentioned potential biases in their results. Wildlife researchers should choose the total number of samples per group to be at least 2 times the number of variables to be measured when overlap among groups is low. Substantially more samples are required as the overlap among groups increases
Optimal design in pediatric pharmacokinetic and pharmacodynamic clinical studies.
Roberts, Jessica K; Stockmann, Chris; Balch, Alfred; Yu, Tian; Ward, Robert M; Spigarelli, Michael G; Sherwin, Catherine M T
2015-03-01
It is not trivial to conduct clinical trials with pediatric participants. Ethical, logistical, and financial considerations add to the complexity of pediatric studies. Optimal design theory allows investigators the opportunity to apply mathematical optimization algorithms to define how to structure their data collection to answer focused research questions. These techniques can be used to determine an optimal sample size, optimal sample times, and the number of samples required for pharmacokinetic and pharmacodynamic studies. The aim of this review is to demonstrate how to determine optimal sample size, optimal sample times, and the number of samples required from each patient by presenting specific examples using optimal design tools. Additionally, this review aims to discuss the relative usefulness of sparse vs rich data. This review is intended to educate the clinician, as well as the basic research scientist, whom plan on conducting a pharmacokinetic/pharmacodynamic clinical trial in pediatric patients. © 2015 John Wiley & Sons Ltd.
Number of pins in two-stage stratified sampling for estimating herbage yield
William G. O' Regan; C. Eugene Conrad
1975-01-01
In a two-stage stratified procedure for sampling herbage yield, plots are stratified by a pin frame in stage one, and clipped. In stage two, clippings from selected plots are sorted, dried, and weighed. Sample size and distribution of plots between the two stages are determined by equations. A way to compute the effect of number of pins on the variance of estimated...
Further improvement of hydrostatic pressure sample injection for microchip electrophoresis.
Luo, Yong; Zhang, Qingquan; Qin, Jianhua; Lin, Bingcheng
2007-12-01
Hydrostatic pressure sample injection method is able to minimize the number of electrodes needed for a microchip electrophoresis process; however, it neither can be applied for electrophoretic DNA sizing, nor can be implemented on the widely used single-cross microchip. This paper presents an injector design that makes the hydrostatic pressure sample injection method suitable for DNA sizing. By introducing an assistant channel into the normal double-cross injector, a rugged DNA sample plug suitable for sizing can be successfully formed within the cross area during the sample loading. This paper also demonstrates that the hydrostatic pressure sample injection can be performed in the single-cross microchip by controlling the radial position of the detection point in the separation channel. Rhodamine 123 and its derivative as model sample were successfully separated.
Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G
2012-10-01
Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.
Study design requirements for RNA sequencing-based breast cancer diagnostics.
Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias
2016-02-01
Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.
Closed percutaneous pleural biopsy. A lost art in the new era.
Khadadah, Mousa E; Muqim, Abdulaziz T; Al-Mutairi, Abdulla D; Nahar, Ibrahim K; Sharma, Prem N; Behbehani, Nasser H; El-Maradni, Nabeel M
2009-06-01
To assess the association between size and number of biopsy specimens obtained by percutaneous closed pleural biopsy, with overall diagnostic yield in general, and histopathological evidence of tuberculosis pleurisy, in particular. One hundred and forty-three patients, with a high index of clinically having tuberculous pleurisy, were referred to the respiratory division of Mubarak Al-Kabeer Hospital in Kuwait during a 9-year period (January 1999 to December 2007). All subjects with exudative lymphocytic predominant effusion underwent percutaneous closed pleural biopsy, looking for tuberculous granulomas. The clinical diagnosis and pathological characteristics (number and size of biopsy samples) were analyzed. Overall diagnostic yield of percutaneous closed pleural biopsy in all cases was noticed to be 52%. The larger biopsy sample size of 3 mm and more, and the higher number of specimens (> or = 4) were significantly associated with an increased diagnostic yield for tuberculous pleurisy (p=0.007 and 0.047). Obtaining 4 or more biopsy samples, and larger specimens of 3mm and more for histopathological evaluation, through percutaneous pleural biopsy, results in a better diagnostic yield for tuberculous pleurisy.
Ngamjarus, Chetta; Chongsuvivatwong, Virasakdi; McNeil, Edward; Holling, Heinz
2017-01-01
Sample size determination usually is taught based on theory and is difficult to understand. Using a smartphone application to teach sample size calculation ought to be more attractive to students than using lectures only. This study compared levels of understanding of sample size calculations for research studies between participants attending a lecture only versus lecture combined with using a smartphone application to calculate sample sizes, to explore factors affecting level of post-test score after training sample size calculation, and to investigate participants’ attitude toward a sample size application. A cluster-randomized controlled trial involving a number of health institutes in Thailand was carried out from October 2014 to March 2015. A total of 673 professional participants were enrolled and randomly allocated to one of two groups, namely, 341 participants in 10 workshops to control group and 332 participants in 9 workshops to intervention group. Lectures on sample size calculation were given in the control group, while lectures using a smartphone application were supplied to the test group. Participants in the intervention group had better learning of sample size calculation (2.7 points out of maximnum 10 points, 95% CI: 24 - 2.9) than the participants in the control group (1.6 points, 95% CI: 1.4 - 1.8). Participants doing research projects had a higher post-test score than those who did not have a plan to conduct research projects (0.9 point, 95% CI: 0.5 - 1.4). The majority of the participants had a positive attitude towards the use of smartphone application for learning sample size calculation.
The SDSS-IV MaNGA Sample: Design, Optimization, and Usage Considerations
NASA Astrophysics Data System (ADS)
Wake, David A.; Bundy, Kevin; Diamond-Stanic, Aleksandar M.; Yan, Renbin; Blanton, Michael R.; Bershady, Matthew A.; Sánchez-Gallego, José R.; Drory, Niv; Jones, Amy; Kauffmann, Guinevere; Law, David R.; Li, Cheng; MacDonald, Nicholas; Masters, Karen; Thomas, Daniel; Tinker, Jeremy; Weijmans, Anne-Marie; Brownstein, Joel R.
2017-09-01
We describe the sample design for the SDSS-IV MaNGA survey and present the final properties of the main samples along with important considerations for using these samples for science. Our target selection criteria were developed while simultaneously optimizing the size distribution of the MaNGA integral field units (IFUs), the IFU allocation strategy, and the target density to produce a survey defined in terms of maximizing signal-to-noise ratio, spatial resolution, and sample size. Our selection strategy makes use of redshift limits that only depend on I-band absolute magnitude (M I ), or, for a small subset of our sample, M I and color (NUV - I). Such a strategy ensures that all galaxies span the same range in angular size irrespective of luminosity and are therefore covered evenly by the adopted range of IFU sizes. We define three samples: the Primary and Secondary samples are selected to have a flat number density with respect to M I and are targeted to have spectroscopic coverage to 1.5 and 2.5 effective radii (R e ), respectively. The Color-Enhanced supplement increases the number of galaxies in the low-density regions of color-magnitude space by extending the redshift limits of the Primary sample in the appropriate color bins. The samples cover the stellar mass range 5× {10}8≤slant {M}* ≤slant 3× {10}11 {M}⊙ {h}-2 and are sampled at median physical resolutions of 1.37 and 2.5 kpc for the Primary and Secondary samples, respectively. We provide weights that will statistically correct for our luminosity and color-dependent selection function and IFU allocation strategy, thus correcting the observed sample to a volume-limited sample.
Effect of finite sample size on feature selection and classification: a simulation study.
Way, Ted W; Sahiner, Berkman; Hadjiiski, Lubomir M; Chan, Heang-Ping
2010-02-01
The small number of samples available for training and testing is often the limiting factor in finding the most effective features and designing an optimal computer-aided diagnosis (CAD) system. Training on a limited set of samples introduces bias and variance in the performance of a CAD system relative to that trained with an infinite sample size. In this work, the authors conducted a simulation study to evaluate the performances of various combinations of classifiers and feature selection techniques and their dependence on the class distribution, dimensionality, and the training sample size. The understanding of these relationships will facilitate development of effective CAD systems under the constraint of limited available samples. Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve Az. The mean Az values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.
Thompson, Jennifer A; Fielding, Katherine; Hargreaves, James; Copas, Andrew
2017-12-01
Background/Aims We sought to optimise the design of stepped wedge trials with an equal allocation of clusters to sequences and explored sample size comparisons with alternative trial designs. Methods We developed a new expression for the design effect for a stepped wedge trial, assuming that observations are equally correlated within clusters and an equal number of observations in each period between sequences switching to the intervention. We minimised the design effect with respect to (1) the fraction of observations before the first and after the final sequence switches (the periods with all clusters in the control or intervention condition, respectively) and (2) the number of sequences. We compared the design effect of this optimised stepped wedge trial to the design effects of a parallel cluster-randomised trial, a cluster-randomised trial with baseline observations, and a hybrid trial design (a mixture of cluster-randomised trial and stepped wedge trial) with the same total cluster size for all designs. Results We found that a stepped wedge trial with an equal allocation to sequences is optimised by obtaining all observations after the first sequence switches and before the final sequence switches to the intervention; this means that the first sequence remains in the control condition and the last sequence remains in the intervention condition for the duration of the trial. With this design, the optimal number of sequences is [Formula: see text], where [Formula: see text] is the cluster-mean correlation, [Formula: see text] is the intracluster correlation coefficient, and m is the total cluster size. The optimal number of sequences is small when the intracluster correlation coefficient and cluster size are small and large when the intracluster correlation coefficient or cluster size is large. A cluster-randomised trial remains more efficient than the optimised stepped wedge trial when the intracluster correlation coefficient or cluster size is small. A cluster-randomised trial with baseline observations always requires a larger sample size than the optimised stepped wedge trial. The hybrid design can always give an equally or more efficient design, but will be at most 5% more efficient. We provide a strategy for selecting a design if the optimal number of sequences is unfeasible. For a non-optimal number of sequences, the sample size may be reduced by allowing a proportion of observations before the first or after the final sequence has switched. Conclusion The standard stepped wedge trial is inefficient. To reduce sample sizes when a hybrid design is unfeasible, stepped wedge trial designs should have no observations before the first sequence switches or after the final sequence switches.
Required sample size for monitoring stand dynamics in strict forest reserves: a case study
Diego Van Den Meersschaut; Bart De Cuyper; Kris Vandekerkhove; Noel Lust
2000-01-01
Stand dynamics in European strict forest reserves are commonly monitored using inventory densities of 5 to 15 percent of the total surface. The assumption that these densities guarantee a representative image of certain parameters is critically analyzed in a case study for the parameters basal area and stem number. The required sample sizes for different accuracy and...
NASA Astrophysics Data System (ADS)
Gnanasaravanan, S.; Rajkumar, P.
2013-05-01
The present study investigates the characterization of minerals in the River Sand (R - Sand) and the Manufactured sand (M-Sand) through FTIR spectroscopic studies. The R - Sand is collected from seven different locations in Cauvery River and M - Sand is collected from eight different manufactures around the Cauvery River belt in Salem, Erode, Tirupur and Namakkal districts of Tamilnadu, India. To extend the effectiveness of the analysis, the samples were subjected to grain size separation to classify the bulk samples into different grain sizes. All the samples were analyzed using FTIR spectrometer. The number of minerals identified with the help of FTIR spectra in overall (bulk) samples of R - Sand is 14 and of M - Sand is 13. The number has been increased while going for grain size separation, i.e., from 14 to 31 for R - Sand and from 13 to 20 for M - Sand. Among all minerals, quartz plays a major role. The relative distribution and the crystallinity nature of quartz have been discussed based on the extinction co-efficient and the crystallinity index values computed. There is no major variation found in M - Sand while going for grain size separation.
Sample Size Methods for Estimating HIV Incidence from Cross-Sectional Surveys
Brookmeyer, Ron
2015-01-01
Summary Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this paper we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this paper at the Biometrics website on Wiley Online Library. PMID:26302040
Sample size methods for estimating HIV incidence from cross-sectional surveys.
Konikoff, Jacob; Brookmeyer, Ron
2015-12-01
Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this article, we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this article at the Biometrics website on Wiley Online Library. © 2015, The International Biometric Society.
In situ measurement of particulate number density and size distribution from an aircraft
NASA Technical Reports Server (NTRS)
Briehl, D.
1974-01-01
Commercial particulate measuring instruments were flown aboard the NASA Convair 990. A condensation nuclei monitor was utilized to measure particles larger than approximately 0.003 micrometers in diameter. A specially designed pressurization system was used with this counter so that the sample could be fed into the monitor at cabin altitude pressure. A near-forward light scattering counter was used to measure the number and size distribution particles in the size range from 0.5 to 5 micrometers and greater in diameter.
Phenotypic constraints promote latent versatility and carbon efficiency in metabolic networks.
Bardoscia, Marco; Marsili, Matteo; Samal, Areejit
2015-07-01
System-level properties of metabolic networks may be the direct product of natural selection or arise as a by-product of selection on other properties. Here we study the effect of direct selective pressure for growth or viability in particular environments on two properties of metabolic networks: latent versatility to function in additional environments and carbon usage efficiency. Using a Markov chain Monte Carlo (MCMC) sampling based on flux balance analysis (FBA), we sample from a known biochemical universe random viable metabolic networks that differ in the number of directly constrained environments. We find that the latent versatility of sampled metabolic networks increases with the number of directly constrained environments and with the size of the networks. We then show that the average carbon wastage of sampled metabolic networks across the constrained environments decreases with the number of directly constrained environments and with the size of the networks. Our work expands the growing body of evidence about nonadaptive origins of key functional properties of biological networks.
Moerbeek, Mirjam
2018-01-01
Background This article studies the design of trials that compare three treatment conditions that are delivered by two types of health professionals. The one type of health professional delivers one treatment, and the other type delivers two treatments, hence, this design is a combination of a nested and crossed design. As each health professional treats multiple patients, the data have a nested structure. This nested structure has thus far been ignored in the design of such trials, which may result in an underestimate of the required sample size. In the design stage, the sample sizes should be determined such that a desired power is achieved for each of the three pairwise comparisons, while keeping costs or sample size at a minimum. Methods The statistical model that relates outcome to treatment condition and explicitly takes the nested data structure into account is presented. Mathematical expressions that relate sample size to power are derived for each of the three pairwise comparisons on the basis of this model. The cost-efficient design achieves sufficient power for each pairwise comparison at lowest costs. Alternatively, one may minimize the total number of patients. The sample sizes are found numerically and an Internet application is available for this purpose. The design is also compared to a nested design in which each health professional delivers just one treatment. Results Mathematical expressions show that this design is more efficient than the nested design. For each pairwise comparison, power increases with the number of health professionals and the number of patients per health professional. The methodology of finding a cost-efficient design is illustrated using a trial that compares treatments for social phobia. The optimal sample sizes reflect the costs for training and supervising psychologists and psychiatrists, and the patient-level costs in the three treatment conditions. Conclusion This article provides the methodology for designing trials that compare three treatment conditions while taking the nesting of patients within health professionals into account. As such, it helps to avoid underpowered trials. To use the methodology, a priori estimates of the total outcome variances and intraclass correlation coefficients must be obtained from experts’ opinions or findings in the literature. PMID:29316807
Support vector regression to predict porosity and permeability: Effect of sample size
NASA Astrophysics Data System (ADS)
Al-Anazi, A. F.; Gates, I. D.
2012-02-01
Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function type and loss functions used.
Lee, Paul H; Tse, Andy C Y
2017-05-01
There are limited data on the quality of reporting of information essential for replication of the calculation as well as the accuracy of the sample size calculation. We examine the current quality of reporting of the sample size calculation in randomized controlled trials (RCTs) published in PubMed and to examine the variation in reporting across study design, study characteristics, and journal impact factor. We also reviewed the targeted sample size reported in trial registries. We reviewed and analyzed all RCTs published in December 2014 with journals indexed in PubMed. The 2014 Impact Factors for the journals were used as proxies for their quality. Of the 451 analyzed papers, 58.1% reported an a priori sample size calculation. Nearly all papers provided the level of significance (97.7%) and desired power (96.6%), and most of the papers reported the minimum clinically important effect size (73.3%). The median (inter-quartile range) of the percentage difference of the reported and calculated sample size calculation was 0.0% (IQR -4.6%;3.0%). The accuracy of the reported sample size was better for studies published in journals that endorsed the CONSORT statement and journals with an impact factor. A total of 98 papers had provided targeted sample size on trial registries and about two-third of these papers (n=62) reported sample size calculation, but only 25 (40.3%) had no discrepancy with the reported number in the trial registries. The reporting of the sample size calculation in RCTs published in PubMed-indexed journals and trial registries were poor. The CONSORT statement should be more widely endorsed. Copyright © 2016 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Kondrashova, Olga; Love, Clare J.; Lunke, Sebastian; Hsu, Arthur L.; Waring, Paul M.; Taylor, Graham R.
2015-01-01
Whilst next generation sequencing can report point mutations in fixed tissue tumour samples reliably, the accurate determination of copy number is more challenging. The conventional Multiplex Ligation-dependent Probe Amplification (MLPA) assay is an effective tool for measurement of gene dosage, but is restricted to around 50 targets due to size resolution of the MLPA probes. By switching from a size-resolved format, to a sequence-resolved format we developed a scalable, high-throughput, quantitative assay. MLPA-seq is capable of detecting deletions, duplications, and amplifications in as little as 5ng of genomic DNA, including from formalin-fixed paraffin-embedded (FFPE) tumour samples. We show that this method can detect BRCA1, BRCA2, ERBB2 and CCNE1 copy number changes in DNA extracted from snap-frozen and FFPE tumour tissue, with 100% sensitivity and >99.5% specificity. PMID:26569395
A Monte Carlo Program for Simulating Selection Decisions from Personnel Tests
ERIC Educational Resources Information Center
Petersen, Calvin R.; Thain, John W.
1976-01-01
Relative to test and criterion parameters and cutting scores, the correlation coefficient, sample size, and number of samples to be drawn (all inputs), this program calculates decision classification rates across samples and for combined samples. Several other related indices are also computed. (Author)
Hancock, Bruno C; Ketterhagen, William R
2011-10-14
Discrete element model (DEM) simulations of the discharge of powders from hoppers under gravity were analyzed to provide estimates of dosage form content uniformity during the manufacture of solid dosage forms (tablets and capsules). For a system that exhibits moderate segregation the effects of sample size, number, and location within the batch were determined. The various sampling approaches were compared to current best-practices for sampling described in the Product Quality Research Institute (PQRI) Blend Uniformity Working Group (BUWG) guidelines. Sampling uniformly across the discharge process gave the most accurate results with respect to identifying segregation trends. Sigmoidal sampling (as recommended in the PQRI BUWG guidelines) tended to overestimate potential segregation issues, whereas truncated sampling (common in industrial practice) tended to underestimate them. The size of the sample had a major effect on the absolute potency RSD. The number of sampling locations (10 vs. 20) had very little effect on the trends in the data, and the number of samples analyzed at each location (1 vs. 3 vs. 7) had only a small effect for the sampling conditions examined. The results of this work provide greater understanding of the effect of different sampling approaches on the measured content uniformity of real dosage forms, and can help to guide the choice of appropriate sampling protocols. Copyright © 2011 Elsevier B.V. All rights reserved.
Urey, Carlos; Weiss, Victor U; Gondikas, Andreas; von der Kammer, Frank; Hofmann, Thilo; Marchetti-Deschmann, Martina; Allmaier, Günter; Marko-Varga, György; Andersson, Roland
2016-11-20
For drug delivery, characterization of liposomes regarding size, particle number concentrations, occurrence of low-sized liposome artefacts and drug encapsulation are of importance to understand their pharmacodynamic properties. In our study, we aimed to demonstrate the applicability of nano Electrospray Gas-Phase Electrophoretic Mobility Molecular Analyser (nES GEMMA) as a suitable technique for analyzing these parameters. We measured number-based particle concentrations, identified differences in size between nominally identical liposomal samples, and detected the presence of low-diameter material which yielded bimodal particle size distributions. Subsequently, we compared these findings to dynamic light scattering (DLS) data and results from light scattering experiments coupled to Asymmetric Flow-Field Flow Fractionation (AF4), the latter improving the detectability of smaller particles in polydisperse samples due to a size separation step prior detection. However, the bimodal size distribution could not be detected due to method inherent limitations. In contrast, cryo transmission electron microscopy corroborated nES GEMMA results. Hence, gas-phase electrophoresis proved to be a versatile tool for liposome characterization as it could analyze both vesicle size and size distribution. Finally, a correlation of nES GEMMA results with cell viability experiments was carried out to demonstrate the importance of liposome batch-to-batch control as low-sized sample components possibly impact cell viability. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Barkhofen, Sonja; Bartley, Tim J; Sansoni, Linda; Kruse, Regina; Hamilton, Craig S; Jex, Igor; Silberhorn, Christine
2017-01-13
Sampling the distribution of bosons that have undergone a random unitary evolution is strongly believed to be a computationally hard problem. Key to outperforming classical simulations of this task is to increase both the number of input photons and the size of the network. We propose driven boson sampling, in which photons are input within the network itself, as a means to approach this goal. We show that the mean number of photons entering a boson sampling experiment can exceed one photon per input mode, while maintaining the required complexity, potentially leading to less stringent requirements on the input states for such experiments. When using heralded single-photon sources based on parametric down-conversion, this approach offers an ∼e-fold enhancement in the input state generation rate over scattershot boson sampling, reaching the scaling limit for such sources. This approach also offers a dramatic increase in the signal-to-noise ratio with respect to higher-order photon generation from such probabilistic sources, which removes the need for photon number resolution during the heralding process as the size of the system increases.
Designing a two-rank acceptance sampling plan for quality inspection of geospatial data products
NASA Astrophysics Data System (ADS)
Tong, Xiaohua; Wang, Zhenhua; Xie, Huan; Liang, Dan; Jiang, Zuoqin; Li, Jinchao; Li, Jun
2011-10-01
To address the disadvantages of classical sampling plans designed for traditional industrial products, we originally propose a two-rank acceptance sampling plan (TRASP) for the inspection of geospatial data outputs based on the acceptance quality level (AQL). The first rank sampling plan is to inspect the lot consisting of map sheets, and the second is to inspect the lot consisting of features in an individual map sheet. The TRASP design is formulated as an optimization problem with respect to sample size and acceptance number, which covers two lot size cases. The first case is for a small lot size with nonconformities being modeled by a hypergeometric distribution function, and the second is for a larger lot size with nonconformities being modeled by a Poisson distribution function. The proposed TRASP is illustrated through two empirical case studies. Our analysis demonstrates that: (1) the proposed TRASP provides a general approach for quality inspection of geospatial data outputs consisting of non-uniform items and (2) the proposed acceptance sampling plan based on TRASP performs better than other classical sampling plans. It overcomes the drawbacks of percent sampling, i.e., "strictness for large lot size, toleration for small lot size," and those of a national standard used specifically for industrial outputs, i.e., "lots with different sizes corresponding to the same sampling plan."
Sample size for estimating mean and coefficient of variation in species of crotalarias.
Toebe, Marcos; Machado, Letícia N; Tartaglia, Francieli L; Carvalho, Juliana O DE; Bandeira, Cirineu T; Cargnelutti Filho, Alberto
2018-04-16
The objective of this study was to determine the sample size necessary to estimate the mean and coefficient of variation in four species of crotalarias (C. juncea, C. spectabilis, C. breviflora and C. ochroleuca). An experiment was carried out for each species during the season 2014/15. At harvest, 1,000 pods of each species were randomly collected. In each pod were measured: mass of pod with and without seeds, length, width and height of pods, number and mass of seeds per pod, and mass of hundred seeds. Measures of central tendency, variability and distribution were calculated, and the normality was verified. The sample size necessary to estimate the mean and coefficient of variation with amplitudes of the confidence interval of 95% (ACI95%) of 2%, 4%, ..., 20% was determined by resampling with replacement. The sample size varies among species and characters, being necessary a larger sample size to estimate the mean in relation of the necessary for the coefficient of variation.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826
Smith, Philip L; Lilburn, Simon D; Corbett, Elaine A; Sewell, David K; Kyllingsbæk, Søren
2016-09-01
We investigated the capacity of visual short-term memory (VSTM) in a phase discrimination task that required judgments about the configural relations between pairs of black and white features. Sewell et al. (2014) previously showed that VSTM capacity in an orientation discrimination task was well described by a sample-size model, which views VSTM as a resource comprised of a finite number of noisy stimulus samples. The model predicts the invariance of [Formula: see text] , the sum of squared sensitivities across items, for displays of different sizes. For phase discrimination, the set-size effect significantly exceeded that predicted by the sample-size model for both simultaneously and sequentially presented stimuli. Instead, the set-size effect and the serial position curves with sequential presentation were predicted by an attention-weighted version of the sample-size model, which assumes that one of the items in the display captures attention and receives a disproportionate share of resources. The choice probabilities and response time distributions from the task were well described by a diffusion decision model in which the drift rates embodied the assumptions of the attention-weighted sample-size model. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Kim, Soyoung; Olejnik, Stephen
2005-01-01
The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…
Okada, Kensuke; Hoshino, Takahiro
2017-04-01
In psychology, the reporting of variance-accounted-for effect size indices has been recommended and widely accepted through the movement away from null hypothesis significance testing. However, most researchers have paid insufficient attention to the fact that effect sizes depend on the choice of the number of levels and their ranges in experiments. Moreover, the functional form of how and how much this choice affects the resultant effect size has not thus far been studied. We show that the relationship between the population effect size and number and range of levels is given as an explicit function under reasonable assumptions. Counterintuitively, it is found that researchers may affect the resultant effect size to be either double or half simply by suitably choosing the number of levels and their ranges. Through a simulation study, we confirm that this relation also applies to sample effect size indices in much the same way. Therefore, the variance-accounted-for effect size would be substantially affected by the basic research design such as the number of levels. Simple cross-study comparisons and a meta-analysis of variance-accounted-for effect sizes would generally be irrational unless differences in research designs are explicitly considered.
Sample allocation balancing overall representativeness and stratum precision.
Diaz-Quijano, Fredi Alexander
2018-05-07
In large-scale surveys, it is often necessary to distribute a preset sample size among a number of strata. Researchers must make a decision between prioritizing overall representativeness or precision of stratum estimates. Hence, I evaluated different sample allocation strategies based on stratum size. The strategies evaluated herein included allocation proportional to stratum population; equal sample for all strata; and proportional to the natural logarithm, cubic root, and square root of the stratum population. This study considered the fact that, from a preset sample size, the dispersion index of stratum sampling fractions is correlated with the population estimator error and the dispersion index of stratum-specific sampling errors would measure the inequality in precision distribution. Identification of a balanced and efficient strategy was based on comparing those both dispersion indices. Balance and efficiency of the strategies changed depending on overall sample size. As the sample to be distributed increased, the most efficient allocation strategies were equal sample for each stratum; proportional to the logarithm, to the cubic root, to square root; and that proportional to the stratum population, respectively. Depending on sample size, each of the strategies evaluated could be considered in optimizing the sample to keep both overall representativeness and stratum-specific precision. Copyright © 2018 Elsevier Inc. All rights reserved.
Study on effect of microparticle's size on cavitation erosion in solid-liquid system
NASA Astrophysics Data System (ADS)
Chen, Haosheng; Liu, Shihan; Wang, Jiadao; Chen, Darong
2007-05-01
Five different solutions containing microparticles in different sizes were tested in a vibration cavitation erosion experiment. After the experiment, the number of erosion pits on sample surfaces, free radicals HO• in solutions, and mass loss all show that the cavitation erosion strength is strongly related to the particle size, and 500nm particles cause more severe cavitation erosion than other smaller or larger particles do. A model is presented to explain such result considering both nucleation and bubble-particle collision effects. Particle of a proper size will increase the number of heterogeneous nucleation and at the same time reduce the number of bubble-particle combinations, which results in more free bubbles in the solution to generate stronger cavitation erosion.
An Investigation of the Sampling Distribution of the Congruence Coefficient.
ERIC Educational Resources Information Center
Broadbooks, Wendy J.; Elmore, Patricia B.
This study developed and investigated an empirical sampling distribution of the congruence coefficient. The effects of sample size, number of variables, and population value of the congruence coefficient on the sampling distribution of the congruence coefficient were examined. Sample data were generated on the basis of the common factor model and…
Threshold-dependent sample sizes for selenium assessment with stream fish tissue
Hitt, Nathaniel P.; Smith, David R.
2015-01-01
Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4 to 8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and Type I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of eight fish could detect an increase of approximately 1 mg Se/kg with 80% power (given α = 0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of approximately 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2, this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of approximately 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated for by increased precision of composites for estimating mean conditions. However, low sample sizes (<5 fish) did not achieve 80% power to detect near-threshold values (i.e., <1 mg Se/kg) under any scenario we evaluated. This analysis can assist the sampling design and interpretation of Se assessments from fish tissue by accounting for natural variation in stream fish populations.
The impact of multiple endpoint dependency on Q and I(2) in meta-analysis.
Thompson, Christopher Glen; Becker, Betsy Jane
2014-09-01
A common assumption in meta-analysis is that effect sizes are independent. When correlated effect sizes are analyzed using traditional univariate techniques, this assumption is violated. This research assesses the impact of dependence arising from treatment-control studies with multiple endpoints on homogeneity measures Q and I(2) in scenarios using the unbiased standardized-mean-difference effect size. Univariate and multivariate meta-analysis methods are examined. Conditions included different overall outcome effects, study sample sizes, numbers of studies, between-outcomes correlations, dependency structures, and ways of computing the correlation. The univariate approach used typical fixed-effects analyses whereas the multivariate approach used generalized least-squares (GLS) estimates of a fixed-effects model, weighted by the inverse variance-covariance matrix. Increased dependence among effect sizes led to increased Type I error rates from univariate models. When effect sizes were strongly dependent, error rates were drastically higher than nominal levels regardless of study sample size and number of studies. In contrast, using GLS estimation to account for multiple-endpoint dependency maintained error rates within nominal levels. Conversely, mean I(2) values were not greatly affected by increased amounts of dependency. Last, we point out that the between-outcomes correlation should be estimated as a pooled within-groups correlation rather than using a full-sample estimator that does not consider treatment/control group membership. Copyright © 2014 John Wiley & Sons, Ltd.
Power calculation for overall hypothesis testing with high-dimensional commensurate outcomes.
Chi, Yueh-Yun; Gribbin, Matthew J; Johnson, Jacqueline L; Muller, Keith E
2014-02-28
The complexity of system biology means that any metabolic, genetic, or proteomic pathway typically includes so many components (e.g., molecules) that statistical methods specialized for overall testing of high-dimensional and commensurate outcomes are required. While many overall tests have been proposed, very few have power and sample size methods. We develop accurate power and sample size methods and software to facilitate study planning for high-dimensional pathway analysis. With an account of any complex correlation structure between high-dimensional outcomes, the new methods allow power calculation even when the sample size is less than the number of variables. We derive the exact (finite-sample) and approximate non-null distributions of the 'univariate' approach to repeated measures test statistic, as well as power-equivalent scenarios useful to generalize our numerical evaluations. Extensive simulations of group comparisons support the accuracy of the approximations even when the ratio of number of variables to sample size is large. We derive a minimum set of constants and parameters sufficient and practical for power calculation. Using the new methods and specifying the minimum set to determine power for a study of metabolic consequences of vitamin B6 deficiency helps illustrate the practical value of the new results. Free software implementing the power and sample size methods applies to a wide range of designs, including one group pre-intervention and post-intervention comparisons, multiple parallel group comparisons with one-way or factorial designs, and the adjustment and evaluation of covariate effects. Copyright © 2013 John Wiley & Sons, Ltd.
Zattoni, Andrea; Melucci, Dora; Reschiglian, Pierluigi; Sanz, Ramsés; Puignou, Lluís; Galceran, Maria Teresa
2004-10-29
Yeasts are widely used in several areas of food industry, e.g. baking, beer brewing, and wine production. Interest in new analytical methods for quality control and characterization of yeast cells is thus increasing. The biophysical properties of yeast cells, among which cell size, are related to yeast cell capabilities to produce primary and secondary metabolites during the fermentation process. Biophysical properties of winemaking yeast strains can be screened by field-flow fractionation (FFF). In this work we present the use of flow FFF (FlFFF) with turbidimetric multi-wavelength detection for the number-size distribution analysis of different commercial winemaking yeast varieties. The use of a diode-array detector allows to apply to dispersed samples like yeast cells the recently developed method for number-size (or mass-size) analysis in flow-assisted separation techniques. Results for six commercial winemaking yeast strains are compared with data obtained by a standard method for cell sizing (Coulter counter). The method here proposed gives, at short analysis time, accurate information on the number of cells of a given size, and information on the total number of cells.
Allen, John C; Thumboo, Julian; Lye, Weng Kit; Conaghan, Philip G; Chew, Li-Ching; Tan, York Kiat
2018-03-01
To determine whether novel methods of selecting joints through (i) ultrasonography (individualized-ultrasound [IUS] method), or (ii) ultrasonography and clinical examination (individualized-composite-ultrasound [ICUS] method) translate into smaller rheumatoid arthritis (RA) clinical trial sample sizes when compared to existing methods utilizing predetermined joint sites for ultrasonography. Cohen's effect size (ES) was estimated (ES^) and a 95% CI (ES^L, ES^U) calculated on a mean change in 3-month total inflammatory score for each method. Corresponding 95% CIs [nL(ES^U), nU(ES^L)] were obtained on a post hoc sample size reflecting the uncertainty in ES^. Sample size calculations were based on a one-sample t-test as the patient numbers needed to provide 80% power at α = 0.05 to reject a null hypothesis H 0 : ES = 0 versus alternative hypotheses H 1 : ES = ES^, ES = ES^L and ES = ES^U. We aimed to provide point and interval estimates on projected sample sizes for future studies reflecting the uncertainty in our study ES^S. Twenty-four treated RA patients were followed up for 3 months. Utilizing the 12-joint approach and existing methods, the post hoc sample size (95% CI) was 22 (10-245). Corresponding sample sizes using ICUS and IUS were 11 (7-40) and 11 (6-38), respectively. Utilizing a seven-joint approach, the corresponding sample sizes using ICUS and IUS methods were nine (6-24) and 11 (6-35), respectively. Our pilot study suggests that sample size for RA clinical trials with ultrasound endpoints may be reduced using the novel methods, providing justification for larger studies to confirm these observations. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
The use of mini-samples in palaeomagnetism
NASA Astrophysics Data System (ADS)
Böhnel, Harald; Michalk, Daniel; Nowaczyk, Norbert; Naranjo, Gildardo Gonzalez
2009-10-01
Rock cores of ~25 mm diameter are widely used in palaeomagnetism. Occasionally smaller diameters have been used as well which represents distinct advantages in terms of throughput, weight of equipment and core collections. How their orientation precision compares to 25 mm cores, however, has not been evaluated in detail before. Here we compare the site mean directions and their statistical parameters for 12 lava flows sampled with 25 mm cores (standard samples, typically 8 cores per site) and with 12 mm drill cores (mini-samples, typically 14 cores per site). The site-mean directions for both sample sizes appear to be indistinguishable in most cases. For the mini-samples, site dispersion parameters k on average are slightly lower than for the standard samples reflecting their larger orienting and measurement errors. Applying the Wilcoxon signed-rank test the probability that k or α95 have the same distribution for both sizes is acceptable only at the 17.4 or 66.3 per cent level, respectively. The larger mini-core numbers per site appears to outweigh the lower k values yielding also slightly smaller confidence limits α95. Further, both k and α95 are less variable for mini-samples than for standard size samples. This is interpreted also to result from the larger number of mini-samples per site, which better averages out the detrimental effect of undetected abnormal remanence directions. Sampling of volcanic rocks with mini-samples therefore does not present a disadvantage in terms of the overall obtainable uncertainty of site mean directions. Apart from this, mini-samples do present clear advantages during the field work, as about twice the number of drill cores can be recovered compared to 25 mm cores, and the sampled rock unit is then more widely covered, which reduces the contribution of natural random errors produced, for example, by fractures, cooling joints, and palaeofield inhomogeneities. Mini-samples may be processed faster in the laboratory, which is of particular advantage when carrying out palaeointensity experiments.
McClure, Foster D; Lee, Jung K
2005-01-01
Sample size formulas are developed to estimate the repeatability and reproducibility standard deviations (Sr and S(R)) such that the actual error in (Sr and S(R)) relative to their respective true values, sigmar and sigmaR, are at predefined levels. The statistical consequences associated with AOAC INTERNATIONAL required sample size to validate an analytical method are discussed. In addition, formulas to estimate the uncertainties of (Sr and S(R)) were derived and are provided as supporting documentation. Formula for the Number of Replicates Required for a Specified Margin of Relative Error in the Estimate of the Repeatability Standard Deviation.
Sample size considerations for clinical research studies in nuclear cardiology.
Chiuzan, Cody; West, Erin A; Duong, Jimmy; Cheung, Ken Y K; Einstein, Andrew J
2015-12-01
Sample size calculation is an important element of research design that investigators need to consider in the planning stage of the study. Funding agencies and research review panels request a power analysis, for example, to determine the minimum number of subjects needed for an experiment to be informative. Calculating the right sample size is crucial to gaining accurate information and ensures that research resources are used efficiently and ethically. The simple question "How many subjects do I need?" does not always have a simple answer. Before calculating the sample size requirements, a researcher must address several aspects, such as purpose of the research (descriptive or comparative), type of samples (one or more groups), and data being collected (continuous or categorical). In this article, we describe some of the most frequent methods for calculating the sample size with examples from nuclear cardiology research, including for t tests, analysis of variance (ANOVA), non-parametric tests, correlation, Chi-squared tests, and survival analysis. For the ease of implementation, several examples are also illustrated via user-friendly free statistical software.
Sample size for post-marketing safety studies based on historical controls.
Wu, Yu-te; Makuch, Robert W
2010-08-01
As part of a drug's entire life cycle, post-marketing studies are an important part in the identification of rare, serious adverse events. Recently, the US Food and Drug Administration (FDA) has begun to implement new post-marketing safety mandates as a consequence of increased emphasis on safety. The purpose of this research is to provide exact sample size formula for the proposed hybrid design, based on a two-group cohort study with incorporation of historical external data. Exact sample size formula based on the Poisson distribution is developed, because the detection of rare events is our outcome of interest. Performance of exact method is compared to its approximate large-sample theory counterpart. The proposed hybrid design requires a smaller sample size compared to the standard, two-group prospective study design. In addition, the exact method reduces the number of subjects required in the treatment group by up to 30% compared to the approximate method for the study scenarios examined. The proposed hybrid design satisfies the advantages and rationale of the two-group design with smaller sample sizes generally required. 2010 John Wiley & Sons, Ltd.
Egg number-egg size: an important trade-off in parasite life history strategies.
Cavaleiro, Francisca I; Santos, Maria J
2014-03-01
Parasites produce from just a few to many eggs of variable size, but our understanding of the factors driving variation in these two life history traits at the intraspecific level is still very fragmentary. This study evaluates the importance of performing multilevel analyses on egg number and egg size, while characterising parasite life history strategies. A total of 120 ovigerous females of Octopicola superba (Copepoda: Octopicolidae) (one sample (n=30) per season) were characterised with respect to different body dimensions (total length; genital somite length) and measures of reproductive effort (fecundity; mean egg diameter; total reproductive effort; mean egg sac length). While endoparasites are suggested to follow both an r- and K-strategy simultaneously, the evidence found in this and other studies suggests that environmental conditions force ectoparasites into one of the two alternatives. The positive and negative skewness of the distributions of fecundity and mean egg diameter, respectively, suggest that O. superba is mainly a K-strategist (i.e. produces a relatively small number of large, well provisioned eggs). Significant sample differences were recorded concomitantly for all body dimensions and measures of reproductive effort, while a general linear model detected a significant influence of season*parasite total length in both egg number and size. This evidence suggests adaptive phenotypic plasticity in body dimensions and size-mediated changes in egg production. Seasonal changes in partitioning of resources between egg number and size resulted in significant differences in egg sac length but not in total reproductive effort. Evidence for a trade-off between egg number and size was found while controlling for a potential confounding effect of parasite total length. However, this trade-off became apparent only at high fecundity levels, suggesting a state of physiological exhaustion. Copyright © 2014 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
A new estimator of the discovery probability.
Favaro, Stefano; Lijoi, Antonio; Prünster, Igor
2012-12-01
Species sampling problems have a long history in ecological and biological studies and a number of issues, including the evaluation of species richness, the design of sampling experiments, and the estimation of rare species variety, are to be addressed. Such inferential problems have recently emerged also in genomic applications, however, exhibiting some peculiar features that make them more challenging: specifically, one has to deal with very large populations (genomic libraries) containing a huge number of distinct species (genes) and only a small portion of the library has been sampled (sequenced). These aspects motivate the Bayesian nonparametric approach we undertake, since it allows to achieve the degree of flexibility typically needed in this framework. Based on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely, the so-called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n+m+1)th observation, species that have been observed with any given frequency in the enlarged sample of size n+m. Such an estimator admits a closed-form expression that can be exactly evaluated. The result we obtain allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Natural applications are represented by the estimation of the probability of discovering rare genes within genomic libraries and the results are illustrated by means of two expressed sequence tags datasets. © 2012, The International Biometric Society.
Backhouse, Martin E
2002-01-01
A number of approaches to conducting economic evaluations could be adopted. However, some decision makers have a preference for wholly stochastic cost-effectiveness analyses, particularly if the sampled data are derived from randomised controlled trials (RCTs). Formal requirements for cost-effectiveness evidence have heightened concerns in the pharmaceutical industry that development costs and times might be increased if formal requirements increase the number, duration or costs of RCTs. Whether this proves to be the case or not will depend upon the timing, nature and extent of the cost-effectiveness evidence required. To illustrate how different requirements for wholly stochastic cost-effectiveness evidence could have a significant impact on two of the major determinants of new drug development costs and times, namely RCT sample size and study duration. Using data collected prospectively in a clinical evaluation, sample sizes were calculated for a number of hypothetical cost-effectiveness study design scenarios. The results were compared with a baseline clinical trial design. The sample sizes required for the cost-effectiveness study scenarios were mostly larger than those for the baseline clinical trial design. Circumstances can be such that a wholly stochastic cost-effectiveness analysis might not be a practical proposition even though its clinical counterpart is. In such situations, alternative research methodologies would be required. For wholly stochastic cost-effectiveness analyses, the importance of prior specification of the different components of study design is emphasised. However, it is doubtful whether all the information necessary for doing this will typically be available when product registration trials are being designed. Formal requirements for wholly stochastic cost-effectiveness evidence based on the standard frequentist paradigm have the potential to increase the size, duration and number of RCTs significantly and hence the costs and timelines associated with new product development. Moreover, it is possible to envisage situations where such an approach would be impossible to adopt. Clearly, further research is required into the issue of how to appraise the economic consequences of alternative economic evaluation research strategies.
ESTIMATING SAMPLE REQUIREMENTS FOR FIELD EVALUATIONS OF PESTICIDE LEACHING
A method is presented for estimating the number of samples needed to evaluate pesticide leaching threats to ground water at a desired level of precision. Sample size projections are based on desired precision (exhibited as relative tolerable error), level of confidence (90 or 95%...
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.
Hero, Alfred O; Rajaratnam, Bala
2016-01-01
When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.
Yao, Peng-Cheng; Gao, Hai-Yan; Wei, Ya-Nan; Zhang, Jian-Hang; Chen, Xiao-Yong
2017-01-01
Environmental conditions in coastal salt marsh habitats have led to the development of specialist genetic adaptations. We evaluated six DNA barcode loci of the 53 species of Poaceae and 15 species of Chenopodiaceae from China's coastal salt marsh area and inland area. Our results indicate that the optimum DNA barcode was ITS for coastal salt-tolerant Poaceae and matK for the Chenopodiaceae. Sampling strategies for ten common species of Poaceae and Chenopodiaceae were analyzed according to optimum barcode. We found that by increasing the number of samples collected from the coastal salt marsh area on the basis of inland samples, the number of haplotypes of Arundinella hirta, Digitaria ciliaris, Eleusine indica, Imperata cylindrica, Setaria viridis, and Chenopodium glaucum increased, with a principal coordinate plot clearly showing increased distribution points. The results of a Mann-Whitney test showed that for Digitaria ciliaris, Eleusine indica, Imperata cylindrica, and Setaria viridis, the distribution of intraspecific genetic distances was significantly different when samples from the coastal salt marsh area were included (P < 0.01). These results suggest that increasing the sample size in specialist habitats can improve measurements of intraspecific genetic diversity, and will have a positive effect on the application of the DNA barcodes in widely distributed species. The results of random sampling showed that when sample size reached 11 for Chloris virgata, Chenopodium glaucum, and Dysphania ambrosioides, 13 for Setaria viridis, and 15 for Eleusine indica, Imperata cylindrica and Chenopodium album, average intraspecific distance tended to reach stability. These results indicate that the sample size for DNA barcode of globally distributed species should be increased to 11–15. PMID:28934362
Yao, Peng-Cheng; Gao, Hai-Yan; Wei, Ya-Nan; Zhang, Jian-Hang; Chen, Xiao-Yong; Li, Hong-Qing
2017-01-01
Environmental conditions in coastal salt marsh habitats have led to the development of specialist genetic adaptations. We evaluated six DNA barcode loci of the 53 species of Poaceae and 15 species of Chenopodiaceae from China's coastal salt marsh area and inland area. Our results indicate that the optimum DNA barcode was ITS for coastal salt-tolerant Poaceae and matK for the Chenopodiaceae. Sampling strategies for ten common species of Poaceae and Chenopodiaceae were analyzed according to optimum barcode. We found that by increasing the number of samples collected from the coastal salt marsh area on the basis of inland samples, the number of haplotypes of Arundinella hirta, Digitaria ciliaris, Eleusine indica, Imperata cylindrica, Setaria viridis, and Chenopodium glaucum increased, with a principal coordinate plot clearly showing increased distribution points. The results of a Mann-Whitney test showed that for Digitaria ciliaris, Eleusine indica, Imperata cylindrica, and Setaria viridis, the distribution of intraspecific genetic distances was significantly different when samples from the coastal salt marsh area were included (P < 0.01). These results suggest that increasing the sample size in specialist habitats can improve measurements of intraspecific genetic diversity, and will have a positive effect on the application of the DNA barcodes in widely distributed species. The results of random sampling showed that when sample size reached 11 for Chloris virgata, Chenopodium glaucum, and Dysphania ambrosioides, 13 for Setaria viridis, and 15 for Eleusine indica, Imperata cylindrica and Chenopodium album, average intraspecific distance tended to reach stability. These results indicate that the sample size for DNA barcode of globally distributed species should be increased to 11-15.
Arnup, Sarah J; McKenzie, Joanne E; Pilcher, David; Bellomo, Rinaldo; Forbes, Andrew B
2018-06-01
The cluster randomised crossover (CRXO) design provides an opportunity to conduct randomised controlled trials to evaluate low risk interventions in the intensive care setting. Our aim is to provide a tutorial on how to perform a sample size calculation for a CRXO trial, focusing on the meaning of the elements required for the calculations, with application to intensive care trials. We use all-cause in-hospital mortality from the Australian and New Zealand Intensive Care Society Adult Patient Database clinical registry to illustrate the sample size calculations. We show sample size calculations for a two-intervention, two 12-month period, cross-sectional CRXO trial. We provide the formulae, and examples of their use, to determine the number of intensive care units required to detect a risk ratio (RR) with a designated level of power between two interventions for trials in which the elements required for sample size calculations remain constant across all ICUs (unstratified design); and in which there are distinct groups (strata) of ICUs that differ importantly in the elements required for sample size calculations (stratified design). The CRXO design markedly reduces the sample size requirement compared with the parallel-group, cluster randomised design for the example cases. The stratified design further reduces the sample size requirement compared with the unstratified design. The CRXO design enables the evaluation of routinely used interventions that can bring about small, but important, improvements in patient care in the intensive care setting.
ERIC Educational Resources Information Center
Spearing, Debra; Woehlke, Paula
To assess the effect on discriminant analysis in terms of correct classification into two groups, the following parameters were systematically altered using Monte Carlo techniques: sample sizes; proportions of one group to the other; number of independent variables; and covariance matrices. The pairing of the off diagonals (or covariances) with…
Rosenthal, Mariana; Anderson, Katey; Tengelsen, Leslie; Carter, Kris; Hahn, Christine; Ball, Christopher
2017-08-24
The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. The aim of this study was to compare Roadmap sampling recommendations with Idaho's influenza virologic surveillance to determine implementation feasibility. We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho's influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients' tested specimens to census estimates by age, sex, and health district residence. Among outpatients surveilled, Idaho's mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Insufficient numbers of respiratory specimens are submitted to IBL for influenza laboratory testing. Increased specimen submission would facilitate meeting Roadmap sample size recommendations. ©Mariana Rosenthal, Katey Anderson, Leslie Tengelsen, Kris Carter, Christine Hahn, Christopher Ball. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 24.08.2017.
2017-01-01
Background The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. Objective The aim of this study was to compare Roadmap sampling recommendations with Idaho’s influenza virologic surveillance to determine implementation feasibility. Methods We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho’s influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients’ tested specimens to census estimates by age, sex, and health district residence. Results Among outpatients surveilled, Idaho’s mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Conclusions Insufficient numbers of respiratory specimens are submitted to IBL for influenza laboratory testing. Increased specimen submission would facilitate meeting Roadmap sample size recommendations. PMID:28838883
Ancestral inference from haplotypes and mutations.
Griffiths, Robert C; Tavaré, Simon
2018-04-25
We consider inference about the history of a sample of DNA sequences, conditional upon the haplotype counts and the number of segregating sites observed at the present time. After deriving some theoretical results in the coalescent setting, we implement rejection sampling and importance sampling schemes to perform the inference. The importance sampling scheme addresses an extension of the Ewens Sampling Formula for a configuration of haplotypes and the number of segregating sites in the sample. The implementations include both constant and variable population size models. The methods are illustrated by two human Y chromosome datasets. Copyright © 2018. Published by Elsevier Inc.
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
NASA Astrophysics Data System (ADS)
Atapour, Hadi; Mortazavi, Ali
2018-04-01
The effects of textural characteristics, especially grain size, on index properties of weakly solidified artificial sandstones are studied. For this purpose, a relatively large number of laboratory tests were carried out on artificial sandstones that were produced in the laboratory. The prepared samples represent fifteen sandstone types consisting of five different median grain sizes and three different cement contents. Indices rock properties including effective porosity, bulk density, point load strength index, and Schmidt hammer values (SHVs) were determined. Experimental results showed that the grain size has significant effects on index properties of weakly solidified sandstones. The porosity of samples is inversely related to the grain size and decreases linearly as grain size increases. While a direct relationship was observed between grain size and dry bulk density, as bulk density increased with increasing median grain size. Furthermore, it was observed that the point load strength index and SHV of samples increased as a result of grain size increase. These observations are indirectly related to the porosity decrease as a function of median grain size.
Hühn, M
1995-05-01
Some approaches to molecular marker-assisted linkage detection for a dominant disease-resistance trait based on a segregating F2 population are discussed. Analysis of two-point linkage is carried out by the traditional measure of maximum lod score. It depends on (1) the maximum-likelihood estimate of the recombination fraction between the marker and the disease-resistance gene locus, (2) the observed absolute frequencies, and (3) the unknown number of tested individuals. If one replaces the absolute frequencies by expressions depending on the unknown sample size and the maximum-likelihood estimate of recombination value, the conventional rule for significant linkage (maximum lod score exceeds a given linkage threshold) can be resolved for the sample size. For each sub-population used for linkage analysis [susceptible (= recessive) individuals, resistant (= dominant) individuals, complete F2] this approach gives a lower bound for the necessary number of individuals required for the detection of significant two-point linkage by the lod-score method.
Conservative Sample Size Determination for Repeated Measures Analysis of Covariance.
Morgan, Timothy M; Case, L Douglas
2013-07-05
In the design of a randomized clinical trial with one pre and multiple post randomized assessments of the outcome variable, one needs to account for the repeated measures in determining the appropriate sample size. Unfortunately, one seldom has a good estimate of the variance of the outcome measure, let alone the correlations among the measurements over time. We show how sample sizes can be calculated by making conservative assumptions regarding the correlations for a variety of covariance structures. The most conservative choice for the correlation depends on the covariance structure and the number of repeated measures. In the absence of good estimates of the correlations, the sample size is often based on a two-sample t-test, making the 'ultra' conservative and unrealistic assumption that there are zero correlations between the baseline and follow-up measures while at the same time assuming there are perfect correlations between the follow-up measures. Compared to the case of taking a single measurement, substantial savings in sample size can be realized by accounting for the repeated measures, even with very conservative assumptions regarding the parameters of the assumed correlation matrix. Assuming compound symmetry, the sample size from the two-sample t-test calculation can be reduced at least 44%, 56%, and 61% for repeated measures analysis of covariance by taking 2, 3, and 4 follow-up measures, respectively. The results offer a rational basis for determining a fairly conservative, yet efficient, sample size for clinical trials with repeated measures and a baseline value.
Comparison of Two Methods Used to Model Shape Parameters of Pareto Distributions
Liu, C.; Charpentier, R.R.; Su, J.
2011-01-01
Two methods are compared for estimating the shape parameters of Pareto field-size (or pool-size) distributions for petroleum resource assessment. Both methods assume mature exploration in which most of the larger fields have been discovered. Both methods use the sizes of larger discovered fields to estimate the numbers and sizes of smaller fields: (1) the tail-truncated method uses a plot of field size versus size rank, and (2) the log-geometric method uses data binned in field-size classes and the ratios of adjacent bin counts. Simulation experiments were conducted using discovered oil and gas pool-size distributions from four petroleum systems in Alberta, Canada and using Pareto distributions generated by Monte Carlo simulation. The estimates of the shape parameters of the Pareto distributions, calculated by both the tail-truncated and log-geometric methods, generally stabilize where discovered pool numbers are greater than 100. However, with fewer than 100 discoveries, these estimates can vary greatly with each new discovery. The estimated shape parameters of the tail-truncated method are more stable and larger than those of the log-geometric method where the number of discovered pools is more than 100. Both methods, however, tend to underestimate the shape parameter. Monte Carlo simulation was also used to create sequences of discovered pool sizes by sampling from a Pareto distribution with a discovery process model using a defined exploration efficiency (in order to show how biased the sampling was in favor of larger fields being discovered first). A higher (more biased) exploration efficiency gives better estimates of the Pareto shape parameters. ?? 2011 International Association for Mathematical Geosciences.
Accuracy or precision: Implications of sample design and methodology on abundance estimation
Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.
2015-01-01
Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.
Effect of laser irradiation on surface hardness and structural parameters of 7178 aluminium alloy
NASA Astrophysics Data System (ADS)
Maryam, Siddra; Bashir, Farooq
2018-04-01
Aluminium 7178 samples were prepared and irradiated with Nd:YAG laser. The surfaces of exposed samples were investigated using optical microscopy, which revealed that the surface morphology of the samples is changed drastically as a function of laser shots. It is revealed from the micrographs that the laser heat effected area increases with the increase in the number of the laser pulses. Furthermore morphological and mechanical properties were studied using XRD and Vickers hardness testing. XRD study shows an increasing trend in Grain size with the increasing number of laser shots. And the hardness of the samples as a function of the laser shots shows that the hardness first increases and then it decreases gradually. It was observed that the grain size has no pronouncing effect on the hardness. Hardness profile has a decreasing trend with the increase in linear distance from the boundary of the laser heat affected area.
Analogical reasoning in amazons.
Obozova, Tanya; Smirnova, Anna; Zorina, Zoya; Wasserman, Edward
2015-11-01
Two juvenile orange-winged amazons (Amazona amazonica) were initially trained to match visual stimuli by color, shape, and number of items, but not by size. After learning these three identity matching-to-sample tasks, the parrots transferred discriminative responding to new stimuli from the same categories that had been used in training (other colors, shapes, and numbers of items) as well as to stimuli from a different category (stimuli varying in size). In the critical testing phase, both parrots exhibited reliable relational matching-to-sample (RMTS) behavior, suggesting that they perceived and compared the relationship between objects in the sample stimulus pair to the relationship between objects in the comparison stimulus pairs, even though no physical matches were possible between items in the sample and comparison pairs. The parrots spontaneously exhibited this higher-order relational responding without having ever before been trained on RMTS tasks, therefore joining apes and crows in displaying this abstract cognitive behavior.
2011-01-01
To obtain approval for the use vertebrate animals in research, an investigator must assure an ethics committee that the proposed number of animals is the minimum necessary to achieve a scientific goal. How does an investigator make that assurance? A power analysis is most accurate when the outcome is known before the study, which it rarely is. A ‘pilot study’ is appropriate only when the number of animals used is a tiny fraction of the numbers that will be invested in the main study because the data for the pilot animals cannot legitimately be used again in the main study without increasing the rate of type I errors (false discovery). Traditional significance testing requires the investigator to determine the final sample size before any data are collected and then to delay analysis of any of the data until all of the data are final. An investigator often learns at that point either that the sample size was larger than necessary or too small to achieve significance. Subjects cannot be added at this point in the study without increasing type I errors. In addition, journal reviewers may require more replications in quantitative studies than are truly necessary. Sequential stopping rules used with traditional significance tests allow incremental accumulation of data on a biomedical research problem so that significance, replicability, and use of a minimal number of animals can be assured without increasing type I errors. PMID:21838970
Drop size distributions and related properties of fog for five locations measured from aircraft
NASA Technical Reports Server (NTRS)
Zak, J. Allen
1994-01-01
Fog drop size distributions were collected from aircraft as part of the Synthetic Vision Technology Demonstration Program. Three west coast marine advection fogs, one frontal fog, and a radiation fog were sampled from the top of the cloud to the bottom as the aircraft descended on a 3-degree glideslope. Drop size versus altitude versus concentration are shown in three dimensional plots for each 10-meter altitude interval from 1-minute samples. Also shown are median volume radius and liquid water content. Advection fogs contained the largest drops with median volume radius of 5-8 micrometers, although the drop sizes in the radiation fog were also large just above the runway surface. Liquid water content increased with height, and the total number of drops generally increased with time. Multimodal variations in number density and particle size were noted in most samples where there was a peak concentration of small drops (2-5 micrometers) at low altitudes, midaltitude peak of drops 5-11 micrometers, and high-altitude peak of the larger drops (11-15 micrometers and above). These observations are compared with others and corroborate previous results in fog gross properties, although there is considerable variation with time and altitude even in the same type of fog.
Contrasting Size Distributions of Chondrules and Inclusions in Allende CV3
NASA Technical Reports Server (NTRS)
Fisher, Kent R.; Tait, Alastair W.; Simon, Jusin I.; Cuzzi, Jeff N.
2014-01-01
There are several leading theories on the processes that led to the formation of chondrites, e.g., sorting by mass, by X-winds, turbulent concentration, and by photophoresis. The juxtaposition of refractory inclusions (CAIs) and less refractory chondrules is central to these theories and there is much to be learned from their relative size distributions. There have been a number of studies into size distributions of particles in chondrites but only on relatively small scales primarily for chondrules, and rarely for both Calcium Aluminum-rich Inclusions (CAIs) and chondrules in the same sample. We have implemented macro-scale (25 cm diameter sample) and high-resolution microscale sampling of the Allende CV3 chondrite to create a complete data set of size frequencies for CAIs and chondrules.
Crans, Gerald G; Shuster, Jonathan J
2008-08-15
The debate as to which statistical methodology is most appropriate for the analysis of the two-sample comparative binomial trial has persisted for decades. Practitioners who favor the conditional methods of Fisher, Fisher's exact test (FET), claim that only experimental outcomes containing the same amount of information should be considered when performing analyses. Hence, the total number of successes should be fixed at its observed level in hypothetical repetitions of the experiment. Using conditional methods in clinical settings can pose interpretation difficulties, since results are derived using conditional sample spaces rather than the set of all possible outcomes. Perhaps more importantly from a clinical trial design perspective, this test can be too conservative, resulting in greater resource requirements and more subjects exposed to an experimental treatment. The actual significance level attained by FET (the size of the test) has not been reported in the statistical literature. Berger (J. R. Statist. Soc. D (The Statistician) 2001; 50:79-85) proposed assessing the conservativeness of conditional methods using p-value confidence intervals. In this paper we develop a numerical algorithm that calculates the size of FET for sample sizes, n, up to 125 per group at the two-sided significance level, alpha = 0.05. Additionally, this numerical method is used to define new significance levels alpha(*) = alpha+epsilon, where epsilon is a small positive number, for each n, such that the size of the test is as close as possible to the pre-specified alpha (0.05 for the current work) without exceeding it. Lastly, a sample size and power calculation example are presented, which demonstrates the statistical advantages of implementing the adjustment to FET (using alpha(*) instead of alpha) in the two-sample comparative binomial trial. 2008 John Wiley & Sons, Ltd
Fitts, Douglas A
2017-09-21
The variable criteria sequential stopping rule (vcSSR) is an efficient way to add sample size to planned ANOVA tests while holding the observed rate of Type I errors, α o , constant. The only difference from regular null hypothesis testing is that criteria for stopping the experiment are obtained from a table based on the desired power, rate of Type I errors, and beginning sample size. The vcSSR was developed using between-subjects ANOVAs, but it should work with p values from any type of F test. In the present study, the α o remained constant at the nominal level when using the previously published table of criteria with repeated measures designs with various numbers of treatments per subject, Type I error rates, values of ρ, and four different sample size models. New power curves allow researchers to select the optimal sample size model for a repeated measures experiment. The criteria held α o constant either when used with a multiple correlation that varied the sample size model and the number of predictor variables, or when used with MANOVA with multiple groups and two levels of a within-subject variable at various levels of ρ. Although not recommended for use with χ 2 tests such as the Friedman rank ANOVA test, the vcSSR produces predictable results based on the relation between F and χ 2 . Together, the data confirm the view that the vcSSR can be used to control Type I errors during sequential sampling with any t- or F-statistic rather than being restricted to certain ANOVA designs.
Bhaskar, Anand; Song, Yun S
2014-01-01
The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.
Bhaskar, Anand; Song, Yun S.
2016-01-01
The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011
Lindenfors, P; Tullberg, B S
2006-07-01
The fact that characters may co-vary in organism groups because of shared ancestry and not always because of functional correlations was the initial rationale for developing phylogenetic comparative methods. Here we point out a case where similarity due to shared ancestry can produce an undesired effect when conducting an independent contrasts analysis. Under special circumstances, using a low sample size will produce results indicating an evolutionary correlation between characters where an analysis of the same pattern utilizing a larger sample size will show that this correlation does not exist. This is the opposite effect of increased sample size to that expected; normally an increased sample size increases the chance of finding a correlation. The situation where the problem occurs is when co-variation between the two continuous characters analysed is clumped in clades; e.g. when some phylogenetically conservative factors affect both characters simultaneously. In such a case, the correlation between the two characters becomes contingent on the number of clades sharing this conservative factor that are included in the analysis, in relation to the number of species contained within these clades. Removing species scattered evenly over the phylogeny will in this case remove the exact variation that diffuses the evolutionary correlation between the two characters - the variation contained within the clades sharing the conservative factor. We exemplify this problem by discussing a parallel in nature where the described problem may be of importance. This concerns the question of the presence or absence of Rensch's rule in primates.
Automated sampling assessment for molecular simulations using the effective sample size
Zhang, Xin; Bhatt, Divesh; Zuckerman, Daniel M.
2010-01-01
To quantify the progress in the development of algorithms and forcefields used in molecular simulations, a general method for the assessment of the sampling quality is needed. Statistical mechanics principles suggest the populations of physical states characterize equilibrium sampling in a fundamental way. We therefore develop an approach for analyzing the variances in state populations, which quantifies the degree of sampling in terms of the effective sample size (ESS). The ESS estimates the number of statistically independent configurations contained in a simulated ensemble. The method is applicable to both traditional dynamics simulations as well as more modern (e.g., multi–canonical) approaches. Our procedure is tested in a variety of systems from toy models to atomistic protein simulations. We also introduce a simple automated procedure to obtain approximate physical states from dynamic trajectories: this allows sample–size estimation in systems for which physical states are not known in advance. PMID:21221418
Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.
Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra
2016-11-20
The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities.
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.
Aerosol mobility size spectrometer
Wang, Jian; Kulkarni, Pramod
2007-11-20
A device for measuring aerosol size distribution within a sample containing aerosol particles. The device generally includes a spectrometer housing defining an interior chamber and a camera for recording aerosol size streams exiting the chamber. The housing includes an inlet for introducing a flow medium into the chamber in a flow direction, an aerosol injection port adjacent the inlet for introducing a charged aerosol sample into the chamber, a separation section for applying an electric field to the aerosol sample across the flow direction and an outlet opposite the inlet. In the separation section, the aerosol sample becomes entrained in the flow medium and the aerosol particles within the aerosol sample are separated by size into a plurality of aerosol flow streams under the influence of the electric field. The camera is disposed adjacent the housing outlet for optically detecting a relative position of at least one aerosol flow stream exiting the outlet and for optically detecting the number of aerosol particles within the at least one aerosol flow stream.
Demenais, F; Lathrop, G M; Lalouel, J M
1988-07-01
A simulation study is here conducted to measure the power of the lod score method to detect linkage between a quantitative trait and a marker locus in various situations. The number of families necessary to detect such linkage with 80% power is assessed for different sets of parameters at the trait locus and different values of the recombination fraction. The effects of varying the mode of sampling families and the sibship size are also evaluated.
NASA Astrophysics Data System (ADS)
Willie, Jacob; Petre, Charles-Albert; Tagg, Nikki; Lens, Luc
2012-11-01
Data from forest herbaceous plants in a site of known species richness in Cameroon were used to test the performance of rarefaction and eight species richness estimators (ACE, ICE, Chao1, Chao2, Jack1, Jack2, Bootstrap and MM). Bias, accuracy, precision and sensitivity to patchiness and sample grain size were the evaluation criteria. An evaluation of the effects of sampling effort and patchiness on diversity estimation is also provided. Stems were identified and counted in linear series of 1-m2 contiguous square plots distributed in six habitat types. Initially, 500 plots were sampled in each habitat type. The sampling process was monitored using rarefaction and a set of richness estimator curves. Curves from the first dataset suggested adequate sampling in riparian forest only. Additional plots ranging from 523 to 2143 were subsequently added in the undersampled habitats until most of the curves stabilized. Jack1 and ICE, the non-parametric richness estimators, performed better, being more accurate and less sensitive to patchiness and sample grain size, and significantly reducing biases that could not be detected by rarefaction and other estimators. This study confirms the usefulness of non-parametric incidence-based estimators, and recommends Jack1 or ICE alongside rarefaction while describing taxon richness and comparing results across areas sampled using similar or different grain sizes. As patchiness varied across habitat types, accurate estimations of diversity did not require the same number of plots. The number of samples needed to fully capture diversity is not necessarily the same across habitats, and can only be known when taxon sampling curves have indicated adequate sampling. Differences in observed species richness between habitats were generally due to differences in patchiness, except between two habitats where they resulted from differences in abundance. We suggest that communities should first be sampled thoroughly using appropriate taxon sampling curves before explaining differences in diversity.
Willan, Andrew R
2016-07-05
The Pessary for the Prevention of Preterm Birth Study (PS3) is an international, multicenter, randomized clinical trial designed to examine the effectiveness of the Arabin pessary in preventing preterm birth in pregnant women with a short cervix. During the design of the study two methodological issues regarding power and sample size were raised. Since treatment in the Standard Arm will vary between centers, it is anticipated that so too will the probability of preterm birth in that arm. This will likely result in a treatment by center interaction, and the issue of how this will affect the sample size requirements was raised. The sample size requirements to examine the effect of the pessary on the baby's clinical outcome was prohibitively high, so the second issue is how best to examine the effect on clinical outcome. The approaches taken to address these issues are presented. Simulation and sensitivity analysis were used to address the sample size issue. The probability of preterm birth in the Standard Arm was assumed to vary between centers following a Beta distribution with a mean of 0.3 and a coefficient of variation of 0.3. To address the second issue a Bayesian decision model is proposed that combines the information regarding the between-treatment difference in the probability of preterm birth from PS3 with the data from the Multiple Courses of Antenatal Corticosteroids for Preterm Birth Study that relate preterm birth and perinatal mortality/morbidity. The approach provides a between-treatment comparison with respect to the probability of a bad clinical outcome. The performance of the approach was assessed using simulation and sensitivity analysis. Accounting for a possible treatment by center interaction increased the sample size from 540 to 700 patients per arm for the base case. The sample size requirements increase with the coefficient of variation and decrease with the number of centers. Under the same assumptions used for determining the sample size requirements, the simulated mean probability that pessary reduces the risk of perinatal mortality/morbidity is 0.98. The simulated mean decreased with coefficient of variation and increased with the number of clinical sites. Employing simulation and sensitivity analysis is a useful approach for determining sample size requirements while accounting for the additional uncertainty due to a treatment by center interaction. Using a surrogate outcome in conjunction with a Bayesian decision model is an efficient way to compare important clinical outcomes in a randomized clinical trial in situations where the direct approach requires a prohibitively high sample size.
How large a training set is needed to develop a classifier for microarray data?
Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M
2008-01-01
A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.
The Impacts of Family Size on Investment in Child Quality
ERIC Educational Resources Information Center
Caceres-Delpiano, Julio
2006-01-01
Using multiple births as an exogenous shift in family size, I investigate the impact of the number of children on child investment and child well-being. Using data from the 1980 US Census Five-Percent Public Use Micro Sample, 2SLS results demonstrate that parents facing a change in family size reallocate resources in a way consistent with Becker's…
Submicrometer Particle Sizing by Multiangle Light Scattering following Fractionation
Wyatt
1998-01-01
The acid test for any particle sizing technique is its ability to determine the differential number fraction size distribution of a simple, well-defined sample. The very best characterized polystyrene latex sphere standards have been measured extensively using transmission electron microscope (TEM) images of a large subpopulation of such samples or by means of the electrostatic classification method as refined at the National Institute of Standards and Technology. The great success, in the past decade, of on-line multiangle light scattering (MALS) detection combined with size exclusion chromatography for the measurement of polymer mass and size distributions suggested, in the early 1990s, that a similar attack for particle characterization might prove useful as well. At that time, fractionation of particles was achievable by capillary hydrodynamic chromatography (CHDF) and field flow fractionation (FFF) methods. The latter has proven most useful when combined with MALS to provide accurate differential number fraction size distributions for a broad range of particle classes. The MALS/FFF combination provides unique advantages and precision relative to FFF, photon correlation spectroscopy, and CHDF techniques used alone. For many classes of particles, resolution of the MALS/FFF combination far exceeds that of TEM measurements. Copyright 1998 Academic Press. Copyright 1998Academic Press
The effects of sample size on population genomic analyses--implications for the tests of neutrality.
Subramanian, Sankar
2016-02-20
One of the fundamental measures of molecular genetic variation is the Watterson's estimator (θ), which is based on the number of segregating sites. The estimation of θ is unbiased only under neutrality and constant population growth. It is well known that the estimation of θ is biased when these assumptions are violated. However, the effects of sample size in modulating the bias was not well appreciated. We examined this issue in detail based on large-scale exome data and robust simulations. Our investigation revealed that sample size appreciably influences θ estimation and this effect was much higher for constrained genomic regions than that of neutral regions. For instance, θ estimated for synonymous sites using 512 human exomes was 1.9 times higher than that obtained using 16 exomes. However, this difference was 2.5 times for the nonsynonymous sites of the same data. We observed a positive correlation between the rate of increase in θ estimates (with respect to the sample size) and the magnitude of selection pressure. For example, θ estimated for the nonsynonymous sites of highly constrained genes (dN/dS < 0.1) using 512 exomes was 3.6 times higher than that estimated using 16 exomes. In contrast this difference was only 2 times for the less constrained genes (dN/dS > 0.9). The results of this study reveal the extent of underestimation owing to small sample sizes and thus emphasize the importance of sample size in estimating a number of population genomic parameters. Our results have serious implications for neutrality tests such as Tajima D, Fu-Li D and those based on the McDonald and Kreitman test: Neutrality Index and the fraction of adaptive substitutions. For instance, use of 16 exomes produced 2.4 times higher proportion of adaptive substitutions compared to that obtained using 512 exomes (24% vs 10 %).
Murakami, Y; Hashimoto, S; Taniguchi, K; Nagai, M
1999-12-01
To describe the characteristics of monitoring stations for the infectious disease surveillance system in Japan, we compared the distributions of the number of monitoring stations in terms of population, region, size of medical institution, and medical specialty. The distributions of annual number of reported cases in terms of the type of diseases, the size of medical institution, and medical specialty were also compared. We conducted a nationwide survey of the pediatrics stations (16 diseases), ophthalmology stations (3 diseases) and the stations of sexually transmitted diseases (STD) (5 diseases) in Japan. In the survey, we collected the data of monitoring stations and the annual reported cases of diseases. We also collected the data on the population, served by the health center where the monitoring stations existed, from the census. First, we compared the difference between the present number of monitoring stations and the current standard established by the Ministry of Health and Welfare (MHW). Second, we compared the distribution of all medical institutions in Japan and the monitoring stations in terms of the size of the medical institution. Third, we compared the average number of annual reported cases of diseases in terms of the size of medical institution and the medical specialty. In most health centers, the number of monitoring stations achieved the current standard of MHW, while a few health centers had no monitoring station, although they had a large population. Most prefectures also achieved the current standard of MHW, but some prefectures were well below the standard. Among pediatric stations, the sampling proportion of large hospitals was higher than other categories. Among the ophthalmology stations, the sampling proportion of hospitals was higher than other categories. Among the STD stations, the sampling proportion of clinics of obstetrics and gynecology was lower than other categories. Except for some diseases, it made little difference in the average number of annual reported cases of diseases in terms of the type of medical institution. Among STD, there was a great difference in the average number of annual reported cases of diseases in terms of medical specialty.
The Mars Orbital Catalog of Hydrated Alteration Signatures (MOCHAS) - Initial release
NASA Astrophysics Data System (ADS)
Carter, John; OMEGA and CRISM Teams
2016-10-01
Aqueous minerals have been identified from orbit at a number of localities, and their analysis allowed refining the water story of Early Mars. They are also a main science driver when selecting current and upcoming landing sites for roving missions.Available catalogs of mineral detections exhibit a number of drawbacks such as a limited sample size (a thousand sites at most), inhomogeneous sampling of the surface and of the investigation methods, and the lack of contextual information (e.g. spatial extent, morphological context). The MOCHAS project strives to address such limitations by providing a global, detailed survey of aqueous minerals on Mars based on 10 years of data from the OMEGA and CRISM imaging spectrometers. Contextual data is provided, including deposit sizes, morphology and detailed composition when available. Sampling biases are also addressed.It will be openly distributed in GIS-ready format and will be participative. For example, it will be possible for researchers to submit requests for specific mapping of regions of interest, or add/refine mineral detections.An initial release is scheduled in Fall 2016 and will feature a two orders of magnitude increase in sample size compared to previous studies.
Graf, Alexandra C; Bauer, Peter
2011-06-30
We calculate the maximum type 1 error rate of the pre-planned conventional fixed sample size test for comparing the means of independent normal distributions (with common known variance) which can be yielded when sample size and allocation rate to the treatment arms can be modified in an interim analysis. Thereby it is assumed that the experimenter fully exploits knowledge of the unblinded interim estimates of the treatment effects in order to maximize the conditional type 1 error rate. The 'worst-case' strategies require knowledge of the unknown common treatment effect under the null hypothesis. Although this is a rather hypothetical scenario it may be approached in practice when using a standard control treatment for which precise estimates are available from historical data. The maximum inflation of the type 1 error rate is substantially larger than derived by Proschan and Hunsberger (Biometrics 1995; 51:1315-1324) for design modifications applying balanced samples before and after the interim analysis. Corresponding upper limits for the maximum type 1 error rate are calculated for a number of situations arising from practical considerations (e.g. restricting the maximum sample size, not allowing sample size to decrease, allowing only increase in the sample size in the experimental treatment). The application is discussed for a motivating example. Copyright © 2011 John Wiley & Sons, Ltd.
Melvin, Elizabeth M; Moore, Brandon R; Gilchrist, Kristin H; Grego, Sonia; Velev, Orlin D
2011-09-01
The recent development of microfluidic "lab on a chip" devices requiring sample sizes <100 μL has given rise to the need to concentrate dilute samples and trap analytes, especially for surface-based detection techniques. We demonstrate a particle collection device capable of concentrating micron-sized particles in a predetermined area by combining AC electroosmosis (ACEO) and dielectrophoresis (DEP). The planar asymmetric electrode pattern uses ACEO pumping to induce equal, quadrilateral flow directed towards a stagnant region in the center of the device. A number of system parameters affecting particle collection efficiency were investigated including electrode and gap width, chamber height, applied potential and frequency, and number of repeating electrode pairs and electrode geometry. The robustness of the on-chip collection design was evaluated against varying electrolyte concentrations, particle types, and particle sizes. These devices are amenable to integration with a variety of detection techniques such as optical evanescent waveguide sensing.
Geometrical characteristics of sandstone with different sample sizes
NASA Astrophysics Data System (ADS)
Cheon, D. S.; Takahashi, M., , Dr
2017-12-01
In many rock engineering projects such as CO2 underground storage, engineering geothermal system, it is important things to understand the fluid flow behavior in the deep geological conditions. This fluid flow is generally affected by the geometrical characteristics of rock, especially porous media. Furthermore, physical properties in rock may depend on the existence of voids space in rock. Total porosity and pore size distribution can be measured by Mercury Intrusion Porosimetry and the other geometrical and spatial information of pores can be obtained through micro-focus X-ray CT. Using the micro-focus X-ray CT, we obtained the extracted void space and transparent image from the original CT voxel images of with different sample sizes like 1 mm, 2 mm, 3 mm cubes. The test samples are Berea sandstone and Otway sandstone. The former is well-known sandstone and it is used for the standard sample to compared to the result from the Otway sandstone. Otway sandstone was obtained from the CO2CRC Otway pilot site for the CO2 geosequestraion project. From the X-ray scan and ExFACT software, we get the informations including effective pore radii, coordination number, tortuosity and effective throat/pore radius ratio etc. The geometrical information analysis showed that for Berea sandstone and Otway sandstone, there is rarely differences with different sample sizes and total value of coordination number show high porosity, the tortuosity of Berea sandstone is higher than the Otway sandstone. In the future, these information will be used for the permeability of the samples.
Chen, Qixuan; Li, Jingguang
2014-05-01
Many recent studies have examined the association between number acuity, which is the ability to rapidly and non-symbolically estimate the quantity of items appearing in a scene, and symbolic math performance. However, various contradictory results have been reported. To comprehensively evaluate the association between number acuity and symbolic math performance, we conduct a meta-analysis to synthesize the results observed in previous studies. First, a meta-analysis of cross-sectional studies (36 samples, N = 4705) revealed a significant positive correlation between these skills (r = 0.20, 95% CI = [0.14, 0.26]); the association remained after considering other potential moderators (e.g., whether general cognitive abilities were controlled). Moreover, a meta-analysis of longitudinal studies revealed 1) that number acuity may prospectively predict later math performance (r = 0.24, 95% CI = [0.11, 0.37]; 6 samples) and 2) that number acuity is retrospectively correlated to early math performance as well (r = 0.17, 95% CI = [0.07, 0.26]; 5 samples). In summary, these pieces of evidence demonstrate a moderate but statistically significant association between number acuity and math performance. Based on the estimated effect sizes, power analyses were conducted, which suggested that many previous studies were underpowered due to small sample sizes. This may account for the disparity between findings in the literature, at least in part. Finally, the theoretical and practical implications of our meta-analytic findings are presented, and future research questions are discussed. Copyright © 2014 Elsevier B.V. All rights reserved.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-04
... approved information collection, the List Sampling Frame Surveys. Revision to burden hours will be needed due to changes in the size of the target population, sampling design, and/or questionnaire length... Agriculture, (202) 720-4333. SUPPLEMENTARY INFORMATION: Title: List Sampling Frame Surveys. OMB Control Number...
Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias
Chambers, David A.; Glasgow, Russell E.
2014-01-01
Abstract A number of commentaries have suggested that large studies are more reliable than smaller studies and there is a growing interest in the analysis of “big data” that integrates information from many thousands of persons and/or different data sources. We consider a variety of biases that are likely in the era of big data, including sampling error, measurement error, multiple comparisons errors, aggregation error, and errors associated with the systematic exclusion of information. Using examples from epidemiology, health services research, studies on determinants of health, and clinical trials, we conclude that it is necessary to exercise greater caution to be sure that big sample size does not lead to big inferential errors. Despite the advantages of big studies, large sample size can magnify the bias associated with error resulting from sampling or study design. Clin Trans Sci 2014; Volume #: 1–5 PMID:25043853
Monitoring the impact of Bt maize on butterflies in the field: estimation of required sample sizes.
Lang, Andreas
2004-01-01
The monitoring of genetically modified organisms (GMOs) after deliberate release is important in order to assess and evaluate possible environmental effects. Concerns have been raised that the transgenic crop, Bt maize, may affect butterflies occurring in field margins. Therefore, a monitoring of butterflies was suggested accompanying the commercial cultivation of Bt maize. In this study, baseline data on the butterfly species and their abundance in maize field margins is presented together with implications for butterfly monitoring. The study was conducted in Bavaria, South Germany, between 2000-2002. A total of 33 butterfly species was recorded in field margins. A small number of species dominated the community, and butterflies observed were mostly common species. Observation duration was the most important factor influencing the monitoring results. Field margin size affected the butterfly abundance, and habitat diversity had a tendency to influence species richness. Sample size and statistical power analyses indicated that a sample size in the range of 75 to 150 field margins for treatment (transgenic maize) and control (conventional maize) would detect (power of 80%) effects larger than 15% in species richness and the butterfly abundance pooled across species. However, a much higher number of field margins must be sampled in order to achieve a higher statistical power, to detect smaller effects, and to monitor single butterfly species.
Lee, K V; Moon, R D; Burkness, E C; Hutchison, W D; Spivak, M
2010-08-01
The parasitic mite Varroa destructor Anderson & Trueman (Acari: Varroidae) is arguably the most detrimental pest of the European-derived honey bee, Apis mellifera L. Unfortunately, beekeepers lack a standardized sampling plan to make informed treatment decisions. Based on data from 31 commercial apiaries, we developed sampling plans for use by beekeepers and researchers to estimate the density of mites in individual colonies or whole apiaries. Beekeepers can estimate a colony's mite density with chosen level of precision by dislodging mites from approximately to 300 adult bees taken from one brood box frame in the colony, and they can extrapolate to mite density on a colony's adults and pupae combined by doubling the number of mites on adults. For sampling whole apiaries, beekeepers can repeat the process in each of n = 8 colonies, regardless of apiary size. Researchers desiring greater precision can estimate mite density in an individual colony by examining three, 300-bee sample units. Extrapolation to density on adults and pupae may require independent estimates of numbers of adults, of pupae, and of their respective mite densities. Researchers can estimate apiary-level mite density by taking one 300-bee sample unit per colony, but should do so from a variable number of colonies, depending on apiary size. These practical sampling plans will allow beekeepers and researchers to quantify mite infestation levels and enhance understanding and management of V. destructor.
Laboratory and Airborne BRDF Analysis of Vegetation Leaves and Soil Samples
NASA Technical Reports Server (NTRS)
Georgiev, Georgi T.; Gatebe, Charles K.; Butler, James J.; King, Michael D.
2008-01-01
Laboratory-based Bidirectional Reflectance Distribution Function (BRDF) analysis of vegetation leaves, soil, and leaf litter samples is presented. The leaf litter and soil samples, numbered 1 and 2, were obtained from a site located in the savanna biome of South Africa (Skukuza: 25.0degS, 31.5degE). A third soil sample, number 3, was obtained from Etosha Pan, Namibia (19.20degS, 15.93degE, alt. 1100 m). In addition, BRDF of local fresh and dry leaves from tulip tree (Liriodendron tulipifera) and acacia tree (Acacia greggii) were studied. It is shown how the BRDF depends on the incident and scatter angles, sample size (i.e. crushed versus whole leaf,) soil samples fraction size, sample status (i.e. fresh versus dry leaves), vegetation species (poplar versus acacia), and vegetation s biochemical composition. As a demonstration of the application of the results of this study, airborne BRDF measurements acquired with NASA's Cloud Absorption Radiometer (CAR) over the same general site where the soil and leaf litter samples were obtained are compared to the laboratory results. Good agreement between laboratory and airborne measured BRDF is reported.
Thorlund, Kristian; Imberger, Georgina; Walsh, Michael; Chu, Rong; Gluud, Christian; Wetterslev, Jørn; Guyatt, Gordon; Devereaux, Philip J.; Thabane, Lehana
2011-01-01
Background Meta-analyses including a limited number of patients and events are prone to yield overestimated intervention effect estimates. While many assume bias is the cause of overestimation, theoretical considerations suggest that random error may be an equal or more frequent cause. The independent impact of random error on meta-analyzed intervention effects has not previously been explored. It has been suggested that surpassing the optimal information size (i.e., the required meta-analysis sample size) provides sufficient protection against overestimation due to random error, but this claim has not yet been validated. Methods We simulated a comprehensive array of meta-analysis scenarios where no intervention effect existed (i.e., relative risk reduction (RRR) = 0%) or where a small but possibly unimportant effect existed (RRR = 10%). We constructed different scenarios by varying the control group risk, the degree of heterogeneity, and the distribution of trial sample sizes. For each scenario, we calculated the probability of observing overestimates of RRR>20% and RRR>30% for each cumulative 500 patients and 50 events. We calculated the cumulative number of patients and events required to reduce the probability of overestimation of intervention effect to 10%, 5%, and 1%. We calculated the optimal information size for each of the simulated scenarios and explored whether meta-analyses that surpassed their optimal information size had sufficient protection against overestimation of intervention effects due to random error. Results The risk of overestimation of intervention effects was usually high when the number of patients and events was small and this risk decreased exponentially over time as the number of patients and events increased. The number of patients and events required to limit the risk of overestimation depended considerably on the underlying simulation settings. Surpassing the optimal information size generally provided sufficient protection against overestimation. Conclusions Random errors are a frequent cause of overestimation of intervention effects in meta-analyses. Surpassing the optimal information size will provide sufficient protection against overestimation. PMID:22028777
Jiang, Wenyu; Simon, Richard
2007-12-20
This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.
A normative inference approach for optimal sample sizes in decisions from experience
Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph
2015-01-01
“Decisions from experience” (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the “sampling paradigm,” which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the “optimal” sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720
A multi-stage drop-the-losers design for multi-arm clinical trials.
Wason, James; Stallard, Nigel; Bowden, Jack; Jennison, Christopher
2017-02-01
Multi-arm multi-stage trials can improve the efficiency of the drug development process when multiple new treatments are available for testing. A group-sequential approach can be used in order to design multi-arm multi-stage trials, using an extension to Dunnett's multiple-testing procedure. The actual sample size used in such a trial is a random variable that has high variability. This can cause problems when applying for funding as the cost will also be generally highly variable. This motivates a type of design that provides the efficiency advantages of a group-sequential multi-arm multi-stage design, but has a fixed sample size. One such design is the two-stage drop-the-losers design, in which a number of experimental treatments, and a control treatment, are assessed at a prescheduled interim analysis. The best-performing experimental treatment and the control treatment then continue to a second stage. In this paper, we discuss extending this design to have more than two stages, which is shown to considerably reduce the sample size required. We also compare the resulting sample size requirements to the sample size distribution of analogous group-sequential multi-arm multi-stage designs. The sample size required for a multi-stage drop-the-losers design is usually higher than, but close to, the median sample size of a group-sequential multi-arm multi-stage trial. In many practical scenarios, the disadvantage of a slight loss in average efficiency would be overcome by the huge advantage of a fixed sample size. We assess the impact of delay between recruitment and assessment as well as unknown variance on the drop-the-losers designs.
Meta-analysis of multiple outcomes: a multilevel approach.
Van den Noortgate, Wim; López-López, José Antonio; Marín-Martínez, Fulgencio; Sánchez-Meca, Julio
2015-12-01
In meta-analysis, dependent effect sizes are very common. An example is where in one or more studies the effect of an intervention is evaluated on multiple outcome variables for the same sample of participants. In this paper, we evaluate a three-level meta-analytic model to account for this kind of dependence, extending the simulation results of Van den Noortgate, López-López, Marín-Martínez, and Sánchez-Meca Behavior Research Methods, 45, 576-594 (2013) by allowing for a variation in the number of effect sizes per study, in the between-study variance, in the correlations between pairs of outcomes, and in the sample size of the studies. At the same time, we explore the performance of the approach if the outcomes used in a study can be regarded as a random sample from a population of outcomes. We conclude that although this approach is relatively simple and does not require prior estimates of the sampling covariances between effect sizes, it gives appropriate mean effect size estimates, standard error estimates, and confidence interval coverage proportions in a variety of realistic situations.
Long-term effective population size dynamics of an intensively monitored vertebrate population
Mueller, A-K; Chakarov, N; Krüger, O; Hoffman, J I
2016-01-01
Long-term genetic data from intensively monitored natural populations are important for understanding how effective population sizes (Ne) can vary over time. We therefore genotyped 1622 common buzzard (Buteo buteo) chicks sampled over 12 consecutive years (2002–2013 inclusive) at 15 microsatellite loci. This data set allowed us to both compare single-sample with temporal approaches and explore temporal patterns in the effective number of parents that produced each cohort in relation to the observed population dynamics. We found reasonable consistency between linkage disequilibrium-based single-sample and temporal estimators, particularly during the latter half of the study, but no clear relationship between annual Ne estimates () and census sizes. We also documented a 14-fold increase in between 2008 and 2011, a period during which the census size doubled, probably reflecting a combination of higher adult survival and immigration from further afield. Our study thus reveals appreciable temporal heterogeneity in the effective population size of a natural vertebrate population, confirms the need for long-term studies and cautions against drawing conclusions from a single sample. PMID:27553455
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hero, Alfred O.; Rajaratnam, Bala
When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
Hero, Alfred O.; Rajaratnam, Bala
2015-01-01
When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
Hero, Alfred O.; Rajaratnam, Bala
2015-12-09
When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Song, Rui; Kosorok, Michael R.; Cai, Jianwen
2009-01-01
Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
Estimating individual glomerular volume in the human kidney: clinical perspectives.
Puelles, Victor G; Zimanyi, Monika A; Samuel, Terence; Hughson, Michael D; Douglas-Denton, Rebecca N; Bertram, John F; Armitage, James A
2012-05-01
Measurement of individual glomerular volumes (IGV) has allowed the identification of drivers of glomerular hypertrophy in subjects without overt renal pathology. This study aims to highlight the relevance of IGV measurements with possible clinical implications and determine how many profiles must be measured in order to achieve stable size distribution estimates. We re-analysed 2250 IGV estimates obtained using the disector/Cavalieri method in 41 African and 34 Caucasian Americans. Pooled IGV analysis of mean and variance was conducted. Monte-Carlo (Jackknife) simulations determined the effect of the number of sampled glomeruli on mean IGV. Lin's concordance coefficient (R(C)), coefficient of variation (CV) and coefficient of error (CE) measured reliability. IGV mean and variance increased with overweight and hypertensive status. Superficial glomeruli were significantly smaller than juxtamedullary glomeruli in all subjects (P < 0.01), by race (P < 0.05) and in obese individuals (P < 0.01). Subjects with multiple chronic kidney disease (CKD) comorbidities showed significant increases in IGV mean and variability. Overall, mean IGV was particularly reliable with nine or more sampled glomeruli (R(C) > 0.95, <5% difference in CV and CE). These observations were not affected by a reduced sample size and did not disrupt the inverse linear correlation between mean IGV and estimated total glomerular number. Multiple comorbidities for CKD are associated with increased IGV mean and variance within subjects, including overweight, obesity and hypertension. Zonal selection and the number of sampled glomeruli do not represent drawbacks for future longitudinal biopsy-based studies of glomerular size and distribution.
Bayesian sample size calculations in phase II clinical trials using a mixture of informative priors.
Gajewski, Byron J; Mayo, Matthew S
2006-08-15
A number of researchers have discussed phase II clinical trials from a Bayesian perspective. A recent article by Mayo and Gajewski focuses on sample size calculations, which they determine by specifying an informative prior distribution and then calculating a posterior probability that the true response will exceed a prespecified target. In this article, we extend these sample size calculations to include a mixture of informative prior distributions. The mixture comes from several sources of information. For example consider information from two (or more) clinicians. The first clinician is pessimistic about the drug and the second clinician is optimistic. We tabulate the results for sample size design using the fact that the simple mixture of Betas is a conjugate family for the Beta- Binomial model. We discuss the theoretical framework for these types of Bayesian designs and show that the Bayesian designs in this paper approximate this theoretical framework. Copyright 2006 John Wiley & Sons, Ltd.
Sample size calculation in economic evaluations.
Al, M J; van Hout, B A; Michel, B C; Rutten, F F
1998-06-01
A simulation method is presented for sample size calculation in economic evaluations. As input the method requires: the expected difference and variance of costs and effects, their correlation, the significance level (alpha) and the power of the testing method and the maximum acceptable ratio of incremental effectiveness to incremental costs. The method is illustrated with data from two trials. The first compares primary coronary angioplasty with streptokinase in the treatment of acute myocardial infarction, in the second trial, lansoprazole is compared with omeprazole in the treatment of reflux oesophagitis. These case studies show how the various parameters influence the sample size. Given the large number of parameters that have to be specified in advance, the lack of knowledge about costs and their standard deviation, and the difficulty of specifying the maximum acceptable ratio of incremental effectiveness to incremental costs, the conclusion of the study is that from a technical point of view it is possible to perform a sample size calculation for an economic evaluation, but one should wonder how useful it is.
Filleron, Thomas; Gal, Jocelyn; Kramar, Andrew
2012-10-01
A major and difficult task is the design of clinical trials with a time to event endpoint. In fact, it is necessary to compute the number of events and in a second step the required number of patients. Several commercial software packages are available for computing sample size in clinical trials with sequential designs and time to event endpoints, but there are a few R functions implemented. The purpose of this paper is to describe features and use of the R function. plansurvct.func, which is an add-on function to the package gsDesign which permits in one run of the program to calculate the number of events, and required sample size but also boundaries and corresponding p-values for a group sequential design. The use of the function plansurvct.func is illustrated by several examples and validated using East software. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Johnston, Lisa G; Hakim, Avi J; Dittrich, Samantha; Burnett, Janet; Kim, Evelyn; White, Richard G
2016-08-01
Reporting key details of respondent-driven sampling (RDS) survey implementation and analysis is essential for assessing the quality of RDS surveys. RDS is both a recruitment and analytic method and, as such, it is important to adequately describe both aspects in publications. We extracted data from peer-reviewed literature published through September, 2013 that reported collected biological specimens using RDS. We identified 151 eligible peer-reviewed articles describing 222 surveys conducted in seven regions throughout the world. Most published surveys reported basic implementation information such as survey city, country, year, population sampled, interview method, and final sample size. However, many surveys did not report essential methodological and analytical information for assessing RDS survey quality, including number of recruitment sites, seeds at start and end, maximum number of waves, and whether data were adjusted for network size. Understanding the quality of data collection and analysis in RDS is useful for effectively planning public health service delivery and funding priorities.
Sediment quantity and quality in three impoundments in Massachusetts
Zimmerman, Marc James; Breault, Robert F.
2003-01-01
As part of a study with an overriding goal of providing information that would assist State and Federal agencies in developing screening protocols for managing sediments impounded behind dams that are potential candidates for removal, the U.S Geological Survey determined sediment quantity and quality at three locations: one on the French River and two on Yokum Brook, a tributary to the west branch of the Westfield River. Data collected with a global positioning system, a geographic information system, and sediment-thickness data aided in the creation of sediment maps and the calculation of sediment volumes at Perryville Pond on the French River in Webster, Massachusetts, and at the Silk Mill and Ballou Dams on Yokum Brook in Becket, Massachusetts. From these data the following sediment volumes were determined: Perryville Pond, 71,000 cubic yards, Silk Mill, 1,600 cubic yards, and Ballou, 800 cubic yards. Sediment characteristics were assessed in terms of grain size and concentrations of potentially hazardous organic compounds and metals. Assessment of the approaches and methods used at study sites indicated that ground-penetrating radar produced data that were extremely difficult and time-consuming to interpret for the three study sites. Because of these difficulties, a steel probe was ultimately used to determine sediment depth and extent for inclusion in the sediment maps. Use of these methods showed that, where sampling sites were accessible, a machine-driven coring device would be preferable to the physically exhausting, manual sediment-coring methods used in this investigation. Enzyme-linked immunosorbent assays were an effective tool for screening large numbers of samples for a range of organic contaminant compounds. An example calculation of the number of samples needed to characterize mean concentrations of contaminants indicated that the number of samples collected for most analytes was adequate; however, additional analyses for lead, copper, silver, arsenic, total petroleum hydrocarbons, and chlordane are needed to meet the criteria determined from the calculations. Particle-size analysis did not reveal a clear spatial distribution pattern at Perryville Pond. On average, less than 65 percent of each sample was greater in size than very fine sand. The sample with the highest percentage of clay-sized particles (24.3 percent) was collected just upstream from the dam and generally had the highest concentrations of contaminants determined here. In contrast, more than 90 percent of the sediment samples in the Becket impoundments had grain sizes larger than very fine sand; as determined by direct observation, rocks, cobbles, and boulders constituted a substantial amount of the material impounded at Becket. In general, the highest percentages of the finest particles, clays, occurred in association with the highest concentrations of contaminants. Enzyme-linked immunosorbent assays of the Perryville samples showed the widespread presence of petroleum hydrocarbons (16 out of 26 samples), polycyclic aromatic hydrocarbons (23 out of 26 samples), and chlordane (18 out of 26 samples); polychlorinated biphenyls were detected in five samples from four locations. Neither petroleum hydrocarbons nor polychlorinated biphenyls were detected at Becket, and chlordane was detected in only one sample. All 14 Becket samples contained polycyclic aromatic hydrocarbons. Replicate quality-control analyses revealed consistent results between paired samples. Samples from throughout Perryville Pond contained a number of metals at potentially toxic concentrations. These metals included arsenic, cadmium, copper, lead, nickel, and zinc. At Becket, no metals were found in elevated concentrations. In general, most of the concentrations of organic compounds and metals detected in Perryville Pond exceeded standards for benthic organisms, but only rarely exceeded standards for human contact. The most highly contaminated samples were
Horowitz, A.J.; Lum, K.R.; Garbarino, J.R.; Hall, G.E.M.; Lemieux, C.; Demas, C.R.
1996-01-01
Field and laboratory experiments indicate that a number of factors associated with filtration other than just pore size (e.g., diameter, manufacturer, volume of sample processed, amount of suspended sediment in the sample) can produce significant variations in the 'dissolved' concentrations of such elements as Fe, Al, Cu, Zn, Pb, Co, and Ni. The bulk of these variations result from the inclusion/exclusion of colloidally associated trace elements in the filtrate, although dilution and sorption/desorption from filters also may be factors. Thus, dissolved trace element concentrations quantitated by analyzing filtrates generated by processing whole water through similar pore-sized filters may not be equal or comparable. As such, simple filtration of unspecified volumes of natural water through unspecified 0.45-??m membrane filters may no longer represent an acceptable operational definition for a number of dissolved chemical constituents.
Overall, John E; Tonidandel, Scott; Starbuck, Robert R
2006-01-01
Recent contributions to the statistical literature have provided elegant model-based solutions to the problem of estimating sample sizes for testing the significance of differences in mean rates of change across repeated measures in controlled longitudinal studies with differentially correlated error and missing data due to dropouts. However, the mathematical complexity and model specificity of these solutions make them generally inaccessible to most applied researchers who actually design and undertake treatment evaluation research in psychiatry. In contrast, this article relies on a simple two-stage analysis in which dropout-weighted slope coefficients fitted to the available repeated measurements for each subject separately serve as the dependent variable for a familiar ANCOVA test of significance for differences in mean rates of change. This article is about how a sample of size that is estimated or calculated to provide desired power for testing that hypothesis without considering dropouts can be adjusted appropriately to take dropouts into account. Empirical results support the conclusion that, whatever reasonable level of power would be provided by a given sample size in the absence of dropouts, essentially the same power can be realized in the presence of dropouts simply by adding to the original dropout-free sample size the number of subjects who would be expected to drop from a sample of that original size under conditions of the proposed study.
Reproductive traits of the small Patagonian octopus Octopus tehuelchus
NASA Astrophysics Data System (ADS)
Storero, Lorena P.; Narvarte, Maite A.; González, Raúl A.
2012-12-01
This study evaluated the reproductive features of Octopus tehuelchus in three coastal environments of San Matías Gulf (Patagonia). Monthly samples of O. tehuelchus were used to estimate size at maturity, compare seasonal changes in oocyte size frequency distributions between sites as well as oocyte number and size between female maturity stage and sites. Females in Islote Lobos had a smaller size at maturity than females in San Antonio Bay and El Fuerte, probably as a consequence of a generally smaller body size. Males in San Antonio Bay were smaller at maturity than females. O. tehuelchus is a simultaneous terminal spawner. Fecundity (expressed as number of vitellogenic oocytes in ovary) was lower in Islote Lobos, and an increase in oocyte number in relation to female total weight was found. Females in San Antonio Bay had the largest oocytes, which may indicate higher energy reserves for the embryo and therefore higher juvenile survival. There was a close relationship between reproduction, growth and condition, represented as size at maturity, number and size of vitellogenic oocytes and period of maturity and spawning. Given the local variation in some reproductive features of O. tehuelchus, studies should focus on the environmental factors, which bring about this variation, and on how it affects the dynamics of local populations.
Sizing for the apparel industry using statistical analysis - a Brazilian case study
NASA Astrophysics Data System (ADS)
Capelassi, C. H.; Carvalho, M. A.; El Kattel, C.; Xu, B.
2017-10-01
The study of the body measurements of Brazilian women used the Kinect Body Imaging system for 3D body scanning. The result of the study aims to meet the needs of the apparel industry for accurate measurements. Data was statistically treated using the IBM SPSS 23 system, with 95% confidence (P<0,05) for the inferential analysis, with the purpose of grouping the measurements in sizes, so that a smaller number of sizes can cover a greater number of people. The sample consisted of 101 volunteers aged between 19 and 62 years. A cluster analysis was performed to identify the main body shapes of the sample. The results were divided between the top and bottom body portions; For the top portion, were used the measurements of the abdomen, waist and bust circumferences, as well as the height; For the bottom portion, were used the measurements of the hip circumference and the height. Three sizing systems were developed for the researched sample from the Abdomen-to-Height Ratio - AHR (top portion): Small (AHR < 0,52), Medium (AHR: 0,52-0,58), Large (AHR > 0,58) and from the Hip-to-Height Ratio - HHR (bottom portion): Small (HHR < 0,62), Medium (HHR: 0,62-0,68), Large (HHR > 0,68).
USDA-ARS?s Scientific Manuscript database
Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...
Labra, Fabio A; Hernández-Miranda, Eduardo; Quiñones, Renato A
2015-01-01
We study the temporal variation in the empirical relationships among body size (S), species richness (R), and abundance (A) in a shallow marine epibenthic faunal community in Coliumo Bay, Chile. We also extend previous analyses by calculating individual energy use (E) and test whether its bivariate and trivariate relationships with S and R are in agreement with expectations derived from the energetic equivalence rule. Carnivorous and scavenger species representing over 95% of sample abundance and biomass were studied. For each individual, body size (g) was measured and E was estimated following published allometric relationships. Data for each sample were tabulated into exponential body size bins, comparing species-averaged values with individual-based estimates which allow species to potentially occupy multiple size classes. For individual-based data, both the number of individuals and species across body size classes are fit by a Weibull function rather than by a power law scaling. Species richness is also a power law of the number of individuals. Energy use shows a piecewise scaling relationship with body size, with energetic equivalence holding true only for size classes above the modal abundance class. Species-based data showed either weak linear or no significant patterns, likely due to the decrease in the number of data points across body size classes. Hence, for individual-based size spectra, the SRA relationship seems to be general despite seasonal forcing and strong disturbances in Coliumo Bay. The unimodal abundance distribution results in a piecewise energy scaling relationship, with small individuals showing a positive scaling and large individuals showing energetic equivalence. Hence, strict energetic equivalence should not be expected for unimodal abundance distributions. On the other hand, while species-based data do not show unimodal SRA relationships, energy use across body size classes did not show significant trends, supporting energetic equivalence. PMID:25691966
NASA Astrophysics Data System (ADS)
Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua
2017-12-01
In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.
Statistical power analysis in wildlife research
Steidl, R.J.; Hayes, J.P.
1997-01-01
Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.
Extension of latin hypercube samples with correlated variables.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hora, Stephen Curtis; Helton, Jon Craig; Sallaberry, Cedric J. PhD.
2006-11-01
A procedure for extending the size of a Latin hypercube sample (LHS) with rank correlated variables is described and illustrated. The extension procedure starts with an LHS of size m and associated rank correlation matrix C and constructs a new LHS of size 2m that contains the elements of the original LHS and has a rank correlation matrix that is close to the original rank correlation matrix C. The procedure is intended for use in conjunction with uncertainty and sensitivity analysis of computationally demanding models in which it is important to make efficient use of a necessarily limited number ofmore » model evaluations.« less
Size separation of analytes using monomeric surfactants
Yeung, Edward S.; Wei, Wei
2005-04-12
A sieving medium for use in the separation of analytes in a sample containing at least one such analyte comprises a monomeric non-ionic surfactant of the of the general formula, B-A, wherein A is a hydrophilic moiety and B is a hydrophobic moiety, present in a solvent at a concentration forming a self-assembled micelle configuration under selected conditions and having an aggregation number providing an equivalent weight capable of effecting the size separation of the sample solution so as to resolve a target analyte(s) in a solution containing the same, the size separation taking place in a chromatography or electrophoresis separation system.
A Size Exclusion Chromatography Laboratory with Unknowns for Introductory Students
ERIC Educational Resources Information Center
McIntee, Edward J.; Graham, Kate J.; Colosky, Edward C.; Jakubowski, Henry V.
2015-01-01
Size exclusion chromatography is an important technique in the separation of biological and polymeric samples by molecular weight. While a number of laboratory experiments have been published that use this technique for the purification of large molecules, this is the first report of an experiment that focuses on purifying an unknown small…
Can we estimate molluscan abundance and biomass on the continental shelf?
NASA Astrophysics Data System (ADS)
Powell, Eric N.; Mann, Roger; Ashton-Alcox, Kathryn A.; Kuykendall, Kelsey M.; Chase Long, M.
2017-11-01
Few empirical studies have focused on the effect of sample density on the estimate of abundance of the dominant carbonate-producing fauna of the continental shelf. Here, we present such a study and consider the implications of suboptimal sampling design on estimates of abundance and size-frequency distribution. We focus on a principal carbonate producer of the U.S. Atlantic continental shelf, the Atlantic surfclam, Spisula solidissima. To evaluate the degree to which the results are typical, we analyze a dataset for the principal carbonate producer of Mid-Atlantic estuaries, the Eastern oyster Crassostrea virginica, obtained from Delaware Bay. These two species occupy different habitats and display different lifestyles, yet demonstrate similar challenges to survey design and similar trends with sampling density. The median of a series of simulated survey mean abundances, the central tendency obtained over a large number of surveys of the same area, always underestimated true abundance at low sample densities. More dramatic were the trends in the probability of a biased outcome. As sample density declined, the probability of a survey availability event, defined as a survey yielding indices >125% or <75% of the true population abundance, increased and that increase was disproportionately biased towards underestimates. For these cases where a single sample accessed about 0.001-0.004% of the domain, 8-15 random samples were required to reduce the probability of a survey availability event below 40%. The problem of differential bias, in which the probabilities of a biased-high and a biased-low survey index were distinctly unequal, was resolved with fewer samples than the problem of overall bias. These trends suggest that the influence of sampling density on survey design comes with a series of incremental challenges. At woefully inadequate sampling density, the probability of a biased-low survey index will substantially exceed the probability of a biased-high index. The survey time series on the average will return an estimate of the stock that underestimates true stock abundance. If sampling intensity is increased, the frequency of biased indices balances between high and low values. Incrementing sample number from this point steadily reduces the likelihood of a biased survey; however, the number of samples necessary to drive the probability of survey availability events to a preferred level of infrequency may be daunting. Moreover, certain size classes will be disproportionately susceptible to such events and the impact on size frequency will be species specific, depending on the relative dispersion of the size classes.
Fischer, Jesse R.; Quist, Michael C.
2014-01-01
All freshwater fish sampling methods are biased toward particular species, sizes, and sexes and are further influenced by season, habitat, and fish behavior changes over time. However, little is known about gear-specific biases for many common fish species because few multiple-gear comparison studies exist that have incorporated seasonal dynamics. We sampled six lakes and impoundments representing a diversity of trophic and physical conditions in Iowa, USA, using multiple gear types (i.e., standard modified fyke net, mini-modified fyke net, sinking experimental gill net, bag seine, benthic trawl, boat-mounted electrofisher used diurnally and nocturnally) to determine the influence of sampling methodology and season on fisheries assessments. Specifically, we describe the influence of season on catch per unit effort, proportional size distribution, and the number of samples required to obtain 125 stock-length individuals for 12 species of recreational and ecological importance. Mean catch per unit effort generally peaked in the spring and fall as a result of increased sampling effectiveness in shallow areas and seasonal changes in habitat use (e.g., movement offshore during summer). Mean proportional size distribution decreased from spring to fall for white bass Morone chrysops, largemouth bass Micropterus salmoides, bluegill Lepomis macrochirus, and black crappie Pomoxis nigromaculatus, suggesting selectivity for large and presumably sexually mature individuals in the spring and summer. Overall, the mean number of samples required to sample 125 stock-length individuals was minimized in the fall with sinking experimental gill nets, a boat-mounted electrofisher used at night, and standard modified nets for 11 of the 12 species evaluated. Our results provide fisheries scientists with relative comparisons between several recommended standard sampling methods and illustrate the effects of seasonal variation on estimates of population indices that will be critical to the future development of standardized sampling methods for freshwater fish in lentic ecosystems.
Shoukri, Mohamed M; Elkum, Nasser; Walter, Stephen D
2006-01-01
Background In this paper we propose the use of the within-subject coefficient of variation as an index of a measurement's reliability. For continuous variables and based on its maximum likelihood estimation we derive a variance-stabilizing transformation and discuss confidence interval construction within the framework of a one-way random effects model. We investigate sample size requirements for the within-subject coefficient of variation for continuous and binary variables. Methods We investigate the validity of the approximate normal confidence interval by Monte Carlo simulations. In designing a reliability study, a crucial issue is the balance between the number of subjects to be recruited and the number of repeated measurements per subject. We discuss efficiency of estimation and cost considerations for the optimal allocation of the sample resources. The approach is illustrated by an example on Magnetic Resonance Imaging (MRI). We also discuss the issue of sample size estimation for dichotomous responses with two examples. Results For the continuous variable we found that the variance stabilizing transformation improves the asymptotic coverage probabilities on the within-subject coefficient of variation for the continuous variable. The maximum like estimation and sample size estimation based on pre-specified width of confidence interval are novel contribution to the literature for the binary variable. Conclusion Using the sample size formulas, we hope to help clinical epidemiologists and practicing statisticians to efficiently design reliability studies using the within-subject coefficient of variation, whether the variable of interest is continuous or binary. PMID:16686943
Size-selective separation of submicron particles in suspensions with ultrasonic atomization.
Nii, Susumu; Oka, Naoyoshi
2014-11-01
Aqueous suspensions containing silica or polystyrene latex were ultrasonically atomized for separating particles of a specific size. With the help of a fog involving fine liquid droplets with a narrow size distribution, submicron particles in a limited size-range were successfully separated from suspensions. Performance of the separation was characterized by analyzing the size and the concentration of collected particles with a high resolution method. Irradiation of 2.4MHz ultrasound to sample suspensions allowed the separation of particles of specific size from 90 to 320nm without regarding the type of material. Addition of a small amount of nonionic surfactant, PONPE20 to SiO2 suspensions enhanced the collection of finer particles, and achieved a remarkable increase in the number of collected particles. Degassing of the sample suspension resulted in eliminating the separation performance. Dissolved air in suspensions plays an important role in this separation. Copyright © 2014 Elsevier B.V. All rights reserved.
Li, Huili; Ostermann, Anne; Karunarathna, Samantha C; Xu, Jianchu; Hyde, Kevin D; Mortimer, Peter E
2018-07-01
The species-area relationship is an important factor in the study of species diversity, conservation biology, and landscape ecology. A deeper understanding of this relationship is necessary, in order to provide recommendations on how to improve the quality of data collection on macrofungal diversity in different land use systems in future studies, a systematic assessment of methodological parameters, in particular optimal plot sizes. The species-area relationship of macrofungi in tropical and temperate climatic zones and four different land use systems were investigated by determining the macrofungal species richness in plot sizes ranging from 100 m 2 to 10 000 m 2 over two sampling seasons. We found that the effect of plot size on recorded species richness significantly differed between land use systems with the exception of monoculture systems. For both climate zones, land use system needs to be considered when determining optimal plot size. Using an optimal plot size was more important than temporal replication (over two sampling seasons) in accurately recording species richness. Copyright © 2018 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Fungal Fragments in Moldy Houses: A Field Study in Homes in New Orleans and Southern Ohio
Reponen, Tiina; Seo, Sung-Chul; Grimsley, Faye; Lee, Taekhee; Crawford, Carlos; Grinshpun, Sergey A.
2007-01-01
Smaller-sized fungal fragments (<1 μm) may contribute to mold-related health effects. Previous laboratory-based studies have shown that the number concentration of fungal fragments can be up to 500 times higher than that of fungal spores, but this has not yet been confirmed in a field study due to lack of suitable methodology. We have recently developed a field-compatible method for the sampling and analysis of airborne fungal fragments. The new methodology was utilized for characterizing fungal fragment exposures in mold-contaminated homes selected in New Orleans, Louisiana and Southern Ohio. Airborne fungal particles were separated into three distinct size fractions: (i) >2.25 μm (spores); (ii) 1.05–2.25 μm (mixture); and (iii) < 1.0 μm (submicrometer-sized fragments). Samples were collected in five homes in summer and winter and analyzed for (1→3)-β-D-glucan. The total (1→3)-β-D-glucan varied from 0.2 to 16.0 ng m−3. The ratio of (1→3)-β-D-glucan mass in fragment size fraction to that in spore size fraction (F/S) varied from 0.011 to 2.163. The mass ratio was higher in winter (average = 1.017) than in summer (0.227) coinciding with a lower relative humidity in the winter. Assuming a mass-based F/S-ratio=1 and the spore size = 3 μm, the corresponding number-based F/S-ratio (fragment number/spore number) would be 103 and 106, for the fragment sizes of 0.3 and 0.03 μm, respectively. These results indicate that the actual (field) contribution of fungal fragments to the overall exposure may be very high, even much greater than that estimated in our earlier laboratory-based studies. PMID:19050738
Patty, Philipus J; Frisken, Barbara J
2006-04-01
We compare results for the number-weighted mean radius and polydispersity obtained either by directly fitting number distributions to dynamic light-scattering data or by converting results obtained by fitting intensity-weighted distributions. We find that results from fits using number distributions are angle independent and that converting intensity-weighted distributions is not always reliable, especially when the polydispersity of the sample is large. We compare the results of fitting symmetric and asymmetric distributions, as represented by Gaussian and Schulz distributions, respectively, to data for extruded vesicles and find that the Schulz distribution provides a better estimate of the size distribution for these samples.
Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello
2012-01-01
Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.
Engblom, Henrik; Heiberg, Einar; Erlinge, David; Jensen, Svend Eggert; Nordrehaug, Jan Erik; Dubois-Randé, Jean-Luc; Halvorsen, Sigrun; Hoffmann, Pavel; Koul, Sasha; Carlsson, Marcus; Atar, Dan; Arheden, Håkan
2016-03-09
Cardiac magnetic resonance (CMR) can quantify myocardial infarct (MI) size and myocardium at risk (MaR), enabling assessment of myocardial salvage index (MSI). We assessed how MSI impacts the number of patients needed to reach statistical power in relation to MI size alone and levels of biochemical markers in clinical cardioprotection trials and how scan day affect sample size. Controls (n=90) from the recent CHILL-MI and MITOCARE trials were included. MI size, MaR, and MSI were assessed from CMR. High-sensitivity troponin T (hsTnT) and creatine kinase isoenzyme MB (CKMB) levels were assessed in CHILL-MI patients (n=50). Utilizing distribution of these variables, 100 000 clinical trials were simulated for calculation of sample size required to reach sufficient power. For a treatment effect of 25% decrease in outcome variables, 50 patients were required in each arm using MSI compared to 93, 98, 120, 141, and 143 for MI size alone, hsTnT (area under the curve [AUC] and peak), and CKMB (AUC and peak) in order to reach a power of 90%. If average CMR scan day between treatment and control arms differed by 1 day, sample size needs to be increased by 54% (77 vs 50) to avoid scan day bias masking a treatment effect of 25%. Sample size in cardioprotection trials can be reduced 46% to 65% without compromising statistical power when using MSI by CMR as an outcome variable instead of MI size alone or biochemical markers. It is essential to ensure lack of bias in scan day between treatment and control arms to avoid compromising statistical power. © 2016 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Results suggest that where information on variance components for a specific chemical in a specific media is not available, a chemical's compound class may provide guidance in selecting sample size and in apportioning resources between numbers of subjects and numbers of repeated ...
Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi; Jafari, Peyman
2017-01-01
Evaluating measurement equivalence (also known as differential item functioning (DIF)) is an important part of the process of validating psychometric questionnaires. This study aimed at evaluating the multiple indicators multiple causes (MIMIC) model for DIF detection when latent construct distribution is nonnormal and the focal group sample size is small. In this simulation-based study, Type I error rates and power of MIMIC model for detecting uniform-DIF were investigated under different combinations of reference to focal group sample size ratio, magnitude of the uniform-DIF effect, scale length, the number of response categories, and latent trait distribution. Moderate and high skewness in the latent trait distribution led to a decrease of 0.33% and 0.47% power of MIMIC model for detecting uniform-DIF, respectively. The findings indicated that, by increasing the scale length, the number of response categories and magnitude DIF improved the power of MIMIC model, by 3.47%, 4.83%, and 20.35%, respectively; it also decreased Type I error of MIMIC approach by 2.81%, 5.66%, and 0.04%, respectively. This study revealed that power of MIMIC model was at an acceptable level when latent trait distributions were skewed. However, empirical Type I error rate was slightly greater than nominal significance level. Consequently, the MIMIC was recommended for detection of uniform-DIF when latent construct distribution is nonnormal and the focal group sample size is small.
Jafari, Peyman
2017-01-01
Evaluating measurement equivalence (also known as differential item functioning (DIF)) is an important part of the process of validating psychometric questionnaires. This study aimed at evaluating the multiple indicators multiple causes (MIMIC) model for DIF detection when latent construct distribution is nonnormal and the focal group sample size is small. In this simulation-based study, Type I error rates and power of MIMIC model for detecting uniform-DIF were investigated under different combinations of reference to focal group sample size ratio, magnitude of the uniform-DIF effect, scale length, the number of response categories, and latent trait distribution. Moderate and high skewness in the latent trait distribution led to a decrease of 0.33% and 0.47% power of MIMIC model for detecting uniform-DIF, respectively. The findings indicated that, by increasing the scale length, the number of response categories and magnitude DIF improved the power of MIMIC model, by 3.47%, 4.83%, and 20.35%, respectively; it also decreased Type I error of MIMIC approach by 2.81%, 5.66%, and 0.04%, respectively. This study revealed that power of MIMIC model was at an acceptable level when latent trait distributions were skewed. However, empirical Type I error rate was slightly greater than nominal significance level. Consequently, the MIMIC was recommended for detection of uniform-DIF when latent construct distribution is nonnormal and the focal group sample size is small. PMID:28713828
Lyons, James E.; Kendall, William L.; Royle, J. Andrew; Converse, Sarah J.; Andres, Brad A.; Buchanan, Joseph B.
2016-01-01
We present a novel formulation of a mark–recapture–resight model that allows estimation of population size, stopover duration, and arrival and departure schedules at migration areas. Estimation is based on encounter histories of uniquely marked individuals and relative counts of marked and unmarked animals. We use a Bayesian analysis of a state–space formulation of the Jolly–Seber mark–recapture model, integrated with a binomial model for counts of unmarked animals, to derive estimates of population size and arrival and departure probabilities. We also provide a novel estimator for stopover duration that is derived from the latent state variable representing the interim between arrival and departure in the state–space model. We conduct a simulation study of field sampling protocols to understand the impact of superpopulation size, proportion marked, and number of animals sampled on bias and precision of estimates. Simulation results indicate that relative bias of estimates of the proportion of the population with marks was low for all sampling scenarios and never exceeded 2%. Our approach does not require enumeration of all unmarked animals detected or direct knowledge of the number of marked animals in the population at the time of the study. This provides flexibility and potential application in a variety of sampling situations (e.g., migratory birds, breeding seabirds, sea turtles, fish, pinnipeds, etc.). Application of the methods is demonstrated with data from a study of migratory sandpipers.
Estimating the Latent Number of Types in Growing Corpora with Reduced Cost-Accuracy Trade-Off
ERIC Educational Resources Information Center
Hidaka, Shohei
2016-01-01
The number of unique words in children's speech is one of most basic statistics indicating their language development. We may, however, face difficulties when trying to accurately evaluate the number of unique words in a child's growing corpus over time with a limited sample size. This study proposes a novel technique to estimate the latent number…
Yi, Honghong; Hao, Jiming; Duan, Lei; Li, Xinghua; Guo, Xingming
2006-09-01
In this investigation, the collection efficiency of particulate emission control devices (PECDs), particulate matter (PM) emissions, and PM size distribution were determined experimentally at the inlet and outlet of PECDs at five coal-fired power plants. Different boilers, coals, and PECDs are used in these power plants. Measurement in situ was performed by an electrical low-pressure impactor with a sampling system, which consisted of an isokinetic sampler probe, precut cyclone, and two-stage dilution system with a sample line to the instruments. The size distribution was measured over a range from 0.03 to 10 microm. Before and after all of the PECDs, the particle number size distributions display a bimodal distribution. The PM2.5 fraction emitted to atmosphere includes a significant amount of the mass from the coarse particle mode. The controlled and uncontrolled emission factors of total PM, inhalable PM (PM10), and fine PM P(M2.5) were obtained. Electrostatic precipitator (ESP) and baghouse total collection efficiencies are 96.38-99.89% and 99.94%, respectively. The minimum collection efficiency of the ESP and the baghouse both appear in the particle size range of 0.1-1 microm. In this size range, ESP and baghouse collection efficiencies are 85.79-98.6% and 99.54%. Real-time measurement shows that the mass and number concentration of PM10 will be greatly affected by the operating conditions of the PECDs. The number of emitted particles increases with increasing boiler load level because of higher combustion temperature. During test run periods, the data reproducibility is satisfactory.
New Measurements of the Particle Size Distribution of Apollo 11 Lunar Soil 10084
NASA Technical Reports Server (NTRS)
McKay, D.S.; Cooper, B.L.; Riofrio, L.M.
2009-01-01
We have initiated a major new program to determine the grain size distribution of nearly all lunar soils collected in the Apollo program. Following the return of Apollo soil and core samples, a number of investigators including our own group performed grain size distribution studies and published the results [1-11]. Nearly all of these studies were done by sieving the samples, usually with a working fluid such as Freon(TradeMark) or water. We have measured the particle size distribution of lunar soil 10084,2005 in water, using a Microtrac(TradeMark) laser diffraction instrument. Details of our own sieving technique and protocol (also used in [11]). are given in [4]. While sieving usually produces accurate and reproducible results, it has disadvantages. It is very labor intensive and requires hours to days to perform properly. Even using automated sieve shaking devices, four or five days may be needed to sieve each sample, although multiple sieve stacks increases productivity. Second, sieving is subject to loss of grains through handling and weighing operations, and these losses are concentrated in the finest grain sizes. Loss from handling becomes a more acute problem when smaller amounts of material are used. While we were able to quantitatively sieve into 6 or 8 size fractions using starting soil masses as low as 50mg, attrition and handling problems limit the practicality of sieving smaller amounts. Third, sieving below 10 or 20microns is not practical because of the problems of grain loss, and smaller grains sticking to coarser grains. Sieving is completely impractical below about 5- 10microns. Consequently, sieving gives no information on the size distribution below approx.10 microns which includes the important submicrometer and nanoparticle size ranges. Finally, sieving creates a limited number of size bins and may therefore miss fine structure of the distribution which would be revealed by other methods that produce many smaller size bins.
Digital image processing of nanometer-size metal particles on amorphous substrates
NASA Technical Reports Server (NTRS)
Soria, F.; Artal, P.; Bescos, J.; Heinemann, K.
1989-01-01
The task of differentiating very small metal aggregates supported on amorphous films from the phase contrast image features inherently stemming from the support is extremely difficult in the nanometer particle size range. Digital image processing was employed to overcome some of the ambiguities in evaluating such micrographs. It was demonstrated that such processing allowed positive particle detection and a limited degree of statistical size analysis even for micrographs where by bare eye examination the distribution between particles and erroneous substrate features would seem highly ambiguous. The smallest size class detected for Pd/C samples peaks at 0.8 nm. This size class was found in various samples prepared under different evaporation conditions and it is concluded that these particles consist of 'a magic number' of 13 atoms and have cubooctahedral or icosahedral crystal structure.
Rosenblum, Michael A; Laan, Mark J van der
2009-01-07
The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).
Horowitz, Arthur J.; Clarke, Robin T.; Merten, Gustavo Henrique
2015-01-01
Since the 1970s, there has been both continuing and growing interest in developing accurate estimates of the annual fluvial transport (fluxes and loads) of suspended sediment and sediment-associated chemical constituents. This study provides an evaluation of the effects of manual sample numbers (from 4 to 12 year−1) and sample scheduling (random-based, calendar-based and hydrology-based) on the precision, bias and accuracy of annual suspended sediment flux estimates. The evaluation is based on data from selected US Geological Survey daily suspended sediment stations in the USA and covers basins ranging in area from just over 900 km2 to nearly 2 million km2 and annual suspended sediment fluxes ranging from about 4 Kt year−1 to about 200 Mt year−1. The results appear to indicate that there is a scale effect for random-based and calendar-based sampling schemes, with larger sample numbers required as basin size decreases. All the sampling schemes evaluated display some level of positive (overestimates) or negative (underestimates) bias. The study further indicates that hydrology-based sampling schemes are likely to generate the most accurate annual suspended sediment flux estimates with the fewest number of samples, regardless of basin size. This type of scheme seems most appropriate when the determination of suspended sediment concentrations, sediment-associated chemical concentrations, annual suspended sediment and annual suspended sediment-associated chemical fluxes only represent a few of the parameters of interest in multidisciplinary, multiparameter monitoring programmes. The results are just as applicable to the calibration of autosamplers/suspended sediment surrogates currently used to measure/estimate suspended sediment concentrations and ultimately, annual suspended sediment fluxes, because manual samples are required to adjust the sample data/measurements generated by these techniques so that they provide depth-integrated and cross-sectionally representative data.
Increased accuracy of batch fecundity estimates using oocyte stage ratios in Plectropomus leopardus.
Carter, A B; Williams, A J; Russ, G R
2009-08-01
Using the ratio of the number of migratory nuclei to hydrated oocytes to estimate batch fecundity of common coral trout Plectropomus leopardus increases the time over which samples can be collected and, therefore, increases the sample size available and reduces biases in batch fecundity estimates.
Duy, Pham K; Chun, Seulah; Chung, Hoeil
2017-11-21
We have systematically characterized Raman scatterings in solid samples with different particle sizes and investigated subsequent trends of particle size-induced intensity variations. For this purpose, both lactose powders and pellets composed of five different particle sizes were prepared. Uniquely in this study, three spectral acquisition schemes with different sizes of laser illuminations and detection windows were employed for the evaluation, since it was expected that the experimental configuration would be another factor potentially influencing the intensity of the lactose peak, along with the particle size itself. In both samples, the distribution of Raman photons became broader with the increase in particle size, as the mean free path of laser photons, the average photon travel distance between consecutive scattering locations, became longer under this situation. When the particle size was the same, the Raman photon distribution was narrower in the pellets since the individual particles were more densely packed in a given volume (the shorter mean free path). When the size of the detection window was small, the number of photons reaching the detector decreased as the photon distribution was larger. Meanwhile, a large-window detector was able to collect the widely distributed Raman photons more effectively; therefore, the trends of intensity change with the variation in particle size were dissimilar depending on the employed spectral acquisition schemes. Overall, the Monte Carlo simulation was effective at probing the photon distribution inside the samples and helped to support the experimental observations.
Gebler, J.B.
2004-01-01
The related topics of spatial variability of aquatic invertebrate community metrics, implications of spatial patterns of metric values to distributions of aquatic invertebrate communities, and ramifications of natural variability to the detection of human perturbations were investigated. Four metrics commonly used for stream assessment were computed for 9 stream reaches within a fairly homogeneous, minimally impaired stream segment of the San Pedro River, Arizona. Metric variability was assessed for differing sampling scenarios using simple permutation procedures. Spatial patterns of metric values suggest that aquatic invertebrate communities are patchily distributed on subsegment and segment scales, which causes metric variability. Wide ranges of metric values resulted in wide ranges of metric coefficients of variation (CVs) and minimum detectable differences (MDDs), and both CVs and MDDs often increased as sample size (number of reaches) increased, suggesting that any particular set of sampling reaches could yield misleading estimates of population parameters and effects that can be detected. Mean metric variabilities were substantial, with the result that only fairly large differences in metrics would be declared significant at ?? = 0.05 and ?? = 0.20. The number of reaches required to obtain MDDs of 10% and 20% varied with significance level and power, and differed for different metrics, but were generally large, ranging into tens and hundreds of reaches. Study results suggest that metric values from one or a small number of stream reach(es) may not be adequate to represent a stream segment, depending on effect sizes of interest, and that larger sample sizes are necessary to obtain reasonable estimates of metrics and sample statistics. For bioassessment to progress, spatial variability may need to be investigated in many systems and should be considered when designing studies and interpreting data.
Zhang, Fang; Wagner, Anita K; Ross-Degnan, Dennis
2011-11-01
Interrupted time series is a strong quasi-experimental research design to evaluate the impacts of health policy interventions. Using simulation methods, we estimated the power requirements for interrupted time series studies under various scenarios. Simulations were conducted to estimate the power of segmented autoregressive (AR) error models when autocorrelation ranged from -0.9 to 0.9 and effect size was 0.5, 1.0, and 2.0, investigating balanced and unbalanced numbers of time periods before and after an intervention. Simple scenarios of autoregressive conditional heteroskedasticity (ARCH) models were also explored. For AR models, power increased when sample size or effect size increased, and tended to decrease when autocorrelation increased. Compared with a balanced number of study periods before and after an intervention, designs with unbalanced numbers of periods had less power, although that was not the case for ARCH models. The power to detect effect size 1.0 appeared to be reasonable for many practical applications with a moderate or large number of time points in the study equally divided around the intervention. Investigators should be cautious when the expected effect size is small or the number of time points is small. We recommend conducting various simulations before investigation. Copyright © 2011 Elsevier Inc. All rights reserved.
1988-09-01
tested. To measure 42 the adequacy of the sample, the Kaiser - Meyer - Olkin measure of sampling adequacy was used. This technique is described in Factor...40 4- 0 - 7 0 0 07 -58d the relatively large number of variables, there was concern about the adequacy of the sample size. A Kaiser - Meyer - Olkin
Sampling plantations to determine white-pine weevil injury
Robert L. Talerico; Robert W., Jr. Wilson
1973-01-01
Use of 1/10-acre square plots to obtain estimates of the proportion of never-weeviled trees necessary for evaluating and scheduling white-pine weevil control is described. The optimum number of trees to observe per plot is estimated from data obtained from sample plantations in the Northeast and a table is given. Of sample size required to achieve a standard error of...
Rast, Philippe; Hofer, Scott M.
2014-01-01
We investigated the power to detect variances and covariances in rates of change in the context of existing longitudinal studies using linear bivariate growth curve models. Power was estimated by means of Monte Carlo simulations. Our findings show that typical longitudinal study designs have substantial power to detect both variances and covariances among rates of change in a variety of cognitive, physical functioning, and mental health outcomes. We performed simulations to investigate the interplay among number and spacing of occasions, total duration of the study, effect size, and error variance on power and required sample size. The relation between growth rate reliability (GRR) and effect size to the sample size required to detect power ≥ .80 was non-linear, with rapidly decreasing sample sizes needed as GRR increases. The results presented here stand in contrast to previous simulation results and recommendations (Hertzog, Lindenberger, Ghisletta, & von Oertzen, 2006; Hertzog, von Oertzen, Ghisletta, & Lindenberger, 2008; von Oertzen, Ghisletta, & Lindenberger, 2010), which are limited due to confounds between study length and number of waves, error variance with GCR, and parameter values which are largely out of bounds of actual study values. Power to detect change is generally low in the early phases (i.e. first years) of longitudinal studies but can substantially increase if the design is optimized. We recommend additional assessments, including embedded intensive measurement designs, to improve power in the early phases of long-term longitudinal studies. PMID:24219544
Melvin, Elizabeth M.; Moore, Brandon R.; Gilchrist, Kristin H.; Grego, Sonia; Velev, Orlin D.
2011-01-01
The recent development of microfluidic “lab on a chip” devices requiring sample sizes <100 μL has given rise to the need to concentrate dilute samples and trap analytes, especially for surface-based detection techniques. We demonstrate a particle collection device capable of concentrating micron-sized particles in a predetermined area by combining AC electroosmosis (ACEO) and dielectrophoresis (DEP). The planar asymmetric electrode pattern uses ACEO pumping to induce equal, quadrilateral flow directed towards a stagnant region in the center of the device. A number of system parameters affecting particle collection efficiency were investigated including electrode and gap width, chamber height, applied potential and frequency, and number of repeating electrode pairs and electrode geometry. The robustness of the on-chip collection design was evaluated against varying electrolyte concentrations, particle types, and particle sizes. These devices are amenable to integration with a variety of detection techniques such as optical evanescent waveguide sensing. PMID:22662040
Sampling guidelines for oral fluid-based surveys of group-housed animals.
Rotolo, Marisa L; Sun, Yaxuan; Wang, Chong; Giménez-Lirola, Luis; Baum, David H; Gauger, Phillip C; Harmon, Karen M; Hoogland, Marlin; Main, Rodger; Zimmerman, Jeffrey J
2017-09-01
Formulas and software for calculating sample size for surveys based on individual animal samples are readily available. However, sample size formulas are not available for oral fluids and other aggregate samples that are increasingly used in production settings. Therefore, the objective of this study was to develop sampling guidelines for oral fluid-based porcine reproductive and respiratory syndrome virus (PRRSV) surveys in commercial swine farms. Oral fluid samples were collected in 9 weekly samplings from all pens in 3 barns on one production site beginning shortly after placement of weaned pigs. Samples (n=972) were tested by real-time reverse-transcription PCR (RT-rtPCR) and the binary results analyzed using a piecewise exponential survival model for interval-censored, time-to-event data with misclassification. Thereafter, simulation studies were used to study the barn-level probability of PRRSV detection as a function of sample size, sample allocation (simple random sampling vs fixed spatial sampling), assay diagnostic sensitivity and specificity, and pen-level prevalence. These studies provided estimates of the probability of detection by sample size and within-barn prevalence. Detection using fixed spatial sampling was as good as, or better than, simple random sampling. Sampling multiple barns on a site increased the probability of detection with the number of barns sampled. These results are relevant to PRRSV control or elimination projects at the herd, regional, or national levels, but the results are also broadly applicable to contagious pathogens of swine for which oral fluid tests of equivalent performance are available. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Image subsampling and point scoring approaches for large-scale marine benthic monitoring programs
NASA Astrophysics Data System (ADS)
Perkins, Nicholas R.; Foster, Scott D.; Hill, Nicole A.; Barrett, Neville S.
2016-07-01
Benthic imagery is an effective tool for quantitative description of ecologically and economically important benthic habitats and biota. The recent development of autonomous underwater vehicles (AUVs) allows surveying of spatial scales that were previously unfeasible. However, an AUV collects a large number of images, the scoring of which is time and labour intensive. There is a need to optimise the way that subsamples of imagery are chosen and scored to gain meaningful inferences for ecological monitoring studies. We examine the trade-off between the number of images selected within transects and the number of random points scored within images on the percent cover of target biota, the typical output of such monitoring programs. We also investigate the efficacy of various image selection approaches, such as systematic or random, on the bias and precision of cover estimates. We use simulated biotas that have varying size, abundance and distributional patterns. We find that a relatively small sampling effort is required to minimise bias. An increased precision for groups that are likely to be the focus of monitoring programs is best gained through increasing the number of images sampled rather than the number of points scored within images. For rare species, sampling using point count approaches is unlikely to provide sufficient precision, and alternative sampling approaches may need to be employed. The approach by which images are selected (simple random sampling, regularly spaced etc.) had no discernible effect on mean and variance estimates, regardless of the distributional pattern of biota. Field validation of our findings is provided through Monte Carlo resampling analysis of a previously scored benthic survey from temperate waters. We show that point count sampling approaches are capable of providing relatively precise cover estimates for candidate groups that are not overly rare. The amount of sampling required, in terms of both the number of images and number of points, varies with the abundance, size and distributional pattern of target biota. Therefore, we advocate either the incorporation of prior knowledge or the use of baseline surveys to establish key properties of intended target biota in the initial stages of monitoring programs.
Effects of plot size on forest-type algorithm accuracy
James A. Westfall
2009-01-01
The Forest Inventory and Analysis (FIA) program utilizes an algorithm to consistently determine the forest type for forested conditions on sample plots. Forest type is determined from tree size and species information. Thus, the accuracy of results is often dependent on the number of trees present, which is highly correlated with plot area. This research examines the...
Design of Phase II Non-inferiority Trials.
Jung, Sin-Ho
2017-09-01
With the development of inexpensive treatment regimens and less invasive surgical procedures, we are confronted with non-inferiority study objectives. A non-inferiority phase III trial requires a roughly four times larger sample size than that of a similar standard superiority trial. Because of the large required sample size, we often face feasibility issues to open a non-inferiority trial. Furthermore, due to lack of phase II non-inferiority trial design methods, we do not have an opportunity to investigate the efficacy of the experimental therapy through a phase II trial. As a result, we often fail to open a non-inferiority phase III trial and a large number of non-inferiority clinical questions still remain unanswered. In this paper, we want to develop some designs for non-inferiority randomized phase II trials with feasible sample sizes. At first, we review a design method for non-inferiority phase III trials. Subsequently, we propose three different designs for non-inferiority phase II trials that can be used under different settings. Each method is demonstrated with examples. Each of the proposed design methods is shown to require a reasonable sample size for non-inferiority phase II trials. The three different non-inferiority phase II trial designs are used under different settings, but require similar sample sizes that are typical for phase II trials.
Peel, D; Waples, R S; Macbeth, G M; Do, C; Ovenden, J R
2013-03-01
Theoretical models are often applied to population genetic data sets without fully considering the effect of missing data. Researchers can deal with missing data by removing individuals that have failed to yield genotypes and/or by removing loci that have failed to yield allelic determinations, but despite their best efforts, most data sets still contain some missing data. As a consequence, realized sample size differs among loci, and this poses a problem for unbiased methods that must explicitly account for random sampling error. One commonly used solution for the calculation of contemporary effective population size (N(e) ) is to calculate the effective sample size as an unweighted mean or harmonic mean across loci. This is not ideal because it fails to account for the fact that loci with different numbers of alleles have different information content. Here we consider this problem for genetic estimators of contemporary effective population size (N(e) ). To evaluate bias and precision of several statistical approaches for dealing with missing data, we simulated populations with known N(e) and various degrees of missing data. Across all scenarios, one method of correcting for missing data (fixed-inverse variance-weighted harmonic mean) consistently performed the best for both single-sample and two-sample (temporal) methods of estimating N(e) and outperformed some methods currently in widespread use. The approach adopted here may be a starting point to adjust other population genetics methods that include per-locus sample size components. © 2012 Blackwell Publishing Ltd.
Estimating individual glomerular volume in the human kidney: clinical perspectives
Puelles, Victor G.; Zimanyi, Monika A.; Samuel, Terence; Hughson, Michael D.; Douglas-Denton, Rebecca N.; Bertram, John F.
2012-01-01
Background. Measurement of individual glomerular volumes (IGV) has allowed the identification of drivers of glomerular hypertrophy in subjects without overt renal pathology. This study aims to highlight the relevance of IGV measurements with possible clinical implications and determine how many profiles must be measured in order to achieve stable size distribution estimates. Methods. We re-analysed 2250 IGV estimates obtained using the disector/Cavalieri method in 41 African and 34 Caucasian Americans. Pooled IGV analysis of mean and variance was conducted. Monte-Carlo (Jackknife) simulations determined the effect of the number of sampled glomeruli on mean IGV. Lin’s concordance coefficient (RC), coefficient of variation (CV) and coefficient of error (CE) measured reliability. Results. IGV mean and variance increased with overweight and hypertensive status. Superficial glomeruli were significantly smaller than juxtamedullary glomeruli in all subjects (P < 0.01), by race (P < 0.05) and in obese individuals (P < 0.01). Subjects with multiple chronic kidney disease (CKD) comorbidities showed significant increases in IGV mean and variability. Overall, mean IGV was particularly reliable with nine or more sampled glomeruli (RC > 0.95, <5% difference in CV and CE). These observations were not affected by a reduced sample size and did not disrupt the inverse linear correlation between mean IGV and estimated total glomerular number. Conclusions. Multiple comorbidities for CKD are associated with increased IGV mean and variance within subjects, including overweight, obesity and hypertension. Zonal selection and the number of sampled glomeruli do not represent drawbacks for future longitudinal biopsy-based studies of glomerular size and distribution. PMID:21984554
Salmonella Typhimurium DT193 and DT99 are present in great and blue tits in Flanders, Belgium
Verbrugghe, E.; Dekeukeleire, D.; De Beelde, R.; Rouffaer, L. O.; Haesendonck, R.; Strubbe, D.; Mattheus, W.; Bertrand, S.; Pasmans, F.; Bonte, D.; Verheyen, K.; Lens, L.; Martel, A.
2017-01-01
Endemic infections with the common avian pathogen Salmonella enterica subspecies enterica serovar Typhimurium (Salmonella Typhimurium) may incur a significant cost on the host population. In this study, we determined the potential of endemic Salmonella infections to reduce the reproductive success of blue (Cyanistes caeruleus) and great (Parus major) tits by correlating eggshell infection with reproductive parameters. The fifth egg of each clutch was collected from nest boxes in 19 deciduous forest fragments. Out of the 101 sampled eggs, 7 Salmonella Typhimurium isolates were recovered. The low bacterial prevalence was reflected by a similarly low serological prevalence in the fledglings. In this study with a relatively small sample size, presence of Salmonella did not affect reproductive parameters (egg volume, clutch size, number of nestlings and number of fledglings), nor the health status of the fledglings. However, in order to clarify the impact on health and reproduction a larger number of samples have to be analyzed. Phage typing showed that the isolates belonged to the definitive phage types (DT) 193 and 99, and multi-locus variable number tandem repeat analysis (MLVA) demonstrated a high similarity among the tit isolates, but distinction to human isolates. These findings suggest the presence of passerine-adapted Salmonella strains in free-ranging tit populations with host pathogen co-existence. PMID:29112955
Image analysis of representative food structures: application of the bootstrap method.
Ramírez, Cristian; Germain, Juan C; Aguilera, José M
2009-08-01
Images (for example, photomicrographs) are routinely used as qualitative evidence of the microstructure of foods. In quantitative image analysis it is important to estimate the area (or volume) to be sampled, the field of view, and the resolution. The bootstrap method is proposed to estimate the size of the sampling area as a function of the coefficient of variation (CV(Bn)) and standard error (SE(Bn)) of the bootstrap taking sub-areas of different sizes. The bootstrap method was applied to simulated and real structures (apple tissue). For simulated structures, 10 computer-generated images were constructed containing 225 black circles (elements) and different coefficient of variation (CV(image)). For apple tissue, 8 images of apple tissue containing cellular cavities with different CV(image) were analyzed. Results confirmed that for simulated and real structures, increasing the size of the sampling area decreased the CV(Bn) and SE(Bn). Furthermore, there was a linear relationship between the CV(image) and CV(Bn) (.) For example, to obtain a CV(Bn) = 0.10 in an image with CV(image) = 0.60, a sampling area of 400 x 400 pixels (11% of whole image) was required, whereas if CV(image) = 1.46, a sampling area of 1000 x 100 pixels (69% of whole image) became necessary. This suggests that a large-size dispersion of element sizes in an image requires increasingly larger sampling areas or a larger number of images.
Effects of crystallite size on the structure and magnetism of ferrihydrite
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Xiaoming; Zhu, Mengqiang; Koopal, Luuk K.
2015-12-15
The structure and magnetic properties of nano-sized (1.6 to 4.4 nm) ferrihydrite samples are systematically investigated through a combination of X-ray diffraction (XRD), X-ray pair distribution function (PDF), X-ray absorption spectroscopy (XAS) and magnetic analyses. The XRD, PDF and Fe K-edge XAS data of the ferrihydrite samples are all fitted well with the Michel ferrihydrite model, indicating similar local-, medium- and long-range ordered structures. PDF and XAS fitting results indicate that, with increasing crystallite size, the average coordination numbers of Fe–Fe and the unit cell parameter c increase, while Fe2 and Fe3 vacancies and the unit cell parameter a decrease.more » Mössbauer results indicate that the surface layer is relatively disordered, which might have been caused by the random distribution of Fe vacancies. These results support Hiemstra's surface-depletion model in terms of the location of disorder and the variations of Fe2 and Fe3 occupancies with size. Magnetic data indicate that the ferrihydrite samples show antiferromagnetism superimposed with a ferromagnetic-like moment at lower temperatures (100 K and 10 K), but ferrihydrite is paramagnetic at room temperature. In addition, both the magnetization and coercivity decrease with increasing ferrihydrite crystallite size due to strong surface effects in fine-grained ferrihydrites. Smaller ferrihydrite samples show less magnetic hyperfine splitting and a lower unblocking temperature (T B) than larger samples. The dependence of magnetic properties on grain size for nano-sized ferrihydrite provides a practical way to determine the crystallite size of ferrihydrite quantitatively in natural environments or artificial systems.« less
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.; Hoffmann, Udo; Douglas, Pamela S.; Einstein, Andrew J.
2014-01-01
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample size required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence. PMID:24694150
Internal pilots for a class of linear mixed models with Gaussian and compound symmetric data
Gurka, Matthew J.; Coffey, Christopher S.; Muller, Keith E.
2015-01-01
SUMMARY An internal pilot design uses interim sample size analysis, without interim data analysis, to adjust the final number of observations. The approach helps to choose a sample size sufficiently large (to achieve the statistical power desired), but not too large (which would waste money and time). We report on recent research in cerebral vascular tortuosity (curvature in three dimensions) which would benefit greatly from internal pilots due to uncertainty in the parameters of the covariance matrix used for study planning. Unfortunately, observations correlated across the four regions of the brain and small sample sizes preclude using existing methods. However, as in a wide range of medical imaging studies, tortuosity data have no missing or mistimed data, a factorial within-subject design, the same between-subject design for all responses, and a Gaussian distribution with compound symmetry. For such restricted models, we extend exact, small sample univariate methods for internal pilots to linear mixed models with any between-subject design (not just two groups). Planning a new tortuosity study illustrates how the new methods help to avoid sample sizes that are too small or too large while still controlling the type I error rate. PMID:17318914
Cherry, S.; White, G.C.; Keating, K.A.; Haroldson, Mark A.; Schwartz, Charles C.
2007-01-01
Current management of the grizzly bear (Ursus arctos) population in Yellowstone National Park and surrounding areas requires annual estimation of the number of adult female bears with cubs-of-the-year. We examined the performance of nine estimators of population size via simulation. Data were simulated using two methods for different combinations of population size, sample size, and coefficient of variation of individual sighting probabilities. We show that the coefficient of variation does not, by itself, adequately describe the effects of capture heterogeneity, because two different distributions of capture probabilities can have the same coefficient of variation. All estimators produced biased estimates of population size with bias decreasing as effort increased. Based on the simulation results we recommend the Chao estimator for model M h be used to estimate the number of female bears with cubs of the year; however, the estimator of Chao and Shen may also be useful depending on the goals of the research.
40 CFR 86.1845-04 - Manufacturer in-use verification testing requirements.
Code of Federal Regulations, 2014 CFR
2014-07-01
... of test vehicles in the sample comply with the sample size requirements of this section. Any post... HDV must test, or cause to have tested, a specified number of vehicles. Such testing must be conducted... first test will be considered the official results for the test vehicle, regardless of any test results...
ERIC Educational Resources Information Center
Nitko, Anthony J.; Hsu, Tse-chi
Item analysis procedures appropriate for domain-referenced classroom testing are described. A conceptual framework within which item statistics can be considered and promising statistics in light of this framework are presented. The sampling fluctuations of the more promising item statistics for sample sizes comparable to the typical classroom…
Landguth, Erin L.; Gedy, Bradley C.; Oyler-McCance, Sara J.; Garey, Andrew L.; Emel, Sarah L.; Mumma, Matthew; Wagner, Helene H.; Fortin, Marie-Josée; Cushman, Samuel A.
2012-01-01
The influence of study design on the ability to detect the effects of landscape pattern on gene flow is one of the most pressing methodological gaps in landscape genetic research. To investigate the effect of study design on landscape genetics inference, we used a spatially-explicit, individual-based program to simulate gene flow in a spatially continuous population inhabiting a landscape with gradual spatial changes in resistance to movement. We simulated a wide range of combinations of number of loci, number of alleles per locus and number of individuals sampled from the population. We assessed how these three aspects of study design influenced the statistical power to successfully identify the generating process among competing hypotheses of isolation-by-distance, isolation-by-barrier, and isolation-by-landscape resistance using a causal modelling approach with partial Mantel tests. We modelled the statistical power to identify the generating process as a response surface for equilibrium and non-equilibrium conditions after introduction of isolation-by-landscape resistance. All three variables (loci, alleles and sampled individuals) affect the power of causal modelling, but to different degrees. Stronger partial Mantel r correlations between landscape distances and genetic distances were found when more loci were used and when loci were more variable, which makes comparisons of effect size between studies difficult. Number of individuals did not affect the accuracy through mean equilibrium partial Mantel r, but larger samples decreased the uncertainty (increasing the precision) of equilibrium partial Mantel r estimates. We conclude that amplifying more (and more variable) loci is likely to increase the power of landscape genetic inferences more than increasing number of individuals.
Landguth, E.L.; Fedy, B.C.; Oyler-McCance, S.J.; Garey, A.L.; Emel, S.L.; Mumma, M.; Wagner, H.H.; Fortin, M.-J.; Cushman, S.A.
2012-01-01
The influence of study design on the ability to detect the effects of landscape pattern on gene flow is one of the most pressing methodological gaps in landscape genetic research. To investigate the effect of study design on landscape genetics inference, we used a spatially-explicit, individual-based program to simulate gene flow in a spatially continuous population inhabiting a landscape with gradual spatial changes in resistance to movement. We simulated a wide range of combinations of number of loci, number of alleles per locus and number of individuals sampled from the population. We assessed how these three aspects of study design influenced the statistical power to successfully identify the generating process among competing hypotheses of isolation-by-distance, isolation-by-barrier, and isolation-by-landscape resistance using a causal modelling approach with partial Mantel tests. We modelled the statistical power to identify the generating process as a response surface for equilibrium and non-equilibrium conditions after introduction of isolation-by-landscape resistance. All three variables (loci, alleles and sampled individuals) affect the power of causal modelling, but to different degrees. Stronger partial Mantel r correlations between landscape distances and genetic distances were found when more loci were used and when loci were more variable, which makes comparisons of effect size between studies difficult. Number of individuals did not affect the accuracy through mean equilibrium partial Mantel r, but larger samples decreased the uncertainty (increasing the precision) of equilibrium partial Mantel r estimates. We conclude that amplifying more (and more variable) loci is likely to increase the power of landscape genetic inferences more than increasing number of individuals. ?? 2011 Blackwell Publishing Ltd.
Chung, Kyong-Sook; Weber, Jaime A; Hipp, Andrew L
2011-01-01
High intraspecific cytogenetic variation in the sedge genus Carex (Cyperaceae) is hypothesized to be due to the "diffuse" or non-localized centromeres, which facilitate chromosome fission and fusion. If chromosome number changes are dominated by fission and fusion, then chromosome evolution will result primarily in changes in the potential for recombination among populations. Chromosome duplications, on the other hand, entail consequent opportunities for divergent evolution of paralogs. In this study, we evaluate whether genome size and chromosome number covary within species. We used flow cytometry to estimate genome sizes in Carex scoparia var. scoparia, sampling 99 plants (23 populations) in the Chicago region, and we used meiotic chromosome observations to document chromosome numbers and chromosome pairing relations. Chromosome numbers range from 2n = 62 to 2n = 68, and nuclear DNA 1C content from 0.342 to 0.361 pg DNA. Regressions of DNA content on chromosome number are nonsignificant for data analyzed by individual or population, and a regression model that excludes slope is favored over a model in which chromosome number predicts genome size. Chromosome rearrangements within cytogenetically variable Carex species are more likely a consequence of fission and fusion than of duplication and deletion. Moreover, neither genome size nor chromosome number is spatially autocorrelated, which suggests the potential for rapid chromosome evolution by fission and fusion at a relatively fine geographic scale (<350 km). These findings have important implications for ecological restoration and speciation within the largest angiosperm genus of the temperate zone.
Sampling strategies for radio-tracking coyotes
Smith, G.J.; Cary, J.R.; Rongstad, O.J.
1981-01-01
Ten coyotes radio-tracked for 24 h periods were most active at night and moved little during daylight hours. Home-range size determined from radio-locations of 3 adult coyotes increased with the number of locations until an asymptote was reached at about 35-40 independent day locations or 3 6 nights of hourly radio-locations. Activity of the coyote did not affect the asymptotic nature of the home-range calculations, but home-range sizes determined from more than 3 nights of hourly locations were considerably larger than home-range sizes determined from daylight locations. Coyote home-range sizes were calculated from daylight locations, full-night tracking periods, and half-night tracking periods. Full- and half-lnight sampling strategies involved obtaining hourly radio-locations during 12 and 6 h periods, respectively. The half-night sampling strategy was the best compromise for our needs, as it adequately indexed the home-range size, reduced time and energy spent, and standardized the area calculation without requiring the researcher to become completely nocturnal. Sight tracking also provided information about coyote activity and sociability.
NASA Astrophysics Data System (ADS)
Clement, Sandhya; Gardner, Brint; Razali, Wan Aizuddin W.; Coleman, Victoria A.; Jämting, Åsa K.; Catchpoole, Heather J.; Goldys, Ewa M.; Herrmann, Jan; Zvyagin, Andrei
2017-11-01
The estimation of nanoparticle number concentration in colloidal suspensions is a prerequisite in many procedures, and in particular in multi-stage, low-yield reactions. Here, we describe a rapid, non-destructive method based on optical extinction and dynamic light scattering (DLS), which combines measurements using common bench-top instrumentation with a numerical algorithm to calculate the particle size distribution (PSD) and concentration. These quantities were derived from Mie theory applied to measurements of the optical extinction spectrum of homogeneous, non-absorbing nanoparticles, and the relative PSD of a colloidal suspension. The work presents an approach to account for PSDs achieved by DLS which, due to the underlying model, may not be representative of the true sample PSD. The presented approach estimates the absolute particle number concentration of samples with mono-, bi-modal and broad size distributions with <50% precision. This provides a convenient and practical solution for number concentration estimation required during many applications of colloidal nanomaterials.
Bayesian assurance and sample size determination in the process validation life-cycle.
Faya, Paul; Seaman, John W; Stamey, James D
2017-01-01
Validation of pharmaceutical manufacturing processes is a regulatory requirement and plays a key role in the assurance of drug quality, safety, and efficacy. The FDA guidance on process validation recommends a life-cycle approach which involves process design, qualification, and verification. The European Medicines Agency makes similar recommendations. The main purpose of process validation is to establish scientific evidence that a process is capable of consistently delivering a quality product. A major challenge faced by manufacturers is the determination of the number of batches to be used for the qualification stage. In this article, we present a Bayesian assurance and sample size determination approach where prior process knowledge and data are used to determine the number of batches. An example is presented in which potency uniformity data is evaluated using a process capability metric. By using the posterior predictive distribution, we simulate qualification data and make a decision on the number of batches required for a desired level of assurance.
Spineli, Loukia M; Jenz, Eva; Großhennig, Anika; Koch, Armin
2017-08-17
A number of papers have proposed or evaluated the delayed-start design as an alternative to the standard two-arm parallel group randomized clinical trial (RCT) design in the field of rare disease. However the discussion is felt to lack a sufficient degree of consideration devoted to the true virtues of the delayed start design and the implications either in terms of required sample-size, overall information, or interpretation of the estimate in the context of small populations. To evaluate whether there are real advantages of the delayed-start design particularly in terms of overall efficacy and sample size requirements as a proposed alternative to the standard parallel group RCT in the field of rare disease. We used a real-life example to compare the delayed-start design with the standard RCT in terms of sample size requirements. Then, based on three scenarios regarding the development of the treatment effect over time, the advantages, limitations and potential costs of the delayed-start design are discussed. We clarify that delayed-start design is not suitable for drugs that establish an immediate treatment effect, but for drugs with effects developing over time, instead. In addition, the sample size will always increase as an implication for a reduced time on placebo resulting in a decreased treatment effect. A number of papers have repeated well-known arguments to justify the delayed-start design as appropriate alternative to the standard parallel group RCT in the field of rare disease and do not discuss the specific needs of research methodology in this field. The main point is that a limited time on placebo will result in an underestimated treatment effect and, in consequence, in larger sample size requirements compared to those expected under a standard parallel-group design. This also impacts on benefit-risk assessment.
NASA Astrophysics Data System (ADS)
Lastra, M.; de La Huz, R.; Sánchez-Mata, A. G.; Rodil, I. F.; Aerts, K.; Beloso, S.; López, J.
2006-02-01
Thirty-four exposed sandy beaches on the northern coast of Spain (from 42°11' to 43°44'N, and from 2°04' to 8°52' W; ca. 1000 km) were sampled over a range of beach sizes, beach morphodynamics and exposure rates. Ten equally spaced intertidal shore levels along six replicated transects were sampled at each beach. Sediment and macrofauna samples were collected using corers to a depth of 15 cm. Morphodynamic characteristics such as the beach face slope, wave environment, exposure rates, Dean's parameter and Beach State Index were estimated. Biotic results indicated that in all the beaches the community was dominated by isopods, amphipods and polychaetes, mostly belonging to the detritivorous-opportunistic trophic group. The number of intertidal species ranged from 9 to 31, their density being between 31 and 618 individuals m - 2 , while individuals per linear metre (m - 1 ) ranged from 4962 to 17 2215. The biomass, calculated as total ash-free dry weight (AFDW) varied from 0.027 to 2.412 g m - 2 , and from 3.6 to 266.6 g m - 1 . Multiple regression analysis indicated that number of species significantly increased with proximity to the wind-driven upwelling zone located to the west, i.e., west-coast beaches hosted more species than east-coast beaches. The number of species increased with decreasing mean grain size and increasing beach length. The density of individuals m - 2 increased with decreasing mean grain size, while biomass m - 2 increased with increasing food availability estimated as chlorophyll-a concentration in the water column of the swash zone. Multiple-regression analysis indicated that chlorophyll-a in the water column increased with increasing western longitude. Additional insights provided by single-regression analysis showed a positive relationship between the number of species and chlorophyll-a, while increasing biomass occurred with increasing mean grain size of the beach. The results indicate that community characteristics in the exposed sandy beaches studied are affected by physical characteristics such as sediment size and beach length, but also by other factors dependent on coastal processes, such as food availability in the water column.
NASA Astrophysics Data System (ADS)
Mathis, Urs; Mohr, Martin; Forss, Anna-Maria
Particle measurements were performed in the exhaust of five light-duty vehicles (Euro-3) at +23, -7, and -20 °C ambient temperatures. The characterization included measurements of particle number, active surface area, number size distribution, and mass size distribution. We investigated two port-injection spark-ignition (PISI) vehicles, a direct-injection spark-ignition (DISI) vehicle, a compressed ignition (CI) vehicle with diesel particle filter (DPF), and a CI vehicle without DPF. To minimize sampling effects, particles were directly sampled from the tailpipe with a novel porous tube diluter at controlled sampling parameters. The diluted exhaust was split into two branches to measure either all or only non-volatile particles. Effect of ambient temperature was investigated on particle emission for cold and warmed-up engine. For the gasoline vehicles and the CI vehicle with DPF, the main portion of particle emission was found in the first minutes of the driving cycle at cold engine start. The particle emission of the CI vehicle without DPF was hardly affected by cold engine start. For the PISI vehicles, particle number emissions were superproportionally increased in the diameter size range from 0.1 to 0.3 μm during cold start at low ambient temperature. Based on the particle mass size distribution, the DPF removed smaller particles ( dp<0.5μm) more efficiently than larger particles ( dp>0.5μm). No significant effect of ambient temperature was observed when the engine was warmed up. Peak emission of volatile nanoparticles only took place at specific conditions and was poorly repeatable. Nucleation of particles was predominately observed during or after strong acceleration at high speed and during regeneration of the DPF.
Orth, Patrick; Zurakowski, David; Alini, Mauro; Cucchiarini, Magali
2013-01-01
Advanced tissue engineering approaches for articular cartilage repair in the knee joint rely on translational animal models. In these investigations, cartilage defects may be established either in one joint (unilateral design) or in both joints of the same animal (bilateral design). We hypothesized that a lower intraindividual variability following the bilateral strategy would reduce the number of required joints. Standardized osteochondral defects were created in the trochlear groove of 18 rabbits. In 12 animals, defects were produced unilaterally (unilateral design; n=12 defects), while defects were created bilaterally in 6 animals (bilateral design; n=12 defects). After 3 weeks, osteochondral repair was evaluated histologically applying an established grading system. Based on intra- and interindividual variabilities, required sample sizes for the detection of discrete differences in the histological score were determined for both study designs (α=0.05, β=0.20). Coefficients of variation (%CV) of the total histological score values were 1.9-fold increased following the unilateral design when compared with the bilateral approach (26 versus 14%CV). The resulting numbers of joints needed to treat were always higher for the unilateral design, resulting in an up to 3.9-fold increase in the required number of experimental animals. This effect was most pronounced for the detection of small-effect sizes and estimating large standard deviations. The data underline the possible benefit of bilateral study designs for the decrease of sample size requirements for certain investigations in articular cartilage research. These findings might also be transferred to other scoring systems, defect types, or translational animal models in the field of cartilage tissue engineering. PMID:23510128
Ferguson, Philip E; Sales, Catherine M; Hodges, Dalton C; Sales, Elizabeth W
2015-01-01
Recent publications have emphasized the importance of a multidisciplinary strategy for maximum conservation and utilization of lung biopsy material for advanced testing, which may determine therapy. This paper quantifies the effect of a multidisciplinary strategy implemented to optimize and increase tissue volume in CT-guided transthoracic needle core lung biopsies. The strategy was three-pronged: (1) once there was confidence diagnostic tissue had been obtained and if safe for the patient, additional biopsy passes were performed to further increase volume of biopsy material, (2) biopsy material was placed in multiple cassettes for processing, and (3) all tissue ribbons were conserved when cutting blocks in the histology laboratory. This study quantifies the effects of strategies #1 and #2. This retrospective analysis comparing CT-guided lung biopsies from 2007 and 2012 (before and after multidisciplinary approach implementation) was performed at a single institution. Patient medical records were reviewed and main variables analyzed include biopsy sample size, radiologist, number of blocks submitted, diagnosis, and complications. The biopsy sample size measured was considered to be directly proportional to tissue volume in the block. Biopsy sample size increased 2.5 fold with the average total biopsy sample size increasing from 1.0 cm (0.9-1.1 cm) in 2007 to 2.5 cm (2.3-2.8 cm) in 2012 (P<0.0001). The improvement was statistically significant for each individual radiologist. During the same time, the rate of pneumothorax requiring chest tube placement decreased from 15% to 7% (P = 0.065). No other major complications were identified. The proportion of tumor within the biopsy material was similar at 28% (23%-33%) and 35% (30%-40%) for 2007 and 2012, respectively. The number of cases with at least two blocks available for testing increased from 10.7% to 96.4% (P<0.0001). The effect of this multidisciplinary strategy to CT-guided lung biopsies was effective in significantly increasing tissue volume and number of blocks available for advanced diagnostic testing.
MaNGA: Target selection and Optimization
NASA Astrophysics Data System (ADS)
Wake, David
2015-01-01
The 6-year SDSS-IV MaNGA survey will measure spatially resolved spectroscopy for 10,000 nearby galaxies using the Sloan 2.5m telescope and the BOSS spectrographs with a new fiber arrangement consisting of 17 individually deployable IFUs. We present the simultaneous design of the target selection and IFU size distribution to optimally meet our targeting requirements. The requirements for the main samples were to use simple cuts in redshift and magnitude to produce an approximately flat number density of targets as a function of stellar mass, ranging from 1x109 to 1x1011 M⊙, and radial coverage to either 1.5 (Primary sample) or 2.5 (Secondary sample) effective radii, while maximizing S/N and spatial resolution. In addition we constructed a 'Color-Enhanced' sample where we required 25% of the targets to have an approximately flat number density in the color and mass plane. We show how these requirements are met using simple absolute magnitude (and color) dependent redshift cuts applied to an extended version of the NASA Sloan Atlas (NSA), how this determines the distribution of IFU sizes and the resulting properties of the MaNGA sample.
MaNGA: Target selection and Optimization
NASA Astrophysics Data System (ADS)
Wake, David
2016-01-01
The 6-year SDSS-IV MaNGA survey will measure spatially resolved spectroscopy for 10,000 nearby galaxies using the Sloan 2.5m telescope and the BOSS spectrographs with a new fiber arrangement consisting of 17 individually deployable IFUs. We present the simultaneous design of the target selection and IFU size distribution to optimally meet our targeting requirements. The requirements for the main samples were to use simple cuts in redshift and magnitude to produce an approximately flat number density of targets as a function of stellar mass, ranging from 1x109 to 1x1011 M⊙, and radial coverage to either 1.5 (Primary sample) or 2.5 (Secondary sample) effective radii, while maximizing S/N and spatial resolution. In addition we constructed a "Color-Enhanced" sample where we required 25% of the targets to have an approximately flat number density in the color and mass plane. We show how these requirements are met using simple absolute magnitude (and color) dependent redshift cuts applied to an extended version of the NASA Sloan Atlas (NSA), how this determines the distribution of IFU sizes and the resulting properties of the MaNGA sample.
Dry particle generation with a 3-D printed fluidized bed generator
Roesch, Michael; Roesch, Carolin; Cziczo, Daniel J.
2017-06-02
We describe the design and testing of PRIZE (PRinted fluidIZed bed gEnerator), a compact fluidized bed aerosol generator manufactured using stereolithography (SLA) printing. Dispersing small quantities of powdered materials – due to either rarity or expense – is challenging due to a lack of small, low-cost dry aerosol generators. With this as motivation, we designed and built a generator that uses a mineral dust or other dry powder sample mixed with bronze beads that sit atop a porous screen. A particle-free airflow is introduced, dispersing the sample as airborne particles. The total particle number concentrations and size distributions were measured duringmore » different stages of the assembling process to show that the SLA 3-D printed generator did not generate particles until the mineral dust sample was introduced. Furthermore, time-series measurements with Arizona Test Dust (ATD) showed stable total particle number concentrations of 10–150 cm -3, depending on the sample mass, from the sub- to super-micrometer size range. Additional tests with collected soil dust samples are also presented. PRIZE is simple to assemble, easy to clean, inexpensive and deployable for laboratory and field studies that require dry particle generation.« less
Dry particle generation with a 3-D printed fluidized bed generator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roesch, Michael; Roesch, Carolin; Cziczo, Daniel J.
We describe the design and testing of PRIZE (PRinted fluidIZed bed gEnerator), a compact fluidized bed aerosol generator manufactured using stereolithography (SLA) printing. Dispersing small quantities of powdered materials – due to either rarity or expense – is challenging due to a lack of small, low-cost dry aerosol generators. With this as motivation, we designed and built a generator that uses a mineral dust or other dry powder sample mixed with bronze beads that sit atop a porous screen. A particle-free airflow is introduced, dispersing the sample as airborne particles. The total particle number concentrations and size distributions were measured duringmore » different stages of the assembling process to show that the SLA 3-D printed generator did not generate particles until the mineral dust sample was introduced. Furthermore, time-series measurements with Arizona Test Dust (ATD) showed stable total particle number concentrations of 10–150 cm -3, depending on the sample mass, from the sub- to super-micrometer size range. Additional tests with collected soil dust samples are also presented. PRIZE is simple to assemble, easy to clean, inexpensive and deployable for laboratory and field studies that require dry particle generation.« less
Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.
2015-01-01
Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.
Exposure to ultrafine particles in hospitality venues with partial smoking bans.
Neuberger, Manfred; Moshammer, Hanns; Schietz, Armin
2013-01-01
Fine particles in hospitality venues with insufficient smoking bans indicate health risks from passive smoking. In a random sample of Viennese inns (restaurants, cafes, bars, pubs and discotheques) effects of partial smoking bans on indoor air quality were examined by measurement of count, size and chargeable surface of ultrafine particles (UFPs) sized 10-300 nm, simultaneously with mass of particles sized 300-2500 nm (PM2.5). Air samples were taken in 134 rooms unannounced during busy hours and analyzed by a diffusion size classifier and an optical particle counter. Highest number concentrations of particles were found in smoking venues and smoking rooms (median 66,011 pt/cm(3)). Even non-smoking rooms adjacent to smoking rooms were highly contaminated (median 25,973 pt/cm(3)), compared with non-smoking venues (median 7408 pt/cm(3)). The particle number concentration was significantly correlated with the fine particle mass (P<0.001). We conclude that the existing tobacco law in Austria is ineffective to protect customers in non-smoking rooms of hospitality premises. Health protection of non-smoking guests and employees from risky UFP concentration is insufficient, even in rooms labeled "non-smoking". Partial smoking bans with separation of smoking rooms failed.
The Discovery of Single-Nucleotide Polymorphisms—and Inferences about Human Demographic History
Wakeley, John; Nielsen, Rasmus; Liu-Cordero, Shau Neen; Ardlie, Kristin
2001-01-01
A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation “SDL,” for “SNP-discovered locus,” in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled. PMID:11704929
Li, Peng; Redden, David T.
2014-01-01
SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738
Chaibub Neto, Elias
2015-01-01
In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965
Do icon arrays help reduce denominator neglect?
Garcia-Retamero, Rocio; Galesic, Mirta; Gigerenzer, Gerd
2010-01-01
Denominator neglect is the focus on the number of times a target event has happened (e.g., the number of treated and nontreated patients who die) without considering the overall number of opportunities for it to happen (e.g., the overall number of treated and nontreated patients). In 2 studies, we addressed the effect of denominator neglect in problems involving treatment risk reduction where samples of treated and non-treated patients and the relative risk reduction were of different sizes. We also tested whether using icon arrays helps people take these different sample sizes into account. We especially focused on older adults, who are often more disadvantaged when making decisions about their health. . Study 1 was conducted on a laboratory sample using a within-subjects design; study 2 was conducted on a nonstudent sample interviewed through the Web using a between-subjects design. Accuracy of understanding risk reduction. Participants often paid too much attention to numerators and insufficient attention to denominators when numerical information about treatment risk reduction was provided. Adding icon arrays to the numerical information, however, drew participants' attention to the denominators and helped them make more accurate assessments of treatment risk reduction. Icon arrays were equally helpful to younger and older adults. Building on previous research showing that problems with understanding numerical information often do not reside in the mind but in the representation of the problem, the results show that icon arrays are an effective method of eliminating denominator neglect.
Burgess, George H.; Bruce, Barry D.; Cailliet, Gregor M.; Goldman, Kenneth J.; Grubbs, R. Dean; Lowe, Christopher G.; MacNeil, M. Aaron; Mollet, Henry F.; Weng, Kevin C.; O'Sullivan, John B.
2014-01-01
White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in “central California” at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats. PMID:24932483
Burgess, George H; Bruce, Barry D; Cailliet, Gregor M; Goldman, Kenneth J; Grubbs, R Dean; Lowe, Christopher G; MacNeil, M Aaron; Mollet, Henry F; Weng, Kevin C; O'Sullivan, John B
2014-01-01
White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in "central California" at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats.
Oba, Yurika; Yamada, Toshihiro
2017-05-01
We estimated the sample size (the number of samples) required to evaluate the concentration of radiocesium ( 137 Cs) in Japanese fir (Abies firma Sieb. & Zucc.), 5 years after the outbreak of the Fukushima Daiichi Nuclear Power Plant accident. We investigated the spatial structure of the contamination levels in this species growing in a mixed deciduous broadleaf and evergreen coniferous forest stand. We sampled 40 saplings with a tree height of 150 cm-250 cm in a Fukushima forest community. The results showed that: (1) there was no correlation between the 137 Cs concentration in needles and soil, and (2) the difference in the spatial distribution pattern of 137 Cs concentration between needles and soil suggest that the contribution of root uptake to 137 Cs in new needles of this species may be minor in the 5 years after the radionuclides were released into the atmosphere. The concentration of 137 Cs in needles showed a strong positive spatial autocorrelation in the distance class from 0 to 2.5 m, suggesting that the statistical analysis of data should consider spatial autocorrelation in the case of an assessment of the radioactive contamination of forest trees. According to our sample size analysis, a sample size of seven trees was required to determine the mean contamination level within an error in the means of no more than 10%. This required sample size may be feasible for most sites. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Alexander, Louise; Snape, Joshua F.; Joy, Katherine H.; Downes, Hilary; Crawford, Ian A.
2016-09-01
Lunar mare basalts provide insights into the compositional diversity of the Moon's interior. Basalt fragments from the lunar regolith can potentially sample lava flows from regions of the Moon not previously visited, thus, increasing our understanding of lunar geological evolution. As part of a study of basaltic diversity at the Apollo 12 landing site, detailed petrological and geochemical data are provided here for 13 basaltic chips. In addition to bulk chemistry, we have analyzed the major, minor, and trace element chemistry of mineral phases which highlight differences between basalt groups. Where samples contain olivine, the equilibrium parent melt magnesium number (Mg#; atomic Mg/[Mg + Fe]) can be calculated to estimate parent melt composition. Ilmenite and plagioclase chemistry can also determine differences between basalt groups. We conclude that samples of approximately 1-2 mm in size can be categorized provided that appropriate mineral phases (olivine, plagioclase, and ilmenite) are present. Where samples are fine-grained (grain size <0.3 mm), a "paired samples t-test" can provide a statistical comparison between a particular sample and known lunar basalts. Of the fragments analyzed here, three are found to belong to each of the previously identified olivine and ilmenite basalt suites, four to the pigeonite basalt suite, one is an olivine cumulate, and two could not be categorized because of their coarse grain sizes and lack of appropriate mineral phases. Our approach introduces methods that can be used to investigate small sample sizes (i.e., fines) from future sample return missions to investigate lava flow diversity and petrological significance.
NASA Technical Reports Server (NTRS)
Woods, D.
1980-01-01
The size distributions of particles in the exhaust plumes from the Titan rockets launched in August and September 1977 were determined from in situ measurements made from a small sampling aircraft that flew through the plumes. Two different sampling instruments were employed, a quartz crystal microbalance (QCM) cascade impactor and a forward scattering spectrometer probe (FSSP). The QCM measured the nonvolatile component of the aerosols in the plume covering an aerodynamic size ranging from 0.05 to 25 micrometers diameter. The FSSP, flown outside the aircraft under the nose section, measured both the liquid droplets and the solid particles over a size range from 0.5 to 7.5 micrometers in diameter. The particles were counted and classified into 15 size intervals. The presence of a large number of liquid droplets in the exhaust clouds is discussed and data are plotted for each launch and compared.
Quantitative study of fungiform papillae and taste buds on the cat's tongue.
Robinson, P P; Winkles, P A
1990-01-01
The number of fungiform papillae has been counted on the tongues of six adult cats and of kittens both at birth and aged 2 and 4 months. Papillae were sampled from different regions of the tongue, and their size and the number of taste buds they contained were determined using histological sections taken parallel to the tongue surface. There were approximately 250 fungiform papillae on the tongues of the adult cats, the papillae were most numerous at the tip of the tongue, and there was no significant difference between the number of papillae on each side. The size of the papillae increased from a mean maximum diameter of 0.28 mm at the tip of the tongue to 0.48 mm at the back; the mean number of taste buds increased correspondingly from 6.9 to 16.6. The kitten tongues had a number and distribution of fungiform papillae similar to that found in the adults. In the neonate, papillae were smaller and contained fewer taste buds; these parameters increased with the corresponding increase in tongue size in the 2- and 4-month-old kittens.
Identification of missing variants by combining multiple analytic pipelines.
Ren, Yingxue; Reddy, Joseph S; Pottier, Cyril; Sarangi, Vivekananda; Tian, Shulan; Sinnwell, Jason P; McDonnell, Shannon K; Biernacka, Joanna M; Carrasquillo, Minerva M; Ross, Owen A; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hudson, Matthew; Mainzer, Liudmila Sergeevna; Asmann, Yan W
2018-04-16
After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. Identification of the complete variant set from sequencing data is the prerequisite of genetic association analyses. The current analytic practice of calling genetic variants from sequencing data using a single bioinformatics pipeline is no longer adequate with the increasingly large projects. The number and percentage of quality variants that passed quality filters but are missed by the one-pipeline approach rapidly increased with sample size.
STUDY OF HOME DEMONSTRATION UNITS IN A SAMPLE OF 27 COUNTIES IN NEW YORK STATE, NUMBER 3.
ERIC Educational Resources Information Center
ALEXANDER, FRANK D.; HARSHAW, JEAN
AN EXPLORATORY STUDY EXAMINED CHARACTERISTICS OF 1,128 HOME DEMONSTRATION UNITS TO SUGGEST HYPOTHESES AND SCOPE FOR A MORE INTENSIVE STUDY OF A SMALL SAMPLE OF UNITS, AND TO PROVIDE GUIDANCE IN SAMPLING. DATA WERE OBTAINED FROM A SPECIALLY DESIGNED MEMBERSHIP CARD USED IN 1962. UNIT SIZE AVERAGED 23.6 MEMBERS BUT THE RANGE WAS FAIRLY GREAT. A NEED…
Size distribution and growth rate of crystal nuclei near critical undercooling in small volumes
NASA Astrophysics Data System (ADS)
Kožíšek, Z.; Demo, P.
2017-11-01
Kinetic equations are numerically solved within standard nucleation model to determine the size distribution of nuclei in small volumes near critical undercooling. Critical undercooling, when first nuclei are detected within the system, depends on the droplet volume. The size distribution of nuclei reaches the stationary value after some time delay and decreases with nucleus size. Only a certain maximum size of nuclei is reached in small volumes near critical undercooling. As a model system, we selected recently studied nucleation in Ni droplet [J. Bokeloh et al., Phys. Rev. Let. 107 (2011) 145701] due to available experimental and simulation data. However, using these data for sample masses from 23 μg up to 63 mg (corresponding to experiments) leads to the size distribution of nuclei, when no critical nuclei in Ni droplet are formed (the number of critical nuclei < 1). If one takes into account the size dependence of the interfacial energy, the size distribution of nuclei increases to reasonable values. In lower volumes (V ≤ 10-9 m3) nucleus size reaches some maximum extreme size, which quickly increases with undercooling. Supercritical clusters continue their growth only if the number of critical nuclei is sufficiently high.
Vitamin D receptor gene and osteoporosis - author`s response
DOE Office of Scientific and Technical Information (OSTI.GOV)
Looney, J.E.; Yoon, Hyun Koo; Fischer, M.
1996-04-01
We appreciate the comments of Dr. Nguyen et al. about our recent study, but we disagree with their suggestion that the lack of an association between low bone density and the BB VDR genotype, which we reported, is an artifact generated by the small sample size. Furthermore, our results are consistent with similar conclusions reached by a number of other investigators, as recently reported by Peacock. Peacock states {open_quotes}Taken as a whole, the results of studies outlined ... indicate that VDR alleles, cannot account for the major part of the heritable component of bone density as indicated by Morrison etmore » al.{close_quotes}. The majority of the 17 studies cited in this editorial could not confirm an association between the VDR genotype and the bone phenotype. Surely one cannot criticize this combined work as representing an artifact because of a too small sample size. We do not dispute the suggestion by Nguyen et al. that large sample sizes are required to analyze small biological effects. This is evident in both Peacock`s summary and in their own bone density studies. We did not design our study with a larger sample size because, based on the work of Morrison et al., we had hypothesized a large biological effect; large sample sizes are only needed for small biological effects. 4 refs.« less
ERIC Educational Resources Information Center
Nevitt, Jonathan; Hancock, Gregory R.
2001-01-01
Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.
2014-04-15
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample sizemore » required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence.« less
Kunz, Cornelia U; Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2017-03-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under- or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re-assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one-sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. © 2016 The Authors. Biometrical Journal Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2016-01-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under‐ or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re‐assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one‐sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. PMID:27886393
Sample size and power considerations in network meta-analysis
2012-01-01
Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327
Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian; Tiedje, James M.
2016-01-01
We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungal community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation. PMID:27313569
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pan, Bo; Shibutani, Yoji, E-mail: sibutani@mech.eng.osaka-u.ac.jp; Zhang, Xu
2015-07-07
Recent research has explained that the steeply increasing yield strength in metals depends on decreasing sample size. In this work, we derive a statistical physical model of the yield strength of finite single-crystal micro-pillars that depends on single-ended dislocation pile-up inside the micro-pillars. We show that this size effect can be explained almost completely by considering the stochastic lengths of the dislocation source and the dislocation pile-up length in the single-crystal micro-pillars. The Hall–Petch-type relation holds even in a microscale single-crystal, which is characterized by its dislocation source lengths. Our quantitative conclusions suggest that the number of dislocation sources andmore » pile-ups are significant factors for the size effect. They also indicate that starvation of dislocation sources is another reason for the size effect. Moreover, we investigated the explicit relationship between the stacking fault energy and the dislocation “pile-up” effect inside the sample: materials with low stacking fault energy exhibit an obvious dislocation pile-up effect. Our proposed physical model predicts a sample strength that agrees well with experimental data, and our model can give a more precise prediction than the current single arm source model, especially for materials with low stacking fault energy.« less
Dombrowski, Kirk; Khan, Bilal; Wendel, Travis; McLean, Katherine; Misshula, Evan; Curtis, Ric
2012-12-01
As part of a recent study of the dynamics of the retail market for methamphetamine use in New York City, we used network sampling methods to estimate the size of the total networked population. This process involved sampling from respondents' list of co-use contacts, which in turn became the basis for capture-recapture estimation. Recapture sampling was based on links to other respondents derived from demographic and "telefunken" matching procedures-the latter being an anonymized version of telephone number matching. This paper describes the matching process used to discover the links between the solicited contacts and project respondents, the capture-recapture calculation, the estimation of "false matches", and the development of confidence intervals for the final population estimates. A final population of 12,229 was estimated, with a range of 8235 - 23,750. The techniques described here have the special virtue of deriving an estimate for a hidden population while retaining respondent anonymity and the anonymity of network alters, but likely require larger sample size than the 132 persons interviewed to attain acceptable confidence levels for the estimate.
Decomposition and model selection for large contingency tables.
Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter
2010-04-01
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.
Secular trends in impact factor of neonatology publications over a 10-year period.
Marom, Ronella; Mimouni, Francis B; Cohen, Shlomi; Lubetzky, Ronit; Mandel, Dror
2012-10-01
To test the hypotheses that published randomized clinical trials (RCTs) in neonatology with negative results (NR) are more likely to be published in journals with lower impact factor (IF) than those with positive results (PR); that there is an increase in the number of yearly published RCTs; that studies with large sample sizes are likely to be published in journals with higher IF. We used all English-written RCTs registered in MEDLINE between 1/1/2001-31/12/2010 in the field of neonatology. Each RCT was classified as having a PR or NR. IF of each journal was determined for the year of publication. We identified 329 RCTs. Yearly number of RCTs varied between 19 and 46, with no significant consistent linear increase over the years. There was no significant change over the years in average IF or in average patient size. IF and sample size of the studies were not significantly higher in studies with PR than in studies with NR. The number of RCTs per year in the field of neonatology has stabilized in the past 10 years, and RCTs with positive or negative results are published in journals of similar IF. © 2012 The Author(s)/Acta Paediatrica © 2012 Foundation Acta Paediatrica.
Dziak, John J.; Nahum-Shani, Inbal; Collins, Linda M.
2012-01-01
Factorial experimental designs have many potential advantages for behavioral scientists. For example, such designs may be useful in building more potent interventions, by helping investigators to screen several candidate intervention components simultaneously and decide which are likely to offer greater benefit before evaluating the intervention as a whole. However, sample size and power considerations may challenge investigators attempting to apply such designs, especially when the population of interest is multilevel (e.g., when students are nested within schools, or employees within organizations). In this article we examine the feasibility of factorial experimental designs with multiple factors in a multilevel, clustered setting (i.e., of multilevel multifactor experiments). We conduct Monte Carlo simulations to demonstrate how design elements such as the number of clusters, the number of lower-level units, and the intraclass correlation affect power. Our results suggest that multilevel, multifactor experiments are feasible for factor-screening purposes, because of the economical properties of complete and fractional factorial experimental designs. We also discuss resources for sample size planning and power estimation for multilevel factorial experiments. These results are discussed from a resource management perspective, in which the goal is to choose a design that maximizes the scientific benefit using the resources available for an investigation. PMID:22309956
Dziak, John J; Nahum-Shani, Inbal; Collins, Linda M
2012-06-01
Factorial experimental designs have many potential advantages for behavioral scientists. For example, such designs may be useful in building more potent interventions by helping investigators to screen several candidate intervention components simultaneously and to decide which are likely to offer greater benefit before evaluating the intervention as a whole. However, sample size and power considerations may challenge investigators attempting to apply such designs, especially when the population of interest is multilevel (e.g., when students are nested within schools, or when employees are nested within organizations). In this article, we examine the feasibility of factorial experimental designs with multiple factors in a multilevel, clustered setting (i.e., of multilevel, multifactor experiments). We conduct Monte Carlo simulations to demonstrate how design elements-such as the number of clusters, the number of lower-level units, and the intraclass correlation-affect power. Our results suggest that multilevel, multifactor experiments are feasible for factor-screening purposes because of the economical properties of complete and fractional factorial experimental designs. We also discuss resources for sample size planning and power estimation for multilevel factorial experiments. These results are discussed from a resource management perspective, in which the goal is to choose a design that maximizes the scientific benefit using the resources available for an investigation. (c) 2012 APA, all rights reserved
NASA Astrophysics Data System (ADS)
KIM, H.; Suk, M. K.; Jung, S. A.; Park, J. S.; Ko, J. S.
2016-12-01
The data quality of dual-polarimetric weather radar is subject to radar scanning strategies such as pulse length, pulse repetition frequency (PRF), antenna scan speed, and sampling number. In terms of sampling number, the quality of radar moment data increases with the increasing of sampling number at the given PRF and pulse length while the feasible number of elevation angles decreases for the given time or the time required for radar volume scan increases with the relatively high sampling number. For operational weather radar, the sampling number is subjectively determined by the proficient radar operator. The determination of suitable sampling number is still challengeable for operational dual-polarimetric weather radar.In this study, we analyzed the sensitivity of polarimetric measurements to sampling number based on special radar experiment for rainfall and snowfall events using S-band dual-polarimetric radar (YIT) at Yong-In test bed. For this experiment, YIT radar transmitted a simultaneously polarized beam in horizontal and vertical with pulse length of 1.0 μs and single PRF of 600Hz. The beam width and gate size were 1.0° and 250m, respectively. The volume scan was composed of three PPI scans with three sampling numbers (antenna scan speed) of 40 (15°s-1), 60(10°s-1), and 85(7°s-1) at same elevation angle (=0.2°). We first investigated the spatial fluctuation of the polarimetric measurements according to three sampling numbers using radial texture. As the sampling number increases, the radial fluctuations of polarimetric measurements decrease. Second, we also examined the sensitivity to fuzzy logic based quality control algorithm for dual-polarimetric radar (Ye et al. 2015). The probability density functions (PDFs) of fuzzy logic feature parameters between ground clutter and meteorological echo area were compared. For overlapping area in both PDFs between ground clutter and meteorological echo increases with decreasing the sampling number. As the overlapping area increases, the classification of ground clutter (or meteorological echo) in fuzzy logic classifier is more difficult due to similar characteristics between ground clutter and meteorological echoes.
Olives, Casey; Valadez, Joseph J.; Brooker, Simon J.; Pagano, Marcello
2012-01-01
Background Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. Methodology We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n = 15 and n = 25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Principle Findings Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n = 15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. Conclusion/Significance This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools. PMID:22970333
NASA Astrophysics Data System (ADS)
Song, Ho-Jun; Kim, Ji-Woo; Kook, Min-Suk; Moon, Won-Jin; Park, Yeong-Joon
2010-09-01
AC-type microarc oxidation (MAO) and hydrothermal treatment techniques were used to enhance the bioactivity of commercially pure titanium (CP-Ti). The porous TiO 2 layer fabricated by the MAO treatment had a dominant anatase structure and contained Ca and P ions. The MAO-treated specimens were treated hydrothermally to form HAp crystallites on the titanium oxide layer in an alkaline aqueous solution (OH-solution) or phosphorous-containing alkaline solution (POH-solution). A small number of micro-sized hydroxyapatite (HAp) crystallites and a thin layer composed of nano-sized HAps were formed on the Ti-MAO-OH group treated hydrothermally in an OH-solution, whereas a large number of micro-sized HAp crystallites and dense anatase TiO 2 nanorods were formed on the Ti-MAO-POH group treated hydrothermally in a POH-solution. The layer of bone-like apatite that formed on the surface of the POH-treated sample after soaking in a modified simulated body fluid was thicker than that on the OH-treated samples.
Mercury and methylmercury in reservoirs in Indiana
Risch, Martin R.; Fredericksen, Amanda L.
2015-01-01
Methylmercury (reported as Hg) in fish-tissue samples collected for the State fish consumption advisory program was used to describe MeHg food-web accumulation and magnification in the reservoirs. The highest percentages of fish-tissue samples with Hg concentrations that exceeded the criterion of 0.30 milligram per kilogram for protection of human health were from Monroe Lake (38 percent) and Patoka Lake (33 percent). A review of the number and size of fish species caught from these two reservoirs resulted in two implications for fish consumption by humans. First, the highest numbers of fish harvested for potential human consumption were species more likely to have MeHg concentrations lower than the human-health criterion (crappie, bluegill, and catfish). Second, although largemouth bass were likely to have MeHg concentrations higher than the human-health criterion, they were caught and released more often than they were harvested. However, the average size largemouth bass (in both reservoirs) and above-average size walleye (in Monroe Lake) that were harvested for potential human consumption were likely to have MeHg concentrations higher than the human-health criterion.
Smith, W.P.; Wiedenfeld, D.A.; Hanel, P.B.; Twedt, D.J.; Ford, R.P.; Cooper, R.J.; Smith, Winston Paul
1993-01-01
To quantify efficacy of point count sampling in bottomland hardwood forests, we examined the influence of point count duration on corresponding estimates of number of individuals and species recorded. To accomplish this we conducted a totalof 82 point counts 7 May-16 May 1992distributed among three habitats (Wet, Mesic, Dry) in each of three regions within the lower Mississippi Alluvial Valley (MAV). Each point count consisted of recording the number of individual birds (all species) seen or heard during the initial three minutes and per each minute thereafter for a period totaling ten minutes. In addition, we included 384 point counts recorded during an 8-week period in each of 3 years (1985-1987) among 56 randomly-selected forest patches within the bottomlands of western Tennessee. Each point count consisted of recording the number of individuals (excluding migrating species) during each of four, 5 minute intervals for a period totaling 20 minutes. To estimate minimum sample size, we determined sampling variation at each level (region, habitat, and locality) with the 82 point counts from the lower (MAV) and applied the procedures of Neter and Wasserman (1974:493; Applied linear statistical models). Neither the cumulative number of individuals nor number of species per sampling interval attained an asymptote after 10 or 20 minutes of sampling. For western Tennessee bottomlands, total individual and species counts relative to point count duration were similar among years and comparable to the pattern observed throughout the lower MAV. Across the MAV, we recorded a total of 1,62 1 birds distributed among 52 species with the majority (8721/1621) representing 8 species. More birds were recorded within 25-50 m than in either of the other distance categories. There was significant variation in numbers of individuals and species among point counts. For both, significant differences between region and patch (nested within region) occurred; neither habitat nor interaction between habitat and region was significant. For = 0.05 and L3 = 0.10, minimum sample size estimates (per factor level) varied by orders of magnitude depending upon the observed or specified range of desired detectable difference. For observed regional variation, 20 and 40 point counts were required to accommodate variability in total birds (MSE = 9.28) and species (MSE = 3.79), respectively; 25 percent of the mean could be achieved with 5 counts per factor level. Corresponding sample sizes required to detect differences of rarer species (e.g., Wood Thrush) were 500; for common species (e.g., Northern Cardinal) this same level of precision could be achieved with 100 counts.
Krishnan, Neeraja M.; Gaur, Prakhar; Chaudhary, Rakshit; Rao, Arjun A.; Panda, Binay
2012-01-01
Copy Number Alterations (CNAs) such as deletions and duplications; compose a larger percentage of genetic variations than single nucleotide polymorphisms or other structural variations in cancer genomes that undergo major chromosomal re-arrangements. It is, therefore, imperative to identify cancer-specific somatic copy number alterations (SCNAs), with respect to matched normal tissue, in order to understand their association with the disease. We have devised an accurate, sensitive, and easy-to-use tool, COPS, COpy number using Paired Samples, for detecting SCNAs. We rigorously tested the performance of COPS using short sequence simulated reads at various sizes and coverage of SCNAs, read depths, read lengths and also with real tumor:normal paired samples. We found COPS to perform better in comparison to other known SCNA detection tools for all evaluated parameters, namely, sensitivity (detection of true positives), specificity (detection of false positives) and size accuracy. COPS performed well for sequencing reads of all lengths when used with most upstream read alignment tools. Additionally, by incorporating a downstream boundary segmentation detection tool, the accuracy of SCNA boundaries was further improved. Here, we report an accurate, sensitive and easy to use tool in detecting cancer-specific SCNAs using short-read sequence data. In addition to cancer, COPS can be used for any disease as long as sequence reads from both disease and normal samples from the same individual are available. An added boundary segmentation detection module makes COPS detected SCNA boundaries more specific for the samples studied. COPS is available at ftp://115.119.160.213 with username “cops” and password “cops”. PMID:23110103
NASA Astrophysics Data System (ADS)
Starost, K.; Frijns, E.; Laer, J. V.; Faisal, N.; Egizabal, A.; Elizextea, C.; Nelissen, I.; Blazquez, M.; Njuguna, J.
2017-05-01
In this study, the effect on nanoparticle emissions due to drilling on Polypropylene (PP) reinforced with 20% talc, 5% montmorillonite (MMT) and 5% Wollastonite (WO) is investigated. The study is the first to explore the nanoparticle release from WO and talc reinforced composites and compares the results to previously researched MMT. With 5% WO, equivalent tensile properties with a 10 % weight reduction were obtained relative to the reference 20% talc sample. The materials were fabricated through injection moulding. The nanorelease studies were undertaken using the controlled drilling methodology for nanoparticle exposure assessment developed within the European Commission funded SIRENA Life 11 ENV/ES/506 project. Measurements were taken using CPC and DMS50 equipment for real-time characterization and measurements. The particle number concentration (of particles <1000nm) and particle size distribution (4.87nm - 562.34nm) of the particles emitted during drilling were evaluated to investigate the effect of the silicate fillers on the particles released. The nano-filled samples exhibited a 33% decrease (MMT sample) or a 30% increase (WO sample) on the average particle number concentration released in comparison to the neat polypropylene sample. The size distribution data displayed a substantial percentage of the particles released from the PP, PP/WO and PP/MMT samples to be between 5-20nm, whereas the PP/talc sample emitted larger particle diameters.
Dimensions of design space: a decision-theoretic approach to optimal research design.
Conti, Stefano; Claxton, Karl
2009-01-01
Bayesian decision theory can be used not only to establish the optimal sample size and its allocation in a single clinical study but also to identify an optimal portfolio of research combining different types of study design. Within a single study, the highest societal payoff to proposed research is achieved when its sample sizes and allocation between available treatment options are chosen to maximize the expected net benefit of sampling (ENBS). Where a number of different types of study informing different parameters in the decision problem could be conducted, the simultaneous estimation of ENBS across all dimensions of the design space is required to identify the optimal sample sizes and allocations within such a research portfolio. This is illustrated through a simple example of a decision model of zanamivir for the treatment of influenza. The possible study designs include: 1) a single trial of all the parameters, 2) a clinical trial providing evidence only on clinical endpoints, 3) an epidemiological study of natural history of disease, and 4) a survey of quality of life. The possible combinations, samples sizes, and allocation between trial arms are evaluated over a range of cost-effectiveness thresholds. The computational challenges are addressed by implementing optimization algorithms to search the ENBS surface more efficiently over such large dimensions.
Photographic techniques for characterizing streambed particle sizes
Whitman, Matthew S.; Moran, Edward H.; Ourso, Robert T.
2003-01-01
We developed photographic techniques to characterize coarse (>2-mm) and fine (≤2-mm) streambed particle sizes in 12 streams in Anchorage, Alaska. Results were compared with current sampling techniques to assess which provided greater sampling efficiency and accuracy. The streams sampled were wadeable and contained gravel—cobble streambeds. Gradients ranged from about 5% at the upstream sites to about 0.25% at the downstream sites. Mean particle sizes and size-frequency distributions resulting from digitized photographs differed significantly from those resulting from Wolman pebble counts for five sites in the analysis. Wolman counts were biased toward selecting larger particles. Photographic analysis also yielded a greater number of measured particles (mean = 989) than did the Wolman counts (mean = 328). Stream embeddedness ratings assigned from field and photographic observations were significantly different at 5 of the 12 sites, although both types of ratings showed a positive relationship with digitized surface fines. Visual estimates of embeddedness and digitized surface fines may both be useful indicators of benthic conditions, but digitizing surface fines produces quantitative rather than qualitative data. Benefits of the photographic techniques include reduced field time, minimal streambed disturbance, convenience of postfield processing, easy sample archiving, and improved accuracy and replication potential.
Ait Kaci Azzou, Sadoune; Larribe, Fabrice; Froda, Sorana
2015-01-01
The effective population size over time (demographic history) can be retraced from a sample of contemporary DNA sequences. In this paper, we propose a novel methodology based on importance sampling (IS) for exploring such demographic histories. Our starting point is the generalized skyline plot with the main difference being that our procedure, skywis plot, uses a large number of genealogies. The information provided by these genealogies is combined according to the IS weights. Thus, we compute a weighted average of the effective population sizes on specific time intervals (epochs), where the genealogies that agree more with the data are given more weight. We illustrate by a simulation study that the skywis plot correctly reconstructs the recent demographic history under the scenarios most commonly considered in the literature. In particular, our method can capture a change point in the effective population size, and its overall performance is comparable with the one of the bayesian skyline plot. We also introduce the case of serially sampled sequences and illustrate that it is possible to improve the performance of the skywis plot in the case of an exponential expansion of the effective population size. PMID:26300910
Divergent estimation error in portfolio optimization and in linear regression
NASA Astrophysics Data System (ADS)
Kondor, I.; Varga-Haszonits, I.
2008-08-01
The problem of estimation error in portfolio optimization is discussed, in the limit where the portfolio size N and the sample size T go to infinity such that their ratio is fixed. The estimation error strongly depends on the ratio N/T and diverges for a critical value of this parameter. This divergence is the manifestation of an algorithmic phase transition, it is accompanied by a number of critical phenomena, and displays universality. As the structure of a large number of multidimensional regression and modelling problems is very similar to portfolio optimization, the scope of the above observations extends far beyond finance, and covers a large number of problems in operations research, machine learning, bioinformatics, medical science, economics, and technology.
Grimplet, Jérôme; Tello, Javier; Laguna, Natalia; Ibáñez, Javier
2017-01-01
Grapevine cluster compactness has a clear impact on fruit quality and health status, as clusters with greater compactness are more susceptible to pests and diseases and ripen more asynchronously. Different parameters related to inflorescence and cluster architecture (length, width, branching, etc.), fruitfulness (number of berries, number of seeds) and berry size (length, width) contribute to the final level of compactness. From a collection of 501 clones of cultivar Garnacha Tinta, two compact and two loose clones with stable differences for cluster compactness-related traits were selected and phenotyped. Key organs and developmental stages were selected for sampling and transcriptomic analyses. Comparison of global gene expression patterns in flowers at the end of bloom allowed identification of potential gene networks with a role in determining the final berry number, berry size and ultimately cluster compactness. A large portion of the differentially expressed genes were found in networks related to cell division (carbohydrates uptake, cell wall metabolism, cell cycle, nucleic acids metabolism, cell division, DNA repair). Their greater expression level in flowers of compact clones indicated that the number of berries and the berry size at ripening appear related to the rate of cell replication in flowers during the early growth stages after pollination. In addition, fluctuations in auxin and gibberellin signaling and transport related gene expression support that they play a central role in fruit set and impact berry number and size. Other hormones, such as ethylene and jasmonate may differentially regulate indirect effects, such as defense mechanisms activation or polyphenols production. This is the first transcriptomic based analysis focused on the discovery of the underlying gene networks involved in grapevine traits of grapevine cluster compactness, berry number and berry size. PMID:28496449
Grimplet, Jérôme; Tello, Javier; Laguna, Natalia; Ibáñez, Javier
2017-01-01
Grapevine cluster compactness has a clear impact on fruit quality and health status, as clusters with greater compactness are more susceptible to pests and diseases and ripen more asynchronously. Different parameters related to inflorescence and cluster architecture (length, width, branching, etc.), fruitfulness (number of berries, number of seeds) and berry size (length, width) contribute to the final level of compactness. From a collection of 501 clones of cultivar Garnacha Tinta, two compact and two loose clones with stable differences for cluster compactness-related traits were selected and phenotyped. Key organs and developmental stages were selected for sampling and transcriptomic analyses. Comparison of global gene expression patterns in flowers at the end of bloom allowed identification of potential gene networks with a role in determining the final berry number, berry size and ultimately cluster compactness. A large portion of the differentially expressed genes were found in networks related to cell division (carbohydrates uptake, cell wall metabolism, cell cycle, nucleic acids metabolism, cell division, DNA repair). Their greater expression level in flowers of compact clones indicated that the number of berries and the berry size at ripening appear related to the rate of cell replication in flowers during the early growth stages after pollination. In addition, fluctuations in auxin and gibberellin signaling and transport related gene expression support that they play a central role in fruit set and impact berry number and size. Other hormones, such as ethylene and jasmonate may differentially regulate indirect effects, such as defense mechanisms activation or polyphenols production. This is the first transcriptomic based analysis focused on the discovery of the underlying gene networks involved in grapevine traits of grapevine cluster compactness, berry number and berry size.
Modeling the development of written language
Puranik, Cynthia S.; Foorman, Barbara; Foster, Elizabeth; Wilson, Laura Gehron; Tschinkel, Erika; Kantor, Patricia Thatcher
2011-01-01
Alternative models of the structure of individual and developmental differences of written composition and handwriting fluency were tested using confirmatory factor analysis of writing samples provided by first- and fourth-grade students. For both groups, a five-factor model provided the best fit to the data. Four of the factors represented aspects of written composition: macro-organization (use of top sentence and number and ordering of ideas), productivity (number and diversity of words used), complexity (mean length of T-unit and syntactic density), and spelling and punctuation. The fifth factor represented handwriting fluency. Handwriting fluency was correlated with written composition factors at both grades. The magnitude of developmental differences between first grade and fourth grade expressed as effect sizes varied for variables representing the five constructs: large effect sizes were found for productivity and handwriting fluency variables; moderate effect sizes were found for complexity and macro-organization variables; and minimal effect sizes were found for spelling and punctuation variables. PMID:22228924
Evans, T A
2001-12-01
Although mark-recapture protocols produce inaccurate population estimates of termite colonies, they might be employed to estimate a relative change in colony size. This possibility was tested using two Australian, mound-building, wood-eating, subterranean Coptotermes species. Three different toxicants delivered in baits were used to decrease (but not eliminate) colony size, and a single mark-recapture protocol was used to estimate pre- and postbaiting population sizes. For both species, the numbers of termites retrieved from bait stations varied widely, resulting in no significant differences in the numbers of termites sampled between treatments in either the pre- or postbaiting protocols. There were significantly fewer termites sampled in all treatments, controls included, in the postbaiting protocol compared with the pre-, suggesting a seasonal change in forager numbers. The comparison of population estimates shows a large decrease in toxicant treated colonies compared with little change in control colonies, which suggests that estimating the relative decline in population size using mark-recapture protocols might to be possible. However, the change in population estimate was due entirely to the significantly lower recapture rate in the control colonies relative to the toxicant treated colonies, as numbers of unmarked termites did not change between treatments. The population estimates should be treated with caution because low recapture rates produce dubious population estimates and, in some cases, postbaiting mark-recapture population estimates could be much greater than those at prebaiting, despite consumption of bait in sufficient quantities to cause population decline. A possible interaction between fat-stain markers and toxicants should be investigated if mark-recapture population estimates are used. Alternative methods of population change are advised, along with other indirect measures.
Karyological features of wild and cultivated forms of myrtle (Myrtus communis, Myrtaceae).
Serçe, S; Ekbiç, E; Suda, J; Gündüz, K; Kiyga, Y
2010-03-09
Myrtle is an evergreen shrub or small tree widespread throughout the Mediterranean region. In Turkey, both cultivated and wild forms, differing in plant and fruit size and fruit composition, can be found. These differences may have resulted from the domestication of the cultivated form over a long period of time. We investigated whether wild and cultivated forms of myrtle differ in karyological features (i.e., number of somatic chromosomes and relative genome size). We sampled two wild forms and six cultivated types of myrtle. All the samples had the same chromosome number (2n = 2x = 22). The results were confirmed by 4',6-diamidino-2-phenylindole (DAPI) flow cytometry. Only negligible variation (approximately 3%) in relative fluorescence intensity was observed among the different myrtle accessions, with wild genotypes having the smallest values. We concluded that despite considerable morphological differentiation, cultivated and wild myrtle genotypes in Turkey have similar karyological features.
Scott, Frank I; McConnell, Ryan A; Lewis, Matthew E; Lewis, James D
2012-04-01
Significant advances have been made in clinical and epidemiologic research methods over the past 30 years. We sought to demonstrate the impact of these advances on published gastroenterology research from 1980 to 2010. Twenty original clinical articles were randomly selected from each of three journals from 1980, 1990, 2000, and 2010. Each article was assessed for topic, whether the outcome was clinical or physiologic, study design, sample size, number of authors and centers collaborating, reporting of various statistical methods, and external funding. From 1980 to 2010, there was a significant increase in analytic studies, clinical outcomes, number of authors per article, multicenter collaboration, sample size, and external funding. There was increased reporting of P values, confidence intervals, and power calculations, and increased use of large multicenter databases, multivariate analyses, and bioinformatics. The complexity of clinical gastroenterology and hepatology research has increased dramatically, highlighting the need for advanced training of clinical investigators.
Detecting a Weak Association by Testing its Multiple Perturbations: a Data Mining Approach
NASA Astrophysics Data System (ADS)
Lo, Min-Tzu; Lee, Wen-Chung
2014-05-01
Many risk factors/interventions in epidemiologic/biomedical studies are of minuscule effects. To detect such weak associations, one needs a study with a very large sample size (the number of subjects, n). The n of a study can be increased but unfortunately only to an extent. Here, we propose a novel method which hinges on increasing sample size in a different direction-the total number of variables (p). We construct a p-based `multiple perturbation test', and conduct power calculations and computer simulations to show that it can achieve a very high power to detect weak associations when p can be made very large. As a demonstration, we apply the method to analyze a genome-wide association study on age-related macular degeneration and identify two novel genetic variants that are significantly associated with the disease. The p-based method may set a stage for a new paradigm of statistical tests.
2011-01-01
Background The relationship between urbanicity and adolescent health is a critical issue for which little empirical evidence has been reported. Although an association has been suggested, a dichotomous rural versus urban comparison may not succeed in identifying differences between adolescent contexts. This study aims to assess the influence of locality size on risk behaviors in a national sample of young Mexicans living in low-income households, while considering the moderating effect of socioeconomic status (SES). Methods This is a secondary analysis of three national surveys of low-income households in Mexico in different settings: rural, semi-urban and urban areas. We analyzed risk behaviors in 15-21-year-olds and their potential relation to urbanicity. The risk behaviors explored were: tobacco and alcohol consumption, sexual initiation and condom use. The adolescents' localities of residence were classified according to the number of inhabitants in each locality. We used a logistical model to identify an association between locality size and risk behaviors, including an interaction term with SES. Results The final sample included 17,974 adolescents from 704 localities in Mexico. Locality size was associated with tobacco and alcohol consumption, showing a similar effect throughout all SES levels: the larger the size of the locality, the lower the risk of consuming tobacco or alcohol compared with rural settings. The effect of locality size on sexual behavior was more complex. The odds of adolescent condom use were higher in larger localities only among adolescents in the lowest SES levels. We found no statically significant association between locality size and sexual initiation. Conclusions The results suggest that in this sample of adolescents from low-income areas in Mexico, risk behaviors are related to locality size (number of inhabitants). Furthermore, for condom use, this relation is moderated by SES. Such heterogeneity suggests the need for more detailed analyses of both the effects of urbanicity on behavior, and the responses--which are also heterogeneous--required to address this situation. PMID:22129110
Almutairy, Meznah; Torng, Eric
2018-01-01
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar sequences. The major problem with k-mer indexes is that they require lots of memory. Sampling is often used to reduce index size and query time. Most applications use one of two major types of sampling: fixed sampling and minimizer sampling. It is well known that fixed sampling will produce a smaller index, typically by roughly a factor of two, whereas it is generally assumed that minimizer sampling will produce faster query times since query k-mers can also be sampled. However, no direct comparison of fixed and minimizer sampling has been performed to verify these assumptions. We systematically compare fixed and minimizer sampling using the human genome as our database. We use the resulting k-mer indexes for fixed sampling and minimizer sampling to find all maximal exact matches between our database, the human genome, and three separate query sets, the mouse genome, the chimp genome, and an NGS data set. We reach the following conclusions. First, using larger k-mers reduces query time for both fixed sampling and minimizer sampling at a cost of requiring more space. If we use the same k-mer size for both methods, fixed sampling requires typically half as much space whereas minimizer sampling processes queries only slightly faster. If we are allowed to use any k-mer size for each method, then we can choose a k-mer size such that fixed sampling both uses less space and processes queries faster than minimizer sampling. The reason is that although minimizer sampling is able to sample query k-mers, the number of shared k-mer occurrences that must be processed is much larger for minimizer sampling than fixed sampling. In conclusion, we argue that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method.
Torng, Eric
2018-01-01
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar sequences. The major problem with k-mer indexes is that they require lots of memory. Sampling is often used to reduce index size and query time. Most applications use one of two major types of sampling: fixed sampling and minimizer sampling. It is well known that fixed sampling will produce a smaller index, typically by roughly a factor of two, whereas it is generally assumed that minimizer sampling will produce faster query times since query k-mers can also be sampled. However, no direct comparison of fixed and minimizer sampling has been performed to verify these assumptions. We systematically compare fixed and minimizer sampling using the human genome as our database. We use the resulting k-mer indexes for fixed sampling and minimizer sampling to find all maximal exact matches between our database, the human genome, and three separate query sets, the mouse genome, the chimp genome, and an NGS data set. We reach the following conclusions. First, using larger k-mers reduces query time for both fixed sampling and minimizer sampling at a cost of requiring more space. If we use the same k-mer size for both methods, fixed sampling requires typically half as much space whereas minimizer sampling processes queries only slightly faster. If we are allowed to use any k-mer size for each method, then we can choose a k-mer size such that fixed sampling both uses less space and processes queries faster than minimizer sampling. The reason is that although minimizer sampling is able to sample query k-mers, the number of shared k-mer occurrences that must be processed is much larger for minimizer sampling than fixed sampling. In conclusion, we argue that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method. PMID:29389989
ERIC Educational Resources Information Center
Hasselhorn, Marcus; Linke-Hasselhorn, Kathrin
2013-01-01
Eight six-year old German children with development disabilities regarding such number competencies as have been demonstrated to be among the most relevant precursor skills for the acquisition of elementary mathematics received intensive training with the program "Mengen, zählen, Zahlen" ["quantities, counting, numbers"] (MZZ,…
USDA-ARS?s Scientific Manuscript database
A first step in exploring population structure in crop plants and other organisms is to define the number of subpopulations that exist for a given data set. The genetic marker data sets being generated have become increasingly large over time and commonly are the high-dimension, low sample size (HDL...
Rakhshan, Hamid
2016-01-01
Summary Background and purpose: Dental aplasia (or hypodontia) is a frequent and challenging anomaly and thus of interest to many dental fields. Although the number of missing teeth (NMT) in each person is a major clinical determinant of treatment need, there is no meta-analysis on this subject. Therefore, we aimed to investigate the relevant literature, including epidemiological studies and research on dental/orthodontic patients. Methods: Among 50 reports, the effects of ethnicities, regions, sample sizes/types, subjects’ minimum ages, journals’ scientific credit, publication year, and gender composition of samples on the number of missing permanent teeth (except the third molars) per person were statistically analysed (α = 0.05, 0.025, 0.01). Limitations: The inclusion of small studies and second-hand information might reduce the reliability. Nevertheless, these strategies increased the meta-sample size and favoured the generalisability. Moreover, data weighting was carried out to account for the effect of study sizes/precisions. Results: The NMT per affected person was 1.675 [95% confidence interval (CI) = 1.621–1.728], 1.987 (95% CI = 1.949–2.024), and 1.893 (95% CI = 1.864–1.923), in randomly selected subjects, dental/orthodontic patients, and both groups combined, respectively. The effects of ethnicities (P > 0.9), continents (P > 0.3), and time (adjusting for the population type, P = 0.7) were not significant. Dental/orthodontic patients exhibited a significantly greater NMT compared to randomly selected subjects (P < 0.012). Larger samples (P = 0.000) and enrolling younger individuals (P = 0.000) might inflate the observed NMT per person. Conclusions: Time, ethnic backgrounds, and continents seem unlikely influencing factors. Subjects younger than 13 years should be excluded. Larger samples should be investigated by more observers. PMID:25840586
Correlated Observations, the Law of Small Numbers and Bank Runs
2016-01-01
Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications. PMID:27035435
Correlated Observations, the Law of Small Numbers and Bank Runs.
Horváth, Gergely; Kiss, Hubert János
2016-01-01
Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications.
Mesh-size effects on drift sample composition as determined with a triple net sampler
Slack, K.V.; Tilley, L.J.; Kennelly, S.S.
1991-01-01
Nested nets of three different mesh apertures were used to study mesh-size effects on drift collected in a small mountain stream. The innermost, middle, and outermost nets had, respectively, 425 ??m, 209 ??m and 106 ??m openings, a design that reduced clogging while partitioning collections into three size groups. The open area of mesh in each net, from largest to smallest mesh opening, was 3.7, 5.7 and 8.0 times the area of the net mouth. Volumes of filtered water were determined with a flowmeter. The results are expressed as (1) drift retained by each net, (2) drift that would have been collected by a single net of given mesh size, and (3) the percentage of total drift (the sum of the catches from all three nets) that passed through the 425 ??m and 209 ??m nets. During a two day period in August 1986, Chironomidae larvae were dominant numerically in all 209 ??m and 106 ??m samples and midday 425 ??m samples. Large drifters (Ephemerellidae) occurred only in 425 ??m or 209 ??m nets, but the general pattern was an increase in abundance and number of taxa with decreasing mesh size. Relatively more individuals occurred in the larger mesh nets at night than during the day. The two larger mesh sizes retained 70% of the total sediment/detritus in the drift collections, and this decreased the rate of clogging of the 106 ??m net. If an objective of a sampling program is to compare drift density or drift rate between areas or sampling dates, the same mesh size should be used for all sample collection and processing. The mesh aperture used for drift collection should retain all species and life stages of significance in a study. The nested net design enables an investigator to test the adequacy of drift samples. ?? 1991 Kluwer Academic Publishers.
ERIC Educational Resources Information Center
Fiedler, Klaus; Kareev, Yaakov
2006-01-01
Adaptive decision making requires that contingencies between decision options and their relative assets be assessed accurately and quickly. The present research addresses the challenging notion that contingencies may be more visible from small than from large samples of observations. An algorithmic account for such a seemingly paradoxical effect…
The solvation study of carbon, silicon and their mixed nanotubes in water solution.
Hashemi Haeri, Haleh; Ketabi, Sepideh; Hashemianzadeh, Seyed Majid
2012-07-01
Nanotubes are believed to open the road toward different modern fields, either technological or biological. However, the applications of nanotubes have been badly impeded for the poor solubility in water which is especially essential for studies in the presence of living cells. Therefore, water soluble samples are in demand. Herein, the outcomes of Monte Carlo simulations of different sets of multiwall nanotubes immersed in water are reported. A number of multi wall nanotube samples, comprised of pure carbon, pure silicon and several mixtures of carbon and silicon are the subjects of study. The simulations are carried out in an (N,V,T) ensemble. The purpose of this report is to look at the effects of nanotube size (diameter) and nanotube type (pure carbon, pure silicon or a mixture of carbon and silicon) variation on solubility of multiwall nanotubes in terms of number of water molecules in shell volume. It is found that the solubility of the multi wall carbon nanotube samples is size independent, whereas multi wall silicon nanotube samples solubility varies with diameter of the inner tube. The higher solubility of samples containing silicon can be attributed to the larger atomic size of silicon atom which provides more direct contact with the water molecules. The other affecting factor is the bigger inter space (the space between inner and outer tube) in the case of silicon samples. Carbon type multi wall nanotubes appeared as better candidates for transporting water molecules through a multi wall nanotube structure, while in the case of water adsorption problems it is better to use multi wall silicon nanotubes or a mixture of multi wall carbon/ silicon nanotubes.
Feng, Dai; Cortese, Giuliana; Baumgartner, Richard
2017-12-01
The receiver operating characteristic (ROC) curve is frequently used as a measure of accuracy of continuous markers in diagnostic tests. The area under the ROC curve (AUC) is arguably the most widely used summary index for the ROC curve. Although the small sample size scenario is common in medical tests, a comprehensive study of small sample size properties of various methods for the construction of the confidence/credible interval (CI) for the AUC has been by and large missing in the literature. In this paper, we describe and compare 29 non-parametric and parametric methods for the construction of the CI for the AUC when the number of available observations is small. The methods considered include not only those that have been widely adopted, but also those that have been less frequently mentioned or, to our knowledge, never applied to the AUC context. To compare different methods, we carried out a simulation study with data generated from binormal models with equal and unequal variances and from exponential models with various parameters and with equal and unequal small sample sizes. We found that the larger the true AUC value and the smaller the sample size, the larger the discrepancy among the results of different approaches. When the model is correctly specified, the parametric approaches tend to outperform the non-parametric ones. Moreover, in the non-parametric domain, we found that a method based on the Mann-Whitney statistic is in general superior to the others. We further elucidate potential issues and provide possible solutions to along with general guidance on the CI construction for the AUC when the sample size is small. Finally, we illustrate the utility of different methods through real life examples.
Researchers’ Intuitions About Power in Psychological Research
Bakker, Marjan; Hartgerink, Chris H. J.; Wicherts, Jelte M.; van der Maas, Han L. J.
2016-01-01
Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies. PMID:27354203
Researchers' Intuitions About Power in Psychological Research.
Bakker, Marjan; Hartgerink, Chris H J; Wicherts, Jelte M; van der Maas, Han L J
2016-08-01
Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers' experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies. © The Author(s) 2016.
Speckle size in optical Fourier domain imaging
NASA Astrophysics Data System (ADS)
Lamouche, G.; Vergnole, S.; Bisaillon, C.-E.; Dufour, M.; Maciejko, R.; Monchalin, J.-P.
2007-06-01
As in conventional time-domain optical coherence tomography (OCT), speckle is inherent to any Optical Fourier Domain Imaging (OFDI) of biological tissue. OFDI is also known as swept-source OCT (SS-OCT). The axial speckle size is mainly determined by the OCT resolution length and the transverse speckle size by the focusing optics illuminating the sample. There is also a contribution from the sample related to the number of scatterers contained within the probed volume. In the OFDI data processing, there is some liberty in selecting the range of wavelengths used and this allows variation in the OCT resolution length. Consequently the probed volume can be varied. By performing measurements on an optical phantom with a controlled density of discrete scatterers and by changing the probed volume with different range of wavelengths in the OFDI data processing, there is an obvious change in the axial speckle size, but we show that there is also a less obvious variation in the transverse speckle size. This work contributes to a better understanding of speckle in OCT.
Structure of Nano-sized CeO 2 Materials: Combined Scattering and Spectroscopic Investigations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marchbank, Huw R.; Clark, Adam H.; Hyde, Timothy I.
Here, the nature of nano-sized ceria, CeO 2, systems were investigated using neutron and X-ray diffraction and X-ray absorption spectroscopy. Whilst both diffraction andtotal pair distribution functions (PDFs) revealed that in all the samples the occupancy of both Ce 4+ and O 2- are very close to the ideal stoichiometry, the analysis using reverse Monte Carlo technique revealedsignificant disorder around oxygen atoms in the nano sized ceria samples in comparison to the highly crystalline NIST standard.In addition, the analysis reveal that the main differences observed in the pair correlations from various X-ray and neutron diffraction techniques were attributed to themore » particle size of the CeO 2 prepared by the reported three methods. Furthermore, detailed analysis of the Ce L 3– and K-edge EXAFS data support this finding; in particular the decrease in higher shell coordination numbers with respect to the NIST standard, are attributed to differences in particle size.« less
Structure of Nano-sized CeO 2 Materials: Combined Scattering and Spectroscopic Investigations
Marchbank, Huw R.; Clark, Adam H.; Hyde, Timothy I.; ...
2016-08-29
Here, the nature of nano-sized ceria, CeO 2, systems were investigated using neutron and X-ray diffraction and X-ray absorption spectroscopy. Whilst both diffraction andtotal pair distribution functions (PDFs) revealed that in all the samples the occupancy of both Ce 4+ and O 2- are very close to the ideal stoichiometry, the analysis using reverse Monte Carlo technique revealedsignificant disorder around oxygen atoms in the nano sized ceria samples in comparison to the highly crystalline NIST standard.In addition, the analysis reveal that the main differences observed in the pair correlations from various X-ray and neutron diffraction techniques were attributed to themore » particle size of the CeO 2 prepared by the reported three methods. Furthermore, detailed analysis of the Ce L 3– and K-edge EXAFS data support this finding; in particular the decrease in higher shell coordination numbers with respect to the NIST standard, are attributed to differences in particle size.« less
Workplace exposure to nanoparticles from gas metal arc welding process
NASA Astrophysics Data System (ADS)
Zhang, Meibian; Jian, Le; Bin, Pingfan; Xing, Mingluan; Lou, Jianlin; Cong, Liming; Zou, Hua
2013-11-01
Workplace exposure to nanoparticles from gas metal arc welding (GMAW) process in an automobile manufacturing factory was investigated using a combination of multiple metrics and a comparison with background particles. The number concentration (NC), lung-deposited surface area concentration (SAC), estimated SAC and mass concentration (MC) of nanoparticles produced from the GMAW process were significantly higher than those of background particles before welding ( P < 0.01). A bimodal size distribution by mass for welding particles with two peak values (i.e., 10,000-18,000 and 560-320 nm) and a unimodal size distribution by number with 190.7-nm mode size or 154.9-nm geometric size were observed. Nanoparticles by number comprised 60.7 % of particles, whereas nanoparticles by mass only accounted for 18.2 % of the total particles. The morphology of welding particles was dominated by the formation of chain-like agglomerates of primary particles. The metal composition of these welding particles consisted primarily of Fe, Mn, and Zn. The size distribution, morphology, and elemental compositions of welding particles were significantly different from background particles. Working activities, sampling distances from the source, air velocity, engineering control measures, and background particles in working places had significant influences on concentrations of airborne nanoparticle. In addition, SAC showed a high correlation with NC and a relatively low correlation with MC. These findings indicate that the GMAW process is able to generate significant levels of nanoparticles. It is recommended that a combination of multiple metrics is measured as part of a well-designed sampling strategy for airborne nanoparticles. Key exposure factors, such as particle agglomeration/aggregation, background particles, working activities, temporal and spatial distributions of the particles, air velocity, engineering control measures, should be investigated when measuring workplace exposure to nanoparticles.
Sampling for Global Epidemic Models and the Topology of an International Airport Network
Bobashev, Georgiy; Morris, Robert J.; Goedecke, D. Michael
2008-01-01
Mathematical models that describe the global spread of infectious diseases such as influenza, severe acute respiratory syndrome (SARS), and tuberculosis (TB) often consider a sample of international airports as a network supporting disease spread. However, there is no consensus on how many cities should be selected or on how to select those cities. Using airport flight data that commercial airlines reported to the Official Airline Guide (OAG) in 2000, we have examined the network characteristics of network samples obtained under different selection rules. In addition, we have examined different size samples based on largest flight volume and largest metropolitan populations. We have shown that although the bias in network characteristics increases with the reduction of the sample size, a relatively small number of areas that includes the largest airports, the largest cities, the most-connected cities, and the most central cities is enough to describe the dynamics of the global spread of influenza. The analysis suggests that a relatively small number of cities (around 200 or 300 out of almost 3000) can capture enough network information to adequately describe the global spread of a disease such as influenza. Weak traffic flows between small airports can contribute to noise and mask other means of spread such as the ground transportation. PMID:18776932
NASA Astrophysics Data System (ADS)
Bozorgzadeh, Nezam; Yanagimura, Yoko; Harrison, John P.
2017-12-01
The Hoek-Brown empirical strength criterion for intact rock is widely used as the basis for estimating the strength of rock masses. Estimations of the intact rock H-B parameters, namely the empirical constant m and the uniaxial compressive strength σc, are commonly obtained by fitting the criterion to triaxial strength data sets of small sample size. This paper investigates how such small sample sizes affect the uncertainty associated with the H-B parameter estimations. We use Monte Carlo (MC) simulation to generate data sets of different sizes and different combinations of H-B parameters, and then investigate the uncertainty in H-B parameters estimated from these limited data sets. We show that the uncertainties depend not only on the level of variability but also on the particular combination of parameters being investigated. As particular combinations of H-B parameters can informally be considered to represent specific rock types, we discuss that as the minimum number of required samples depends on rock type it should correspond to some acceptable level of uncertainty in the estimations. Also, a comparison of the results from our analysis with actual rock strength data shows that the probability of obtaining reliable strength parameter estimations using small samples may be very low. We further discuss the impact of this on ongoing implementation of reliability-based design protocols and conclude with suggestions for improvements in this respect.
The relation between statistical power and inference in fMRI
Wager, Tor D.; Yarkoni, Tal
2017-01-01
Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial—especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20–30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches. PMID:29155843
Blanks: a computer program for analyzing furniture rough-part needs in standard-size blanks
Philip A. Araman
1983-01-01
A computer program is described that allows a company to determine the number of edge-glued, standard-size blanks required to satisfy its rough-part needs for a given production period. Yield and cost information also is determined by the program. A list of the program inputs, outputs, and uses of outputs is described, and an example analysis with sample output is...
Forest inventory using multistage sampling with probability proportional to size. [Brazil
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Lee, D. C. L.; Hernandezfilho, P.; Shimabukuro, Y. E.; Deassis, O. R.; Demedeiros, J. S.
1984-01-01
A multistage sampling technique, with probability proportional to size, for forest volume inventory using remote sensing data is developed and evaluated. The study area is located in the Southeastern Brazil. The LANDSAT 4 digital data of the study area are used in the first stage for automatic classification of reforested areas. Four classes of pine and eucalypt with different tree volumes are classified utilizing a maximum likelihood classification algorithm. Color infrared aerial photographs are utilized in the second stage of sampling. In the third state (ground level) the time volume of each class is determined. The total time volume of each class is expanded through a statistical procedure taking into account all the three stages of sampling. This procedure results in an accurate time volume estimate with a smaller number of aerial photographs and reduced time in field work.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
The reliability and stability of visual working memory capacity.
Xu, Z; Adam, K C S; Fang, X; Vogel, E K
2018-04-01
Because of the central role of working memory capacity in cognition, many studies have used short measures of working memory capacity to examine its relationship to other domains. Here, we measured the reliability and stability of visual working memory capacity, measured using a single-probe change detection task. In Experiment 1, the participants (N = 135) completed a large number of trials of a change detection task (540 in total, 180 each of set sizes 4, 6, and 8). With large numbers of both trials and participants, reliability estimates were high (α > .9). We then used an iterative down-sampling procedure to create a look-up table for expected reliability in experiments with small sample sizes. In Experiment 2, the participants (N = 79) completed 31 sessions of single-probe change detection. The first 30 sessions took place over 30 consecutive days, and the last session took place 30 days later. This unprecedented number of sessions allowed us to examine the effects of practice on stability and internal reliability. Even after much practice, individual differences were stable over time (average between-session r = .76).
Temporal change in the size distribution of airborne Radiocesium derived from the Fukushima accident
NASA Astrophysics Data System (ADS)
Kaneyasu, Naoki; Ohashi, Hideo; Suzuki, Fumie; Okuda, Tomoaki; Ikemori, Fumikazu; Akata, Naofumi
2013-04-01
The accident of Fukushima Dai-ichi nuclear power plant discharged a large amount of radioactive materials into the environment. After 40 days of the accident, we started to collect the size-segregated aerosol at Tsukuba City, Japan, located 170 km south of the plant, by use of a low-pressure cascade impactor. The sampling continued from April 28, through October 26, 2011. The number of sample sets collected in total was 8. The radioactivity of 134Cs and 137Cs in aerosols collected at each stage were determined by gamma-ray with a high sensitivity Germanic detector. After the gamma-ray spectrometry analysis, the chemical species in the aerosols were analyzed. The analyses of first (April 28-May 12) and second (May 12-26) samples showed that the activity size distributions of 134Cs and 137Cs in aerosols reside mostly in the accumulation mode size range. These activity size distributions almost overlapped with the mass size distribution of non-sea-salt sulfate aerosol. From the results, we regarded that sulfate is the main transport medium of these radionuclides, and re-suspended soil particles that attached radionuclides were not the major airborne radioactive substances by the end of May, 2011 (Kaneyasu et al., 2012). We further conducted the successive extraction experiment of radiocesium from the aerosol deposits on the aluminum sheet substrate (8th stage of the first aerosol sample, 0.5-0.7 μm in aerodynamic diameter) with water and 0.1M HCl. In contrast to the relatively insoluble property of Chernobyl radionuclides, those in aerosols collected at Tsukuba in fine mode are completely water-soluble (100%). From the third aerosol sample, the activity size distributions started to change, i.e., the major peak in the accumulation mode size range seen in the first and second aerosol samples became smaller and an additional peak appeared in the coarse mode size range. The comparison of the activity size distributions of radiocesium and the mass size distributions of major aerosol components collected by the end of August, 2011, (i.e., sample No.5) and its implication will be discussed in the presentation. Reference Kaneyasu et al., Environ. Sci. Technol. 46, 5720-5726 (2012).
Effect of Sampling Plans on the Risk of Escherichia coli O157 Illness.
Kiermeier, Andreas; Sumner, John; Jenson, Ian
2015-07-01
Australia exports about 150,000 to 200,000 tons of manufacturing beef to the United States annually. Each lot is tested for Escherichia coli O157 using the N-60 sampling protocol, where 60 small pieces of surface meat from each lot of production are tested. A risk assessment of E. coli O157 illness from the consumption of hamburgers made from Australian manufacturing meat formed the basis to evaluate the effect of sample size and amount on the number of illnesses predicted. The sampling plans evaluated included no sampling (resulting in an estimated 55.2 illnesses per annum), the current N-60 plan (50.2 illnesses), N-90 (49.6 illnesses), N-120 (48.4 illnesses), and a more stringent N-60 sampling plan taking five 25-g samples from each of 12 cartons (47.4 illnesses per annum). While sampling may detect some highly contaminated lots, it does not guarantee that all such lots are removed from commerce. It is concluded that increasing the sample size or sample amount from the current N-60 plan would have a very small public health effect.
Constraints on the Dark Matter Particle Mass from the Number of Milky Way Satellites
2010-04-12
but our lower mass limits do not necessarily apply to mixed dark matter cosmologies . Higgs decay produced sterile neutrinos can, however, constitute...simulations of the growth of Milky Way-sized halos in cold and warm dark matter cosmologies . The number of dark matter satellites in our simulated Milky...tions of WDM cosmologies due to numerical artifacts produced by discrete sampling of the gravitational poten- tial with a finite number of particles
Drew, L.J.; Attanasi, E.D.; Schuenemeyer, J.H.
1988-01-01
If observed oil and gas field size distributions are obtained by random samplings, the fitted distributions should approximate that of the parent population of oil and gas fields. However, empirical evidence strongly suggests that larger fields tend to be discovered earlier in the discovery process than they would be by random sampling. Economic factors also can limit the number of small fields that are developed and reported. This paper examines observed size distributions in state and federal waters of offshore Texas. Results of the analysis demonstrate how the shape of the observable size distributions change with significant hydrocarbon price changes. Comparison of state and federal observed size distributions in the offshore area shows how production cost differences also affect the shape of the observed size distribution. Methods for modifying the discovery rate estimation procedures when economic factors significantly affect the discovery sequence are presented. A primary conclusion of the analysis is that, because hydrocarbon price changes can significantly affect the observed discovery size distribution, one should not be confident about inferring the form and specific parameters of the parent field size distribution from the observed distributions. ?? 1988 International Association for Mathematical Geology.
What is the extent of prokaryotic diversity?
Curtis, Thomas P; Head, Ian M; Lunn, Mary; Woodcock, Stephen; Schloss, Patrick D; Sloan, William T
2006-01-01
The extent of microbial diversity is an intrinsically fascinating subject of profound practical importance. The term ‘diversity’ may allude to the number of taxa or species richness as well as their relative abundance. There is uncertainty about both, primarily because sample sizes are too small. Non-parametric diversity estimators make gross underestimates if used with small sample sizes on unevenly distributed communities. One can make richness estimates over many scales using small samples by assuming a species/taxa-abundance distribution. However, no one knows what the underlying taxa-abundance distributions are for bacterial communities. Latterly, diversity has been estimated by fitting data from gene clone libraries and extrapolating from this to taxa-abundance curves to estimate richness. However, since sample sizes are small, we cannot be sure that such samples are representative of the community from which they were drawn. It is however possible to formulate, and calibrate, models that predict the diversity of local communities and of samples drawn from that local community. The calibration of such models suggests that migration rates are small and decrease as the community gets larger. The preliminary predictions of the model are qualitatively consistent with the patterns seen in clone libraries in ‘real life’. The validation of this model is also confounded by small sample sizes. However, if such models were properly validated, they could form invaluable tools for the prediction of microbial diversity and a basis for the systematic exploration of microbial diversity on the planet. PMID:17028084
Surface degassing and modifications to vesicle size distributions in active basalt flows
Cashman, K.V.; Mangan, M.T.; Newman, S.
1994-01-01
The character of the vesicle population in lava flows includes several measurable parameters that may provide important constraints on lava flow dynamics and rheology. Interpretation of vesicle size distributions (VSDs), however, requires an understanding of vesiculation processes in feeder conduits, and of post-eruption modifications to VSDs during transport and emplacement. To this end we collected samples from active basalt flows at Kilauea Volcano: (1) near the effusive Kupaianaha vent; (2) through skylights in the approximately isothermal Wahaula and Kamoamoa tube systems transporting lava to the coast; (3) from surface breakouts at different locations along the lava tubes; and (4) from different locations in a single breakout from a lava tube 1 km from the 51 vent at Pu'u 'O'o. Near-vent samples are characterized by VSDs that show exponentially decreasing numbers of vesicles with increasing vesicle size. These size distributions suggest that nucleation and growth of bubbles were continuous during ascent in the conduit, with minor associated bubble coalescence resulting from differential bubble rise. The entire vesicle population can be attributed to shallow exsolution of H2O-dominated gases at rates consistent with those predicted by simple diffusion models. Measurements of H2O, CO2 and S in the matrix glass show that the melt equilibrated rapidly at atmospheric pressure. Down-tube samples maintain similar VSD forms but show a progressive decrease in both overall vesicularity and mean vesicle size. We attribute this change to open system, "passive" rise and escape of larger bubbles to the surface. Such gas loss from the tube system results in the output of 1.2 ?? 106 g/day SO2, an output representing an addition of approximately 1% to overall volatile budget calculations. A steady increase in bubble number density with downstream distance is best explained by continued bubble nucleation at rates of 7-8/cm3s. Rates are ???25% of those estimated from the vent samples, and thus represent volatile supersaturations considerably less than those of the conduit. We note also that the small total volume represented by this new bubble population does not: (1) measurably deplete the melt in volatiles; or (2) make up for the overall vesicularity decrease resulting from the loss of larger bubbles. Surface breakout samples have distinctive VSDs characterized by an extreme depletion in the small vesicle population. This results in samples with much lower number densities and larger mean vesicle sizes than corresponding tube samples. Similar VSD patterns have been observed in solidified lava flows and are interpreted to result from either static (wall rupture) or dynamic (bubble rise and capture) coalescence. Through comparison with vent and tube vesicle populations, we suggest that, in addition to coalescence, the observed vesicle populations in the breakout samples have experienced a rapid loss of small vesicles consistent with 'ripening' of the VSD resulting from interbubble diffusion of volatiles. Confinement of ripening features to surface flows suggests that the thin skin that forms on surface breakouts may play a role in the observed VSD modification. ?? 1994.
Determining chewing efficiency using a solid test food and considering all phases of mastication.
Liu, Ting; Wang, Xinmiao; Chen, Jianshe; van der Glas, Hilbert W
2018-07-01
Following chewing a solid food, the median particle size, X 50 , is determined after N chewing cycles, by curve-fitting of the particle size distribution. Reduction of X 50 with N is traditionally followed from N ≥ 15-20 cycles when using the artificial test food Optosil ® , because of initially unreliable values of X 50 . The aims of the study were (i) to enable testing at small N-values by using initial particles of appropriate size, shape and amount, and (ii) to compare measures of chewing ability, i.e. chewing efficiency (N needed to halve the initial particle size, N(1/2-Xo)) and chewing performance (X 50 at a particular N-value, X 50,N ). 8 subjects with a natural dentition chewed 4 types of samples of Optosil particles: (1) 8 cubes of 8 mm, border size relative to bin size (traditional test), (2) 9 half-cubes of 9.6 mm, mid-size; similar sample volume, (3) 4 half-cubes of 9.6 mm, and 2 half-cubes of 9.6 mm; reduced particle number and sample volume. All samples were tested with 4 N-values. Curve-fitting with a 2nd order polynomial function yielded log(X 50 )-log(N) relationships, after which N(1/2-Xo) and X 50,N were obtained. Reliable X 50 -values are obtained for all N-values when using half-cubes with a mid-size relative to bin sizes. By using 2 or 4 half-cubes, determination of N(1/2-Xo) or X 50,N needs less chewing cycles than traditionally. Chewing efficiency is preferable over chewing performance because of a comparison of inter-subject chewing ability at the same stage of food comminution and constant intra-subject and inter-subject ratios between and within samples respectively. Copyright © 2018 Elsevier Ltd. All rights reserved.
Aitken, C G
1999-07-01
It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Burkness, Eric C; Hutchison, W D
2009-10-01
Populations of cabbage looper, Trichoplusiani (Lepidoptera: Noctuidae), were sampled in experimental plots and commercial fields of cabbage (Brasicca spp.) in Minnesota during 1998-1999 as part of a larger effort to implement an integrated pest management program. Using a resampling approach and the Wald's sequential probability ratio test, sampling plans with different sampling parameters were evaluated using independent presence/absence and enumerative data. Evaluations and comparisons of the different sampling plans were made based on the operating characteristic and average sample number functions generated for each plan and through the use of a decision probability matrix. Values for upper and lower decision boundaries, sequential error rates (alpha, beta), and tally threshold were modified to determine parameter influence on the operating characteristic and average sample number functions. The following parameters resulted in the most desirable operating characteristic and average sample number functions; action threshold of 0.1 proportion of plants infested, tally threshold of 1, alpha = beta = 0.1, upper boundary of 0.15, lower boundary of 0.05, and resampling with replacement. We found that sampling parameters can be modified and evaluated using resampling software to achieve desirable operating characteristic and average sample number functions. Moreover, management of T. ni by using binomial sequential sampling should provide a good balance between cost and reliability by minimizing sample size and maintaining a high level of correct decisions (>95%) to treat or not treat.
Rock sample brought to earth from the Apollo 12 lunar landing mission
NASA Technical Reports Server (NTRS)
1969-01-01
Close-up view of Apollo 12 sample 12,065 under observation in the Manned Spacecraft Center's Lunar Receiving Laboratory. This sample, collected during the second Apollo 12 extravehicular activity (EVA-2) of Astronauts Charles Conrad Jr., and Alan L. Bean, is a fine-grained rock. Note the glass-lined pits. An idea of the size of the rock can be gained by reference to the gauge on the bottom portion of the number meter.
Speckle imaging through turbulent atmosphere based on adaptable pupil segmentation
NASA Astrophysics Data System (ADS)
Loktev, Mikhail; Soloviev, Oleg; Savenko, Svyatoslav; Vdovin, Gleb
2011-07-01
We report on the first results to our knowledge obtained with adaptable multiaperture imaging through turbulence on a horizontal atmospheric path. We show that the resolution can be improved by adaptively matching the size of the subaperture to the characteristic size of the turbulence. Further improvement is achieved by the deconvolution of a number of subimages registered simultaneously through multiple subapertures. Different implementations of multiaperture geometry, including pupil multiplication, pupil image sampling, and a plenoptic telescope, are considered. Resolution improvement has been demonstrated on a ˜550m horizontal turbulent path, using a combination of aperture sampling, speckle image processing, and, optionally, frame selection.
NASA Astrophysics Data System (ADS)
Venero, I. M.; Mayol-Bracero, O. L.; Anderson, J. R.
2012-12-01
As part of the Puerto Rican African Dust and Cloud Study (PRADACS) and the Ice in Clouds Experiment - Tropical (ICE-T), we sampled giant airborne particles to study their elemental composition, morphology, and size distributions. Samples were collected in July 2011 during field measurements performed by NCAR's C-130 aircraft based on St Croix, U.S Virgin Island. The results presented here correspond to the measurements done during research flight #8 (RF8). Aerosol particles with Dp > 1 um were sampled with the Giant Nuclei Impactor and particles with Dp < 1 um were collected with the Wyoming Inlet. Collected particles were later analyzed using an automated scanning electron microscope (SEM) and manual observation by field emission SEM. We identified the chemical composition and morphology of major particle types in filter samples collected at different altitudes (e.g., 300 ft, 1000 ft, and 4500ft). Results from the flight upwind of Puerto Rico show that particles in the giant nuclei size range are dominated by sea salt. Samples collected at altitudes 300 ft and 1000 ft showed the highest number of sea salt particles and the samples collected at higher altitudes (> 4000 ft) showed the highest concentrations of clay material. HYSPLIT back trajectories for all samples showed that the low altitude samples initiated in the free troposphere in the Atlantic Ocean, which may account for the high sea salt content and that the source of the high altitude samples was closer to the Saharan - Sahel desert region and, therefore, these samples possibly had the influence of African dust. Size distribution results for quartz and unreacted sea-salt aerosols collected on the Giant Nuclei Impactor showed that sample RF08 - 12:05 UTM (300 ft) had the largest size value (mean = 2.936 μm) than all the other samples. Additional information was also obtained from the Wyoming Inlet present at the C - 130 aircraft which showed that size distribution results for all particles were smaller in size. The different mineral components of the dust have different size distributions so that a fractionation process could occur during transport. Also, the presence of supermicron sea salt at altitude is important for cloud processes.
The grain size(s) of Black Hills Quartzite deformed in the dislocation creep regime
NASA Astrophysics Data System (ADS)
Heilbronner, Renée; Kilian, Rüdiger
2017-10-01
General shear experiments on Black Hills Quartzite (BHQ) deformed in the dislocation creep regimes 1 to 3 have been previously analyzed using the CIP method (Heilbronner and Tullis, 2002, 2006). They are reexamined using the higher spatial and orientational resolution of EBSD. Criteria for coherent segmentations based on c-axis orientation and on full crystallographic orientations are determined. Texture domains of preferred c-axis orientation (Y and B domains) are extracted and analyzed separately. Subdomains are recognized, and their shape and size are related to the kinematic framework and the original grains in the BHQ. Grain size analysis is carried out for all samples, high- and low-strain samples, and separately for a number of texture domains. When comparing the results to the recrystallized quartz piezometer of Stipp and Tullis (2003), it is found that grain sizes are consistently larger for a given flow stress. It is therefore suggested that the recrystallized grain size also depends on texture, grain-scale deformation intensity, and the kinematic framework (of axial vs. general shear experiments).
Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian; ...
2016-06-02
We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungalmore » community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian
We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungalmore » community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.« less
Set size and culture influence children's attention to number.
Cantrell, Lisa; Kuwabara, Megumi; Smith, Linda B
2015-03-01
Much research evidences a system in adults and young children for approximately representing quantity. Here we provide evidence that the bias to attend to discrete quantity versus other dimensions may be mediated by set size and culture. Preschool-age English-speaking children in the United States and Japanese-speaking children in Japan were tested in a match-to-sample task where number was pitted against cumulative surface area in both large and small numerical set comparisons. Results showed that children from both cultures were biased to attend to the number of items for small sets. Large set responses also showed a general attention to number when ratio difficulty was easy. However, relative to the responses for small sets, attention to number decreased for both groups; moreover, both U.S. and Japanese children showed a significant bias to attend to total amount for difficult numerical ratio distances, although Japanese children shifted attention to total area at relatively smaller set sizes than U.S. children. These results add to our growing understanding of how quantity is represented and how such representation is influenced by context--both cultural and perceptual. Copyright © 2014 Elsevier Inc. All rights reserved.
Effects of 16S rDNA sampling on estimates of the number of endosymbiont lineages in sucking lice
Burleigh, J. Gordon; Light, Jessica E.; Reed, David L.
2016-01-01
Phylogenetic trees can reveal the origins of endosymbiotic lineages of bacteria and detect patterns of co-evolution with their hosts. Although taxon sampling can greatly affect phylogenetic and co-evolutionary inference, most hypotheses of endosymbiont relationships are based on few available bacterial sequences. Here we examined how different sampling strategies of Gammaproteobacteria sequences affect estimates of the number of endosymbiont lineages in parasitic sucking lice (Insecta: Phthirapatera: Anoplura). We estimated the number of louse endosymbiont lineages using both newly obtained and previously sequenced 16S rDNA bacterial sequences and more than 42,000 16S rDNA sequences from other Gammaproteobacteria. We also performed parametric and nonparametric bootstrapping experiments to examine the effects of phylogenetic error and uncertainty on these estimates. Sampling of 16S rDNA sequences affects the estimates of endosymbiont diversity in sucking lice until we reach a threshold of genetic diversity, the size of which depends on the sampling strategy. Sampling by maximizing the diversity of 16S rDNA sequences is more efficient than randomly sampling available 16S rDNA sequences. Although simulation results validate estimates of multiple endosymbiont lineages in sucking lice, the bootstrap results suggest that the precise number of endosymbiont origins is still uncertain. PMID:27547523
Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations*
Feehan, Dennis M.; Salganik, Matthew J.
2018-01-01
The network scale-up method enables researchers to estimate the size of hidden populations, such as drug injectors and sex workers, using sampled social network data. The basic scale-up estimator offers advantages over other size estimation techniques, but it depends on problematic modeling assumptions. We propose a new generalized scale-up estimator that can be used in settings with non-random social mixing and imperfect awareness about membership in the hidden population. Further, the new estimator can be used when data are collected via complex sample designs and from incomplete sampling frames. However, the generalized scale-up estimator also requires data from two samples: one from the frame population and one from the hidden population. In some situations these data from the hidden population can be collected by adding a small number of questions to already planned studies. For other situations, we develop interpretable adjustment factors that can be applied to the basic scale-up estimator. We conclude with practical recommendations for the design and analysis of future studies. PMID:29375167
Allometry and Ecology of the Bilaterian Gut Microbiome
Sherrill-Mix, Scott; McCormick, Kevin; Lauder, Abigail; Bailey, Aubrey; Zimmerman, Laurie; Li, Yingying; Django, Jean-Bosco N.; Bertolani, Paco; Colin, Christelle; Hart, John A.; Hart, Terese B.; Georgiev, Alexander V.; Sanz, Crickette M.; Morgan, David B.; Atencia, Rebeca; Cox, Debby; Muller, Martin N.; Sommer, Volker; Piel, Alexander K.; Stewart, Fiona A.; Speede, Sheri; Roman, Joe; Wu, Gary; Taylor, Josh; Bohm, Rudolf; Rose, Heather M.; Carlson, John; Mjungu, Deus; Schmidt, Paul; Gaughan, Celeste; Bushman, Joyslin I.; Schmidt, Ella; Bittinger, Kyle; Collman, Ronald G.; Hahn, Beatrice H.
2018-01-01
ABSTRACT Classical ecology provides principles for construction and function of biological communities, but to what extent these apply to the animal-associated microbiota is just beginning to be assessed. Here, we investigated the influence of several well-known ecological principles on animal-associated microbiota by characterizing gut microbial specimens from bilaterally symmetrical animals (Bilateria) ranging from flies to whales. A rigorously vetted sample set containing 265 specimens from 64 species was assembled. Bacterial lineages were characterized by 16S rRNA gene sequencing. Previously published samples were also compared, allowing analysis of over 1,098 samples in total. A restricted number of bacterial phyla was found to account for the great majority of gut colonists. Gut microbial composition was associated with host phylogeny and diet. We identified numerous gut bacterial 16S rRNA gene sequences that diverged deeply from previously studied taxa, identifying opportunities to discover new bacterial types. The number of bacterial lineages per gut sample was positively associated with animal mass, paralleling known species-area relationships from island biogeography and implicating body size as a determinant of community stability and niche complexity. Samples from larger animals harbored greater numbers of anaerobic communities, specifying a mechanism for generating more-complex microbial environments. Predictions for species/abundance relationships from models of neutral colonization did not match the data set, pointing to alternative mechanisms such as selection of specific colonists by environmental niche. Taken together, the data suggest that niche complexity increases with gut size and that niche selection forces dominate gut community construction. PMID:29588401
Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.
2011-01-01
Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.
Park, Myung Sook; Kang, Kyung Ja; Jang, Sun Joo; Lee, Joo Yun; Chang, Sun Ju
2018-03-01
This study aimed to evaluate the components of test-retest reliability including time interval, sample size, and statistical methods used in patient-reported outcome measures in older people and to provide suggestions on the methodology for calculating test-retest reliability for patient-reported outcomes in older people. This was a systematic literature review. MEDLINE, Embase, CINAHL, and PsycINFO were searched from January 1, 2000 to August 10, 2017 by an information specialist. This systematic review was guided by both the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist and the guideline for systematic review published by the National Evidence-based Healthcare Collaborating Agency in Korea. The methodological quality was assessed by the Consensus-based Standards for the selection of health Measurement Instruments checklist box B. Ninety-five out of 12,641 studies were selected for the analysis. The median time interval for test-retest reliability was 14days, and the ratio of sample size for test-retest reliability to the number of items in each measure ranged from 1:1 to 1:4. The most frequently used statistical methods for continuous scores was intraclass correlation coefficients (ICCs). Among the 63 studies that used ICCs, 21 studies presented models for ICC calculations and 30 studies reported 95% confidence intervals of the ICCs. Additional analyses using 17 studies that reported a strong ICC (>0.09) showed that the mean time interval was 12.88days and the mean ratio of the number of items to sample size was 1:5.37. When researchers plan to assess the test-retest reliability of patient-reported outcome measures for older people, they need to consider an adequate time interval of approximately 13days and the sample size of about 5 times the number of items. Particularly, statistical methods should not only be selected based on the types of scores of the patient-reported outcome measures, but should also be described clearly in the studies that report the results of test-retest reliability. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ferguson, Philip E.; Sales, Catherine M.; Hodges, Dalton C.; Sales, Elizabeth W.
2015-01-01
Background Recent publications have emphasized the importance of a multidisciplinary strategy for maximum conservation and utilization of lung biopsy material for advanced testing, which may determine therapy. This paper quantifies the effect of a multidisciplinary strategy implemented to optimize and increase tissue volume in CT-guided transthoracic needle core lung biopsies. The strategy was three-pronged: (1) once there was confidence diagnostic tissue had been obtained and if safe for the patient, additional biopsy passes were performed to further increase volume of biopsy material, (2) biopsy material was placed in multiple cassettes for processing, and (3) all tissue ribbons were conserved when cutting blocks in the histology laboratory. This study quantifies the effects of strategies #1 and #2. Design This retrospective analysis comparing CT-guided lung biopsies from 2007 and 2012 (before and after multidisciplinary approach implementation) was performed at a single institution. Patient medical records were reviewed and main variables analyzed include biopsy sample size, radiologist, number of blocks submitted, diagnosis, and complications. The biopsy sample size measured was considered to be directly proportional to tissue volume in the block. Results Biopsy sample size increased 2.5 fold with the average total biopsy sample size increasing from 1.0 cm (0.9–1.1 cm) in 2007 to 2.5 cm (2.3–2.8 cm) in 2012 (P<0.0001). The improvement was statistically significant for each individual radiologist. During the same time, the rate of pneumothorax requiring chest tube placement decreased from 15% to 7% (P = 0.065). No other major complications were identified. The proportion of tumor within the biopsy material was similar at 28% (23%–33%) and 35% (30%–40%) for 2007 and 2012, respectively. The number of cases with at least two blocks available for testing increased from 10.7% to 96.4% (P<0.0001). Conclusions The effect of this multidisciplinary strategy to CT-guided lung biopsies was effective in significantly increasing tissue volume and number of blocks available for advanced diagnostic testing. PMID:26479367
Subsampling program for the estimation of fish impingement
NASA Astrophysics Data System (ADS)
Beauchamp, John J.; Kumar, K. D.
1984-11-01
Federal regulations require operators of nuclear and coal-fired power-generating stations to estimate the number of fish impinged on intake screens. During winter months, impingement may range into the hundreds of thousands for certain species, making it impossible to count all intake screens completely. We present graphs for determinig the appropriate“optimal” subsample that must be obtained to estimate the total number impinged. Since the number of fish impinged tends to change drastically within a short time period, the subsample size is determined based on the most recent data. This allows for the changing nature of the species-age composition of the impinged fish. These graphs can also be used for subsampling fish catches in an aquatic system when the size of the catch is too large to sample completely.
Areal Control Using Generalized Least Squares As An Alternative to Stratification
Raymond L. Czaplewski
2001-01-01
Stratification for both variance reduction and areal control proliferates the number of strata, which causes small sample sizes in many strata. This might compromise statistical efficiency. Generalized least squares can, in principle, replace stratification for areal control.
Recommendations for the use of mist nets for inventory and monitoring of bird populations
C. John Ralph; Erica H. Dunn; Will J. Peach; Colleen M. Handel
2004-01-01
We provide recommendations on the best practices for mist netting for the purposes of monitoring population parameters such as abundance and demography. Studies should be carefully thought out before nets are set up, to ensure that sampling design and estimated sample size will allow study objectives to be met. Station location, number of nets, type of nets, net...
A. Broido; Hsiukang Yow
1977-01-01
Even before weight loss in the low-temperature pyrolysis of cellulose becomes significant, the average degree of polymerization of the partially pyrolyzed samples drops sharply. The gel permeation chromatograms of nitrated derivatives of the samples can be described in terms of a small number of mixed size populationsâeach component fitted within reasonable limits by a...
Kim, Hyoungrae; Jang, Cheongyun; Yadav, Dharmendra K; Kim, Mi-Hyun
2017-03-23
The accuracy of any 3D-QSAR, Pharmacophore and 3D-similarity based chemometric target fishing models are highly dependent on a reasonable sample of active conformations. Since a number of diverse conformational sampling algorithm exist, which exhaustively generate enough conformers, however model building methods relies on explicit number of common conformers. In this work, we have attempted to make clustering algorithms, which could find reasonable number of representative conformer ensembles automatically with asymmetric dissimilarity matrix generated from openeye tool kit. RMSD was the important descriptor (variable) of each column of the N × N matrix considered as N variables describing the relationship (network) between the conformer (in a row) and the other N conformers. This approach used to evaluate the performance of the well-known clustering algorithms by comparison in terms of generating representative conformer ensembles and test them over different matrix transformation functions considering the stability. In the network, the representative conformer group could be resampled for four kinds of algorithms with implicit parameters. The directed dissimilarity matrix becomes the only input to the clustering algorithms. Dunn index, Davies-Bouldin index, Eta-squared values and omega-squared values were used to evaluate the clustering algorithms with respect to the compactness and the explanatory power. The evaluation includes the reduction (abstraction) rate of the data, correlation between the sizes of the population and the samples, the computational complexity and the memory usage as well. Every algorithm could find representative conformers automatically without any user intervention, and they reduced the data to 14-19% of the original values within 1.13 s per sample at the most. The clustering methods are simple and practical as they are fast and do not ask for any explicit parameters. RCDTC presented the maximum Dunn and omega-squared values of the four algorithms in addition to consistent reduction rate between the population size and the sample size. The performance of the clustering algorithms was consistent over different transformation functions. Moreover, the clustering method can also be applied to molecular dynamics sampling simulation results.
NASA Astrophysics Data System (ADS)
Hohenegger, Johann
2015-04-01
The shells of symbiont-bearing larger benthic Foraminifera (LBF) represent the response to physiological requirements in dependence of environmental conditions. All compartments of the shell such as chambers and chamberlets accommodate the growth of the cell protoplasm and are adaptations for housing photosymbiotic algae. Investigations on the biology of LBF were predominantly based on laboratory studies. The lifetime of LBF under natural conditions is still unclear. LBF, which can build >100 chambers during their lifetime, are thought to live at least one year under natural conditions. This is supported by studies on population dynamics of eulittoral foraminifera. In species characterized by a time-restricted single reproduction period the mean size of specimens increases from small to large during lifetime simultaneously reducing individual number. This becomes more complex when two or more reproduction times are present within a one-year cycle leading to a mixture of abundant small individuals with few large specimens during the year, while keeping mean size more or less constant. This mixture is typical for most sublittoral megalospheric (gamonts or schizonts) LBF. Nothing is known on the lifetime of agamonts, the diploid asexually reproducing generation. In all hyaline LBF it is thought to be significantly longer than 1 year based on the large size and considering the mean chamber building rate of the gamont/schizonts. Observations on LBF under natural conditions have not been performed yet in the deeper sublittoral. This reflects the difficulties due to intense hydrodynamics that hinder deploying technical equipment for studies in the natural environment. Therefore, studying growth, lifetime and reproduction of sublittoral LBF under natural conditions can be performed using the so-called 'natural laboratory' in comparison with laboratory investigations. The best sampling method in the upper sublittoral from 5 to 70 m depth is by SCUBA diving. Irregular sampling intervals caused by differing weather conditions may range from weeks to one month, whereby the latter represents the upper limit: larger intervals could render the data set worthless. The number of sampling points at the location must be more than 4, randomly distributed and approximately 5m apart to smooth the effects of patchy distributions, which are typical for most LBF. Only three simple measurements are necessary to determine chamber building rate and population dynamics under natural conditions. These are the number of individuals, number of chambers and the largest diameter of the individual. The determination of a standardized sample surface area, which is necessary for population dynamic investigations, depends on the sampling method. Reproduction and longevity can be estimated based on shell size using the date where the mean abundance of specimens with minimum size (expected after a one month's growth) characterizes the reproduction period. Then the difference to the date with the mean abundance of specimens characterized by large size indicating readiness for reproduction marks the life time. Calculation of the chamber-building rate based on chamber number is more complex and depends on the reproduction period and longevity. This can be fitted with theoretical growth functions (e.g. Michaelis Menten Function). According to the above mentioned methods, chamber building rates, longevity and population dynamics can be obtained for the shallow sublittoral symbiont-bearing LBF using the 'natural laboratory'.
Improvement of sampling plans for Salmonella detection in pooled table eggs by use of real-time PCR.
Pasquali, Frédérique; De Cesare, Alessandra; Valero, Antonio; Olsen, John Emerdhal; Manfreda, Gerardo
2014-08-01
Eggs and egg products have been described as the most critical food vehicles of salmonellosis. The prevalence and level of contamination of Salmonella on table eggs are low, which severely affects the sensitivity of sampling plans applied voluntarily in some European countries, where one to five pools of 10 eggs are tested by the culture based reference method ISO 6579:2004. In the current study we have compared the testing-sensitivity of the reference culture method ISO 6579:2004 and an alternative real-time PCR method on Salmonella contaminated egg-pool of different sizes (4-9 uninfected eggs mixed with one contaminated egg) and contamination levels (10°-10(1), 10(1)-10(2), 10(2)-10(3)CFU/eggshell). Two hundred and seventy samples corresponding to 15 replicates per pool size and inoculum level were tested. At the lowest contamination level real-time PCR detected Salmonella in 40% of contaminated pools vs 12% using ISO 6579. The results were used to estimate the lowest number of sample units needed to be tested in order to have a 95% certainty not falsely to accept a contaminated lot by Monte Carlo simulation. According to this simulation, at least 16 pools of 10 eggs each are needed to be tested by ISO 6579 in order to obtain this confidence level, while the minimum number of pools to be tested was reduced to 8 pools of 9 eggs each, when real-time PCR was applied as analytical method. This result underlines the importance of including analytical methods with higher sensitivity in order to improve the efficiency of sampling and reduce the number of samples to be tested. Copyright © 2013 Elsevier B.V. All rights reserved.
Estimating and comparing microbial diversity in the presence of sequencing errors
Chiu, Chun-Huo
2016-01-01
Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach aims to compare diversity estimates for equally-large or equally-complete samples; it is based on the seamless rarefaction and extrapolation sampling curves of Hill numbers, specifically for q = 0, 1 and 2. (2) An asymptotic approach refers to the comparison of the estimated asymptotic diversity profiles. That is, this approach compares the estimated profiles for complete samples or samples whose size tends to be sufficiently large. It is based on statistical estimation of the true Hill number of any order q ≥ 0. In the two approaches, replacing the spurious singleton count by our estimated count, we can greatly remove the positive biases associated with diversity estimates due to spurious singletons and also make fair comparisons across microbial communities, as illustrated in our simulation results and in applying our method to analyze sequencing data from viral metagenomes. PMID:26855872
NASA Astrophysics Data System (ADS)
Collier, Jordan; Filipovic, Miroslav; Norris, Ray; Chow, Kate; Huynh, Minh; Banfield, Julie; Tothill, Nick; Sirothia, Sandeep Kumar; Shabala, Stanislav
2014-04-01
This proposal is a continuation of an extensive project (the core of Collier's PhD) to explore the earliest stages of AGN formation, using Gigahertz-Peaked Spectrum (GPS) and Compact Steep Spectrum (CSS) sources. Both are widely believed to represent the earliest stages of radio-loud AGN evolution, with GPS sources preceding CSS sources. In this project, we plan to (a) test this hypothesis, (b) place GPS and CSS sources into an evolutionary sequence with a number of other young AGN candidates, and (c) search for evidence of the evolving accretion mode. We will do this using high-resolution radio observations, with a number of other multiwavelength age indicators, of a carefully selected complete faint sample of 80 GPS/CSS sources. Analysis of the C2730 ELAIS-S1 data shows that we have so far met our goals, resolving the jets of 10/49 sources, and measuring accurate spectral indices from 0.843-10 GHz. This particular proposal is to almost triple the sample size by observing an additional 80 GPS/CSS sources in the Chandra Deep Field South (arguably the best-studied field) and allow a turnover frequency - linear size relation to be derived at >10-sigma. Sources found to be unresolved in our final sample will subsequently be observed with VLBI. Comparing those sources resolved with ATCA to the more compact sources resolved with VLBI will give a distribution of source sizes, helping to answer the question of whether all GPS/CSS sources grow to larger sizes.
THE OCCURRENCE RATE OF SMALL PLANETS AROUND SMALL STARS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dressing, Courtney D.; Charbonneau, David, E-mail: cdressing@cfa.harvard.edu
We use the optical and near-infrared photometry from the Kepler Input Catalog to provide improved estimates of the stellar characteristics of the smallest stars in the Kepler target list. We find 3897 dwarfs with temperatures below 4000 K, including 64 planet candidate host stars orbited by 95 transiting planet candidates. We refit the transit events in the Kepler light curves for these planet candidates and combine the revised planet/star radius ratios with our improved stellar radii to revise the radii of the planet candidates orbiting the cool target stars. We then compare the number of observed planet candidates to themore » number of stars around which such planets could have been detected in order to estimate the planet occurrence rate around cool stars. We find that the occurrence rate of 0.5-4 R{sub Circled-Plus} planets with orbital periods shorter than 50 days is 0.90{sup +0.04}{sub -0.03} planets per star. The occurrence rate of Earth-size (0.5-1.4 R{sub Circled-Plus }) planets is constant across the temperature range of our sample at 0.51{sub -0.05}{sup +0.06} Earth-size planets per star, but the occurrence of 1.4-4 R{sub Circled-Plus} planets decreases significantly at cooler temperatures. Our sample includes two Earth-size planet candidates in the habitable zone, allowing us to estimate that the mean number of Earth-size planets in the habitable zone is 0.15{sup +0.13}{sub -0.06} planets per cool star. Our 95% confidence lower limit on the occurrence rate of Earth-size planets in the habitable zones of cool stars is 0.04 planets per star. With 95% confidence, the nearest transiting Earth-size planet in the habitable zone of a cool star is within 21 pc. Moreover, the nearest non-transiting planet in the habitable zone is within 5 pc with 95% confidence.« less
Carson, Evan W; Turner, Thomas F; Saltzgiver, Melody J; Adams, Deborah; Kesner, Brian R; Marsh, Paul C; Pilger, Tyler J; Dowling, Thomas E
2016-11-01
As with many endangered, long-lived iteroparous fishes, survival of razorback sucker depends on a management strategy that circumvents recruitment failure that results from predation by non-native fishes. In Lake Mohave, AZ-NV, management of razorback sucker centers on capture of larvae spawned in the lake, rearing them in off-channel habitats, and subsequent release ("repatriation") to the lake when adults are sufficiently large to resist predation. The effects of this strategy on genetic diversity, however, remained uncertain. After correction for differences in sample size among groups, metrics of mitochondrial DNA (mtDNA; number of haplotypes, N H , and haplotype diversity, H D ) and microsatellite (number of alleles, N A , and expected heterozygosity, H E ) diversity did not differ significantly between annual samples of repatriated adults and larval year-classes or among pooled samples of repatriated adults, larvae, and wild fish. These findings indicate that the current management program thus far maintained historical genetic variation of razorback sucker in the lake. Because effective population size, N e , is closely tied to the small census population size (N c = ~1500-3000) of razorback sucker in Lake Mohave, this population will remain at risk from genetic, as well as demographic risk of extinction unless N c is increased substantially. © The American Genetic Association 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Multiclass classification of microarray data samples with a reduced number of genes
2011-01-01
Background Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained. Results A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples. Conclusions A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples. PMID:21342522
Characterizing temporal changes of agricultural particulate matter number concentrations
NASA Astrophysics Data System (ADS)
Docekal, G. P.; Mahmood, R.; Larkin, G. P.; Silva, P. J.
2017-12-01
It is widely accepted among literature that particulate matter (PM) are of detriment to human health and the environment as a whole. These effects can vary depending on the particle size. This study examines PM size distributions and number concentrations at a poultry house. Despite much literature on PM concentrations at agricultural facilities, few studies have looked at the size distribution of particles at such facilities from the nucleation up through the coarse mode. Two optical particle counters (OPCs) were placed, one inside of a chicken house, and one on the outside of an exhaust fan to determine particle size distributions. In addition, a scanning mobility particle sizer (SMPS) and aerodynamic particle sizer (APS) sampled poultry house particles to give sizing information from a full size range of 10 nm - 20 mm. The data collected show several different types of events where observed size distributions changed. While some of these are due to expected dust generation events producing coarse mode particles, others suggest particle nucleation and accumulation events at the smaller size ranges that also occurred. The data suggest that agricultural facilities have an impact one the presence of PM in the environment beyond just generation of coarse mode dust. Data for different types of size distribution changes observed will be discussed.
Early detection of nonnative alleles in fish populations: When sample size actually matters
Croce, Patrick Della; Poole, Geoffrey C.; Payne, Robert A.; Gresswell, Bob
2017-01-01
Reliable detection of nonnative alleles is crucial for the conservation of sensitive native fish populations at risk of introgression. Typically, nonnative alleles in a population are detected through the analysis of genetic markers in a sample of individuals. Here we show that common assumptions associated with such analyses yield substantial overestimates of the likelihood of detecting nonnative alleles. We present a revised equation to estimate the likelihood of detecting nonnative alleles in a population with a given level of admixture. The new equation incorporates the effects of the genotypic structure of the sampled population and shows that conventional methods overestimate the likelihood of detection, especially when nonnative or F-1 hybrid individuals are present. Under such circumstances—which are typical of early stages of introgression and therefore most important for conservation efforts—our results show that improved detection of nonnative alleles arises primarily from increasing the number of individuals sampled rather than increasing the number of genetic markers analyzed. Using the revised equation, we describe a new approach to determining the number of individuals to sample and the number of diagnostic markers to analyze when attempting to monitor the arrival of nonnative alleles in native populations.
Grabitz, Clara R; Button, Katherine S; Munafò, Marcus R; Newbury, Dianne F; Pernet, Cyril R; Thompson, Paul A; Bishop, Dorothy V M
2018-01-01
Genetics and neuroscience are two areas of science that pose particular methodological problems because they involve detecting weak signals (i.e., small effects) in noisy data. In recent years, increasing numbers of studies have attempted to bridge these disciplines by looking for genetic factors associated with individual differences in behavior, cognition, and brain structure or function. However, different methodological approaches to guarding against false positives have evolved in the two disciplines. To explore methodological issues affecting neurogenetic studies, we conducted an in-depth analysis of 30 consecutive articles in 12 top neuroscience journals that reported on genetic associations in nonclinical human samples. It was often difficult to estimate effect sizes in neuroimaging paradigms. Where effect sizes could be calculated, the studies reporting the largest effect sizes tended to have two features: (i) they had the smallest samples and were generally underpowered to detect genetic effects, and (ii) they did not fully correct for multiple comparisons. Furthermore, only a minority of studies used statistical methods for multiple comparisons that took into account correlations between phenotypes or genotypes, and only nine studies included a replication sample or explicitly set out to replicate a prior finding. Finally, presentation of methodological information was not standardized and was often distributed across Methods sections and Supplementary Material, making it challenging to assemble basic information from many studies. Space limits imposed by journals could mean that highly complex statistical methods were described in only a superficial fashion. In summary, methods that have become standard in the genetics literature-stringent statistical standards, use of large samples, and replication of findings-are not always adopted when behavioral, cognitive, or neuroimaging phenotypes are used, leading to an increased risk of false-positive findings. Studies need to correct not just for the number of phenotypes collected but also for the number of genotypes examined, genetic models tested, and subsamples investigated. The field would benefit from more widespread use of methods that take into account correlations between the factors corrected for, such as spectral decomposition, or permutation approaches. Replication should become standard practice; this, together with the need for larger sample sizes, will entail greater emphasis on collaboration between research groups. We conclude with some specific suggestions for standardized reporting in this area.
Lamarre, Sophie; Frasse, Pierre; Zouine, Mohamed; Labourdette, Delphine; Sainderichin, Elise; Hu, Guojian; Le Berre-Anton, Véronique; Bouzayen, Mondher; Maza, Elie
2018-01-01
RNA-Seq is a widely used technology that allows an efficient genome-wide quantification of gene expressions for, for example, differential expression (DE) analysis. After a brief review of the main issues, methods and tools related to the DE analysis of RNA-Seq data, this article focuses on the impact of both the replicate number and library size in such analyses. While the main drawback of previous relevant studies is the lack of generality, we conducted both an analysis of a two-condition experiment (with eight biological replicates per condition) to compare the results with previous benchmark studies, and a meta-analysis of 17 experiments with up to 18 biological conditions, eight biological replicates and 100 million (M) reads per sample. As a global trend, we concluded that the replicate number has a larger impact than the library size on the power of the DE analysis, except for low-expressed genes, for which both parameters seem to have the same impact. Our study also provides new insights for practitioners aiming to enhance their experimental designs. For instance, by analyzing both the sensitivity and specificity of the DE analysis, we showed that the optimal threshold to control the false discovery rate (FDR) is approximately 2−r, where r is the replicate number. Furthermore, we showed that the false positive rate (FPR) is rather well controlled by all three studied R packages: DESeq, DESeq2, and edgeR. We also analyzed the impact of both the replicate number and library size on gene ontology (GO) enrichment analysis. Interestingly, we concluded that increases in the replicate number and library size tend to enhance the sensitivity and specificity, respectively, of the GO analysis. Finally, we recommend to RNA-Seq practitioners the production of a pilot data set to strictly analyze the power of their experimental design, or the use of a public data set, which should be similar to the data set they will obtain. For individuals working on tomato research, on the basis of the meta-analysis, we recommend at least four biological replicates per condition and 20 M reads per sample to be almost sure of obtaining about 1000 DE genes if they exist. PMID:29491871
A Powder Delivery System (PoDS) for Mars in situ Science
NASA Astrophysics Data System (ADS)
Bryson, C.; Blake, D.; Saha, C.; Sarrazin, P.
2004-12-01
Many instruments proposed for in situ Mars science investigations work best with fine-grained samples of rocks or soils. Such instruments include the mineral analyzer CheMin [1] and any instrument that requires samples having high surface areas (e.g., mass spectrometers, organic analyzers, etc). The Powder Delivery System (PoDS) is designed to deliver powders of selected grain sizes from a sample acquisition device such as an arm-deployed robotic driller or corer to an instrument suite located on the body of a rover/lander. PoDS is capable of size-selective sampling of crushed rocks, soil or drill powder for delivery to instruments that require specific grain sizes (e.g. 5-50 mg of less than150 micron powder for CheMin). Sample material is transported as an aerosol of particles and gas by vacuum advection. In the laboratory a venturi pump driven by compressed air provides the impulse. On Mars, the ambient atmosphere is a source of CO2 that can be captured and compressed by adsorption pumping during diurnal temperature cycling [2]. The lower atmospheric pressure on the surface of Mars (7 torr) will affect fundamental parameters of gas-particle interaction such as Reynolds, Stocks and Knudsen numbers [3]. However, calculations show that the PoDS will operate under both Martian and terrestrial atmospheric conditions. Cyclone separators with appropriate particle size selection ranges remove particles from the aerosol stream. The vortex flow inside the cyclone causes grains larger than a specific size to be collected, while smaller grains remain entrained in the gas. Cyclones are very efficient inertial and centrifugal particle separators with cut sizes (d50) as low as 4 microns. Depending on the particle size ranges desired, a series of cyclones with descending cut sizes may be used, the simplest case being a single cyclone for particle deposition without mass separation. Transmission / membrane filters of appropriate pore sizes may also be used to collect powder from the aerosol stream. Results of a number of tests of the prototype PoDS will be presented. [1] Blake D. F., Sarrazin P., Bish D. L., Feldman S., Chipera S. J, Vaniman D.T., and Collins S., 2004, Definitive Mineralogical Analysis of Mars Analog Rocks Using the CheMin XRD/XRF Instrument, LPSC XXXV abstr. #1794 (CD-ROM). [2] Finn J. E., McKay C. P. and Sridhar R. K., 1999, Martian Atmosphere Utilization by Temperature-Swing Adsorption, University of Arizona, Publication No.961597, http://stl.ame.arizona.edu/publications/961597.pdf [3] Hinds W. C., 1999, Aerosol Technology - Properties, Behavior, and Measurement of Airborne Particles, Second edition, John Wiley & Sons, Inc., pp 15-67, 111-136.
A multi-particle crushing apparatus for studying rock fragmentation due to repeated impacts
NASA Astrophysics Data System (ADS)
Huang, S.; Mohanty, B.; Xia, K.
2017-12-01
Rock crushing is a common process in mining and related operations. Although a number of particle crushing tests have been proposed in the literature, most of them are concerned with single-particle crushing, i.e., a single rock sample is crushed in each test. Considering the realistic scenario in crushers where many fragments are involved, a laboratory crushing apparatus is developed in this study. This device consists of a Hopkinson pressure bar system and a piston-holder system. The Hopkinson pressure bar system is used to apply calibrated dynamic loads to the piston-holder system, and the piston-holder system is used to hold rock samples and to recover fragments for subsequent particle size analysis. The rock samples are subjected to three to seven impacts under three impact velocities (2.2, 3.8, and 5.0 m/s), with the feed size of the rock particle samples limited between 9.5 and 12.7 mm. Several key parameters are determined from this test, including particle size distribution parameters, impact velocity, loading pressure, and total work. The results show that the total work correlates well with resulting fragmentation size distribution, and the apparatus provides a useful tool for studying the mechanism of crushing, which further provides guidelines for the design of commercial crushers.
Upward counterfactual thinking and depression: A meta-analysis.
Broomhall, Anne Gene; Phillips, Wendy J; Hine, Donald W; Loi, Natasha M
2017-07-01
This meta-analysis examined the strength of association between upward counterfactual thinking and depressive symptoms. Forty-two effect sizes from a pooled sample of 13,168 respondents produced a weighted average effect size of r=.26, p<.001. Moderator analyses using an expanded set of 96 effect sizes indicated that upward counterfactuals and regret produced significant positive effects that were similar in strength. Effects also did not vary as a function of the theme of the counterfactual-inducing situation or study design (cross-sectional versus longitudinal). Significant effect size heterogeneity was observed across sample types, methods of assessing upward counterfactual thinking, and types of depression scale. Significant positive effects were found in studies that employed samples of bereaved individuals, older adults, terminally ill patients, or university students, but not adolescent mothers or mixed samples. Both number-based and Likert-based upward counterfactual thinking assessments produced significant positive effects, with the latter generating a larger effect. All depression scales produced significant positive effects, except for the Psychiatric Epidemiology Research Interview. Research and theoretical implications are discussed in relation to cognitive theories of depression and the functional theory of upward counterfactual thinking, and important gaps in the extant research literature are identified. Copyright © 2017 Elsevier Ltd. All rights reserved.
Epistemological Issues in Astronomy Education Research: How Big of a Sample is "Big Enough"?
NASA Astrophysics Data System (ADS)
Slater, Stephanie; Slater, T. F.; Souri, Z.
2012-01-01
As astronomy education research (AER) continues to evolve into a sophisticated enterprise, we must begin to grapple with defining our epistemological parameters. Moreover, as we attempt to make pragmatic use of our findings, we must make a concerted effort to communicate those parameters in a sensible way to the larger astronomical community. One area of much current discussion involves a basic discussion of methodologies, and subsequent sample sizes, that should be considered appropriate for generating knowledge in the field. To address this question, we completed a meta-analysis of nearly 1,000 peer-reviewed studies published in top tier professional journals. Data related to methodologies and sample sizes were collected from "hard science” and "human science” journals to compare the epistemological systems of these two bodies of knowledge. Working back in time from August 2011, the 100 most recent studies reported in each journal were used as a data source: Icarus, ApJ and AJ, NARST, IJSE and SciEd. In addition, data was collected from the 10 most recent AER dissertations, a set of articles determined by the science education community to be the most influential in the field, and the nearly 400 articles used as reference materials for the NRC's Taking Science to School. Analysis indicates these bodies of knowledge have a great deal in common; each relying on a large variety of methodologies, and each building its knowledge through studies that proceed from surprisingly low sample sizes. While both fields publish a small percentage of studies with large sample sizes, the vast majority of top tier publications consist of rich studies of a small number of objects. We conclude that rigor in each field is determined not by a circumscription of methodologies and sample sizes, but by peer judgments that the methods and sample sizes are appropriate to the research question.
Porous structure, permeability, and mechanical properties of polyolefin microporous films
NASA Astrophysics Data System (ADS)
Elyashevich, G. K.; Kuryndin, I. S.; Lavrentyev, V. K.; Bobrovsky, A. Yu.; Bukošek, V.
2012-09-01
Microporous films of polyolefins, namely, polyethylene and polypropylene, have been prepared using the process based on the extrusion of the melt with the subsequent annealing, uniaxial extension, and thermal fixation. The influence of the conditions used for preparation of the films on their morphology, porosity, number and sizes of through-flow channels, and mechanical properties has been investigated. It has been found that a significant influence on the characteristics of the porous structure of the films is exerted by the degree of orientation of the melt at extrusion, the annealing temperature, and the degree of uniaxial extension of the films. The threshold values of these parameters, at which through-flow channels are formed in the films, have been determined. It has been shown using filtration porosimetry that polyethylene films have a higher permeability to liquids as compared to the polypropylene samples (240 and 180 L/(m2 h atm), respectively). The porous structure of the polyethylene films is characterized by larger sizes of through pores than those of the polypropylene samples (the average pore sizes are 210 and 160 nm, respectively), whereas the polypropylene films contain a larger number of through-flow channels.
Socioeconomic Factors Influence Physical Activity and Sport in Quebec Schools.
Morin, Pascale; Lebel, Alexandre; Robitaille, Éric; Bisset, Sherri
2016-11-01
School environments providing a wide selection of physical activities and sufficient facilities are both essential and formative to ensure young people adopt active lifestyles. We describe the association between school opportunities for physical activity and socioeconomic factors measured by low-income cutoff index, school size (number of students), and neighborhood population density. A cross-sectional survey using a 2-stage stratified sampling method built a representative sample of 143 French-speaking public schools in Quebec, Canada. Self-administered questionnaires collected data describing the physical activities offered and schools' sports facilities. Descriptive and bivariate analyses were performed separately for primary and secondary schools. In primary schools, school size was positively associated with more intramural and extracurricular activities, more diverse interior facilities, and activities promoting active transportation. Low-income primary schools were more likely to offer a single gym. Low-income secondary schools offered lower diversity of intramural activities and fewer exterior sporting facilities. High-income secondary schools with a large school size provided a greater number of opportunities, larger infrastructures, and a wider selection of physical activities than smaller low-income schools. Results reveal an overall positive association between school availability of physical and sport activity and socioeconomic factors. © 2016, American School Health Association.
Dust generation in powders: Effect of particle size distribution
NASA Astrophysics Data System (ADS)
Chakravarty, Somik; Le Bihan, Olivier; Fischer, Marc; Morgeneyer, Martin
2017-06-01
This study explores the relationship between the bulk and grain-scale properties of powders and dust generation. A vortex shaker dustiness tester was used to evaluate 8 calcium carbonate test powders with median particle sizes ranging from 2μm to 136μm. Respirable aerosols released from the powder samples were characterised by their particle number and mass concentrations. All the powder samples were found to release respirable fractions of dust particles which end up decreasing with time. The variation of powder dustiness as a function of the particle size distribution was analysed for the powders, which were classified into three groups based on the fraction of particles within the respirable range. The trends we observe might be due to the interplay of several mechanisms like de-agglomeration and attrition and their relative importance.
Michen, Benjamin; Geers, Christoph; Vanhecke, Dimitri; Endes, Carola; Rothen-Rutishauser, Barbara; Balog, Sandor; Petri-Fink, Alke
2015-01-01
Standard transmission electron microscopy nanoparticle sample preparation generally requires the complete removal of the suspending liquid. Drying often introduces artifacts, which can obscure the state of the dispersion prior to drying and preclude automated image analysis typically used to obtain number-weighted particle size distribution. Here we present a straightforward protocol for prevention of the onset of drying artifacts, thereby allowing the preservation of in-situ colloidal features of nanoparticles during TEM sample preparation. This is achieved by adding a suitable macromolecular agent to the suspension. Both research- and economically-relevant particles with high polydispersity and/or shape anisotropy are easily characterized following our approach (http://bsa.bionanomaterials.ch), which allows for rapid and quantitative classification in terms of dimensionality and size: features that are major targets of European Union recommendations and legislation. PMID:25965905
Infrared reflectance spectra: Effects of particle size, provenance and preparation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Su, Yin-Fong; Myers, Tanya L.; Brauer, Carolyn S.
2014-09-22
We have recently developed methods for making more accurate infrared total and diffuse directional - hemispherical reflectance measurements using an integrating sphere. We have found that reflectance spectra of solids, especially powders, are influenced by a number of factors including the sample preparation method, the particle size and morphology, as well as the sample origin. On a quantitative basis we have investigated some of these parameters and the effects they have on reflectance spectra, particularly in the longwave infrared. In the IR the spectral features may be observed as either maxima or minima: In general, upward-going peaks in the reflectancemore » spectrum result from strong surface scattering, i.e. rays that are reflected from the surface without bulk penetration, whereas downward-going peaks are due to either absorption or volume scattering, i.e. rays that have penetrated or refracted into the sample interior and are not reflected. The light signals reflected from solids usually encompass all such effects, but with strong dependencies on particle size and preparation. This paper measures the reflectance spectra in the 1.3 – 16 micron range for various bulk materials that have a combination of strong and weak absorption bands in order to observe the effects on the spectral features: Bulk materials were ground with a mortar and pestle and sieved to separate the samples into various size fractions between 5 and 500 microns. The median particle size is demonstrated to have large effects on the reflectance spectra. For certain minerals we also observe significant spectral change depending on the geologic origin of the sample. All three such effects (particle size, preparation and provenance) result in substantial change in the reflectance spectra for solid materials; successful identification algorithms will require sufficient flexibility to account for these parameters.« less
Infrared reflectance spectra: effects of particle size, provenance and preparation
NASA Astrophysics Data System (ADS)
Su, Yin-Fong; Myers, Tanya L.; Brauer, Carolyn S.; Blake, Thomas A.; Forland, Brenda M.; Szecsody, J. E.; Johnson, Timothy J.
2014-10-01
We have recently developed methods for making more accurate infrared total and diffuse directional - hemispherical reflectance measurements using an integrating sphere. We have found that reflectance spectra of solids, especially powders, are influenced by a number of factors including the sample preparation method, the particle size and morphology, as well as the sample origin. On a quantitative basis we have investigated some of these parameters and the effects they have on reflectance spectra, particularly in the longwave infrared. In the IR the spectral features may be observed as either maxima or minima: In general, upward-going peaks in the reflectance spectrum result from strong surface scattering, i.e. rays that are reflected from the surface without bulk penetration, whereas downward-going peaks are due to either absorption or volume scattering, i.e. rays that have penetrated or refracted into the sample interior and are not reflected. The light signals reflected from solids usually encompass all such effects, but with strong dependencies on particle size and preparation. This paper measures the reflectance spectra in the 1.3 - 16 micron range for various bulk materials that have a combination of strong and weak absorption bands in order to observe the effects on the spectral features: Bulk materials were ground with a mortar and pestle and sieved to separate the samples into various size fractions between 5 and 500 microns. The median particle size is demonstrated to have large effects on the reflectance spectra. For certain minerals we also observe significant spectral change depending on the geologic origin of the sample. All three such effects (particle size, preparation and provenance) result in substantial change in the reflectance spectra for solid materials; successful identification algorithms will require sufficient flexibility to account for these parameters.
Improving the quality of biomarker discovery research: the right samples and enough of them.
Pepe, Margaret S; Li, Christopher I; Feng, Ziding
2015-06-01
Biomarker discovery research has yielded few biomarkers that validate for clinical use. A contributing factor may be poor study designs. The goal in discovery research is to identify a subset of potentially useful markers from a large set of candidates assayed on case and control samples. We recommend the PRoBE design for selecting samples. We propose sample size calculations that require specifying: (i) a definition for biomarker performance; (ii) the proportion of useful markers the study should identify (Discovery Power); and (iii) the tolerable number of useless markers amongst those identified (False Leads Expected, FLE). We apply the methodology to a study of 9,000 candidate biomarkers for risk of colon cancer recurrence where a useful biomarker has positive predictive value ≥ 30%. We find that 40 patients with recurrence and 160 without recurrence suffice to filter out 98% of useless markers (2% FLE) while identifying 95% of useful biomarkers (95% Discovery Power). Alternative methods for sample size calculation required more assumptions. Biomarker discovery research should utilize quality biospecimen repositories and include sample sizes that enable markers meeting prespecified performance characteristics for well-defined clinical applications to be identified. The scientific rigor of discovery research should be improved. ©2015 American Association for Cancer Research.
Sampling and counting genome rearrangement scenarios
2015-01-01
Background Even for moderate size inputs, there are a tremendous number of optimal rearrangement scenarios, regardless what the model is and which specific question is to be answered. Therefore giving one optimal solution might be misleading and cannot be used for statistical inferring. Statistically well funded methods are necessary to sample uniformly from the solution space and then a small number of samples are sufficient for statistical inferring. Contribution In this paper, we give a mini-review about the state-of-the-art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models. Above that, we also give a Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model. The method has been implemented and tested on real life data. The software package together with example data can be downloaded from http://www.renyi.hu/~miklosi/SCJ-Gibbs/ PMID:26452124
MEPAG Recommendations for a 2018 Mars Sample Return Caching Lander - Sample Types, Number, and Sizes
NASA Technical Reports Server (NTRS)
Allen, Carlton C.
2011-01-01
The return to Earth of geological and atmospheric samples from the surface of Mars is among the highest priority objectives of planetary science. The MEPAG Mars Sample Return (MSR) End-to-End International Science Analysis Group (MEPAG E2E-iSAG) was chartered to propose scientific objectives and priorities for returned sample science, and to map out the implications of these priorities, including for the proposed joint ESA-NASA 2018 mission that would be tasked with the crucial job of collecting and caching the samples. The E2E-iSAG identified four overarching scientific aims that relate to understanding: (A) the potential for life and its pre-biotic context, (B) the geologic processes that have affected the martian surface, (C) planetary evolution of Mars and its atmosphere, (D) potential for future human exploration. The types of samples deemed most likely to achieve the science objectives are, in priority order: (1A). Subaqueous or hydrothermal sediments (1B). Hydrothermally altered rocks or low temperature fluid-altered rocks (equal priority) (2). Unaltered igneous rocks (3). Regolith, including airfall dust (4). Present-day atmosphere and samples of sedimentary-igneous rocks containing ancient trapped atmosphere Collection of geologically well-characterized sample suites would add considerable value to interpretations of all collected rocks. To achieve this, the total number of rock samples should be about 30-40. In order to evaluate the size of individual samples required to meet the science objectives, the E2E-iSAG reviewed the analytical methods that would likely be applied to the returned samples by preliminary examination teams, for planetary protection (i.e., life detection, biohazard assessment) and, after distribution, by individual investigators. It was concluded that sample size should be sufficient to perform all high-priority analyses in triplicate. In keeping with long-established curatorial practice of extraterrestrial material, at least 40% by mass of each sample should be preserved to support future scientific investigations. Samples of 15-16 grams are considered optimal. The total mass of returned rocks, soils, blanks and standards should be approximately 500 grams. Atmospheric gas samples should be the equivalent of 50 cubic cm at 20 times Mars ambient atmospheric pressure.
Sample size calculations for stepped wedge and cluster randomised trials: a unified approach
Hemming, Karla; Taljaard, Monica
2016-01-01
Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808
Agus, Emily L; Young, David T; Lingard, Justin J N; Smalley, Robert J; Tate, James E; Goodman, Paul S; Tomlin, Alison S
2007-11-01
Measurements of urban particle number concentrations and size distributions in the range 5-1000 nm were taken at elevated (roof-level) and roadside sampling sites on Narborough Road in Leicester, UK, along with simultaneous measurements of traffic, NO(x), CO and 1,3-butadiene concentrations and meteorological parameters. A fitting program was used to determine the characteristics of up to five modal groups present in the particle size distributions. All particle modal concentrations peaked during the morning and evening rush hours. Additional events associated with the smallest mode, that were not observed to be connected to primary emissions, were also present suggesting that this mode consisted of newly formed secondary particles. These events included peaks in concentration which coincided with peaks in solar radiation, and lower concentrations of the larger modes. Investigation into the relationships between traffic flow and occupancy indicated three flow regimes; free-flow, unstable and congested. During free-flow conditions, positive linear relationships existed between traffic flow and particle modal number concentrations. However, during unstable and congested periods, this relationship was shown to break-down. Similar trends were observed for concentrations of the gas phase pollutants NO(x), CO and 1,3-butadiene. Strong linear relationships existed between NO(x), CO, 1,3-butadiene concentrations, nucleation and Aitken mode concentrations at both sampling locations, indicating a local traffic related emission source. At the roadside, both nucleation and Aitken mode are best represented by a decreasing exponential function with wind speed, whereas at the roof-level this relationship only occurred for Aitken mode particles. The differing relationships at the two sampling locations are most likely due to a combination of meteorological factors and distance from the local emission source.
Exposure to particulate matter in a mosque
NASA Astrophysics Data System (ADS)
Ocak, Yılmaz; Kılıçvuran, Akın; Eren, Aykut Balkan; Sofuoglu, Aysun; Sofuoglu, Sait C.
2012-09-01
Indoor air quality in mosques during prayers may be of concern for sensitive/susceptible sub-groups of the population. However, no indoor air pollutant levels of potentially toxic agents in mosques have been reported in the literature. This study measured PM concentrations in a mosque on Friday when the mid-day prayer always receives high attendance. Particle number and CO2 concentrations were measured on nine sampling days in three different campaigns before, during, and after prayer under three different cleaning schedules: vacuuming a week before, a day before, and on the morning of the prayer. In addition, daily PM2.5 concentrations were measured. Number concentrations in 0.5-1.0, 1.0-5.0, and > 5.0 μm diameter size ranges were monitored. In all campaigns the maximum number concentrations were observed on the most crowded days. The lowest number concentrations occurred when vacuuming was performed a day before the prayer day in two of the three size ranges considered. PM2.5 concentrations (four-hour samples that integrated before, during, and after the prayer) were comparable to the other indoor environments reported in the literature. CO2 concentrations suggested that ventilation was not sufficient in the mosque during the prayers. The results showed that better ventilation, a preventive cleaning strategy, and a more detailed study are needed.
Costa, Marilia G; Barbosa, José C; Yamamoto, Pedro T
2007-01-01
The sequential sampling is characterized by using samples of variable sizes, and has the advantage of reducing sampling time and costs if compared to fixed-size sampling. To introduce an adequate management for orthezia, sequential sampling plans were developed for orchards under low and high infestation. Data were collected in Matão, SP, in commercial stands of the orange variety 'Pêra Rio', at five, nine and 15 years of age. Twenty samplings were performed in the whole area of each stand by observing the presence or absence of scales on plants, being plots comprised of ten plants. After observing that in all of the three stands the scale population was distributed according to the contagious model, fitting the Negative Binomial Distribution in most samplings, two sequential sampling plans were constructed according to the Sequential Likelihood Ratio Test (SLRT). To construct these plans an economic threshold of 2% was adopted and the type I and II error probabilities were fixed in alpha = beta = 0.10. Results showed that the maximum numbers of samples expected to determine control need were 172 and 76 samples for stands with low and high infestation, respectively.
NASA Astrophysics Data System (ADS)
Yulia, M.; Suhandy, D.
2018-03-01
NIR spectra obtained from spectral data acquisition system contains both chemical information of samples as well as physical information of the samples, such as particle size and bulk density. Several methods have been established for developing calibration models that can compensate for sample physical information variations. One common approach is to include physical information variation in the calibration model both explicitly and implicitly. The objective of this study was to evaluate the feasibility of using explicit method to compensate the influence of different particle size of coffee powder in NIR calibration model performance. A number of 220 coffee powder samples with two different types of coffee (civet and non-civet) and two different particle sizes (212 and 500 µm) were prepared. Spectral data was acquired using NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement. A discrimination method based on PLS-DA was conducted and the influence of different particle size on the performance of PLS-DA was investigated. In explicit method, we add directly the particle size as predicted variable results in an X block containing only the NIR spectra and a Y block containing the particle size and type of coffee. The explicit inclusion of the particle size into the calibration model is expected to improve the accuracy of type of coffee determination. The result shows that using explicit method the quality of the developed calibration model for type of coffee determination is a little bit superior with coefficient of determination (R2) = 0.99 and root mean square error of cross-validation (RMSECV) = 0.041. The performance of the PLS2 calibration model for type of coffee determination with particle size compensation was quite good and able to predict the type of coffee in two different particle sizes with relatively high R2 pred values. The prediction also resulted in low bias and RMSEP values.
Clinical decision making and the expected value of information.
Willan, Andrew R
2007-01-01
The results of the HOPE study, a randomized clinical trial, provide strong evidence that 1) ramipril prevents the composite outcome of cardiovascular death, myocardial infarction or stroke in patients who are at high risk of a cardiovascular event and 2) ramipril is cost-effective at a threshold willingness-to-pay of $10,000 to prevent an event of the composite outcome. In this report the concept of the expected value of information is used to determine if the information provided by the HOPE study is sufficient for decision making in the US and Canada. and results Using the cost-effectiveness data from a clinical trial, or from a meta-analysis of several trials, one can determine, based on the number of future patients that would benefit from the health technology under investigation, the expected value of sample information (EVSI) of a future trial as a function of proposed sample size. If the EVSI exceeds the cost for any particular sample size then the current information is insufficient for decision making and a future trial is indicated. If, on the other hand, there is no sample size for which the EVSI exceeds the cost, then there is sufficient information for decision making and no future trial is required. Using the data from the HOPE study these concepts are applied for various assumptions regarding the fixed and variable cost of a future trial and the number of patients who would benefit from ramipril. Expected value of information methods provide a decision-analytic alternative to the standard likelihood methods for assessing the evidence provided by cost-effectiveness data from randomized clinical trials.
Robust gene selection methods using weighting schemes for microarray data analysis.
Kang, Suyeon; Song, Jongwoo
2017-09-02
A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.
Simplified pupal surveys of Aedes aegypti (L.) for entomologic surveillance and dengue control.
Barrera, Roberto
2009-07-01
Pupal surveys of Aedes aegypti (L.) are useful indicators of risk for dengue transmission, although sample sizes for reliable estimations can be large. This study explores two methods for making pupal surveys more practical yet reliable and used data from 10 pupal surveys conducted in Puerto Rico during 2004-2008. The number of pupae per person for each sampling followed a negative binomial distribution, thus showing aggregation. One method found a common aggregation parameter (k) for the negative binomial distribution, a finding that enabled the application of a sequential sampling method requiring few samples to determine whether the number of pupae/person was above a vector density threshold for dengue transmission. A second approach used the finding that the mean number of pupae/person is correlated with the proportion of pupa-infested households and calculated equivalent threshold proportions of pupa-positive households. A sequential sampling program was also developed for this method to determine whether observed proportions of infested households were above threshold levels. These methods can be used to validate entomological thresholds for dengue transmission.
NASA Astrophysics Data System (ADS)
Hiraga, T.; Miyazaki, T.; Tasaka, M.; Yoshida, H.
2011-12-01
Using very fine-grained aggregates of forsterite containing ~10vol% secondary mineral phase such as periclase and enstatite, we have been able to demonstrate their superplascity, that is, achievement of more than a few 100 % tensile strain (Hiraga et al. 2010). Superplastic deformation is commonly considered to proceed via grain boundary sliding (GBS) which results in grain switching in the samples. Hiraga et al. (2010) succeeded in detecting the operation of GBS from observing the coalescence of grains of secondary phase in superplastically deformed samples. The secondary phase pins the motion of grain boundaries of the primary phase; however, the reduction of the number of the grains of secondary phase due to their coalescence allows grain growth of the primary phase. We analyzed the relationships between grain size of the primary and secondary phases, between strain and grain size, and between strain and the number of coalesced grains in the superplastically deformed samples. The results supports participation of all the grains of the primary phase in grain switching process indicating that the grain boundary sliding accommodates almost entire strain during the deformation. Mechanical properties of these materials such as their stress and grain size exponents of 1-2 do not conflict this conclusion. We applied the relationships obtained from analyzing superplastic materials to the microstructure of the natural samples, which has been considered to have deformed via grain boundary sliding, that is, ultramylonite. The microstructure of greenschist-grade ultramylonite reported by Fliervoet et al. (1997) was analyzed. Distributions of the mineral phases (i.e., quartz, plagioclase, K-feldspar and biotite) show distinct coalescence of the same mineral phases in the direction almost perpendicular to the foliation of the rock. The number of coalesced grains indicates that the strain that rock experienced is > 2. [reference] Hiraga et al. (2010) Nature 468, 1091-1094; Fliervoet et al. (1997) Journal of Structural Geology 19, 1495-1520
Voineskos, Sophocles H; Coroneos, Christopher J; Ziolkowski, Natalia I; Kaur, Manraj N; Banfield, Laura; Meade, Maureen O; Chung, Kevin C; Thoma, Achilleas; Bhandari, Mohit
2016-02-01
The authors examined industry support, conflict of interest, and sample size in plastic surgery randomized controlled trials that compared surgical interventions. They hypothesized that industry-funded trials demonstrate statistically significant outcomes more often, and randomized controlled trials with small sample sizes report statistically significant results more frequently. An electronic search identified randomized controlled trials published between 2000 and 2013. Independent reviewers assessed manuscripts and performed data extraction. Funding source, conflict of interest, primary outcome direction, and sample size were examined. Chi-squared and independent-samples t tests were used in the analysis. The search identified 173 randomized controlled trials, of which 100 (58 percent) did not acknowledge funding status. A relationship between funding source and trial outcome direction was not observed. Both funding status and conflict of interest reporting improved over time. Only 24 percent (six of 25) of industry-funded randomized controlled trials reported authors to have independent control of data and manuscript contents. The mean number of patients randomized was 73 per trial (median, 43, minimum, 3, maximum, 936). Small trials were not found to be positive more often than large trials (p = 0.87). Randomized controlled trials with small sample size were common; however, this provides great opportunity for the field to engage in further collaboration and produce larger, more definitive trials. Reporting of trial funding and conflict of interest is historically poor, but it greatly improved over the study period. Underreporting at author and journal levels remains a limitation when assessing the relationship between funding source and trial outcomes. Improved reporting and manuscript control should be goals that both authors and journals can actively achieve.
Final Results of Shuttle MMOD Impact Database
NASA Technical Reports Server (NTRS)
Hyde, J. L.; Christiansen, E. L.; Lear, D. M.
2015-01-01
The Shuttle Hypervelocity Impact Database documents damage features on each Orbiter thought to be from micrometeoroids (MM) or orbital debris (OD). Data is divided into tables for crew module windows, payload bay door radiators and thermal protection systems along with other miscellaneous regions. The combined number of records in the database is nearly 3000. Each database record provides impact feature dimensions, location on the vehicle and relevant mission information. Additional detail on the type and size of particle that produced the damage site is provided when sampling data and definitive spectroscopic analysis results are available. Guidelines are described which were used in determining whether impact damage is from micrometeoroid or orbital debris impact based on the findings from scanning electron microscopy chemical analysis. Relationships assumed when converting from observed feature sizes in different shuttle materials to particle sizes will be presented. A small number of significant impacts on the windows, radiators and wing leading edge will be highlighted and discussed in detail, including the hypervelocity impact testing performed to estimate particle sizes that produced the damage.
On the comparison of the strength of morphological integration across morphometric datasets.
Adams, Dean C; Collyer, Michael L
2016-11-01
Evolutionary morphologists frequently wish to understand the extent to which organisms are integrated, and whether the strength of morphological integration among subsets of phenotypic variables differ among taxa or other groups. However, comparisons of the strength of integration across datasets are difficult, in part because the summary measures that characterize these patterns (RV coefficient and r PLS ) are dependent both on sample size and on the number of variables. As a solution to this issue, we propose a standardized test statistic (a z-score) for measuring the degree of morphological integration between sets of variables. The approach is based on a partial least squares analysis of trait covariation, and its permutation-based sampling distribution. Under the null hypothesis of a random association of variables, the method displays a constant expected value and confidence intervals for datasets of differing sample sizes and variable number, thereby providing a consistent measure of integration suitable for comparisons across datasets. A two-sample test is also proposed to statistically determine whether levels of integration differ between datasets, and an empirical example examining cranial shape integration in Mediterranean wall lizards illustrates its use. Some extensions of the procedure are also discussed. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
Effect of Microstructural Interfaces on the Mechanical Response of Crystalline Metallic Materials
NASA Astrophysics Data System (ADS)
Aitken, Zachary H.
Advances in nano-scale mechanical testing have brought about progress in the understanding of physical phenomena in materials and a measure of control in the fabrication of novel materials. In contrast to bulk materials that display size-invariant mechanical properties, sub-micron metallic samples show a critical dependence on sample size. The strength of nano-scale single crystalline metals is well-described by a power-law function, sigma ∝ D-n, where D is a critical sample size and n is a experimentally-fit positive exponent. This relationship is attributed to source-driven plasticity and demonstrates a strengthening as the decreasing sample size begins to limit the size and number of dislocation sources. A full understanding of this size-dependence is complicated by the presence of microstructural features such as interfaces that can compete with the dominant dislocation-based deformation mechanisms. In this thesis, the effects of microstructural features such as grain boundaries and anisotropic crystallinity on nano-scale metals are investigated through uniaxial compression testing. We find that nano-sized Cu covered by a hard coating displays a Bauschinger effect and the emergence of this behavior can be explained through a simple dislocation-based analytic model. Al nano-pillars containing a single vertically-oriented coincident site lattice grain boundary are found to show similar deformation to single-crystalline nano-pillars with slip traces passing through the grain boundary. With increasing tilt angle of the grain boundary from the pillar axis, we observe a transition from dislocation-dominated deformation to grain boundary sliding. Crystallites are observed to shear along the grain boundary and molecular dynamics simulations reveal a mechanism of atomic migration that accommodates boundary sliding. We conclude with an analysis of the effects of inherent crystal anisotropy and alloying on the mechanical behavior of the Mg alloy, AZ31. Through comparison to pure Mg, we show that the size effect dominates the strength of samples below 10 microm, that differences in the size effect between hexagonal slip systems is due to the inherent crystal anisotropy, suggesting that the fundamental mechanism of the size effect in these slip systems is the same.
Arnup, Sarah J; McKenzie, Joanne E; Hemming, Karla; Pilcher, David; Forbes, Andrew B
2017-08-15
In a cluster randomised crossover (CRXO) design, a sequence of interventions is assigned to a group, or 'cluster' of individuals. Each cluster receives each intervention in a separate period of time, forming 'cluster-periods'. Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design. Formulae are available for the two-period, two-intervention, cross-sectional CRXO design, however implementation of these formulae is known to be suboptimal. The aims of this tutorial are to illustrate the intuition behind the design; and provide guidance on performing sample size calculations. Graphical illustrations are used to describe the effect of the cluster randomisation and crossover aspects of the design on the correlation between individual responses in a CRXO trial. Sample size calculations for binary and continuous outcomes are illustrated using parameters estimated from the Australia and New Zealand Intensive Care Society - Adult Patient Database (ANZICS-APD) for patient mortality and length(s) of stay (LOS). The similarity between individual responses in a CRXO trial can be understood in terms of three components of variation: variation in cluster mean response; variation in the cluster-period mean response; and variation between individual responses within a cluster-period; or equivalently in terms of the correlation between individual responses in the same cluster-period (within-cluster within-period correlation, WPC), and between individual responses in the same cluster, but in different periods (within-cluster between-period correlation, BPC). The BPC lies between zero and the WPC. When the WPC and BPC are equal the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation. When the BPC is zero there is no advantage in a CRXO over a parallel-group cluster randomised trial. Sample size calculations illustrate that small changes in the specification of the WPC or BPC can increase the required number of clusters. By illustrating how the parameters required for sample size calculations arise from the CRXO design and by providing guidance on both how to choose values for the parameters and perform the sample size calculations, the implementation of the sample size formulae for CRXO trials may improve.
The Effects of Popping Popcorn Under Reduced Pressure
NASA Astrophysics Data System (ADS)
Quinn, Paul; Cooper, Amanda
2008-03-01
In our experiments, we model the popping of popcorn as an adiabatic process and develop a process for improving the efficiency of popcorn production. By lowering the pressure of the popcorn during the popping process, we induce an increase in popcorn size, while decreasing the number of remaining unpopped kernels. In this project we run numerous experiments using three of the most common popping devices, a movie popcorn maker, a stove pot, and a microwave. We specifically examine the effects of varying the pressure on total sample size, flake size and waste. An empirical relationship is found between these variables and the pressure.
Speckle imaging through turbulent atmosphere based on adaptable pupil segmentation.
Loktev, Mikhail; Soloviev, Oleg; Savenko, Svyatoslav; Vdovin, Gleb
2011-07-15
We report on the first results to our knowledge obtained with adaptable multiaperture imaging through turbulence on a horizontal atmospheric path. We show that the resolution can be improved by adaptively matching the size of the subaperture to the characteristic size of the turbulence. Further improvement is achieved by the deconvolution of a number of subimages registered simultaneously through multiple subapertures. Different implementations of multiaperture geometry, including pupil multiplication, pupil image sampling, and a plenoptic telescope, are considered. Resolution improvement has been demonstrated on a ∼550 m horizontal turbulent path, using a combination of aperture sampling, speckle image processing, and, optionally, frame selection. © 2011 Optical Society of America
Stratospheric CCN sampling program
NASA Technical Reports Server (NTRS)
Rogers, C. F.
1981-01-01
When Mt. St. Helens produced several major eruptions in the late spring of 1980, there was a strong interest in the characterization of the cloud condensation nuclei (CCN) activity of the material that was injected into the troposphere and stratosphere. The scientific value of CCN measurements is two fold: CCN counts may be directly applied to calculations of the interaction of the aerosol (enlargement) at atmospherically-realistic relative humidities or supersaturations; and if the chemical constituency of the aerosol can be assumed, the number-versus-critical supersaturation spectrum may be converted into a dry aerosol size spectrum covering a size region not readily measured by other methods. The sampling method is described along with the instrumentation used in the experiments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Öztürk, Hande; Noyan, I. Cevdet
A rigorous study of sampling and intensity statistics applicable for a powder diffraction experiment as a function of crystallite size is presented. Our analysis yields approximate equations for the expected value, variance and standard deviations for both the number of diffracting grains and the corresponding diffracted intensity for a given Bragg peak. The classical formalism published in 1948 by Alexander, Klug & Kummer [J. Appl. Phys.(1948),19, 742–753] appears as a special case, limited to large crystallite sizes, here. It is observed that both the Lorentz probability expression and the statistics equations used in the classical formalism are inapplicable for nanocrystallinemore » powder samples.« less
Öztürk, Hande; Noyan, I. Cevdet
2017-08-24
A rigorous study of sampling and intensity statistics applicable for a powder diffraction experiment as a function of crystallite size is presented. Our analysis yields approximate equations for the expected value, variance and standard deviations for both the number of diffracting grains and the corresponding diffracted intensity for a given Bragg peak. The classical formalism published in 1948 by Alexander, Klug & Kummer [J. Appl. Phys.(1948),19, 742–753] appears as a special case, limited to large crystallite sizes, here. It is observed that both the Lorentz probability expression and the statistics equations used in the classical formalism are inapplicable for nanocrystallinemore » powder samples.« less
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Instrumental neutron activation analysis for studying size-fractionated aerosols
NASA Astrophysics Data System (ADS)
Salma, Imre; Zemplén-Papp, Éva
1999-10-01
Instrumental neutron activation analysis (INAA) was utilized for studying aerosol samples collected into a coarse and a fine size fraction on Nuclepore polycarbonate membrane filters. As a result of the panoramic INAA, 49 elements were determined in an amount of about 200-400 μg of particulate matter by two irradiations and four γ-spectrometric measurements. The analytical calculations were performed by the absolute ( k0) standardization method. The calibration procedures, application protocol and the data evaluation process are described and discussed. They make it possible now to analyse a considerable number of samples, with assuring the quality of the results. As a means of demonstrating the system's analytical capabilities, the concentration ranges, median or mean atmospheric concentrations and detection limits are presented for an extensive series of aerosol samples collected within the framework of an urban air pollution study in Budapest. For most elements, the precision of the analysis was found to be beyond the uncertainty represented by the sampling techniques and sample variability.
Sample preparation techniques for the determination of trace residues and contaminants in foods.
Ridgway, Kathy; Lalljie, Sam P D; Smith, Roger M
2007-06-15
The determination of trace residues and contaminants in complex matrices, such as food, often requires extensive sample extraction and preparation prior to instrumental analysis. Sample preparation is often the bottleneck in analysis and there is a need to minimise the number of steps to reduce both time and sources of error. There is also a move towards more environmentally friendly techniques, which use less solvent and smaller sample sizes. Smaller sample size becomes important when dealing with real life problems, such as consumer complaints and alleged chemical contamination. Optimal sample preparation can reduce analysis time, sources of error, enhance sensitivity and enable unequivocal identification, confirmation and quantification. This review considers all aspects of sample preparation, covering general extraction techniques, such as Soxhlet and pressurised liquid extraction, microextraction techniques such as liquid phase microextraction (LPME) and more selective techniques, such as solid phase extraction (SPE), solid phase microextraction (SPME) and stir bar sorptive extraction (SBSE). The applicability of each technique in food analysis, particularly for the determination of trace organic contaminants in foods is discussed.
Sparse feature learning for instrument identification: Effects of sampling and pooling methods.
Han, Yoonchang; Lee, Subin; Nam, Juhan; Lee, Kyogu
2016-05-01
Feature learning for music applications has recently received considerable attention from many researchers. This paper reports on the sparse feature learning algorithm for musical instrument identification, and in particular, focuses on the effects of the frame sampling techniques for dictionary learning and the pooling methods for feature aggregation. To this end, two frame sampling techniques are examined that are fixed and proportional random sampling. Furthermore, the effect of using onset frame was analyzed for both of proposed sampling methods. Regarding summarization of the feature activation, a standard deviation pooling method is used and compared with the commonly used max- and average-pooling techniques. Using more than 47 000 recordings of 24 instruments from various performers, playing styles, and dynamics, a number of tuning parameters are experimented including the analysis frame size, the dictionary size, and the type of frequency scaling as well as the different sampling and pooling methods. The results show that the combination of proportional sampling and standard deviation pooling achieve the best overall performance of 95.62% while the optimal parameter set varies among the instrument classes.
Measurement and classification methods using the ASAE S572-1 reference nozzles
USDA-ARS?s Scientific Manuscript database
An increasing number of spray nozzle and agrochemical manufacturers are incorporating droplet size measurements into both research and development with each laboratory invariably having their own sampling setup and procedures, particularly with regard to both measurement distance from the nozzle and...
A closer look at the size of the gaze-liking effect: a preregistered replication.
Tipples, Jason; Pecchinenda, Anna
2018-04-30
This study is a direct replication of gaze-liking effect using the same design, stimuli and procedure. The gaze-liking effect describes the tendency for people to rate objects as more likeable when they have recently seen a person repeatedly gaze toward rather than away from the object. However, as subsequent studies show considerable variability in the size of this effect, we sampled a larger number of participants (N = 98) than the original study (N = 24) to gain a more precise estimate of the gaze-liking effect size. Our results indicate a much smaller standardised effect size (d z = 0.02) than that of the original study (d z = 0.94). Our smaller effect size was not due to general insensitivity to eye-gaze effects because the same sample showed a clear (d z = 1.09) gaze-cuing effect - faster reaction times when eyes looked toward vs away from target objects. We discuss the implications of our findings for future studies wishing to study the gaze-liking effect.
Catch of channel catfish with tandem-set hoop nets and gill nets in lentic systems of Nebraska
Richters, Lindsey K.; Pope, Kevin L.
2011-01-01
Twenty-six Nebraska water bodies representing two ecosystem types (small standing waters and large standing waters) were surveyed during 2008 and 2009 with tandem-set hoop nets and experimental gill nets to determine if similar trends existed in catch rates and size structures of channel catfish Ictalurus punctatus captured with these gears. Gear efficiency was assessed as the number of sets (nets) that would be required to capture 100 channel catfish given observed catch per unit effort (CPUE). Efficiency of gill nets was not correlated with efficiency of hoop nets for capturing channel catfish. Small sample sizes prohibited estimation of proportional size distributions in most surveys; in the four surveys for which sample size was sufficient to quantify length-frequency distributions of captured channel catfish, distributions differed between gears. The CPUE of channel catfish did not differ between small and large water bodies for either gear. While catch rates of hoop nets were lower than rates recorded in previous studies, this gear was more efficient than gill nets at capturing channel catfish. However, comparisons of size structure between gears may be problematic.
Error simulation of paired-comparison-based scaling methods
NASA Astrophysics Data System (ADS)
Cui, Chengwu
2000-12-01
Subjective image quality measurement usually resorts to psycho physical scaling. However, it is difficult to evaluate the inherent precision of these scaling methods. Without knowing the potential errors of the measurement, subsequent use of the data can be misleading. In this paper, the errors on scaled values derived form paired comparison based scaling methods are simulated with randomly introduced proportion of choice errors that follow the binomial distribution. Simulation results are given for various combinations of the number of stimuli and the sampling size. The errors are presented in the form of average standard deviation of the scaled values and can be fitted reasonably well with an empirical equation that can be sued for scaling error estimation and measurement design. The simulation proves paired comparison based scaling methods can have large errors on the derived scaled values when the sampling size and the number of stimuli are small. Examples are also given to show the potential errors on actually scaled values of color image prints as measured by the method of paired comparison.
NASA Astrophysics Data System (ADS)
Zou, Xiaodong; Zhao, Dapeng; Sun, Jincheng; Wang, Cong; Matsuura, Hiroyuki
2018-04-01
Inclusion evolution behaviors, in terms of composition, size, and number density, and associated influence on the microstructures of the as-cast slabs, rolled plates, and simulated welded samples of plain EH36 and EH36-Mg shipbuilding steels have been systematically investigated. The results indicate that the inclusions in the as-cast plain EH36 are almost Al-Ca-S-O-(Mn) complex oxides with sizes ranging from 1.0 to 2.0 μm. After Mg addition, a large amount of individually fine MnS precipitates and Mg-containing Ti-Al-Mg-O-(Mn-S) complex inclusions are generated, which significantly refine the microstructure and are conducive to the nucleation of acicular ferrite in the rolled and welded sample. Moreover, after rolling and welding thermal simulation, the number of individual MnS decreases gradually due to its precipitation on the surface of Ti-Al-Mg-O oxides.
Puechmaille, Sebastien J
2016-05-01
Inferences of population structure and more precisely the identification of genetically homogeneous groups of individuals are essential to the fields of ecology, evolutionary biology and conservation biology. Such population structure inferences are routinely investigated via the program structure implementing a Bayesian algorithm to identify groups of individuals at Hardy-Weinberg and linkage equilibrium. While the method is performing relatively well under various population models with even sampling between subpopulations, the robustness of the method to uneven sample size between subpopulations and/or hierarchical levels of population structure has not yet been tested despite being commonly encountered in empirical data sets. In this study, I used simulated and empirical microsatellite data sets to investigate the impact of uneven sample size between subpopulations and/or hierarchical levels of population structure on the detected population structure. The results demonstrated that uneven sampling often leads to wrong inferences on hierarchical structure and downward-biased estimates of the true number of subpopulations. Distinct subpopulations with reduced sampling tended to be merged together, while at the same time, individuals from extensively sampled subpopulations were generally split, despite belonging to the same panmictic population. Four new supervised methods to detect the number of clusters were developed and tested as part of this study and were found to outperform the existing methods using both evenly and unevenly sampled data sets. Additionally, a subsampling strategy aiming to reduce sampling unevenness between subpopulations is presented and tested. These results altogether demonstrate that when sampling evenness is accounted for, the detection of the correct population structure is greatly improved. © 2016 John Wiley & Sons Ltd.
Spatial Sampling of Weather Data for Regional Crop Yield Simulations
NASA Technical Reports Server (NTRS)
Van Bussel, Lenny G. J.; Ewert, Frank; Zhao, Gang; Hoffmann, Holger; Enders, Andreas; Wallach, Daniel; Asseng, Senthold; Baigorria, Guillermo A.; Basso, Bruno; Biernath, Christian;
2016-01-01
Field-scale crop models are increasingly applied at spatio-temporal scales that range from regions to the globe and from decades up to 100 years. Sufficiently detailed data to capture the prevailing spatio-temporal heterogeneity in weather, soil, and management conditions as needed by crop models are rarely available. Effective sampling may overcome the problem of missing data but has rarely been investigated. In this study the effect of sampling weather data has been evaluated for simulating yields of winter wheat in a region in Germany over a 30-year period (1982-2011) using 12 process-based crop models. A stratified sampling was applied to compare the effect of different sizes of spatially sampled weather data (10, 30, 50, 100, 500, 1000 and full coverage of 34,078 sampling points) on simulated wheat yields. Stratified sampling was further compared with random sampling. Possible interactions between sample size and crop model were evaluated. The results showed differences in simulated yields among crop models but all models reproduced well the pattern of the stratification. Importantly, the regional mean of simulated yields based on full coverage could already be reproduced by a small sample of 10 points. This was also true for reproducing the temporal variability in simulated yields but more sampling points (about 100) were required to accurately reproduce spatial yield variability. The number of sampling points can be smaller when a stratified sampling is applied as compared to a random sampling. However, differences between crop models were observed including some interaction between the effect of sampling on simulated yields and the model used. We concluded that stratified sampling can considerably reduce the number of required simulations. But, differences between crop models must be considered as the choice for a specific model can have larger effects on simulated yields than the sampling strategy. Assessing the impact of sampling soil and crop management data for regional simulations of crop yields is still needed.
Yu, Bi-yun; Zhang, Wen-hui; He, Ting; You, Jian-jian; Li, Gang
2014-12-01
Typical sampling method was conducted to survey the effects of forest gap size on branch architecture, leaf characteristics and their vertical distribution of Quercus variablis seedlings from different size gaps in natural secondary Q. variablis thinning forest, on the south slope of Qinling Mountains. The results showed that gap size significantly affected the diameter, crown area of Q. variablis seedlings. The gap size positively correlated with diameter and negatively correlated with crown area, while it had no significant impact on seedling height, crown length and crown rates. The overall bifurcation ratio, stepwise bifurcation ratio, and ratio of branch diameter followed as large gap > middle gap > small gap > understory. The vertical distribution of first-order branches under different size gaps mainly concentrated at the middle and upper part of trunk, larger diameter first-order branches were mainly distributed at the lower part of trunk, and the angle of first-order branch increased at first and then declined with the increasing seedling height. With the increasing forest gap size, the leaf length, leaf width and average leaf area of seedlings all gradually declined, while the average leaf number per plant and relative total leaf number increased, the leaf length-width ratio kept stable, the relative leaf number was mainly distributed at the middle and upper parts of trunk, the changes of leaf area index was consistent with the change of the relative total number of leaves. There was no significant difference between the diameters of middle gap and large gap seedlings, but the diameter of middle gap seedlings was higher than that of large gap, suggesting the middle gap would benefit the seedlings regeneration and high-quality timber cultivation. To promote the regeneration of Q. variabilis seedlings, and to cultivate high-quality timber, appropriate thinning should be taken to increase the number of middle gaps in the management of Q. variabilis forest.
Sample design effects in landscape genetics
Oyler-McCance, Sara J.; Fedy, Bradley C.; Landguth, Erin L.
2012-01-01
An important research gap in landscape genetics is the impact of different field sampling designs on the ability to detect the effects of landscape pattern on gene flow. We evaluated how five different sampling regimes (random, linear, systematic, cluster, and single study site) affected the probability of correctly identifying the generating landscape process of population structure. Sampling regimes were chosen to represent a suite of designs common in field studies. We used genetic data generated from a spatially-explicit, individual-based program and simulated gene flow in a continuous population across a landscape with gradual spatial changes in resistance to movement. Additionally, we evaluated the sampling regimes using realistic and obtainable number of loci (10 and 20), number of alleles per locus (5 and 10), number of individuals sampled (10-300), and generational time after the landscape was introduced (20 and 400). For a simulated continuously distributed species, we found that random, linear, and systematic sampling regimes performed well with high sample sizes (>200), levels of polymorphism (10 alleles per locus), and number of molecular markers (20). The cluster and single study site sampling regimes were not able to correctly identify the generating process under any conditions and thus, are not advisable strategies for scenarios similar to our simulations. Our research emphasizes the importance of sampling data at ecologically appropriate spatial and temporal scales and suggests careful consideration for sampling near landscape components that are likely to most influence the genetic structure of the species. In addition, simulating sampling designs a priori could help guide filed data collection efforts.
Yoshida, Sachiyo; Rudan, Igor; Cousens, Simon
2016-01-01
Introduction Crowdsourcing has become an increasingly important tool to address many problems – from government elections in democracies, stock market prices, to modern online tools such as TripAdvisor or Internet Movie Database (IMDB). The CHNRI method (the acronym for the Child Health and Nutrition Research Initiative) for setting health research priorities has crowdsourcing as the major component, which it uses to generate, assess and prioritize between many competing health research ideas. Methods We conducted a series of analyses using data from a group of 91 scorers to explore the quantitative properties of their collective opinion. We were interested in the stability of their collective opinion as the sample size increases from 15 to 90. From a pool of 91 scorers who took part in a previous CHNRI exercise, we used sampling with replacement to generate multiple random samples of different size. First, for each sample generated, we identified the top 20 ranked research ideas, among 205 that were proposed and scored, and calculated the concordance with the ranking generated by the 91 original scorers. Second, we used rank correlation coefficients to compare the ranks assigned to all 205 proposed research ideas when samples of different size are used. We also analysed the original pool of 91 scorers to to look for evidence of scoring variations based on scorers' characteristics. Results The sample sizes investigated ranged from 15 to 90. The concordance for the top 20 scored research ideas increased with sample sizes up to about 55 experts. At this point, the median level of concordance stabilized at 15/20 top ranked questions (75%), with the interquartile range also generally stable (14–16). There was little further increase in overlap when the sample size increased from 55 to 90. When analysing the ranking of all 205 ideas, the rank correlation coefficient increased as the sample size increased, with a median correlation of 0.95 reached at the sample size of 45 experts (median of the rank correlation coefficient = 0.95; IQR 0.94–0.96). Conclusions Our analyses suggest that the collective opinion of an expert group on a large number of research ideas, expressed through categorical variables (Yes/No/Not Sure/Don't know), stabilises relatively quickly in terms of identifying the ideas that have most support. In the exercise we found a high degree of reproducibility of the identified research priorities was achieved with as few as 45–55 experts. PMID:27350874
Yoshida, Sachiyo; Rudan, Igor; Cousens, Simon
2016-06-01
Crowdsourcing has become an increasingly important tool to address many problems - from government elections in democracies, stock market prices, to modern online tools such as TripAdvisor or Internet Movie Database (IMDB). The CHNRI method (the acronym for the Child Health and Nutrition Research Initiative) for setting health research priorities has crowdsourcing as the major component, which it uses to generate, assess and prioritize between many competing health research ideas. We conducted a series of analyses using data from a group of 91 scorers to explore the quantitative properties of their collective opinion. We were interested in the stability of their collective opinion as the sample size increases from 15 to 90. From a pool of 91 scorers who took part in a previous CHNRI exercise, we used sampling with replacement to generate multiple random samples of different size. First, for each sample generated, we identified the top 20 ranked research ideas, among 205 that were proposed and scored, and calculated the concordance with the ranking generated by the 91 original scorers. Second, we used rank correlation coefficients to compare the ranks assigned to all 205 proposed research ideas when samples of different size are used. We also analysed the original pool of 91 scorers to to look for evidence of scoring variations based on scorers' characteristics. The sample sizes investigated ranged from 15 to 90. The concordance for the top 20 scored research ideas increased with sample sizes up to about 55 experts. At this point, the median level of concordance stabilized at 15/20 top ranked questions (75%), with the interquartile range also generally stable (14-16). There was little further increase in overlap when the sample size increased from 55 to 90. When analysing the ranking of all 205 ideas, the rank correlation coefficient increased as the sample size increased, with a median correlation of 0.95 reached at the sample size of 45 experts (median of the rank correlation coefficient = 0.95; IQR 0.94-0.96). Our analyses suggest that the collective opinion of an expert group on a large number of research ideas, expressed through categorical variables (Yes/No/Not Sure/Don't know), stabilises relatively quickly in terms of identifying the ideas that have most support. In the exercise we found a high degree of reproducibility of the identified research priorities was achieved with as few as 45-55 experts.
A survey of FRAXE allele sizes in three populations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhong, N.; Ju, W.; Curley, D.
1996-08-09
FRAXE is a fragile site located at Xq27-8, which contains polymorphic triplet GCC repeats associated with a CpG island. Similar to FRAXA, expansion of the GCC repeats results in an abnormal methylation of the CpG island and is associated with a mild mental retardation syndrome (FRAXE-MR). We surveyed the GCC repeat alleles of FRAXE from 3 populations. A total of 665 X chromosomes including 416 from a New York Euro-American sample (259 normal and 157 with FRAXA mutations), 157 from a Chinese sample (144 normal and 13 FRAXA), and 92 from a Finnish sample (56 normal and 36 FRAXA) weremore » analyzed by polymerase chain reaction. Twenty-seven alleles, ranging from 4 to 39 GCC repeats, were observed. The modal repeat number was 16 in the New York and Finnish samples and accounted for 24% of all the chromosomes tested (162/665). The modal repeat number in the Chinese sample was 18. A founder effect for FRAXA was suggested among the Finnish FRAXA samples in that 75% had the FRAXE 16 repeat allele versus only 30% of controls. Sequencing of the FRAXE region showed no imperfections within the GCC repeat region, such as those commonly seen in FRAXA. The smaller size and limited range of repeats and the lack of imperfections suggests the molecular mechanisms underlying FRAXE triplet mutations may be different from those underlying FRAXA. 27 refs., 4 figs., 1 tab.« less
Random vs. systematic sampling from administrative databases involving human subjects.
Hagino, C; Lo, R J
1998-09-01
Two sampling techniques, simple random sampling (SRS) and systematic sampling (SS), were compared to determine whether they yield similar and accurate distributions for the following four factors: age, gender, geographic location and years in practice. Any point estimate within 7 yr or 7 percentage points of its reference standard (SRS or the entire data set, i.e., the target population) was considered "acceptably similar" to the reference standard. The sampling frame was from the entire membership database of the Canadian Chiropractic Association. The two sampling methods were tested using eight different sample sizes of n (50, 100, 150, 200, 250, 300, 500, 800). From the profile/characteristics, summaries of four known factors [gender, average age, number (%) of chiropractors in each province and years in practice], between- and within-methods chi 2 tests and unpaired t tests were performed to determine whether any of the differences [descriptively greater than 7% or 7 yr] were also statistically significant. The strengths of the agreements between the provincial distributions were quantified by calculating the percent agreements for each (provincial pairwise-comparison methods). Any percent agreement less than 70% was judged to be unacceptable. Our assessments of the two sampling methods (SRS and SS) for the different sample sizes tested suggest that SRS and SS yielded acceptably similar results. Both methods started to yield "correct" sample profiles at approximately the same sample size (n > 200). SS is not only convenient, it can be recommended for sampling from large databases in which the data are listed without any inherent order biases other than alphabetical listing by surname.
NASA Technical Reports Server (NTRS)
Franklin, Janet; Simonett, David
1988-01-01
The Li-Strahler reflectance model, driven by LANDSAT Thematic Mapper (TM) data, provided regional estimates of tree size and density within 20 percent of sampled values in two bioclimatic zones in West Africa. This model exploits tree geometry in an inversion technique to predict average tree size and density from reflectance data using a few simple parameters measured in the field (spatial pattern, shape, and size distribution of trees) and in the imagery (spectral signatures of scene components). Trees are treated as simply shaped objects, and multispectral reflectance of a pixel is assumed to be related only to the proportions of tree crown, shadow, and understory in the pixel. These, in turn, are a direct function of the number and size of trees, the solar illumination angle, and the spectral signatures of crown, shadow and understory. Given the variance in reflectance from pixel to pixel within a homogeneous area of woodland, caused by the variation in the number and size of trees, the model can be inverted to give estimates of average tree size and density. Because the inversion is sensitive to correct determination of component signatures, predictions are not accurate for small areas.
Jones, Jeffery I.; Gardner, Michael S.; Schieltz, David M.; Parks, Bryan A.; Toth, Christopher A.; Rees, Jon C.; Andrews, Michael L.; Carter, Kayla; Lehtikoski, Antony K.; McWilliams, Lisa G.; Williamson, Yulanda M.; Bierbaum, Kevin P.; Pirkle, James L.; Barr, John R.
2018-01-01
Lipoproteins are complex molecular assemblies that are key participants in the intricate cascade of extracellular lipid metabolism with important consequences in the formation of atherosclerotic lesions and the development of cardiovascular disease. Multiplexed mass spectrometry (MS) techniques have substantially improved the ability to characterize the composition of lipoproteins. However, these advanced MS techniques are limited by traditional pre-analytical fractionation techniques that compromise the structural integrity of lipoprotein particles during separation from serum or plasma. In this work, we applied a highly effective and gentle hydrodynamic size based fractionation technique, asymmetric flow field-flow fractionation (AF4), and integrated it into a comprehensive tandem mass spectrometry based workflow that was used for the measurement of apolipoproteins (apos A-I, A-II, A-IV, B, C-I, C-II, C-III and E), free cholesterol (FC), cholesterol esters (CE), triglycerides (TG), and phospholipids (PL) (phosphatidylcholine (PC), sphingomyelin (SM), phosphatidylethanolamine (PE), phosphatidylinositol (PI) and lysophosphatidylcholine (LPC)). Hydrodynamic size in each of 40 size fractions separated by AF4 was measured by dynamic light scattering. Measuring all major lipids and apolipoproteins in each size fraction and in the whole serum, using total of 0.1 ml, allowed the volumetric calculation of lipoprotein particle numbers and expression of composition in molar analyte per particle number ratios. Measurements in 110 serum samples showed substantive differences between size fractions of HDL and LDL. Lipoprotein composition within size fractions was expressed in molar ratios of analytes (A-I/A-II, C-II/C-I, C-II/C-III. E/C-III, FC/PL, SM/PL, PE/PL, and PI/PL), showing differences in sample categories with combinations of normal and high levels of Total-C and/or Total-TG. The agreement with previous studies indirectly validates the AF4-LC-MS/MS approach and demonstrates the potential of this workflow for characterization of lipoprotein composition in clinical studies using small volumes of archived frozen samples. PMID:29634782
Relation Between Pore Size and the Compressibility of a Confined Fluid
Gor, Gennady Y.; Siderius, Daniel W.; Rasmussen, Christopher J.; Krekelberg, William P.; Shen, Vincent K.; Bernstein, Noam
2015-01-01
When a fluid is confined to a nanopore, its thermodynamic properties differ from the properties of a bulk fluid, so measuring such properties of the confined fluid can provide information about the pore sizes. Here we report a simple relation between the pore size and isothermal compressibility of argon confined in these pores. Compressibility is calculated from the fluctuations of the number of particles in the grand canonical ensemble using two different simulation techniques: conventional grand-canonical Monte Carlo and grand-canonical ensemble transition-matrix Monte Carlo. Our results provide a theoretical framework for extracting the information on the pore sizes of fluid-saturated samples by measuring the compressibility from ultrasonic experiments. PMID:26590541
NASA Astrophysics Data System (ADS)
Wiedensohler, A.; Birmili, W.; Nowak, A.; Sonntag, A.; Weinhold, K.; Merkel, M.; Wehner, B.; Tuch, T.; Pfeifer, S.; Fiebig, M.; Fjäraa, A. M.; Asmi, E.; Sellegri, K.; Depuy, R.; Venzac, H.; Villani, P.; Laj, P.; Aalto, P.; Ogren, J. A.; Swietlicki, E.; Williams, P.; Roldin, P.; Quincey, P.; Hüglin, C.; Fierz-Schmidhauser, R.; Gysel, M.; Weingartner, E.; Riccobono, F.; Santos, S.; Grüning, C.; Faloon, K.; Beddows, D.; Harrison, R.; Monahan, C.; Jennings, S. G.; O'Dowd, C. D.; Marinoni, A.; Horn, H.-G.; Keck, L.; Jiang, J.; Scheckman, J.; McMurry, P. H.; Deng, Z.; Zhao, C. S.; Moerman, M.; Henzing, B.; de Leeuw, G.; Löschau, G.; Bastian, S.
2012-03-01
Mobility particle size spectrometers often referred to as DMPS (Differential Mobility Particle Sizers) or SMPS (Scanning Mobility Particle Sizers) have found a wide range of applications in atmospheric aerosol research. However, comparability of measurements conducted world-wide is hampered by lack of generally accepted technical standards and guidelines with respect to the instrumental set-up, measurement mode, data evaluation as well as quality control. Technical standards were developed for a minimum requirement of mobility size spectrometry to perform long-term atmospheric aerosol measurements. Technical recommendations include continuous monitoring of flow rates, temperature, pressure, and relative humidity for the sheath and sample air in the differential mobility analyzer. We compared commercial and custom-made inversion routines to calculate the particle number size distributions from the measured electrical mobility distribution. All inversion routines are comparable within few per cent uncertainty for a given set of raw data. Furthermore, this work summarizes the results from several instrument intercomparison workshops conducted within the European infrastructure project EUSAAR (European Supersites for Atmospheric Aerosol Research) and ACTRIS (Aerosols, Clouds, and Trace gases Research InfraStructure Network) to determine present uncertainties especially of custom-built mobility particle size spectrometers. Under controlled laboratory conditions, the particle number size distributions from 20 to 200 nm determined by mobility particle size spectrometers of different design are within an uncertainty range of around ±10% after correcting internal particle losses, while below and above this size range the discrepancies increased. For particles larger than 200 nm, the uncertainty range increased to 30%, which could not be explained. The network reference mobility spectrometers with identical design agreed within ±4% in the peak particle number concentration when all settings were done carefully. The consistency of these reference instruments to the total particle number concentration was demonstrated to be less than 5%. Additionally, a new data structure for particle number size distributions was introduced to store and disseminate the data at EMEP (European Monitoring and Evaluation Program). This structure contains three levels: raw data, processed data, and final particle size distributions. Importantly, we recommend reporting raw measurements including all relevant instrument parameters as well as a complete documentation on all data transformation and correction steps. These technical and data structure standards aim to enhance the quality of long-term size distribution measurements, their comparability between different networks and sites, and their transparency and traceability back to raw data.
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
Structure and coarsening at the surface of a dry three-dimensional aqueous foam.
Roth, A E; Chen, B G; Durian, D J
2013-12-01
We utilize total-internal reflection to isolate the two-dimensional surface foam formed at the planar boundary of a three-dimensional sample. The resulting images of surface Plateau borders are consistent with Plateau's laws for a truly two-dimensional foam. Samples are allowed to coarsen into a self-similar scaling state where statistical distributions appear independent of time, except for an overall scale factor. There we find that statistical measures of side number distributions, size-topology correlations, and bubble shapes are all very similar to those for two-dimensional foams. However, the size number distribution is slightly broader, and the shapes are slightly more elongated. A more obvious difference is that T2 processes now include the creation of surface bubbles, due to rearrangement in the bulk, and von Neumann's law is dramatically violated for individual bubbles. But nevertheless, our most striking finding is that von Neumann's law appears to holds on average, namely, the average rate of area change for surface bubbles appears to be proportional to the number of sides minus six, but with individual bubbles showing a wide distribution of deviations from this average behavior.
NASA Astrophysics Data System (ADS)
Royalty, T. M.; Phillips, B.; Dawson, K. W.; Reed, R. E.; Meskhidze, N.
2016-12-01
We report aerosol number size distribution and hygroscopicity data collected over the Pacific Ocean near the Hawaii Ocean Timeseries (HOT) Station ALOHA (centered near 22°N, 158°W). From June 25 to July 3, 2016 our hygroscopicity tandem differential mobility analyzer (HTDMA)/scanning mobility particle sizer (SMPS) system was deployed onboard of NOAA Ship Hi'ialakai that participated in mooring operations associated with the Woods Hole Oceanographic Institution WHOTS project. The ambient aerosol data was collected during the ship's planned operations. The inlet was located at the bow of the ship and the air samples were drawn (using 3/8 inch stainless steel tubing) inside a dry, air-conditioned lab. The region north of Oahu was very clean, with total particle number approximately 200 cm-3, occasionally dropping below 100 cm-3. We compare our particle number size distribution and hygroscopicity data with previously reported estimates. Our measurements contribute to process-level understanding of the role of sea spray aerosol in marine boundary layer cloud condensation nuclei (CCN) budget and provide crucial information to the community interested in studying and projecting climate change using Earth System Models.
Paquet, Victor; Joseph, Caroline; D'Souza, Clive
2012-01-01
Anthropometric studies typically require a large number of individuals that are selected in a manner so that demographic characteristics that impact body size and function are proportionally representative of a user population. This sampling approach does not allow for an efficient characterization of the distribution of body sizes and functions of sub-groups within a population and the demographic characteristics of user populations can often change with time, limiting the application of the anthropometric data in design. The objective of this study is to demonstrate how demographically representative user populations can be developed from samples that are not proportionally representative in order to improve the application of anthropometric data in design. An engineering anthropometry problem of door width and clear floor space width is used to illustrate the value of the approach.
Laboratory evaluation of the Sequoia Scientific LISST-ABS acoustic backscatter sediment sensor
Snazelle, Teri T.
2017-12-18
Sequoia Scientific’s LISST-ABS is an acoustic backscatter sensor designed to measure suspended-sediment concentration at a point source. Three LISST-ABS were evaluated at the U.S. Geological Survey (USGS) Hydrologic Instrumentation Facility (HIF). Serial numbers 6010, 6039, and 6058 were assessed for accuracy in solutions with varying particle-size distributions and for the effect of temperature on sensor accuracy. Certified sediment samples composed of different ranges of particle size were purchased from Powder Technology Inc. These sediment samples were 30–80-micron (µm) Arizona Test Dust; less than 22-µm ISO 12103-1, A1 Ultrafine Test Dust; and 149-µm MIL-STD 810E Silica Dust. The sensor was able to accurately measure suspended-sediment concentration when calibrated with sediment of the same particle-size distribution as the measured. Overall testing demonstrated that sensors calibrated with finer sized sediments overdetect sediment concentrations with coarser sized sediments, and sensors calibrated with coarser sized sediments do not detect increases in sediment concentrations from small and fine sediments. These test results are not unexpected for an acoustic-backscatter device and stress the need for using accurate site-specific particle-size distributions during sensor calibration. When calibrated for ultrafine dust with a less than 22-µm particle size (silt) and with the Arizona Test Dust with a 30–80-µm range, the data from sensor 6039 were biased high when fractions of the coarser (149-µm) Silica Dust were added. Data from sensor 6058 showed similar results with an elevated response to coarser material when calibrated with a finer particle-size distribution and a lack of detection when subjected to finer particle-size sediment. Sensor 6010 was also tested for the effect of dissimilar particle size during the calibration and showed little effect. Subsequent testing revealed problems with this sensor, including an inadequate temperature compensation, making this data questionable. The sensor was replaced by Sequoia Scientific with serial number 6039. Results from the extended temperature testing showed proper temperature compensation for sensor 6039, and results from the dissimilar calibration/testing particle-size distribution closely corroborated the results from sensor 6058.
Physical properties of the WAIS Divide ice core
Fitzpatrick, Joan J.; Voigt, Donald E.; Fegyveresi, John M.; Stevens, Nathan T.; Spencer, Matthew K.; Cole-Dai, Jihong; Alley, Richard B.; Jardine, Gabriella E.; Cravens, Eric; Wilen, Lawrence A.; Fudge, T. J.; McConnell, Joseph R.
2014-01-01
The WAIS (West Antarctic Ice Sheet) Divide deep ice core was recently completed to a total depth of 3405 m, ending ∼50 m above the bed. Investigation of the visual stratigraphy and grain characteristics indicates that the ice column at the drilling location is undisturbed by any large-scale overturning or discontinuity. The climate record developed from this core is therefore likely to be continuous and robust. Measured grain-growth rates, recrystallization characteristics, and grain-size response at climate transitions fit within current understanding. Significant impurity control on grain size is indicated from correlation analysis between impurity loading and grain size. Bubble-number densities and bubble sizes and shapes are presented through the full extent of the bubbly ice. Where bubble elongation is observed, the direction of elongation is preferentially parallel to the trace of the basal (0001) plane. Preferred crystallographic orientation of grains is present in the shallowest samples measured, and increases with depth, progressing to a vertical-girdle pattern that tightens to a vertical single-maximum fabric. This single-maximum fabric switches into multiple maxima as the grain size increases rapidly in the deepest, warmest ice. A strong dependence of the fabric on the impurity-mediated grain size is apparent in the deepest samples.
NASA Technical Reports Server (NTRS)
Heath, Christopher M.
2012-01-01
An isokinetic dilution probe has been designed with the aid of computational fluid dynamics to sample sub-micron particles emitted from aviation combustion sources. The intended operational range includes standard day atmospheric conditions up to 40,000-ft. With dry nitrogen as the diluent, the probe is intended to minimize losses from particle microphysics and transport while rapidly quenching chemical kinetics. Initial results indicate that the Mach number ratio of the aerosol sample and dilution streams in the mixing region is an important factor for successful operation. Flow rate through the probe tip was found to be highly sensitive to the static pressure at the probe exit. Particle losses through the system were estimated to be on the order of 50% with minimal change in the overall particle size distribution apparent. Following design refinement, experimental testing and validation will be conducted in the Particle Aerosol Laboratory, a research facility located at the NASA Glenn Research Center to study the evolution of aviation emissions at lower stratospheric conditions. Particle size distributions and number densities from various combustion sources will be used to better understand particle-phase microphysics, plume chemistry, evolution to cirrus, and environmental impacts of aviation.
Molecular weight, polydispersity, and spectroscopic properties of aquatic humic substances
Chin, Y.-P.; Aiken, G.; O'Loughlin, E.
1994-01-01
The number- and weight-averaged molecular weights of a number of aquatic fulvic acids, a commercial humic acid, and unfractionated organic matter from four natural water samples were measured by high-pressure size exclusion chromatography (HPSEC). Molecular weights determined in this manner compared favorably with those values reported in the literature. Both recent literature values and our data indicate that these substances are smaller and less polydisperse than previously believed. Moreover, the molecular weights of the organic matter from three of the four natural water samples compared favorably to the fulvic acid samples extracted from similar environments. Bulk spectroscopic properties of the fulvic substances such as molar absorptivity at 280 nm and the E4/E6 ratio were also measured. A strong correlation was observed between molar absorptivity, total aromaticity, and the weight average molecular weights of all the humic substances. This observation suggests that bulk spectroscopic properties can be used to quickly estimate the size of humic substances and their aromatic contents. Both parameters are important with respect to understanding humic substance mobility and their propensity to react with both organic and inorganic pollutants. ?? 1994 American Chemical Society.
NASA Astrophysics Data System (ADS)
Gopi, K. R.; Nayaka, H. Shivananda; Sahu, Sandeep
2016-09-01
Magnesium alloy Mg-Al-Mn (AM70) was processed by equal channel angular pressing (ECAP) at 275 °C for up to 4 passes in order to produce ultrafine-grained microstructure and improve its mechanical properties. ECAP-processed samples were characterized for microstructural analysis using optical microscopy, scanning electron microscopy, and transmission electron microscopy. Microstructural analysis showed that, with an increase in the number of ECAP passes, grains refined and grain size reduced from an average of 45 to 1 µm. Electron backscatter diffraction analysis showed the transition from low angle grain boundaries to high angle grain boundaries in ECAP 4 pass sample as compared to as-cast sample. The strength and hardness values an showed increasing trend for the initial 2 passes of ECAP processing and then started decreasing with further increase in the number of ECAP passes, even though the grain size continued to decrease in all the successive ECAP passes. However, the strength and hardness values still remained quite high when compared to the initial condition. This behavior was found to be correlated with texture modification in the material as a result of ECAP processing.
Ozay, Guner; Seyhan, Ferda; Yilmaz, Aysun; Whitaker, Thomas B; Slate, Andrew B; Giesbrecht, Francis
2006-01-01
The variability associated with the aflatoxin test procedure used to estimate aflatoxin levels in bulk shipments of hazelnuts was investigated. Sixteen 10 kg samples of shelled hazelnuts were taken from each of 20 lots that were suspected of aflatoxin contamination. The total variance associated with testing shelled hazelnuts was estimated and partitioned into sampling, sample preparation, and analytical variance components. Each variance component increased as aflatoxin concentration (either B1 or total) increased. With the use of regression analysis, mathematical expressions were developed to model the relationship between aflatoxin concentration and the total, sampling, sample preparation, and analytical variances. The expressions for these relationships were used to estimate the variance for any sample size, subsample size, and number of analyses for a specific aflatoxin concentration. The sampling, sample preparation, and analytical variances associated with estimating aflatoxin in a hazelnut lot at a total aflatoxin level of 10 ng/g and using a 10 kg sample, a 50 g subsample, dry comminution with a Robot Coupe mill, and a high-performance liquid chromatographic analytical method are 174.40, 0.74, and 0.27, respectively. The sampling, sample preparation, and analytical steps of the aflatoxin test procedure accounted for 99.4, 0.4, and 0.2% of the total variability, respectively.
Design of an occulter testbed at flight Fresnel numbers
NASA Astrophysics Data System (ADS)
Sirbu, Dan; Kasdin, N. Jeremy; Kim, Yunjong; Vanderbei, Robert J.
2015-01-01
An external occulter is a spacecraft flown along the line-of-sight of a space telescope to suppress starlight and enable high-contrast direct imaging of exoplanets. Laboratory verification of occulter designs is necessary to validate the optical models used to design and predict occulter performance. At Princeton, we are designing and building a testbed that allows verification of scaled occulter designs whose suppressed shadow is mathematically identical to that of space occulters. Here, we present a sample design operating at a flight Fresnel number and is thus representative of a realistic space mission. We present calculations of experimental limits arising from the finite size and propagation distance available in the testbed, limitations due to manufacturing feature size, and non-ideal input beam. We demonstrate how the testbed is designed to be feature-size limited, and provide an estimation of the expected performance.
Porosity of the Marcellus Shale: A contrast matching small-angle neutron scattering study
Bahadur, Jitendra; Ruppert, Leslie F.; Pipich, Vitaliy; Sakurovs, Richard; Melnichenko, Yuri B.
2018-01-01
Neutron scattering techniques were used to determine the effect of mineral matter on the accessibility of water and toluene to pores in the Devonian Marcellus Shale. Three Marcellus Shale samples, representing quartz-rich, clay-rich, and carbonate-rich facies, were examined using contrast matching small-angle neutron scattering (CM-SANS) at ambient pressure and temperature. Contrast matching compositions of H2O, D2O and toluene, deuterated toluene were used to probe open and closed pores of these three shale samples. Results show that although the mean pore radius was approximately the same for all three samples, the fractal dimension of the quartz-rich sample was higher than for the clay-rich and carbonate-rich samples, indicating different pore size distributions among the samples. The number density of pores was highest in the clay-rich sample and lowest in the quartz-rich sample. Contrast matching with water and toluene mixtures shows that the accessibility of pores to water and toluene also varied among the samples. In general, water accessed approximately 70–80% of the larger pores (>80 nm radius) in all three samples. At smaller pore sizes (~5–80 nm radius), the fraction of accessible pores decreases. The lowest accessibility to both fluids is at pore throat size of ~25 nm radii with the quartz-rich sample exhibiting lower accessibility than the clay- and carbonate-rich samples. The mechanism for this behaviour is unclear, but because the mineralogy of the three samples varies, it is likely that the inaccessible pores in this size range are associated with organics and not a specific mineral within the samples. At even smaller pore sizes (~<2.5 nm radius), in all samples, the fraction of accessible pores to water increases again to approximately 70–80%. Accessibility to toluene generally follows that of water; however, in the smallest pores (~<2.5 nm radius), accessibility to toluene decreases, especially in the clay-rich sample which contains about 30% more closed pores than the quartz- and carbonate-rich samples. Results from this study show that mineralogy of producing intervals within a shale reservoir can affect accessibility of pores to water and toluene and these mineralogic differences may affect hydrocarbon storage and production and hydraulic fracturing characteristics
Two-sample binary phase 2 trials with low type I error and low sample size
Litwin, Samuel; Basickes, Stanley; Ross, Eric A.
2017-01-01
Summary We address design of two-stage clinical trials comparing experimental and control patients. Our end-point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p0 and alternative that it is p0 among controls and p1 > p0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E, sufficiently exceeds C, that among (C)ontrols. Here, we combine one-sample rejection decision rules, E ≥ m, with two-sample rules of the form E – C > r to achieve two-sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two-sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. PMID:28118686
Elahi, Fanny M; Marx, Gabe; Cobigo, Yann; Staffaroni, Adam M; Kornak, John; Tosun, Duygu; Boxer, Adam L; Kramer, Joel H; Miller, Bruce L; Rosen, Howard J
2017-01-01
Degradation of white matter microstructure has been demonstrated in frontotemporal lobar degeneration (FTLD) and Alzheimer's disease (AD). In preparation for clinical trials, ongoing studies are investigating the utility of longitudinal brain imaging for quantification of disease progression. To date only one study has examined sample size calculations based on longitudinal changes in white matter integrity in FTLD. To quantify longitudinal changes in white matter microstructural integrity in the three canonical subtypes of frontotemporal dementia (FTD) and AD using diffusion tensor imaging (DTI). 60 patients with clinical diagnoses of FTD, including 27 with behavioral variant frontotemporal dementia (bvFTD), 14 with non-fluent variant primary progressive aphasia (nfvPPA), and 19 with semantic variant PPA (svPPA), as well as 19 patients with AD and 69 healthy controls were studied. We used a voxel-wise approach to calculate annual rate of change in fractional anisotropy (FA) and mean diffusivity (MD) in each group using two time points approximately one year apart. Mean rates of change in FA and MD in 48 atlas-based regions-of-interest, as well as global measures of cognitive function were used to calculate sample sizes for clinical trials (80% power, alpha of 5%). All FTD groups showed statistically significant baseline and longitudinal white matter degeneration, with predominant involvement of frontal tracts in the bvFTD group, frontal and temporal tracts in the PPA groups and posterior tracts in the AD group. Longitudinal change in MD yielded a larger number of regions with sample sizes below 100 participants per therapeutic arm in comparison with FA. SvPPA had the smallest sample size based on change in MD in the fornix (n = 41 participants per study arm to detect a 40% effect of drug), and nfvPPA and AD had their smallest sample sizes based on rate of change in MD within the left superior longitudinal fasciculus (n = 49 for nfvPPA, and n = 23 for AD). BvFTD generally showed the largest sample size estimates (minimum n = 140 based on MD in the corpus callosum). The corpus callosum appeared to be the best region for a potential study that would include all FTD subtypes. Change in global measure of functional status (CDR box score) yielded the smallest sample size for bvFTD (n = 71), but clinical measures were inferior to white matter change for the other groups. All three of the canonical subtypes of FTD are associated with significant change in white matter integrity over one year. These changes are consistent enough that drug effects in future clinical trials could be detected with relatively small numbers of participants. While there are some differences in regions of change across groups, the genu of the corpus callosum is a region that could be used to track progression in studies that include all subtypes.
NASA Astrophysics Data System (ADS)
Wiedensohler, A.; Birmili, W.; Nowak, A.; Sonntag, A.; Weinhold, K.; Merkel, M.; Wehner, B.; Tuch, T.; Pfeifer, S.; Fiebig, M.; Fjäraa, A. M.; Asmi, E.; Sellegri, K.; Depuy, R.; Venzac, H.; Villani, P.; Laj, P.; Aalto, P.; Ogren, J. A.; Swietlicki, E.; Roldin, P.; Williams, P.; Quincey, P.; Hüglin, C.; Fierz-Schmidhauser, R.; Gysel, M.; Weingartner, E.; Riccobono, F.; Santos, S.; Grüning, C.; Faloon, K.; Beddows, D.; Harrison, R. M.; Monahan, C.; Jennings, S. G.; O'Dowd, C. D.; Marinoni, A.; Horn, H.-G.; Keck, L.; Jiang, J.; Scheckman, J.; McMurry, P. H.; Deng, Z.; Zhao, C. S.; Moerman, M.; Henzing, B.; de Leeuw, G.
2010-12-01
Particle mobility size spectrometers often referred to as DMPS (Differential Mobility Particle Sizers) or SMPS (Scanning Mobility Particle Sizers) have found a wide application in atmospheric aerosol research. However, comparability of measurements conducted world-wide is hampered by lack of generally accepted technical standards with respect to the instrumental set-up, measurement mode, data evaluation as well as quality control. This article results from several instrument intercomparison workshops conducted within the European infrastructure project EUSAAR (European Supersites for Atmospheric Aerosol Research). Under controlled laboratory conditions, the number size distribution from 20 to 200 nm determined by mobility size spectrometers of different design are within an uncertainty range of ±10% after correcting internal particle losses, while below and above this size range the discrepancies increased. Instruments with identical design agreed within ±3% in the peak number concentration when all settings were done carefully. Technical standards were developed for a minimum requirement of mobility size spectrometry for atmospheric aerosol measurements. Technical recommendations are given for atmospheric measurements including continuous monitoring of flow rates, temperature, pressure, and relative humidity for the sheath and sample air in the differential mobility analyser. In cooperation with EMEP (European Monitoring and Evaluation Program), a new uniform data structure was introduced for saving and disseminating the data within EMEP. This structure contains three levels: raw data, processed data, and final particle size distributions. Importantly, we recommend reporting raw measurements including all relevant instrument parameters as well as a complete documentation on all data transformation and correction steps. These technical and data structure standards aim to enhance the quality of long-term size distribution measurements, their comparability between different networks and sites, and their transparency and traceability back to raw data.
Puls, Robert W.; Eychaner, James H.; Powell, Robert M.
1996-01-01
Investigations at Pinal Creek, Arizona, evaluated routine sampling procedures for determination of aqueous inorganic geochemistry and assessment of contaminant transport by colloidal mobility. Sampling variables included pump type and flow rate, collection under air or nitrogen, and filter pore diameter. During well purging and sample collection, suspended particle size and number as well as dissolved oxygen, temperature, specific conductance, pH, and redox potential were monitored. Laboratory analyses of both unfiltered samples and the filtrates were performed by inductively coupled argon plasma, atomic absorption with graphite furnace, and ion chromatography. Scanning electron microscopy with Energy Dispersive X-ray was also used for analysis of filter particulates. Suspended particle counts consistently required approximately twice as long as the other field-monitored indicators to stabilize. High-flow-rate pumps entrained normally nonmobile particles. Difference in elemental concentrations using different filter-pore sizes were generally not large with only two wells having differences greater than 10 percent in most wells. Similar differences (>10%) were observed for some wells when samples were collected under nitrogen rather than in air. Fe2+/Fe3+ ratios for air-collected samples were smaller than for samples collected under a nitrogen atmosphere, reflecting sampling-induced oxidation.
Dependence of Some Properties of Groups on Group Local Number Density
NASA Astrophysics Data System (ADS)
Deng, Xin-Fa; Wu, Ping
2014-09-01
In this study we investigate the dependence of projected size Sizesky, and rms deviation σR of projected distance in the sky from the group center, rms velocities σV , and virial radius RVir of groups on group local number density. In the volume-limited group samples, it is found that groups in high density regions preferentially have larger Sizesky, σR , σV , and RVir than ones in low density regions.
Yeshaya, J; Shalgi, R; Shohat, M; Avivi, L
1999-01-01
X-chromosome inactivation and the size of the CGG repeat number are assumed to play a role in the clinical, physical, and behavioral phenotype of female carriers of a mutated FMR1 allele. In view of the tight relationship between replication timing and the expression of a given DNA sequence, we have examined the replication timing of FMR1 alleles on active and inactive X-chromosomes in cell samples (lymphocytes or amniocytes) of 25 females: 17 heterozygous for a mutated FMR1 allele with a trinucleotide repeat number varying from 58 to a few hundred, and eight homozygous for a wild-type allele. We have applied two-color fluorescence in situ hybridization (FISH) with FMR1 and X-chromosome alpha-satellite probes to interphase cells of the various genotypes: the alpha-satellite probe was used to distinguish between early replicating (active) and late replicating (inactive) X-chromosomes, and the FMR1 probe revealed the replication pattern of this locus. All samples, except one with a large trinucleotide expansion, showed an early replicating FMR1 allele on the active X-chromosome and a late replicating allele on the inactive X-chromosome. In samples of mutation carriers, both the early and the late alleles showed delayed replication compared with normal alleles, regardless of repeat size. We conclude therefore that: (1) the FMR1 locus is subjected to X-inactivation; (2) mutated FMR1 alleles, regardless of repeat size, replicate later than wild-type alleles on both the active and inactive X-chromosomes; and (3) the delaying effect of the trinucleotide expansion, even with a low repeat size, is superimposed on the delay in replication associated with X-inactivation.
Double asymptotics for the chi-square statistic.
Rempała, Grzegorz A; Wesołowski, Jacek
2016-12-01
Consider distributional limit of the Pearson chi-square statistic when the number of classes m n increases with the sample size n and [Formula: see text]. Under mild moment conditions, the limit is Gaussian for λ = ∞, Poisson for finite λ > 0, and degenerate for λ = 0.
David J. Nowak
1994-01-01
Urban forests are complex ecosystems created by the interaction of anthropogenic and natural processes. One key to better management of these systems is to understand urban forest structure and its relationship to forest functions. Through sampling and inventories, urban foresters often obtain structural information (e.g., numbers, location, size, and condition) on...
Ecologists are often faced with problem of small sample size, correlated and large number of predictors, and high noise-to-signal relationships. This necessitates excluding important variables from the model when applying standard multiple or multivariate regression analyses. In ...
Cueva Del Castillo, R
2015-04-01
Body size is directly or indirectly correlated with fitness. Body size, which conveys maximal fitness, often differs between sexes. Sexual size dimorphism (SSD) evolves because body size tends to be related to reproductive success through different pathways in males and females. In general, female insects are larger than males, suggesting that natural selection for high female fecundity could be stronger than sexual selection in males. I assessed the role of body size and fecundity in SSD in the Neotropical cricket Macroanaxipha macilenta (Saussure). This species shows a SSD bias toward males. Females did not present a correlation between number of eggs and body size. Nonetheless, there were fluctuations in the number of eggs carried by females during the sampling period, and the size of females that were collected carrying eggs was larger than that of females collected with no eggs. Since mating induces vitellogenesis in some cricket species, differences in female body size might suggest male mate choice. Sexual selection in the body size of males of M. macilenta may possibly be stronger than the selection of female fecundity. Even so, no mating behavior was observed during the field observations, including audible male calling or courtship songs, yet males may produce ultrasonic calls due to their size. If female body size in M. macilenta is not directly related to fecundity, the lack of a correlated response to selection on female body size could represent an alternate evolutionary pathway in the evolution of body size and SSD in insects.
Accounting for randomness in measurement and sampling in studying cancer cell population dynamics.
Ghavami, Siavash; Wolkenhauer, Olaf; Lahouti, Farshad; Ullah, Mukhtar; Linnebacher, Michael
2014-10-01
Knowing the expected temporal evolution of the proportion of different cell types in sample tissues gives an indication about the progression of the disease and its possible response to drugs. Such systems have been modelled using Markov processes. We here consider an experimentally realistic scenario in which transition probabilities are estimated from noisy cell population size measurements. Using aggregated data of FACS measurements, we develop MMSE and ML estimators and formulate two problems to find the minimum number of required samples and measurements to guarantee the accuracy of predicted population sizes. Our numerical results show that the convergence mechanism of transition probabilities and steady states differ widely from the real values if one uses the standard deterministic approach for noisy measurements. This provides support for our argument that for the analysis of FACS data one should consider the observed state as a random variable. The second problem we address is about the consequences of estimating the probability of a cell being in a particular state from measurements of small population of cells. We show how the uncertainty arising from small sample sizes can be captured by a distribution for the state probability.
Kabaluk, J Todd; Binns, Michael R; Vernon, Robert S
2006-06-01
Counts of green peach aphid, Myzus persicae (Sulzer) (Hemiptera: Aphididae), in potato, Solanum tuberosum L., fields were used to evaluate the performance of the sampling plan from a pest management company. The counts were further used to develop a binomial sampling method, and both full count and binomial plans were evaluated using operating characteristic curves. Taylor's power law provided a good fit of the data (r2 = 0.95), with the relationship between the variance (s2) and mean (m) as ln(s2) = 1.81(+/- 0.02) + 1.55(+/- 0.01) ln(m). A binomial sampling method was developed using the empirical model ln(m) = c + dln(-ln(1 - P(T))), to which the data fit well for tally numbers (T) of 0, 1, 3, 5, 7, and 10. Although T = 3 was considered the most reasonable given its operating characteristics and presumed ease of classification above or below critical densities (i.e., action thresholds) of one and 10 M. persicae per leaf, the full count method is shown to be superior. The mean number of sample sites per field visit by the pest management company was 42 +/- 19, with more than one-half (54%) of the field visits involving sampling 31-50 sample sites, which was acceptable in the context of operating characteristic curves for a critical density of 10 M. persicae per leaf. Based on operating characteristics, actual sample sizes used by the pest management company can be reduced by at least 50%, on average, for a critical density of 10 M. persicae per leaf. For a critical density of one M. persicae per leaf used to avert the spread of potato leaf roll virus, sample sizes from 50 to 100 were considered more suitable.
Intrafamilial clustering of anti-ATLA-positive persons.
Kajiyama, W; Kashiwagi, S; Hayashi, J; Nomura, H; Ikematsu, H; Okochi, K
1986-11-01
A total of 1,333 persons in 627 families were surveyed for presence of antibody to adult T-cell leukemia-associated antigen (anti-ATLA). Each person was classified according to the anti-ATLA status (positive for sample 1, negative for sample 2) of the head of household of his or her family. In sample 1, the sex- and age-standardized prevalence of anti-ATLA was 38.5%. This was five times as high as the standardized prevalence in sample 2 (7.8%). There were significant differences in prevalence of anti-ATLA between males in samples 1 and 2 and between females in samples 1 and 2. In every age group, prevalence in sample 1 was greater than that in sample 2 except for males aged 60-69 years. In each of four subareas, families in sample 1 had higher standardized prevalence (29.6-42.5%) than families in sample 2 (6.0-9.7%). Although crude prevalence decreased with family size in sample 1 (62.1-25.4%) as well as in sample 2, indirectly standardized prevalence was almost equal within each sample, regardless of number of family members. The degree of aggregation was independent of locality and family size. These data suggest that anti-ATLA-positive persons aggregate in family units.
A two-stage Monte Carlo approach to the expression of uncertainty with finite sample sizes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crowder, Stephen Vernon; Moyer, Robert D.
2005-05-01
Proposed supplement I to the GUM outlines a 'propagation of distributions' approach to deriving the distribution of a measurand for any non-linear function and for any set of random inputs. The supplement's proposed Monte Carlo approach assumes that the distributions of the random inputs are known exactly. This implies that the sample sizes are effectively infinite. In this case, the mean of the measurand can be determined precisely using a large number of Monte Carlo simulations. In practice, however, the distributions of the inputs will rarely be known exactly, but must be estimated using possibly small samples. If these approximatedmore » distributions are treated as exact, the uncertainty in estimating the mean is not properly taken into account. In this paper, we propose a two-stage Monte Carlo procedure that explicitly takes into account the finite sample sizes used to estimate parameters of the input distributions. We will illustrate the approach with a case study involving the efficiency of a thermistor mount power sensor. The performance of the proposed approach will be compared to the standard GUM approach for finite samples using simple non-linear measurement equations. We will investigate performance in terms of coverage probabilities of derived confidence intervals.« less
Do online social media cut through the constraints that limit the size of offline social networks?
Dunbar, R. I. M.
2016-01-01
The social brain hypothesis has suggested that natural social network sizes may have a characteristic size in humans. This is determined in part by cognitive constraints and in part by the time costs of servicing relationships. Online social networking offers the potential to break through the glass ceiling imposed by at least the second of these, potentially enabling us to maintain much larger social networks. This is tested using two separate UK surveys, each randomly stratified by age, gender and regional population size. The data show that the size and range of online egocentric social networks, indexed as the number of Facebook friends, is similar to that of offline face-to-face networks. For one sample, respondents also specified the number of individuals in the inner layers of their network (formally identified as support clique and sympathy group), and these were also similar in size to those observed in offline networks. This suggests that, as originally proposed by the social brain hypothesis, there is a cognitive constraint on the size of social networks that even the communication advantages of online media are unable to overcome. In practical terms, it may reflect the fact that real (as opposed to casual) relationships require at least occasional face-to-face interaction to maintain them. PMID:26909163
Do online social media cut through the constraints that limit the size of offline social networks?
Dunbar, R I M
2016-01-01
The social brain hypothesis has suggested that natural social network sizes may have a characteristic size in humans. This is determined in part by cognitive constraints and in part by the time costs of servicing relationships. Online social networking offers the potential to break through the glass ceiling imposed by at least the second of these, potentially enabling us to maintain much larger social networks. This is tested using two separate UK surveys, each randomly stratified by age, gender and regional population size. The data show that the size and range of online egocentric social networks, indexed as the number of Facebook friends, is similar to that of offline face-to-face networks. For one sample, respondents also specified the number of individuals in the inner layers of their network (formally identified as support clique and sympathy group), and these were also similar in size to those observed in offline networks. This suggests that, as originally proposed by the social brain hypothesis, there is a cognitive constraint on the size of social networks that even the communication advantages of online media are unable to overcome. In practical terms, it may reflect the fact that real (as opposed to casual) relationships require at least occasional face-to-face interaction to maintain them.
Genome-Wide Chromosomal Targets of Oncogenic Transcription Factors
2008-04-01
axis. (a) Comparison between STAGE and ChIP-chip when the same sample was analyzed by both methods. The gray line indicates all predicted STAGE targets...numbers of single-hit tags (Y-axis) were plotted against the frequen- cies of those tags in the random ( gray bars) and experimental (black bars) tag...size of 500 bp gave an optimal separation between random and real data. Data shown is for a window size of 500 bp. The gray bars indicate log10 of the
Development of a Multiple-Stage Differential Mobility Analyzer (MDMA)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Da-Ren; Cheng, Mengdawn
2007-01-01
A new DMA column has been designed with the capability of simultaneously extracting monodisperse particles of different sizes in multiple stages. We call this design a multistage DMA, or MDMA. A prototype MDMA has been constructed and experimentally evaluated in this study. The new column enables the fast measurement of particles in a wide size range, while preserving the powerful particle classification function of a DMA. The prototype MDMA has three sampling stages, capable of classifying monodisperse particles of three different sizes simultaneously. The scanning voltage operation of a DMA can be applied to this new column. Each stage ofmore » MDMA column covers a fraction of the entire particle size range to be measured. The covered size fractions of two adjacent stages of the MDMA are designed somewhat overlapped. The arrangement leads to the reduction of scanning voltage range and thus the cycling time of the measurement. The modular sampling stage design of the MDMA allows the flexible configuration of desired particle classification lengths and variable number of stages in the MDMA. The design of our MDMA also permits operation at high sheath flow, enabling high-resolution particle size measurement and/or reduction of the lower sizing limit. Using the tandem DMA technique, the performance of the MDMA, i.e., sizing accuracy, resolution, and transmission efficiency, was evaluated at different ratios of aerosol and sheath flowrates. Two aerosol sampling schemes were investigated. One was to extract aerosol flows at an evenly partitioned flowrate at each stage, and the other was to extract aerosol at a rate the same as the polydisperse aerosol flowrate at each stage. We detail the prototype design of the MDMA and the evaluation result on the transfer functions of the MDMA at different particle sizes and operational conditions.« less
Size effect on atomic structure in low-dimensional Cu-Zr amorphous systems.
Zhang, W B; Liu, J; Lu, S H; Zhang, H; Wang, H; Wang, X D; Cao, Q P; Zhang, D X; Jiang, J Z
2017-08-04
The size effect on atomic structure of a Cu 64 Zr 36 amorphous system, including zero-dimensional small-size amorphous particles (SSAPs) and two-dimensional small-size amorphous films (SSAFs) together with bulk sample was investigated by molecular dynamics simulations. We revealed that sample size strongly affects local atomic structure in both Cu 64 Zr 36 SSAPs and SSAFs, which are composed of core and shell (surface) components. Compared with core component, the shell component of SSAPs has lower average coordination number and average bond length, higher degree of ordering, and lower packing density due to the segregation of Cu atoms on the shell of Cu 64 Zr 36 SSAPs. These atomic structure differences in SSAPs with various sizes result in different glass transition temperatures, in which the glass transition temperature for the shell component is found to be 577 K, which is much lower than 910 K for the core component. We further extended the size effect on the structure and glasses transition temperature to Cu 64 Zr 36 SSAFs, and revealed that the T g decreases when SSAFs becomes thinner due to the following factors: different dynamic motion (mean square displacement), different density of core and surface and Cu segregation on the surface of SSAFs. The obtained results here are different from the results for the size effect on atomic structure of nanometer-sized crystalline metallic alloys.
The change of family size and structure in China.
1992-04-01
With the socioeconomic development and change of people's values, there is some significant change in family size and structure in China. According to the 10% sample data from the 4th Census, 1 family has 3.97 persons on an average, less than the 3rd Census by 0.44 persons; among all types of families, 1-generation families account for 13.5%, 3 generation families for 18.5%, and 2-generation families account for 68%. Instead of large families consisting of several generations and many members, small families has now become a principal family type in China. According to the analysis of the sample data from the 4th Census, the family size is mainly decided by the fertility level in particular regions, and it also depends on the economic development. So family size is usually smaller in more developed regions, such as in Beijing, Tianjin, Zhejiang, Liaoning as well as in Shanghai of which family size is only 3.08 persons; and family size is generally larger in less developed regions such as in Qinghai, Guangxi, Gansu, Xinjiang, and in Tibet of which family size is as large as 5.13 persons. Specialists regard the increase of the number of families as 1 of the major consequences of the economic development, change of living style, and improvement of living standard, Young people now are more inclined to live separately from their parents. However, the increase of the number of families will undoubtedly place more pressure on housing and require more furniture and other durable consumer goods from the market. Therefore, the government and other social sectors related should make corresponding plans and policies to cope with the increase of families and minifying of family size so as to promote family planning and socioeconomic development, and to create better social circumstances for small families. full text
Using meta-analysis to inform the design of subsequent studies of diagnostic test accuracy.
Hinchliffe, Sally R; Crowther, Michael J; Phillips, Robert S; Sutton, Alex J
2013-06-01
An individual diagnostic accuracy study rarely provides enough information to make conclusive recommendations about the accuracy of a diagnostic test; particularly when the study is small. Meta-analysis methods provide a way of combining information from multiple studies, reducing uncertainty in the result and hopefully providing substantial evidence to underpin reliable clinical decision-making. Very few investigators consider any sample size calculations when designing a new diagnostic accuracy study. However, it is important to consider the number of subjects in a new study in order to achieve a precise measure of accuracy. Sutton et al. have suggested previously that when designing a new therapeutic trial, it could be more beneficial to consider the power of the updated meta-analysis including the new trial rather than of the new trial itself. The methodology involves simulating new studies for a range of sample sizes and estimating the power of the updated meta-analysis with each new study added. Plotting the power values against the range of sample sizes allows the clinician to make an informed decision about the sample size of a new trial. This paper extends this approach from the trial setting and applies it to diagnostic accuracy studies. Several meta-analytic models are considered including bivariate random effects meta-analysis that models the correlation between sensitivity and specificity. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.
Daaboul, George G; Lopez, Carlos A; Chinnala, Jyothsna; Goldberg, Bennett B; Connor, John H; Ünlü, M Selim
2014-06-24
Rapid, sensitive, and direct label-free capture and characterization of nanoparticles from complex media such as blood or serum will broadly impact medicine and the life sciences. We demonstrate identification of virus particles in complex samples for replication-competent wild-type vesicular stomatitis virus (VSV), defective VSV, and Ebola- and Marburg-pseudotyped VSV with high sensitivity and specificity. Size discrimination of the imaged nanoparticles (virions) allows differentiation between modified viruses having different genome lengths and facilitates a reduction in the counting of nonspecifically bound particles to achieve a limit-of-detection (LOD) of 5 × 10(3) pfu/mL for the Ebola and Marburg VSV pseudotypes. We demonstrate the simultaneous detection of multiple viruses in a single sample (composed of serum or whole blood) for screening applications and uncompromised detection capabilities in samples contaminated with high levels of bacteria. By employing affinity-based capture, size discrimination, and a "digital" detection scheme to count single virus particles, we show that a robust and sensitive virus/nanoparticle sensing assay can be established for targets in complex samples. The nanoparticle microscopy system is termed the Single Particle Interferometric Reflectance Imaging Sensor (SP-IRIS) and is capable of high-throughput and rapid sizing of large numbers of biological nanoparticles on an antibody microarray for research and diagnostic applications.
Hall, Damien
2010-03-15
Observations of the motion of individual molecules in the membrane of a number of different cell types have led to the suggestion that the outer membrane of many eukaryotic cells may be effectively partitioned into microdomains. A major cause of this suggested partitioning is believed to be due to the direct/indirect association of the cytosolic face of the cell membrane with the cortical cytoskeleton. Such intimate association is thought to introduce effective hydrodynamic barriers into the membrane that are capable of frustrating molecular Brownian motion over distance scales greater than the average size of the compartment. To date, the standard analytical method for deducing compartment characteristics has relied on observing the random walk behavior of a labeled lipid or protein at various temporal frequencies and different total lengths of time. Simple theoretical arguments suggest that the presence of restrictive barriers imparts a characteristic turnover to a plot of mean squared displacement versus sampling period that can be interpreted to yield the average dimensions of the compartment expressed as the respective side lengths of a rectangle. In the following series of articles, we used computer simulation methods to investigate how well the conventional analytical strategy coped with heterogeneity in size, shape, and barrier permeability of the cell membrane compartments. We also explored questions relating to the necessary extent of sampling required (with regard to both the recorded time of a single trajectory and the number of trajectories included in the measurement bin) for faithful representation of the actual distribution of compartment sizes found using the SPT technique. In the current investigation, we turned our attention to the analytical characterization of diffusion through cell membrane compartments having both a uniform size and permeability. For this ideal case, we found that (i) an optimum sampling time interval existed for the analysis and (ii) the total length of time for which a trajectory was recorded was a key factor. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Efficient Sample Tracking With OpenLabFramework
List, Markus; Schmidt, Steffen; Trojnar, Jakub; Thomas, Jochen; Thomassen, Mads; Kruse, Torben A.; Tan, Qihua; Baumbach, Jan; Mollenhauer, Jan
2014-01-01
The advance of new technologies in biomedical research has led to a dramatic growth in experimental throughput. Projects therefore steadily grow in size and involve a larger number of researchers. Spreadsheets traditionally used are thus no longer suitable for keeping track of the vast amounts of samples created and need to be replaced with state-of-the-art laboratory information management systems. Such systems have been developed in large numbers, but they are often limited to specific research domains and types of data. One domain so far neglected is the management of libraries of vector clones and genetically engineered cell lines. OpenLabFramework is a newly developed web-application for sample tracking, particularly laid out to fill this gap, but with an open architecture allowing it to be extended for other biological materials and functional data. Its sample tracking mechanism is fully customizable and aids productivity further through support for mobile devices and barcoded labels. PMID:24589879
The relationship between offspring size and fitness: integrating theory and empiricism.
Rollinson, Njal; Hutchings, Jeffrey A
2013-02-01
How parents divide the energy available for reproduction between size and number of offspring has a profound effect on parental reproductive success. Theory indicates that the relationship between offspring size and offspring fitness is of fundamental importance to the evolution of parental reproductive strategies: this relationship predicts the optimal division of resources between size and number of offspring, it describes the fitness consequences for parents that deviate from optimality, and its shape can predict the most viable type of investment strategy in a given environment (e.g., conservative vs. diversified bet-hedging). Many previous attempts to estimate this relationship and the corresponding value of optimal offspring size have been frustrated by a lack of integration between theory and empiricism. In the present study, we draw from C. Smith and S. Fretwell's classic model to explain how a sound estimate of the offspring size--fitness relationship can be derived with empirical data. We evaluate what measures of fitness can be used to model the offspring size--fitness curve and optimal size, as well as which statistical models should and should not be used to estimate offspring size--fitness relationships. To construct the fitness curve, we recommend that offspring fitness be measured as survival up to the age at which the instantaneous rate of offspring mortality becomes random with respect to initial investment. Parental fitness is then expressed in ecologically meaningful, theoretically defensible, and broadly comparable units: the number of offspring surviving to independence. Although logistic and asymptotic regression have been widely used to estimate offspring size-fitness relationships, the former provides relatively unreliable estimates of optimal size when offspring survival and sample sizes are low, and the latter is unreliable under all conditions. We recommend that the Weibull-1 model be used to estimate this curve because it provides modest improvements in prediction accuracy under experimentally relevant conditions.
A Complete Analytical Screening Identifies the Real Pesticide Contamination of Surface Waters
NASA Astrophysics Data System (ADS)
Moschet, Christoph; Wittmer, Irene; Simovic, Jelena; Junghans, Marion; Singer, Heinz; Stamm, Christian; Leu, Christian; Hollender, Juliane
2014-05-01
A comprehensive assessment of pesticides in surface waters is challenging due to the large number of potential contaminants. In Switzerland for example, roughly 500 active ingredients are registered as either plant protection agent (PPA) or as biocide. In addition, an unlimited number of transformations products (TPs) can enter or be formed in surfaced waters. Most scientific publications or regulatory monitoring authorities have implemented 15-40 pesticides in their analytics. Only a few TPs are normally included. Interpretations of the surface water quality based on these subsets remains error prone. In the presented study, we carried out a nearly complete analytical screening covering 86% of all polar organic pesticides (from agricultural and urban sources) in Switzerland (300 substances) and 134 TPs with limits of quantification in the low ng/L range. The comprehensive pesticide screening was conducted by liquid-chromatography coupled to high-resolution tandem mass spectrometry. Five medium-sized rivers (Strahler stream order 3-4, catchment size 35-105 km2), containing high percentiles of diverse crops, orchards and urban settlements in their catchments, were sampled from March till July 2012. Nine subsequent time-proportional bi-weekly composite samples were taken in order to quantify average concentrations. In total, 104 different active ingredients could be detected in at least one of the five rivers. Thereby, 82 substances were only registered as PPA, 20 were registered as PPA and as biocide and 2 were only registered as biocide. Within the PPAs, herbicides had the most frequent detections and the highest concentrations, followed by fungicides and insecticides. Most concentrations were found between 1 and 50 ng/L; however 31 substances (mainly herbicides) had concentrations above 100 ng/L and 3 herbicides above 1000 ng/L. It has to be noted that the measured concentrations are average concentrations over two weeks in medium sized streams and that maximum concentrations, especially in smaller streams, can be much higher. In each sample, between 30-50 pesticides were detected and the concentration sum of all active ingredients exceeded 1000 ng/L in 78% of the samples. Forty of the 134 investigated TPs could be detected in all the five rivers. As for the active ingredients, herbicide TPs dominated the detection frequency and the concentration range. Twelve TPs exceeded 100 ng/L in at least one sample. Between 15 and 25 TPs were detected in each sample, and 35% of all samples had a concentration sum of more than 1000 ng/L. The comparison of the measured concentrations of the parent compounds with chronic environmental quality standards (AA-EQS), revealed that 70% of all surface water samples exceeded at least one of them; in some samples up to seven AA-EQS exceedances were observed. In total, 19 substances (mainly herbicides and insecticides) exceeded critical concentrations in at least one sample. The conducted study showed that the investigated medium-sized rivers were exposed to a large number of pesticides and TPs over the whole sampling period. For a correct assessment of the surface water quality, it is therefore crucial to measure as many pesticides as possible in order to get the real contamination of pesticides in surface waters.
Shahbi, M; Rajabpour, A
2017-08-01
Phthorimaea operculella Zeller is an important pest of potato in Iran. Spatial distribution and fixed-precision sequential sampling for population estimation of the pest on two potato cultivars, Arinda ® and Sante ® , were studied in two separate potato fields during two growing seasons (2013-2014 and 2014-2015). Spatial distribution was investigated by Taylor's power law and Iwao's patchiness. Results showed that the spatial distribution of eggs and larvae was random. In contrast to Iwao's patchiness, Taylor's power law provided a highly significant relationship between variance and mean density. Therefore, fixed-precision sequential sampling plan was developed by Green's model at two precision levels of 0.25 and 0.1. The optimum sample size on Arinda ® and Sante ® cultivars at precision level of 0.25 ranged from 151 to 813 and 149 to 802 leaves, respectively. At 0.1 precision level, the sample sizes varied from 5083 to 1054 and 5100 to 1050 leaves for Arinda ® and Sante ® cultivars, respectively. Therefore, the optimum sample sizes for the cultivars, with different resistance levels, were not significantly different. According to the calculated stop lines, the sampling must be continued until cumulative number of eggs + larvae reached to 15-16 or 96-101 individuals at precision levels of 0.25 or 0.1, respectively. The performance of the sampling plan was validated by resampling analysis using resampling for validation of sampling plans software. The sampling plant provided in this study can be used to obtain a rapid estimate of the pest density with minimal effort.
Herath, Samantha; Yap, Elaine
2018-02-01
In diagnosing peripheral pulmonary lesions (PPL), radial endobronchial ultrasound (R-EBUS) is emerging as a safer method in comparison to CT-guided biopsy. Despite the better safety profile, the yield of R-EBUS remains lower (73%) than CT-guided biopsy (90%) due to the smaller size of samples. We adopted a hybrid method by adding cryobiopsy via the R-EBUS Guide Sheath (GS) to produce larger, non-crushed samples to improve diagnostic capability and enhance molecular testing. We report six prospective patients who underwent this procedure in our institution. R-EBUS samples were obtained via conventional sampling methods (needle aspiration, forceps biopsy, and cytology brush), followed by a cryobiopsy. An endobronchial blocker was placed near the planned area of biopsy in advance and inflated post-biopsy to minimize the risk of bleeding in all patients. A chest X-ray was performed 1 h post-procedure. All the PPLs were visualized with R-EBUS. The mean diameter of cryobiopsy samples was twice the size of forceps biopsy samples. In four patients, cryobiopsy samples were superior in size and the number of malignant cells per high power filed and was the preferred sample selected for mutation analysis and molecular testing. There was no pneumothorax or significant bleeding to report. Cryobiopsy samples were consistently larger and were the preferred samples for molecular testing, with an increase in the diagnostic yield and reduction in the need for repeat procedures, without hindering the marked safety profile of R-EBUS. Using an endobronchial blocker improves the safety of this procedure.
WHERE ARE THE LOW-MASS POPULATION III STARS?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ishiyama, Tomoaki; Sudo, Kae; Yokoi, Shingo
2016-07-20
We study the number and the distribution of low-mass Population III (Pop III) stars in the Milky Way. In our numerical model, hierarchical formation of dark matter minihalos and Milky-Way-sized halos are followed by a high-resolution cosmological simulation. We model the Pop III formation in H{sub 2} cooling minihalos without metal under UV radiation of the Lyman–Werner bands. Assuming a Kroupa initial mass function (IMF) from 0.15 to 1.0 M {sub ⊙} for low-mass Pop III stars, as a working hypothesis, we try to constrain the theoretical models in reverse by current and future observations. We find that the survivorsmore » tend to concentrate on the center of halo and subhalos. We also evaluate the observability of Pop III survivors in the Milky Way and dwarf galaxies, and constraints on the number of Pop III survivors per minihalo. The higher latitude fields require lower sample sizes because of the high number density of stars in the galactic disk, the required sample sizes are comparable in the high- and middle-latitude fields by photometrically selecting low-metallicity stars with optimized narrow-band filters, and the required number of dwarf galaxies to find one Pop III survivor is less than 10 at <100 kpc for the tip of red giant stars. Provided that available observations have not detected any survivors, the formation models of low-mass Pop III stars with more than 10 stars per minihalo are already excluded. Furthermore, we discuss the way to constrain the IMF of Pop III stars at a high mass range of ≳10 M {sub ⊙}.« less
Visual search by chimpanzees (Pan): assessment of controlling relations.
Tomonaga, M
1995-01-01
Three experimentally sophisticated chimpanzees (Pan), Akira, Chloe, and Ai, were trained on visual search performance using a modified multiple-alternative matching-to-sample task in which a sample stimulus was followed by the search display containing one target identical to the sample and several uniform distractors (i.e., negative comparison stimuli were identical to each other). After they acquired this task, they were tested for transfer of visual search performance to trials in which the sample was not followed by the uniform search display (odd-item search). Akira showed positive transfer of visual search performance to odd-item search even when the display size (the number of stimulus items in the search display) was small, whereas Chloe and Ai showed a transfer only when the display size was large. Chloe and Ai used some nonrelational cues such as perceptual isolation of the target among uniform distractors (so-called pop-out). In addition to the odd-item search test, various types of probe trials were presented to clarify the controlling relations in multiple-alternative matching to sample. Akira showed a decrement of accuracy as a function of the display size when the search display was nonuniform (i.e., each "distractor" stimulus was not the same), whereas Chloe and Ai showed perfect performance. Furthermore, when the sample was identical to the uniform distractors in the search display, Chloe and Ai never selected an odd-item target, but Akira selected it when the display size was large. These results indicated that Akira's behavior was controlled mainly by relational cues of target-distractor oddity, whereas an identity relation between the sample and the target strongly controlled the performance of Chloe and Ai. PMID:7714449
Evolutionary Trends and the Salience Bias (with Apologies to Oil Tankers, Karl Marx, and Others).
ERIC Educational Resources Information Center
McShea, Daniel W.
1994-01-01
Examines evolutionary trends, specifically trends in size, complexity, and fitness. Notes that documentation of these trends consists of either long lists of cases, or descriptions of a small number of salient cases. Proposes the use of random samples to avoid this "saliency bias." (SR)
Response surface methodology, often supported by factorial designs, is the classical experimental approach that is widely accepted for detecting and characterizing interactions among chemicals in a mixture. In an effort to reduce the experimental effort as the number of compound...
American Samoa's forest resources, 2001.
Joseph A. Donnegan; Sheri S. Mann; Sarah L. Butler; Bruce A. Hiserote
2004-01-01
The Forest Inventory and Analysis Program of the Pacific Northwest Research Station collected, analyzed, and summarized data from field plots, and mapped land cover on four islands in American Samoa. This statistical sample provides estimates of forest area, stem volume, biomass, numbers of trees, damages to trees, and tree size distribution. The summary provides...
ERIC Educational Resources Information Center
Begeny, John C.; Krouse, Hailey E.; Brown, Kristina G.; Mann, Courtney M.
2011-01-01
Teacher judgments about students' academic abilities are important for instructional decision making and potential special education entitlement decisions. However, the small number of studies evaluating teachers' judgments are limited methodologically (e.g., sample size, procedural sophistication) and have yet to answer important questions…
Class Extraction and Classification Accuracy in Latent Class Models
ERIC Educational Resources Information Center
Wu, Qiong
2009-01-01
Despite the increasing popularity of latent class models (LCM) in educational research, methodological studies have not yet accumulated much information on the appropriate application of this modeling technique, especially with regard to requirement on sample size and number of indicators. This dissertation study represented an initial attempt to…
Hard choices in assessing survival past dams — a comparison of single- and paired-release strategies
Zydlewski, Joseph D.; Stich, Daniel S.; Sigourney, Douglas B.
2017-01-01
Mark–recapture models are widely used to estimate survival of salmon smolts migrating past dams. Paired releases have been used to improve estimate accuracy by removing components of mortality not attributable to the dam. This method is accompanied by reduced precision because (i) sample size is reduced relative to a single, large release; and (ii) variance calculations inflate error. We modeled an idealized system with a single dam to assess trade-offs between accuracy and precision and compared methods using root mean squared error (RMSE). Simulations were run under predefined conditions (dam mortality, background mortality, detection probability, and sample size) to determine scenarios when the paired release was preferable to a single release. We demonstrate that a paired-release design provides a theoretical advantage over a single-release design only at large sample sizes and high probabilities of detection. At release numbers typical of many survival studies, paired release can result in overestimation of dam survival. Failures to meet model assumptions of a paired release may result in further overestimation of dam-related survival. Under most conditions, a single-release strategy was preferable.
In Situ Balloon-Borne Ice Particle Imaging in High-Latitude Cirrus
NASA Astrophysics Data System (ADS)
Kuhn, Thomas; Heymsfield, Andrew J.
2016-09-01
Cirrus clouds reflect incoming solar radiation, creating a cooling effect. At the same time, these clouds absorb the infrared radiation from the Earth, creating a greenhouse effect. The net effect, crucial for radiative transfer, depends on the cirrus microphysical properties, such as particle size distributions and particle shapes. Knowledge of these cloud properties is also needed for calibrating and validating passive and active remote sensors. Ice particles of sizes below 100 µm are inherently difficult to measure with aircraft-mounted probes due to issues with resolution, sizing, and size-dependent sampling volume. Furthermore, artefacts are produced by shattering of particles on the leading surfaces of the aircraft probes when particles several hundred microns or larger are present. Here, we report on a series of balloon-borne in situ measurements that were carried out at a high-latitude location, Kiruna in northern Sweden (68N 21E). The method used here avoids these issues experienced with the aircraft probes. Furthermore, with a balloon-borne instrument, data are collected as vertical profiles, more useful for calibrating or evaluating remote sensing measurements than data collected along horizontal traverses. Particles are collected on an oil-coated film at a sampling speed given directly by the ascending rate of the balloon, 4 m s-1. The collecting film is advanced uniformly inside the instrument so that an always unused section of the film is exposed to ice particles, which are measured by imaging shortly after sampling. The high optical resolution of about 4 µm together with a pixel resolution of 1.65 µm allows particle detection at sizes of 10 µm and larger. For particles that are 20 µm (12 pixel) in size or larger, the shape can be recognized. The sampling volume, 130 cm3 s-1, is well defined and independent of particle size. With the encountered number concentrations of between 4 and 400 L-1, this required about 90- to 4-s sampling times to determine particle size distributions of cloud layers. Depending on how ice particles vary through the cloud, several layers per cloud with relatively uniform properties have been analysed. Preliminary results of the balloon campaign, targeting upper tropospheric, cold cirrus clouds, are presented here. Ice particles in these clouds were predominantly very small, with a median size of measured particles of around 50 µm and about 80 % of all particles below 100 µm in size. The properties of the particle size distributions at temperatures between -36 and -67 °C have been studied, as well as particle areas, extinction coefficients, and their shapes (area ratios). Gamma and log-normal distribution functions could be fitted to all measured particle size distributions achieving very good correlation with coefficients R of up to 0.95. Each distribution features one distinct mode. With decreasing temperature, the mode diameter decreases exponentially, whereas the total number concentration increases by two orders of magnitude with decreasing temperature in the same range. The high concentrations at cold temperatures also caused larger extinction coefficients, directly determined from cross-sectional areas of single ice particles, than at warmer temperatures. The mass of particles has been estimated from area and size. Ice water content (IWC) and effective diameters are then determined from the data. IWC did vary only between 1 × 10-3 and 5 × 10-3 g m-3 at temperatures below -40 °C and did not show a clear temperature trend. These measurements are part of an ongoing study.
Shirazi, Mohammadali; Reddy Geedipally, Srinivas; Lord, Dominique
2017-01-01
Severity distribution functions (SDFs) are used in highway safety to estimate the severity of crashes and conduct different types of safety evaluations and analyses. Developing a new SDF is a difficult task and demands significant time and resources. To simplify the process, the Highway Safety Manual (HSM) has started to document SDF models for different types of facilities. As such, SDF models have recently been introduced for freeway and ramps in HSM addendum. However, since these functions or models are fitted and validated using data from a few selected number of states, they are required to be calibrated to the local conditions when applied to a new jurisdiction. The HSM provides a methodology to calibrate the models through a scalar calibration factor. However, the proposed methodology to calibrate SDFs was never validated through research. Furthermore, there are no concrete guidelines to select a reliable sample size. Using extensive simulation, this paper documents an analysis that examined the bias between the 'true' and 'estimated' calibration factors. It was indicated that as the value of the true calibration factor deviates further away from '1', more bias is observed between the 'true' and 'estimated' calibration factors. In addition, simulation studies were performed to determine the calibration sample size for various conditions. It was found that, as the average of the coefficient of variation (CV) of the 'KAB' and 'C' crashes increases, the analyst needs to collect a larger sample size to calibrate SDF models. Taking this observation into account, sample-size guidelines are proposed based on the average CV of crash severities that are used for the calibration process. Copyright © 2016 Elsevier Ltd. All rights reserved.
Quality Evalution of Potato Clones as Processed Material Cultivated in Lembang
NASA Astrophysics Data System (ADS)
Rahayu, S. T.; Handayani, T.; Levianny, P. S.
2017-03-01
Potatoes are widely grown in the temperate as well as tropical zones and are the fourth largest staple crop in the world after maize, wheat and rice. The study aimed to evaluate the quality of several potato clones as raw material on potato based products (chips and boiled). The study was conducted at Indonesian Vegetable Research Institute, Lembang about 1200 m asl height, in 2016. The design used was a randomized complete block design with three replications. The samples tested were 5 clones selection (clones number 1,2,3,4,10). In this study, variety Granola (Clone number 6) and Atlantic (Clone number 7) were used as a susceptible control, meanwhile the Katahdin (Clone number 8) and SP 951 (Clone number 9) were used as the resistant control. Chemical properties tested were starch, reduction sugar, water content, specific gravity, and Total Soluble Solute (TSS). The organoleptic assessment method used was hedonic test with scale of 1-5 (very like until very dislike) which had been done by 15 untrained panelists. Data was statisticaly analized by Duncan’s test (5%). Clone 1 and 2 were preferred by panelist as raw material for potato chips, which got score of ‘very like’ until ‘like’ for color, size, taste, and texture parameters. Although there was no significant difference on color and size parameters for all samples of that boiled potato there, however, clone no 8 can be considered as the most favourite based on taste and texture parameters.
Urban Land Cover Mapping Accuracy Assessment - A Cost-benefit Analysis Approach
NASA Astrophysics Data System (ADS)
Xiao, T.
2012-12-01
One of the most important components in urban land cover mapping is mapping accuracy assessment. Many statistical models have been developed to help design simple schemes based on both accuracy and confidence levels. It is intuitive that an increased number of samples increases the accuracy as well as the cost of an assessment. Understanding cost and sampling size is crucial in implementing efficient and effective of field data collection. Few studies have included a cost calculation component as part of the assessment. In this study, a cost-benefit sampling analysis model was created by combining sample size design and sampling cost calculation. The sampling cost included transportation cost, field data collection cost, and laboratory data analysis cost. Simple Random Sampling (SRS) and Modified Systematic Sampling (MSS) methods were used to design sample locations and to extract land cover data in ArcGIS. High resolution land cover data layers of Denver, CO and Sacramento, CA, street networks, and parcel GIS data layers were used in this study to test and verify the model. The relationship between the cost and accuracy was used to determine the effectiveness of each sample method. The results of this study can be applied to other environmental studies that require spatial sampling.
NASA Technical Reports Server (NTRS)
Tomberlin, T. J.
1985-01-01
Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.
Microstructural development of cobalt ferrite ceramics and its influence on magnetic properties
NASA Astrophysics Data System (ADS)
Kim, Gi-Yeop; Jeon, Jae-Ho; Kim, Myong-Ho; Suvorov, Danilo; Choi, Si-Young
2013-11-01
The microstructural evolution and its influence on magnetic properties in cobalt ferrite were investigated. The cobalt ferrite powders were prepared via a solid-state reaction route and then sintered at 1200 °C for 1, 2, and 16 h in air. The microstructures from sintered samples represented a bimodal distribution of grain size, which is associated with abnormal grain growth behavior. And thus, with increasing sintering time, the number and size of abnormal grains accordingly increased but the matrix grains were frozen with stagnant grain growth. In the sample sintered for 16 h, all of the matrix grains were consumed and the abnormal grains consequently impinged on each other. With the appearance of abnormal grains, the magnetic coercivity significantly decreased from 586.3 Oe (1 h sintered sample) to 168.3 Oe (16 h sintered sample). This is due to the magnetization in abnormal grains being easily flipped. In order to achieve high magnetic coercivity of cobalt ferrite, it is thus imperative to fabricate the fine and homogeneous microstructure.
NASA Astrophysics Data System (ADS)
Manjili, Mohsen Hajipour; Halali, Mohammad
2018-02-01
Samples of INCONEL 718 were levitated and melted in a slag by the application of an electromagnetic field. The effects of temperature, time, and slag composition on the inclusion content of the samples were studied thoroughly. Samples were compared with the original alloy to study the effect of the process on inclusions. Size, shape, and chemical composition of remaining non-metallic inclusions were investigated. The samples were prepared by Standard Guide for Preparing and Evaluating Specimens for Automatic Inclusion Assessment of Steel (ASTM E 768-99) method and the results were reported by means of the Standard Test Methods for Determining the Inclusion Content of Steel (ASTM E 45-97). Results indicated that by increasing temperature and processing time, greater level of cleanliness could be achieved, and numbers and size of the remaining inclusions decreased significantly. It was also observed that increasing calcium fluoride content of the slag helped reduce inclusion content.
Total Water Content Measurements with an Isokinetic Sampling Probe
NASA Technical Reports Server (NTRS)
Reehorst, Andrew L.; Miller, Dean R.; Bidwell, Colin S.
2010-01-01
The NASA Glenn Research Center has developed a Total Water Content (TWC) Isokinetic Sampling Probe. Since it is not sensitive to cloud water particle phase nor size, it is particularly attractive to support super-cooled large droplet and high ice water content aircraft icing studies. The instrument is comprised of the Sampling Probe, Sample Flow Control, and Water Vapor Measurement subsystems. Analysis and testing have been conducted on the subsystems to ensure their proper function and accuracy. End-to-end bench testing has also been conducted to ensure the reliability of the entire instrument system. A Stokes Number based collection efficiency correction was developed to correct for probe thickness effects. The authors further discuss the need to ensure that no condensation occurs within the instrument plumbing. Instrument measurements compared to facility calibrations from testing in the NASA Glenn Icing Research Tunnel are presented and discussed. There appears to be liquid water content and droplet size effects in the differences between the two measurement techniques.
The effect of membrane filtration on dissolved trace element concentrations
Horowitz, A.J.; Lum, K.R.; Garbarino, J.R.; Hall, G.E.M.; Lemieux, C.; Demas, C.R.
1996-01-01
The almost universally accepted operational definition for dissolved constituents is based on processing whole-water samples through a 0.45-??m membrane filter. Results from field and laboratory experiments indicate that a number of factors associated with filtration, other than just pore size (e.g., diameter, manufacturer, volume of sample processed, amount of suspended sediment in the sample), can produce substantial variations in the 'dissolved' concentrations of such elements as Fe, Al, Cu, Zn, Pb, Co, and Ni. These variations result from the inclusion/exclusion of colloidally- associated trace elements. Thus, 'dissolved' concentrations quantitated by analyzing filtrates generated by processing whole-water through similar pore- sized membrane filters may not be equal/comparable. As such, simple filtration through a 0.45-??m membrane filter may no longer represent an acceptable operational definition for dissolved chemical constituents. This conclusion may have important implications for environmental studies and regulatory agencies.
Classification of urine sediment based on convolution neural network
NASA Astrophysics Data System (ADS)
Pan, Jingjing; Jiang, Cunbo; Zhu, Tiantian
2018-04-01
By designing a new convolution neural network framework, this paper breaks the constraints of the original convolution neural network framework requiring large training samples and samples of the same size. Move and cropping the input images, generate the same size of the sub-graph. And then, the generated sub-graph uses the method of dropout, increasing the diversity of samples and preventing the fitting generation. Randomly select some proper subset in the sub-graphic set and ensure that the number of elements in the proper subset is same and the proper subset is not the same. The proper subsets are used as input layers for the convolution neural network. Through the convolution layer, the pooling, the full connection layer and output layer, we can obtained the classification loss rate of test set and training set. In the red blood cells, white blood cells, calcium oxalate crystallization classification experiment, the classification accuracy rate of 97% or more.