Sample records for proposed sample size

  1. Designing a two-rank acceptance sampling plan for quality inspection of geospatial data products

    NASA Astrophysics Data System (ADS)

    Tong, Xiaohua; Wang, Zhenhua; Xie, Huan; Liang, Dan; Jiang, Zuoqin; Li, Jinchao; Li, Jun

    2011-10-01

    To address the disadvantages of classical sampling plans designed for traditional industrial products, we originally propose a two-rank acceptance sampling plan (TRASP) for the inspection of geospatial data outputs based on the acceptance quality level (AQL). The first rank sampling plan is to inspect the lot consisting of map sheets, and the second is to inspect the lot consisting of features in an individual map sheet. The TRASP design is formulated as an optimization problem with respect to sample size and acceptance number, which covers two lot size cases. The first case is for a small lot size with nonconformities being modeled by a hypergeometric distribution function, and the second is for a larger lot size with nonconformities being modeled by a Poisson distribution function. The proposed TRASP is illustrated through two empirical case studies. Our analysis demonstrates that: (1) the proposed TRASP provides a general approach for quality inspection of geospatial data outputs consisting of non-uniform items and (2) the proposed acceptance sampling plan based on TRASP performs better than other classical sampling plans. It overcomes the drawbacks of percent sampling, i.e., "strictness for large lot size, toleration for small lot size," and those of a national standard used specifically for industrial outputs, i.e., "lots with different sizes corresponding to the same sampling plan."

  2. A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.

    PubMed

    Yu, Qingzhao; Zhu, Lin; Zhu, Han

    2017-11-01

    Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.

  3. Approximate sample size formulas for the two-sample trimmed mean test with unequal variances.

    PubMed

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2007-05-01

    Yuen's two-sample trimmed mean test statistic is one of the most robust methods to apply when variances are heterogeneous. The present study develops formulas for the sample size required for the test. The formulas are applicable for the cases of unequal variances, non-normality and unequal sample sizes. Given the specified alpha and the power (1-beta), the minimum sample size needed by the proposed formulas under various conditions is less than is given by the conventional formulas. Moreover, given a specified size of sample calculated by the proposed formulas, simulation results show that Yuen's test can achieve statistical power which is generally superior to that of the approximate t test. A numerical example is provided.

  4. Optimum sample size allocation to minimize cost or maximize power for the two-sample trimmed mean test.

    PubMed

    Guo, Jiin-Huarng; Luh, Wei-Ming

    2009-05-01

    When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non-normality for Yuen's two-group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed.

  5. A Bayesian sequential design using alpha spending function to control type I error.

    PubMed

    Zhu, Han; Yu, Qingzhao

    2017-10-01

    We propose in this article a Bayesian sequential design using alpha spending functions to control the overall type I error in phase III clinical trials. We provide algorithms to calculate critical values, power, and sample sizes for the proposed design. Sensitivity analysis is implemented to check the effects from different prior distributions, and conservative priors are recommended. We compare the power and actual sample sizes of the proposed Bayesian sequential design with different alpha spending functions through simulations. We also compare the power of the proposed method with frequentist sequential design using the same alpha spending function. Simulations show that, at the same sample size, the proposed method provides larger power than the corresponding frequentist sequential design. It also has larger power than traditional Bayesian sequential design which sets equal critical values for all interim analyses. When compared with other alpha spending functions, O'Brien-Fleming alpha spending function has the largest power and is the most conservative in terms that at the same sample size, the null hypothesis is the least likely to be rejected at early stage of clinical trials. And finally, we show that adding a step of stop for futility in the Bayesian sequential design can reduce the overall type I error and reduce the actual sample sizes.

  6. Sample size considerations for paired experimental design with incomplete observations of continuous outcomes.

    PubMed

    Zhu, Hong; Xu, Xiaohan; Ahn, Chul

    2017-01-01

    Paired experimental design is widely used in clinical and health behavioral studies, where each study unit contributes a pair of observations. Investigators often encounter incomplete observations of paired outcomes in the data collected. Some study units contribute complete pairs of observations, while the others contribute either pre- or post-intervention observations. Statistical inference for paired experimental design with incomplete observations of continuous outcomes has been extensively studied in literature. However, sample size method for such study design is sparsely available. We derive a closed-form sample size formula based on the generalized estimating equation approach by treating the incomplete observations as missing data in a linear model. The proposed method properly accounts for the impact of mixed structure of observed data: a combination of paired and unpaired outcomes. The sample size formula is flexible to accommodate different missing patterns, magnitude of missingness, and correlation parameter values. We demonstrate that under complete observations, the proposed generalized estimating equation sample size estimate is the same as that based on the paired t-test. In the presence of missing data, the proposed method would lead to a more accurate sample size estimate comparing with the crude adjustment. Simulation studies are conducted to evaluate the finite-sample performance of the generalized estimating equation sample size formula. A real application example is presented for illustration.

  7. Sample size for post-marketing safety studies based on historical controls.

    PubMed

    Wu, Yu-te; Makuch, Robert W

    2010-08-01

    As part of a drug's entire life cycle, post-marketing studies are an important part in the identification of rare, serious adverse events. Recently, the US Food and Drug Administration (FDA) has begun to implement new post-marketing safety mandates as a consequence of increased emphasis on safety. The purpose of this research is to provide exact sample size formula for the proposed hybrid design, based on a two-group cohort study with incorporation of historical external data. Exact sample size formula based on the Poisson distribution is developed, because the detection of rare events is our outcome of interest. Performance of exact method is compared to its approximate large-sample theory counterpart. The proposed hybrid design requires a smaller sample size compared to the standard, two-group prospective study design. In addition, the exact method reduces the number of subjects required in the treatment group by up to 30% compared to the approximate method for the study scenarios examined. The proposed hybrid design satisfies the advantages and rationale of the two-group design with smaller sample sizes generally required. 2010 John Wiley & Sons, Ltd.

  8. A note on power and sample size calculations for the Kruskal-Wallis test for ordered categorical data.

    PubMed

    Fan, Chunpeng; Zhang, Donghui

    2012-01-01

    Although the Kruskal-Wallis test has been widely used to analyze ordered categorical data, power and sample size methods for this test have been investigated to a much lesser extent when the underlying multinomial distributions are unknown. This article generalizes the power and sample size procedures proposed by Fan et al. ( 2011 ) for continuous data to ordered categorical data, when estimates from a pilot study are used in the place of knowledge of the true underlying distribution. Simulations show that the proposed power and sample size formulas perform well. A myelin oligodendrocyte glycoprotein (MOG) induced experimental autoimmunce encephalomyelitis (EAE) mouse study is used to demonstrate the application of the methods.

  9. Sample size determination for logistic regression on a logit-normal distribution.

    PubMed

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  10. Robust gene selection methods using weighting schemes for microarray data analysis.

    PubMed

    Kang, Suyeon; Song, Jongwoo

    2017-09-02

    A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.

  11. Design and analysis of three-arm trials with negative binomially distributed endpoints.

    PubMed

    Mütze, Tobias; Munk, Axel; Friede, Tim

    2016-02-20

    A three-arm clinical trial design with an experimental treatment, an active control, and a placebo control, commonly referred to as the gold standard design, enables testing of non-inferiority or superiority of the experimental treatment compared with the active control. In this paper, we propose methods for designing and analyzing three-arm trials with negative binomially distributed endpoints. In particular, we develop a Wald-type test with a restricted maximum-likelihood variance estimator for testing non-inferiority or superiority. For this test, sample size and power formulas as well as optimal sample size allocations will be derived. The performance of the proposed test will be assessed in an extensive simulation study with regard to type I error rate, power, sample size, and sample size allocation. For the purpose of comparison, Wald-type statistics with a sample variance estimator and an unrestricted maximum-likelihood estimator are included in the simulation study. We found that the proposed Wald-type test with a restricted variance estimator performed well across the considered scenarios and is therefore recommended for application in clinical trials. The methods proposed are motivated and illustrated by a recent clinical trial in multiple sclerosis. The R package ThreeArmedTrials, which implements the methods discussed in this paper, is available on CRAN. Copyright © 2015 John Wiley & Sons, Ltd.

  12. Inference and sample size calculation for clinical trials with incomplete observations of paired binary outcomes.

    PubMed

    Zhang, Song; Cao, Jing; Ahn, Chul

    2017-02-20

    We investigate the estimation of intervention effect and sample size determination for experiments where subjects are supposed to contribute paired binary outcomes with some incomplete observations. We propose a hybrid estimator to appropriately account for the mixed nature of observed data: paired outcomes from those who contribute complete pairs of observations and unpaired outcomes from those who contribute either pre-intervention or post-intervention outcomes. We theoretically prove that if incomplete data are evenly distributed between the pre-intervention and post-intervention periods, the proposed estimator will always be more efficient than the traditional estimator. A numerical research shows that when the distribution of incomplete data is unbalanced, the proposed estimator will be superior when there is moderate-to-strong positive within-subject correlation. We further derive a closed-form sample size formula to help researchers determine how many subjects need to be enrolled in such studies. Simulation results suggest that the calculated sample size maintains the empirical power and type I error under various design configurations. We demonstrate the proposed method using a real application example. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. A general approach for sample size calculation for the three-arm 'gold standard' non-inferiority design.

    PubMed

    Stucke, Kathrin; Kieser, Meinhard

    2012-12-10

    In the three-arm 'gold standard' non-inferiority design, an experimental treatment, an active reference, and a placebo are compared. This design is becoming increasingly popular, and it is, whenever feasible, recommended for use by regulatory guidelines. We provide a general method to calculate the required sample size for clinical trials performed in this design. As special cases, the situations of continuous, binary, and Poisson distributed outcomes are explored. Taking into account the correlation structure of the involved test statistics, the proposed approach leads to considerable savings in sample size as compared with application of ad hoc methods for all three scale levels. Furthermore, optimal sample size allocation ratios are determined that result in markedly smaller total sample sizes as compared with equal assignment. As optimal allocation makes the active treatment groups larger than the placebo group, implementation of the proposed approach is also desirable from an ethical viewpoint. Copyright © 2012 John Wiley & Sons, Ltd.

  14. Phylogenetic effective sample size.

    PubMed

    Bartoszek, Krzysztof

    2016-10-21

    In this paper I address the question-how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Sample size and power calculations for detecting changes in malaria transmission using antibody seroconversion rate.

    PubMed

    Sepúlveda, Nuno; Paulino, Carlos Daniel; Drakeley, Chris

    2015-12-30

    Several studies have highlighted the use of serological data in detecting a reduction in malaria transmission intensity. These studies have typically used serology as an adjunct measure and no formal examination of sample size calculations for this approach has been conducted. A sample size calculator is proposed for cross-sectional surveys using data simulation from a reverse catalytic model assuming a reduction in seroconversion rate (SCR) at a given change point before sampling. This calculator is based on logistic approximations for the underlying power curves to detect a reduction in SCR in relation to the hypothesis of a stable SCR for the same data. Sample sizes are illustrated for a hypothetical cross-sectional survey from an African population assuming a known or unknown change point. Overall, data simulation demonstrates that power is strongly affected by assuming a known or unknown change point. Small sample sizes are sufficient to detect strong reductions in SCR, but invariantly lead to poor precision of estimates for current SCR. In this situation, sample size is better determined by controlling the precision of SCR estimates. Conversely larger sample sizes are required for detecting more subtle reductions in malaria transmission but those invariantly increase precision whilst reducing putative estimation bias. The proposed sample size calculator, although based on data simulation, shows promise of being easily applicable to a range of populations and survey types. Since the change point is a major source of uncertainty, obtaining or assuming prior information about this parameter might reduce both the sample size and the chance of generating biased SCR estimates.

  16. Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.

    PubMed

    Youssef, Noha H; Elshahed, Mostafa S

    2008-09-01

    Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.

  17. Sample size determination for estimating antibody seroconversion rate under stable malaria transmission intensity.

    PubMed

    Sepúlveda, Nuno; Drakeley, Chris

    2015-04-03

    In the last decade, several epidemiological studies have demonstrated the potential of using seroprevalence (SP) and seroconversion rate (SCR) as informative indicators of malaria burden in low transmission settings or in populations on the cusp of elimination. However, most of studies are designed to control ensuing statistical inference over parasite rates and not on these alternative malaria burden measures. SP is in essence a proportion and, thus, many methods exist for the respective sample size determination. In contrast, designing a study where SCR is the primary endpoint, is not an easy task because precision and statistical power are affected by the age distribution of a given population. Two sample size calculators for SCR estimation are proposed. The first one consists of transforming the confidence interval for SP into the corresponding one for SCR given a known seroreversion rate (SRR). The second calculator extends the previous one to the most common situation where SRR is unknown. In this situation, data simulation was used together with linear regression in order to study the expected relationship between sample size and precision. The performance of the first sample size calculator was studied in terms of the coverage of the confidence intervals for SCR. The results pointed out to eventual problems of under or over coverage for sample sizes ≤250 in very low and high malaria transmission settings (SCR ≤ 0.0036 and SCR ≥ 0.29, respectively). The correct coverage was obtained for the remaining transmission intensities with sample sizes ≥ 50. Sample size determination was then carried out for cross-sectional surveys using realistic SCRs from past sero-epidemiological studies and typical age distributions from African and non-African populations. For SCR < 0.058, African studies require a larger sample size than their non-African counterparts in order to obtain the same precision. The opposite happens for the remaining transmission intensities. With respect to the second sample size calculator, simulation unravelled the likelihood of not having enough information to estimate SRR in low transmission settings (SCR ≤ 0.0108). In that case, the respective estimates tend to underestimate the true SCR. This problem is minimized by sample sizes of no less than 500 individuals. The sample sizes determined by this second method highlighted the prior expectation that, when SRR is not known, sample sizes are increased in relation to the situation of a known SRR. In contrast to the first sample size calculation, African studies would now require lesser individuals than their counterparts conducted elsewhere, irrespective of the transmission intensity. Although the proposed sample size calculators can be instrumental to design future cross-sectional surveys, the choice of a particular sample size must be seen as a much broader exercise that involves weighting statistical precision with ethical issues, available human and economic resources, and possible time constraints. Moreover, if the sample size determination is carried out on varying transmission intensities, as done here, the respective sample sizes can also be used in studies comparing sites with different malaria transmission intensities. In conclusion, the proposed sample size calculators are a step towards the design of better sero-epidemiological studies. Their basic ideas show promise to be applied to the planning of alternative sampling schemes that may target or oversample specific age groups.

  18. Allocating Sample Sizes to Reduce Budget for Fixed-Effect 2×2 Heterogeneous Analysis of Variance

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2016-01-01

    This article discusses the sample size requirements for the interaction, row, and column effects, respectively, by forming a linear contrast for a 2×2 factorial design for fixed-effects heterogeneous analysis of variance. The proposed method uses the Welch t test and its corresponding degrees of freedom to calculate the final sample size in a…

  19. Approximate Sample Size Formulas for Testing Group Mean Differences when Variances Are Unequal in One-Way ANOVA

    ERIC Educational Resources Information Center

    Guo, Jiin-Huarng; Luh, Wei-Ming

    2008-01-01

    This study proposes an approach for determining appropriate sample size for Welch's F test when unequal variances are expected. Given a certain maximum deviation in population means and using the quantile of F and t distributions, there is no need to specify a noncentrality parameter and it is easy to estimate the approximate sample size needed…

  20. PTM Modeling of Dredged Suspended Sediment at Proposed Polaris Point and Ship Repair Facility CVN Berthing Sites - Apra Harbor, Guam

    DTIC Science & Technology

    2017-09-01

    ADCP locations used for model calibration. ......................................................................... 12 Figure 4-3. Sample water...Example of fine sediment sample [Set d, Sample B30]. (B) Example of coarse sediment sample [Set d, sample B05...Turning Basin average sediment size distribution curve. ................................................... 21 Figure 5-5. Turning Basin average size

  1. On sample size of the kruskal-wallis test with application to a mouse peritoneal cavity study.

    PubMed

    Fan, Chunpeng; Zhang, Donghui; Zhang, Cun-Hui

    2011-03-01

    As the nonparametric generalization of the one-way analysis of variance model, the Kruskal-Wallis test applies when the goal is to test the difference between multiple samples and the underlying population distributions are nonnormal or unknown. Although the Kruskal-Wallis test has been widely used for data analysis, power and sample size methods for this test have been investigated to a much lesser extent. This article proposes new power and sample size calculation methods for the Kruskal-Wallis test based on the pilot study in either a completely nonparametric model or a semiparametric location model. No assumption is made on the shape of the underlying population distributions. Simulation results show that, in terms of sample size calculation for the Kruskal-Wallis test, the proposed methods are more reliable and preferable to some more traditional methods. A mouse peritoneal cavity study is used to demonstrate the application of the methods. © 2010, The International Biometric Society.

  2. On the repeated measures designs and sample sizes for randomized controlled trials.

    PubMed

    Tango, Toshiro

    2016-04-01

    For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference

    PubMed Central

    Karcher, Michael D.; Palacios, Julia A.; Bedford, Trevor; Suchard, Marc A.; Minin, Vladimir N.

    2016-01-01

    Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals’ genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples. PMID:26938243

  4. Sample size and power considerations in network meta-analysis

    PubMed Central

    2012-01-01

    Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327

  5. An internal pilot design for prospective cancer screening trials with unknown disease prevalence.

    PubMed

    Brinton, John T; Ringham, Brandy M; Glueck, Deborah H

    2015-10-13

    For studies that compare the diagnostic accuracy of two screening tests, the sample size depends on the prevalence of disease in the study population, and on the variance of the outcome. Both parameters may be unknown during the design stage, which makes finding an accurate sample size difficult. To solve this problem, we propose adapting an internal pilot design. In this adapted design, researchers will accrue some percentage of the planned sample size, then estimate both the disease prevalence and the variances of the screening tests. The updated estimates of the disease prevalence and variance are used to conduct a more accurate power and sample size calculation. We demonstrate that in large samples, the adapted internal pilot design produces no Type I inflation. For small samples (N less than 50), we introduce a novel adjustment of the critical value to control the Type I error rate. We apply the method to two proposed prospective cancer screening studies: 1) a small oral cancer screening study in individuals with Fanconi anemia and 2) a large oral cancer screening trial. Conducting an internal pilot study without adjusting the critical value can cause Type I error rate inflation in small samples, but not in large samples. An internal pilot approach usually achieves goal power and, for most studies with sample size greater than 50, requires no Type I error correction. Further, we have provided a flexible and accurate approach to bound Type I error below a goal level for studies with small sample size.

  6. Sample size determination in group-sequential clinical trials with two co-primary endpoints

    PubMed Central

    Asakura, Koko; Hamasaki, Toshimitsu; Sugimoto, Tomoyuki; Hayashi, Kenichi; Evans, Scott R; Sozu, Takashi

    2014-01-01

    We discuss sample size determination in group-sequential designs with two endpoints as co-primary. We derive the power and sample size within two decision-making frameworks. One is to claim the test intervention’s benefit relative to control when superiority is achieved for the two endpoints at the same interim timepoint of the trial. The other is when the superiority is achieved for the two endpoints at any interim timepoint, not necessarily simultaneously. We evaluate the behaviors of sample size and power with varying design elements and provide a real example to illustrate the proposed sample size methods. In addition, we discuss sample size recalculation based on observed data and evaluate the impact on the power and Type I error rate. PMID:24676799

  7. Selective counting and sizing of single virus particles using fluorescent aptamer-based nanoparticle tracking analysis.

    PubMed

    Szakács, Zoltán; Mészáros, Tamás; de Jonge, Marien I; Gyurcsányi, Róbert E

    2018-05-30

    Detection and counting of single virus particles in liquid samples are largely limited to narrow size distribution of viruses and purified formulations. To address these limitations, here we propose a calibration-free method that enables concurrently the selective recognition, counting and sizing of virus particles as demonstrated through the detection of human respiratory syncytial virus (RSV), an enveloped virus with a broad size distribution, in throat swab samples. RSV viruses were selectively labeled through their attachment glycoproteins (G) with fluorescent aptamers, which further enabled their identification, sizing and counting at the single particle level by fluorescent nanoparticle tracking analysis. The proposed approach seems to be generally applicable to virus detection and quantification. Moreover, it could be successfully applied to detect single RSV particles in swab samples of diagnostic relevance. Since the selective recognition is associated with the sizing of each detected particle, this method enables to discriminate viral elements linked to the virus as well as various virus forms and associations.

  8. 77 FR 2697 - Proposed Information Collection; Comment Request; Annual Services Report

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-19

    ... and from a sample of small- and medium-sized businesses selected using a stratified sampling procedure... be canvassed when the sample is re-drawn, while nearly all of the small- and medium-sized firms from...); Educational Services (NAICS 61); Health Care and Social Assistance (NAICS 62); Arts, Entertainment, and...

  9. Determination of sample size for higher volatile data using new framework of Box-Jenkins model with GARCH: A case study on gold price

    NASA Astrophysics Data System (ADS)

    Roslindar Yaziz, Siti; Zakaria, Roslinazairimah; Hura Ahmad, Maizah

    2017-09-01

    The model of Box-Jenkins - GARCH has been shown to be a promising tool for forecasting higher volatile time series. In this study, the framework of determining the optimal sample size using Box-Jenkins model with GARCH is proposed for practical application in analysing and forecasting higher volatile data. The proposed framework is employed to daily world gold price series from year 1971 to 2013. The data is divided into 12 different sample sizes (from 30 to 10200). Each sample is tested using different combination of the hybrid Box-Jenkins - GARCH model. Our study shows that the optimal sample size to forecast gold price using the framework of the hybrid model is 1250 data of 5-year sample. Hence, the empirical results of model selection criteria and 1-step-ahead forecasting evaluations suggest that the latest 12.25% (5-year data) of 10200 data is sufficient enough to be employed in the model of Box-Jenkins - GARCH with similar forecasting performance as by using 41-year data.

  10. A two-stage Monte Carlo approach to the expression of uncertainty with finite sample sizes.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crowder, Stephen Vernon; Moyer, Robert D.

    2005-05-01

    Proposed supplement I to the GUM outlines a 'propagation of distributions' approach to deriving the distribution of a measurand for any non-linear function and for any set of random inputs. The supplement's proposed Monte Carlo approach assumes that the distributions of the random inputs are known exactly. This implies that the sample sizes are effectively infinite. In this case, the mean of the measurand can be determined precisely using a large number of Monte Carlo simulations. In practice, however, the distributions of the inputs will rarely be known exactly, but must be estimated using possibly small samples. If these approximatedmore » distributions are treated as exact, the uncertainty in estimating the mean is not properly taken into account. In this paper, we propose a two-stage Monte Carlo procedure that explicitly takes into account the finite sample sizes used to estimate parameters of the input distributions. We will illustrate the approach with a case study involving the efficiency of a thermistor mount power sensor. The performance of the proposed approach will be compared to the standard GUM approach for finite samples using simple non-linear measurement equations. We will investigate performance in terms of coverage probabilities of derived confidence intervals.« less

  11. What is a species? A new universal method to measure differentiation and assess the taxonomic rank of allopatric populations, using continuous variables

    PubMed Central

    Donegan, Thomas M.

    2018-01-01

    Abstract Existing models for assigning species, subspecies, or no taxonomic rank to populations which are geographically separated from one another were analyzed. This was done by subjecting over 3,000 pairwise comparisons of vocal or biometric data based on birds to a variety of statistical tests that have been proposed as measures of differentiation. One current model which aims to test diagnosability (Isler et al. 1998) is highly conservative, applying a hard cut-off, which excludes from consideration differentiation below diagnosis. It also includes non-overlap as a requirement, a measure which penalizes increases to sample size. The “species scoring” model of Tobias et al. (2010) involves less drastic cut-offs, but unlike Isler et al. (1998), does not control adequately for sample size and attributes scores in many cases to differentiation which is not statistically significant. Four different models of assessing effect sizes were analyzed: using both pooled and unpooled standard deviations and controlling for sample size using t-distributions or omitting to do so. Pooled standard deviations produced more conservative effect sizes when uncontrolled for sample size but less conservative effect sizes when so controlled. Pooled models require assumptions to be made that are typically elusive or unsupported for taxonomic studies. Modifications to improving these frameworks are proposed, including: (i) introducing statistical significance as a gateway to attributing any weighting to findings of differentiation; (ii) abandoning non-overlap as a test; (iii) recalibrating Tobias et al. (2010) scores based on effect sizes controlled for sample size using t-distributions. A new universal method is proposed for measuring differentiation in taxonomy using continuous variables and a formula is proposed for ranking allopatric populations. This is based first on calculating effect sizes using unpooled standard deviations, controlled for sample size using t-distributions, for a series of different variables. All non-significant results are excluded by scoring them as zero. Distance between any two populations is calculated using Euclidian summation of non-zeroed effect size scores. If the score of an allopatric pair exceeds that of a related sympatric pair, then the allopatric population can be ranked as species and, if not, then at most subspecies rank should be assigned. A spreadsheet has been programmed and is being made available which allows this and other tests of differentiation and rank studied in this paper to be rapidly analyzed. PMID:29780266

  12. Robust Covariate-Adjusted Log-Rank Statistics and Corresponding Sample Size Formula for Recurrent Events Data

    PubMed Central

    Song, Rui; Kosorok, Michael R.; Cai, Jianwen

    2009-01-01

    Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107

  13. An imbalance in cluster sizes does not lead to notable loss of power in cross-sectional, stepped-wedge cluster randomised trials with a continuous outcome.

    PubMed

    Kristunas, Caroline A; Smith, Karen L; Gray, Laura J

    2017-03-07

    The current methodology for sample size calculations for stepped-wedge cluster randomised trials (SW-CRTs) is based on the assumption of equal cluster sizes. However, as is often the case in cluster randomised trials (CRTs), the clusters in SW-CRTs are likely to vary in size, which in other designs of CRT leads to a reduction in power. The effect of an imbalance in cluster size on the power of SW-CRTs has not previously been reported, nor what an appropriate adjustment to the sample size calculation should be to allow for any imbalance. We aimed to assess the impact of an imbalance in cluster size on the power of a cross-sectional SW-CRT and recommend a method for calculating the sample size of a SW-CRT when there is an imbalance in cluster size. The effect of varying degrees of imbalance in cluster size on the power of SW-CRTs was investigated using simulations. The sample size was calculated using both the standard method and two proposed adjusted design effects (DEs), based on those suggested for CRTs with unequal cluster sizes. The data were analysed using generalised estimating equations with an exchangeable correlation matrix and robust standard errors. An imbalance in cluster size was not found to have a notable effect on the power of SW-CRTs. The two proposed adjusted DEs resulted in trials that were generally considerably over-powered. We recommend that the standard method of sample size calculation for SW-CRTs be used, provided that the assumptions of the method hold. However, it would be beneficial to investigate, through simulation, what effect the maximum likely amount of inequality in cluster sizes would be on the power of the trial and whether any inflation of the sample size would be required.

  14. The Petersen-Lincoln estimator and its extension to estimate the size of a shared population.

    PubMed

    Chao, Anne; Pan, H-Y; Chiang, Shu-Chuan

    2008-12-01

    The Petersen-Lincoln estimator has been used to estimate the size of a population in a single mark release experiment. However, the estimator is not valid when the capture sample and recapture sample are not independent. We provide an intuitive interpretation for "independence" between samples based on 2 x 2 categorical data formed by capture/non-capture in each of the two samples. From the interpretation, we review a general measure of "dependence" and quantify the correlation bias of the Petersen-Lincoln estimator when two types of dependences (local list dependence and heterogeneity of capture probability) exist. An important implication in the census undercount problem is that instead of using a post enumeration sample to assess the undercount of a census, one should conduct a prior enumeration sample to avoid correlation bias. We extend the Petersen-Lincoln method to the case of two populations. This new estimator of the size of the shared population is proposed and its variance is derived. We discuss a special case where the correlation bias of the proposed estimator due to dependence between samples vanishes. The proposed method is applied to a study of the relapse rate of illicit drug use in Taiwan. ((c) 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim).

  15. Optimal flexible sample size design with robust power.

    PubMed

    Zhang, Lanju; Cui, Lu; Yang, Bo

    2016-08-30

    It is well recognized that sample size determination is challenging because of the uncertainty on the treatment effect size. Several remedies are available in the literature. Group sequential designs start with a sample size based on a conservative (smaller) effect size and allow early stop at interim looks. Sample size re-estimation designs start with a sample size based on an optimistic (larger) effect size and allow sample size increase if the observed effect size is smaller than planned. Different opinions favoring one type over the other exist. We propose an optimal approach using an appropriate optimality criterion to select the best design among all the candidate designs. Our results show that (1) for the same type of designs, for example, group sequential designs, there is room for significant improvement through our optimization approach; (2) optimal promising zone designs appear to have no advantages over optimal group sequential designs; and (3) optimal designs with sample size re-estimation deliver the best adaptive performance. We conclude that to deal with the challenge of sample size determination due to effect size uncertainty, an optimal approach can help to select the best design that provides most robust power across the effect size range of interest. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Repeated significance tests of linear combinations of sensitivity and specificity of a diagnostic biomarker

    PubMed Central

    Wu, Mixia; Shu, Yu; Li, Zhaohai; Liu, Aiyi

    2016-01-01

    A sequential design is proposed to test whether the accuracy of a binary diagnostic biomarker meets the minimal level of acceptance. The accuracy of a binary diagnostic biomarker is a linear combination of the marker’s sensitivity and specificity. The objective of the sequential method is to minimize the maximum expected sample size under the null hypothesis that the marker’s accuracy is below the minimal level of acceptance. The exact results of two-stage designs based on Youden’s index and efficiency indicate that the maximum expected sample sizes are smaller than the sample sizes of the fixed designs. Exact methods are also developed for estimation, confidence interval and p-value concerning the proposed accuracy index upon termination of the sequential testing. PMID:26947768

  17. Minimizing the Maximum Expected Sample Size in Two-Stage Phase II Clinical Trials with Continuous Outcomes

    PubMed Central

    Wason, James M. S.; Mander, Adrian P.

    2012-01-01

    Two-stage designs are commonly used for Phase II trials. Optimal two-stage designs have the lowest expected sample size for a specific treatment effect, for example, the null value, but can perform poorly if the true treatment effect differs. Here we introduce a design for continuous treatment responses that minimizes the maximum expected sample size across all possible treatment effects. The proposed design performs well for a wider range of treatment effects and so is useful for Phase II trials. We compare the design to a previously used optimal design and show it has superior expected sample size properties. PMID:22651118

  18. Robust functional statistics applied to Probability Density Function shape screening of sEMG data.

    PubMed

    Boudaoud, S; Rix, H; Al Harrach, M; Marin, F

    2014-01-01

    Recent studies pointed out possible shape modifications of the Probability Density Function (PDF) of surface electromyographical (sEMG) data according to several contexts like fatigue and muscle force increase. Following this idea, criteria have been proposed to monitor these shape modifications mainly using High Order Statistics (HOS) parameters like skewness and kurtosis. In experimental conditions, these parameters are confronted with small sample size in the estimation process. This small sample size induces errors in the estimated HOS parameters restraining real-time and precise sEMG PDF shape monitoring. Recently, a functional formalism, the Core Shape Model (CSM), has been used to analyse shape modifications of PDF curves. In this work, taking inspiration from CSM method, robust functional statistics are proposed to emulate both skewness and kurtosis behaviors. These functional statistics combine both kernel density estimation and PDF shape distances to evaluate shape modifications even in presence of small sample size. Then, the proposed statistics are tested, using Monte Carlo simulations, on both normal and Log-normal PDFs that mimic observed sEMG PDF shape behavior during muscle contraction. According to the obtained results, the functional statistics seem to be more robust than HOS parameters to small sample size effect and more accurate in sEMG PDF shape screening applications.

  19. 76 FR 28786 - Proposed Data Collections Submitted for Public Comment and Recommendations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-18

    .... The sample size is based on recommendations related to qualitative interview methods and the research... than 10 employees (CPWR, 2007), and this establishment size experiences the highest fatality rate... out occupational safety and health training. This interview will be administered to a sample of...

  20. Relative efficiency of unequal versus equal cluster sizes in cluster randomized trials using generalized estimating equation models.

    PubMed

    Liu, Jingxia; Colditz, Graham A

    2018-05-01

    There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the "working correlation structure" is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs-exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Summary and Synthesis: How to Present a Research Proposal.

    PubMed

    Setia, Maninder Singh; Panda, Saumya

    2017-01-01

    This concluding module attempts to synthesize the key learning points discussed during the course of the previous ten sets of modules on methodology and biostatistics. The objective of this module is to discuss how to present a model research proposal, based on whatever was discussed in the preceding modules. The lynchpin of a research proposal is the protocol, and the key component of a protocol is the study design. However, one must not neglect the other areas, be it the project summary through which one catches the eyes of the reviewer of the proposal, or the background and the literature review, or the aims and objectives of the study. Two critical areas in the "methods" section that cannot be emphasized more are the sampling strategy and a formal estimation of sample size. Without a legitimate sample size, none of the conclusions based on the statistical analysis would be valid. Finally, the ethical parameters of the study should be well understood by the researchers, and that should get reflected in the proposal.

  2. Summary and Synthesis: How to Present a Research Proposal

    PubMed Central

    Setia, Maninder Singh; Panda, Saumya

    2017-01-01

    This concluding module attempts to synthesize the key learning points discussed during the course of the previous ten sets of modules on methodology and biostatistics. The objective of this module is to discuss how to present a model research proposal, based on whatever was discussed in the preceding modules. The lynchpin of a research proposal is the protocol, and the key component of a protocol is the study design. However, one must not neglect the other areas, be it the project summary through which one catches the eyes of the reviewer of the proposal, or the background and the literature review, or the aims and objectives of the study. Two critical areas in the “methods” section that cannot be emphasized more are the sampling strategy and a formal estimation of sample size. Without a legitimate sample size, none of the conclusions based on the statistical analysis would be valid. Finally, the ethical parameters of the study should be well understood by the researchers, and that should get reflected in the proposal. PMID:28979004

  3. Sample Size in Qualitative Interview Studies: Guided by Information Power.

    PubMed

    Malterud, Kirsti; Siersma, Volkert Dirk; Guassora, Ann Dorrit

    2015-11-27

    Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is "saturation." Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose the concept "information power" to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power depends on (a) the aim of the study, (b) sample specificity, (c) use of established theory, (d) quality of dialogue, and (e) analysis strategy. We present a model where these elements of information and their relevant dimensions are related to information power. Application of this model in the planning and during data collection of a qualitative study is discussed. © The Author(s) 2015.

  4. Sampling intraspecific variability in leaf functional traits: Practical suggestions to maximize collected information.

    PubMed

    Petruzzellis, Francesco; Palandrani, Chiara; Savi, Tadeja; Alberti, Roberto; Nardini, Andrea; Bacaro, Giovanni

    2017-12-01

    The choice of the best sampling strategy to capture mean values of functional traits for a species/population, while maintaining information about traits' variability and minimizing the sampling size and effort, is an open issue in functional trait ecology. Intraspecific variability (ITV) of functional traits strongly influences sampling size and effort. However, while adequate information is available about intraspecific variability between individuals (ITV BI ) and among populations (ITV POP ), relatively few studies have analyzed intraspecific variability within individuals (ITV WI ). Here, we provide an analysis of ITV WI of two foliar traits, namely specific leaf area (SLA) and osmotic potential (π), in a population of Quercus ilex L. We assessed the baseline ITV WI level of variation between the two traits and provided the minimum and optimal sampling size in order to take into account ITV WI , comparing sampling optimization outputs with those previously proposed in the literature. Different factors accounted for different amount of variance of the two traits. SLA variance was mostly spread within individuals (43.4% of the total variance), while π variance was mainly spread between individuals (43.2%). Strategies that did not account for all the canopy strata produced mean values not representative of the sampled population. The minimum size to adequately capture the studied functional traits corresponded to 5 leaves taken randomly from 5 individuals, while the most accurate and feasible sampling size was 4 leaves taken randomly from 10 individuals. We demonstrate that the spatial structure of the canopy could significantly affect traits variability. Moreover, different strategies for different traits could be implemented during sampling surveys. We partially confirm sampling sizes previously proposed in the recent literature and encourage future analysis involving different traits.

  5. Outcome-Dependent Sampling Design and Inference for Cox's Proportional Hazards Model.

    PubMed

    Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P; Zhou, Haibo

    2016-11-01

    We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study.

  6. Design of Phase II Non-inferiority Trials.

    PubMed

    Jung, Sin-Ho

    2017-09-01

    With the development of inexpensive treatment regimens and less invasive surgical procedures, we are confronted with non-inferiority study objectives. A non-inferiority phase III trial requires a roughly four times larger sample size than that of a similar standard superiority trial. Because of the large required sample size, we often face feasibility issues to open a non-inferiority trial. Furthermore, due to lack of phase II non-inferiority trial design methods, we do not have an opportunity to investigate the efficacy of the experimental therapy through a phase II trial. As a result, we often fail to open a non-inferiority phase III trial and a large number of non-inferiority clinical questions still remain unanswered. In this paper, we want to develop some designs for non-inferiority randomized phase II trials with feasible sample sizes. At first, we review a design method for non-inferiority phase III trials. Subsequently, we propose three different designs for non-inferiority phase II trials that can be used under different settings. Each method is demonstrated with examples. Each of the proposed design methods is shown to require a reasonable sample size for non-inferiority phase II trials. The three different non-inferiority phase II trial designs are used under different settings, but require similar sample sizes that are typical for phase II trials.

  7. Sample size determination for disease prevalence studies with partially validated data.

    PubMed

    Qiu, Shi-Fang; Poon, Wai-Yin; Tang, Man-Lai

    2016-02-01

    Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example. © The Author(s) 2012.

  8. Size constraints on a Majorana beam-splitter interferometer: Majorana coupling and surface-bulk scattering

    NASA Astrophysics Data System (ADS)

    Røising, Henrik Schou; Simon, Steven H.

    2018-03-01

    Topological insulator surfaces in proximity to superconductors have been proposed as a way to produce Majorana fermions in condensed matter physics. One of the simplest proposed experiments with such a system is Majorana interferometry. Here we consider two possibly conflicting constraints on the size of such an interferometer. Coupling of a Majorana mode from the edge (the arms) of the interferometer to vortices in the center of the device sets a lower bound on the size of the device. On the other hand, scattering to the usually imperfectly insulating bulk sets an upper bound. From estimates of experimental parameters, we find that typical samples may have no size window in which the Majorana interferometer can operate, implying that a new generation of more highly insulating samples must be explored.

  9. Designing a multiple dependent state sampling plan based on the coefficient of variation.

    PubMed

    Yan, Aijun; Liu, Sanyang; Dong, Xiaojuan

    2016-01-01

    A multiple dependent state (MDS) sampling plan is developed based on the coefficient of variation of the quality characteristic which follows a normal distribution with unknown mean and variance. The optimal plan parameters of the proposed plan are solved by a nonlinear optimization model, which satisfies the given producer's risk and consumer's risk at the same time and minimizes the sample size required for inspection. The advantages of the proposed MDS sampling plan over the existing single sampling plan are discussed. Finally an example is given to illustrate the proposed plan.

  10. An integrated approach to piezoactuator positioning in high-speed atomic force microscope imaging

    NASA Astrophysics Data System (ADS)

    Yan, Yan; Wu, Ying; Zou, Qingze; Su, Chanmin

    2008-07-01

    In this paper, an integrated approach to achieve high-speed atomic force microscope (AFM) imaging of large-size samples is proposed, which combines the enhanced inversion-based iterative control technique to drive the piezotube actuator control for lateral x-y axis positioning with the use of a dual-stage piezoactuator for vertical z-axis positioning. High-speed, large-size AFM imaging is challenging because in high-speed lateral scanning of the AFM imaging at large size, large positioning error of the AFM probe relative to the sample can be generated due to the adverse effects—the nonlinear hysteresis and the vibrational dynamics of the piezotube actuator. In addition, vertical precision positioning of the AFM probe is even more challenging (than the lateral scanning) because the desired trajectory (i.e., the sample topography profile) is unknown in general, and the probe positioning is also effected by and sensitive to the probe-sample interaction. The main contribution of this article is the development of an integrated approach that combines advanced control algorithm with an advanced hardware platform. The proposed approach is demonstrated in experiments by imaging a large-size (50μm ) calibration sample at high-speed (50Hz scan rate).

  11. Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

    PubMed

    Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

    2014-09-01

    Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Critical appraisal of fundamental items in approved clinical trial research proposals in Mashhad University of Medical Sciences

    PubMed Central

    Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein

    2017-01-01

    Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials. PMID:29445703

  13. Critical appraisal of fundamental items in approved clinical trial research proposals in Mashhad University of Medical Sciences.

    PubMed

    Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein

    2017-01-01

    Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials.

  14. Outcome-Dependent Sampling Design and Inference for Cox’s Proportional Hazards Model

    PubMed Central

    Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P.; Zhou, Haibo

    2016-01-01

    We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study. PMID:28090134

  15. Simple, Defensible Sample Sizes Based on Cost Efficiency

    PubMed Central

    Bacchetti, Peter; McCulloch, Charles E.; Segal, Mark R.

    2009-01-01

    Summary The conventional approach of choosing sample size to provide 80% or greater power ignores the cost implications of different sample size choices. Costs, however, are often impossible for investigators and funders to ignore in actual practice. Here, we propose and justify a new approach for choosing sample size based on cost efficiency, the ratio of a study’s projected scientific and/or practical value to its total cost. By showing that a study’s projected value exhibits diminishing marginal returns as a function of increasing sample size for a wide variety of definitions of study value, we are able to develop two simple choices that can be defended as more cost efficient than any larger sample size. The first is to choose the sample size that minimizes the average cost per subject. The second is to choose sample size to minimize total cost divided by the square root of sample size. This latter method is theoretically more justifiable for innovative studies, but also performs reasonably well and has some justification in other cases. For example, if projected study value is assumed to be proportional to power at a specific alternative and total cost is a linear function of sample size, then this approach is guaranteed either to produce more than 90% power or to be more cost efficient than any sample size that does. These methods are easy to implement, based on reliable inputs, and well justified, so they should be regarded as acceptable alternatives to current conventional approaches. PMID:18482055

  16. A size-dependent constitutive model of bulk metallic glasses in the supercooled liquid region

    PubMed Central

    Yao, Di; Deng, Lei; Zhang, Mao; Wang, Xinyun; Tang, Na; Li, Jianjun

    2015-01-01

    Size effect is of great importance in micro forming processes. In this paper, micro cylinder compression was conducted to investigate the deformation behavior of bulk metallic glasses (BMGs) in supercooled liquid region with different deformation variables including sample size, temperature and strain rate. It was found that the elastic and plastic behaviors of BMGs have a strong dependence on the sample size. The free volume and defect concentration were introduced to explain the size effect. In order to demonstrate the influence of deformation variables on steady stress, elastic modulus and overshoot phenomenon, four size-dependent factors were proposed to construct a size-dependent constitutive model based on the Maxwell-pulse type model previously presented by the authors according to viscosity theory and free volume model. The proposed constitutive model was then adopted in finite element method simulations, and validated by comparing the micro cylinder compression and micro double cup extrusion experimental data with the numerical results. Furthermore, the model provides a new approach to understanding the size-dependent plastic deformation behavior of BMGs. PMID:25626690

  17. Sample size allocation in multiregional equivalence studies.

    PubMed

    Liao, Jason J Z; Yu, Ziji; Li, Yulan

    2018-06-17

    With the increasing globalization of drug development, the multiregional clinical trial (MRCT) has gained extensive use. The data from MRCTs could be accepted by regulatory authorities across regions and countries as the primary sources of evidence to support global marketing drug approval simultaneously. The MRCT can speed up patient enrollment and drug approval, and it makes the effective therapies available to patients all over the world simultaneously. However, there are many challenges both operationally and scientifically in conducting a drug development globally. One of many important questions to answer for the design of a multiregional study is how to partition sample size into each individual region. In this paper, two systematic approaches are proposed for the sample size allocation in a multiregional equivalence trial. A numerical evaluation and a biosimilar trial are used to illustrate the characteristics of the proposed approaches. Copyright © 2018 John Wiley & Sons, Ltd.

  18. A novel sample size formula for the weighted log-rank test under the proportional hazards cure model.

    PubMed

    Xiong, Xiaoping; Wu, Jianrong

    2017-01-01

    The treatment of cancer has progressed dramatically in recent decades, such that it is no longer uncommon to see a cure or log-term survival in a significant proportion of patients with various types of cancer. To adequately account for the cure fraction when designing clinical trials, the cure models should be used. In this article, a sample size formula for the weighted log-rank test is derived under the fixed alternative hypothesis for the proportional hazards cure models. Simulation showed that the proposed sample size formula provides an accurate estimation of sample size for designing clinical trials under the proportional hazards cure models. Copyright © 2016 John Wiley & Sons, Ltd.

  19. Development of a copula-based particle filter (CopPF) approach for hydrologic data assimilation under consideration of parameter interdependence

    NASA Astrophysics Data System (ADS)

    Fan, Y. R.; Huang, G. H.; Baetz, B. W.; Li, Y. P.; Huang, K.

    2017-06-01

    In this study, a copula-based particle filter (CopPF) approach was developed for sequential hydrological data assimilation by considering parameter correlation structures. In CopPF, multivariate copulas are proposed to reflect parameter interdependence before the resampling procedure with new particles then being sampled from the obtained copulas. Such a process can overcome both particle degeneration and sample impoverishment. The applicability of CopPF is illustrated with three case studies using a two-parameter simplified model and two conceptual hydrologic models. The results for the simplified model indicate that model parameters are highly correlated in the data assimilation process, suggesting a demand for full description of their dependence structure. Synthetic experiments on hydrologic data assimilation indicate that CopPF can rejuvenate particle evolution in large spaces and thus achieve good performances with low sample size scenarios. The applicability of CopPF is further illustrated through two real-case studies. It is shown that, compared with traditional particle filter (PF) and particle Markov chain Monte Carlo (PMCMC) approaches, the proposed method can provide more accurate results for both deterministic and probabilistic prediction with a sample size of 100. Furthermore, the sample size would not significantly influence the performance of CopPF. Also, the copula resampling approach dominates parameter evolution in CopPF, with more than 50% of particles sampled by copulas in most sample size scenarios.

  20. A modified approach to estimating sample size for simple logistic regression with one continuous covariate.

    PubMed

    Novikov, I; Fund, N; Freedman, L S

    2010-01-15

    Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.

  1. 78 FR 74175 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-10

    ... precision requirements or power calculations that justify the proposed sample size, the expected response...: Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on... Information Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback...

  2. Size and modal analyses of fines and ultrafines from some Apollo 17 samples

    NASA Technical Reports Server (NTRS)

    Greene, G. M.; King, D. T., Jr.; Banholzer, G. S., Jr.; King, E. A.

    1975-01-01

    Scanning electron and optical microscopy techniques have been used to determine the grain-size frequency distributions and morphology-based modal analyses of fine and ultrafine fractions of some Apollo 17 regolith samples. There are significant and large differences between the grain-size frequency distributions of the less than 10-micron size fraction of Apollo 17 samples, but there are no clear relations to the local geologic setting from which individual samples have been collected. This may be due to effective lateral mixing of regolith particles in this size range by micrometeoroid impacts. None of the properties of the frequency distributions support the idea of selective transport of any fine grain-size fraction, as has been proposed by other workers. All of the particle types found in the coarser size fractions also occur in the less than 10-micron particles. In the size range from 105 to 10 microns there is a strong tendency for the percentage of regularly shaped glass to increase as the graphic mean grain size of the less than 1-mm size fraction decreases, both probably being controlled by exposure age.

  3. 75 FR 39200 - Periodic Reporting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-08

    ... using an alternative sample frame. Establishing this docket will allow the Commission to consider the... Information System/Revenue Pieces and Weight (ODIS/RPW) data by 20 percent. Id. at 3. In effect, Proposal Two... sample size to a special study utilizing an alternative sample frame. The alternative sample frame that...

  4. Estimation after classification using lot quality assurance sampling: corrections for curtailed sampling with application to evaluating polio vaccination campaigns.

    PubMed

    Olives, Casey; Valadez, Joseph J; Pagano, Marcello

    2014-03-01

    To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.

  5. Fragment size distribution statistics in dynamic fragmentation of laser shock-loaded tin

    NASA Astrophysics Data System (ADS)

    He, Weihua; Xin, Jianting; Zhao, Yongqiang; Chu, Genbai; Xi, Tao; Shui, Min; Lu, Feng; Gu, Yuqiu

    2017-06-01

    This work investigates the geometric statistics method to characterize the size distribution of tin fragments produced in the laser shock-loaded dynamic fragmentation process. In the shock experiments, the ejection of the tin sample with etched V-shape groove in the free surface are collected by the soft recovery technique. Subsequently, the produced fragments are automatically detected with the fine post-shot analysis techniques including the X-ray micro-tomography and the improved watershed method. To characterize the size distributions of the fragments, a theoretical random geometric statistics model based on Poisson mixtures is derived for dynamic heterogeneous fragmentation problem, which reveals linear combinational exponential distribution. The experimental data related to fragment size distributions of the laser shock-loaded tin sample are examined with the proposed theoretical model, and its fitting performance is compared with that of other state-of-the-art fragment size distribution models. The comparison results prove that our proposed model can provide far more reasonable fitting result for the laser shock-loaded tin.

  6. Extracting samples of high diversity from thematic collections of large gene banks using a genetic-distance based approach

    PubMed Central

    2010-01-01

    Background Breeding programs are usually reluctant to evaluate and use germplasm accessions other than the elite materials belonging to their advanced populations. The concept of core collections has been proposed to facilitate the access of potential users to samples of small sizes, representative of the genetic variability contained within the gene pool of a specific crop. The eventual large size of a core collection perpetuates the problem it was originally proposed to solve. The present study suggests that, in addition to the classic core collection concept, thematic core collections should be also developed for a specific crop, composed of a limited number of accessions, with a manageable size. Results The thematic core collection obtained meets the minimum requirements for a core sample - maintenance of at least 80% of the allelic richness of the thematic collection, with, approximately, 15% of its size. The method was compared with other methodologies based on the M strategy, and also with a core collection generated by random sampling. Higher proportions of retained alleles (in a core collection of equal size) or similar proportions of retained alleles (in a core collection of smaller size) were detected in the two methods based on the M strategy compared to the proposed methodology. Core sub-collections constructed by different methods were compared regarding the increase or maintenance of phenotypic diversity. No change on phenotypic diversity was detected by measuring the trait "Weight of 100 Seeds", for the tested sampling methods. Effects on linkage disequilibrium between unlinked microsatellite loci, due to sampling, are discussed. Conclusions Building of a thematic core collection was here defined by prior selection of accessions which are diverse for the trait of interest, and then by pairwise genetic distances, estimated by DNA polymorphism analysis at molecular marker loci. The resulting thematic core collection potentially reflects the maximum allele richness with the smallest sample size from a larger thematic collection. As an example, we used the development of a thematic core collection for drought tolerance in rice. It is expected that such thematic collections increase the use of germplasm by breeding programs and facilitate the study of the traits under consideration. The definition of a core collection to study drought resistance is a valuable contribution towards the understanding of the genetic control and the physiological mechanisms involved in water use efficiency in plants. PMID:20576152

  7. 76 FR 61360 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-04

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Activities: Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative... Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for approval under the Paperwork...

  8. 76 FR 35069 - Agency Information Collection Activities; Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-15

    ... precision requirements or power calculations that justify the proposed sample size, the expected response...; Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on... (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery...

  9. 76 FR 41280 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-13

    ... proposed sample size, the expected response rate, methods for assessing potential non-response bias, the... Activities: Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery '' to OMB for...

  10. 77 FR 52708 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-08-30

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential...: Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on... Information Collection request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback...

  11. Sample size determination for equivalence assessment with multiple endpoints.

    PubMed

    Sun, Anna; Dong, Xiaoyu; Tsong, Yi

    2014-01-01

    Equivalence assessment between a reference and test treatment is often conducted by two one-sided tests (TOST). The corresponding power function and sample size determination can be derived from a joint distribution of the sample mean and sample variance. When an equivalence trial is designed with multiple endpoints, it often involves several sets of two one-sided tests. A naive approach for sample size determination in this case would select the largest sample size required for each endpoint. However, such a method ignores the correlation among endpoints. With the objective to reject all endpoints and when the endpoints are uncorrelated, the power function is the production of all power functions for individual endpoints. With correlated endpoints, the sample size and power should be adjusted for such a correlation. In this article, we propose the exact power function for the equivalence test with multiple endpoints adjusted for correlation under both crossover and parallel designs. We further discuss the differences in sample size for the naive method without and with correlation adjusted methods and illustrate with an in vivo bioequivalence crossover study with area under the curve (AUC) and maximum concentration (Cmax) as the two endpoints.

  12. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data

    PubMed Central

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks. PMID:29706880

  13. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data.

    PubMed

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks.

  14. Using Candy Samples to Learn about Sampling Techniques and Statistical Data Evaluation

    ERIC Educational Resources Information Center

    Canaes, Larissa S.; Brancalion, Marcel L.; Rossi, Adriana V.; Rath, Susanne

    2008-01-01

    A classroom exercise for undergraduate and beginning graduate students that takes about one class period is proposed and discussed. It is an easy, interesting exercise that demonstrates important aspects of sampling techniques (sample amount, particle size, and the representativeness of the sample in relation to the bulk material). The exercise…

  15. Wider-Opening Dewar Flasks for Cryogenic Storage

    NASA Technical Reports Server (NTRS)

    Ruemmele, Warren P.; Manry, John; Stafford, Kristin; Bue, Grant; Krejci, John; Evernden, Bent

    2010-01-01

    Dewar flasks have been proposed as containers for relatively long-term (25 days) storage of perishable scientific samples or other perishable objects at a temperature of 175 C. The refrigeration would be maintained through slow boiling of liquid nitrogen (LN2). For the purposes of the application for which these containers were proposed, (1) the neck openings of commercial off-the-shelf (COTS) Dewar flasks are too small for most NASA samples; (2) the round shapes of the COTS containers give rise to unacceptably low efficiency of packing in rectangular cargo compartments; and (3) the COTS containers include metal structures that are too thermally conductive, such that they cannot, without exceeding size and weight limits, hold enough LN2 for the required long-term-storage. In comparison with COTS Dewar flasks, the proposed containers would be rectangular, yet would satisfy the long-term storage requirement without exceeding size and weight limits; would have larger neck openings; and would have greater sample volumes, leading to a packing efficiency of about double the sample volume as a fraction of total volume. The proposed containers would be made partly of aerospace- type composite materials and would include vacuum walls, multilayer insulation, and aerogel insulation.

  16. A novel measure of effect size for mediation analysis.

    PubMed

    Lachowicz, Mark J; Preacher, Kristopher J; Kelley, Ken

    2018-06-01

    Mediation analysis has become one of the most popular statistical methods in the social sciences. However, many currently available effect size measures for mediation have limitations that restrict their use to specific mediation models. In this article, we develop a measure of effect size that addresses these limitations. We show how modification of a currently existing effect size measure results in a novel effect size measure with many desirable properties. We also derive an expression for the bias of the sample estimator for the proposed effect size measure and propose an adjusted version of the estimator. We present a Monte Carlo simulation study conducted to examine the finite sampling properties of the adjusted and unadjusted estimators, which shows that the adjusted estimator is effective at recovering the true value it estimates. Finally, we demonstrate the use of the effect size measure with an empirical example. We provide freely available software so that researchers can immediately implement the methods we discuss. Our developments here extend the existing literature on effect sizes and mediation by developing a potentially useful method of communicating the magnitude of mediation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  17. Single-arm phase II trial design under parametric cure models.

    PubMed

    Wu, Jianrong

    2015-01-01

    The current practice of designing single-arm phase II survival trials is limited under the exponential model. Trial design under the exponential model may not be appropriate when a portion of patients are cured. There is no literature available for designing single-arm phase II trials under the parametric cure model. In this paper, a test statistic is proposed, and a sample size formula is derived for designing single-arm phase II trials under a class of parametric cure models. Extensive simulations showed that the proposed test and sample size formula perform very well under different scenarios. Copyright © 2015 John Wiley & Sons, Ltd.

  18. Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range.

    PubMed

    Luo, Dehui; Wan, Xiang; Liu, Jiming; Tong, Tiejun

    2018-06-01

    The era of big data is coming, and evidence-based medicine is attracting increasing attention to improve decision making in medical practice via integrating evidence from well designed and conducted clinical research. Meta-analysis is a statistical technique widely used in evidence-based medicine for analytically combining the findings from independent clinical trials to provide an overall estimation of a treatment effectiveness. The sample mean and standard deviation are two commonly used statistics in meta-analysis but some trials use the median, the minimum and maximum values, or sometimes the first and third quartiles to report the results. Thus, to pool results in a consistent format, researchers need to transform those information back to the sample mean and standard deviation. In this article, we investigate the optimal estimation of the sample mean for meta-analysis from both theoretical and empirical perspectives. A major drawback in the literature is that the sample size, needless to say its importance, is either ignored or used in a stepwise but somewhat arbitrary manner, e.g. the famous method proposed by Hozo et al. We solve this issue by incorporating the sample size in a smoothly changing weight in the estimators to reach the optimal estimation. Our proposed estimators not only improve the existing ones significantly but also share the same virtue of the simplicity. The real data application indicates that our proposed estimators are capable to serve as "rules of thumb" and will be widely applied in evidence-based medicine.

  19. Measures of precision for dissimilarity-based multivariate analysis of ecological communities

    PubMed Central

    Anderson, Marti J; Santana-Garcon, Julia

    2015-01-01

    Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826

  20. Exact tests using two correlated binomial variables in contemporary cancer clinical trials.

    PubMed

    Yu, Jihnhee; Kepner, James L; Iyer, Renuka

    2009-12-01

    New therapy strategies for the treatment of cancer are rapidly emerging because of recent technology advances in genetics and molecular biology. Although newer targeted therapies can improve survival without measurable changes in tumor size, clinical trial conduct has remained nearly unchanged. When potentially efficacious therapies are tested, current clinical trial design and analysis methods may not be suitable for detecting therapeutic effects. We propose an exact method with respect to testing cytostatic cancer treatment using correlated bivariate binomial random variables to simultaneously assess two primary outcomes. The method is easy to implement. It does not increase the sample size over that of the univariate exact test and in most cases reduces the sample size required. Sample size calculations are provided for selected designs.

  1. Population size estimation in Yellowstone wolves with error-prone noninvasive microsatellite genotypes.

    PubMed

    Creel, Scott; Spong, Goran; Sands, Jennifer L; Rotella, Jay; Zeigle, Janet; Joe, Lawrence; Murphy, Kerry M; Smith, Douglas

    2003-07-01

    Determining population sizes can be difficult, but is essential for conservation. By counting distinct microsatellite genotypes, DNA from noninvasive samples (hair, faeces) allows estimation of population size. Problems arise because genotypes from noninvasive samples are error-prone, but genotyping errors can be reduced by multiple polymerase chain reaction (PCR). For faecal genotypes from wolves in Yellowstone National Park, error rates varied substantially among samples, often above the 'worst-case threshold' suggested by simulation. Consequently, a substantial proportion of multilocus genotypes held one or more errors, despite multiple PCR. These genotyping errors created several genotypes per individual and caused overestimation (up to 5.5-fold) of population size. We propose a 'matching approach' to eliminate this overestimation bias.

  2. Multi-parameter analysis using photovoltaic cell-based optofluidic cytometer

    PubMed Central

    Yan, Chien-Shun; Wang, Yao-Nan

    2016-01-01

    A multi-parameter optofluidic cytometer based on two low-cost commercial photovoltaic cells and an avalanche photodetector is proposed. The optofluidic cytometer is fabricated on a polydimethylsiloxane (PDMS) substrate and is capable of detecting side scattered (SSC), extinction (EXT) and fluorescence (FL) signals simultaneously using a free-space light transmission technique without the need for on-chip optical waveguides. The feasibility of the proposed device is demonstrated by detecting fluorescent-labeled polystyrene beads with sizes of 3 μm, 5 μm and 10 μm, respectively, and label-free beads with a size of 7.26 μm. The detection experiments are performed using both single-bead population samples and mixed-bead population samples. The detection results obtained using the SSC/EXT, EXT/FL and SSC/FL signals are compared with those obtained using a commercial flow cytometer. It is shown that the optofluidic cytometer achieves a high detection accuracy for both single-bead population samples and mixed-bead population samples. Consequently, the proposed device provides a versatile, straightforward and low-cost solution for a wide variety of point-of-care (PoC) cytometry applications. PMID:27699122

  3. A cautionary note on Bayesian estimation of population size by removal sampling with diffuse priors.

    PubMed

    Bord, Séverine; Bioche, Christèle; Druilhet, Pierre

    2018-05-01

    We consider the problem of estimating a population size by removal sampling when the sampling rate is unknown. Bayesian methods are now widespread and allow to include prior knowledge in the analysis. However, we show that Bayes estimates based on default improper priors lead to improper posteriors or infinite estimates. Similarly, weakly informative priors give unstable estimators that are sensitive to the choice of hyperparameters. By examining the likelihood, we show that population size estimates can be stabilized by penalizing small values of the sampling rate or large value of the population size. Based on theoretical results and simulation studies, we propose some recommendations on the choice of the prior. Then, we applied our results to real datasets. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Critical appraisal of arguments for the delayed-start design proposed as alternative to the parallel-group randomized clinical trial design in the field of rare disease.

    PubMed

    Spineli, Loukia M; Jenz, Eva; Großhennig, Anika; Koch, Armin

    2017-08-17

    A number of papers have proposed or evaluated the delayed-start design as an alternative to the standard two-arm parallel group randomized clinical trial (RCT) design in the field of rare disease. However the discussion is felt to lack a sufficient degree of consideration devoted to the true virtues of the delayed start design and the implications either in terms of required sample-size, overall information, or interpretation of the estimate in the context of small populations. To evaluate whether there are real advantages of the delayed-start design particularly in terms of overall efficacy and sample size requirements as a proposed alternative to the standard parallel group RCT in the field of rare disease. We used a real-life example to compare the delayed-start design with the standard RCT in terms of sample size requirements. Then, based on three scenarios regarding the development of the treatment effect over time, the advantages, limitations and potential costs of the delayed-start design are discussed. We clarify that delayed-start design is not suitable for drugs that establish an immediate treatment effect, but for drugs with effects developing over time, instead. In addition, the sample size will always increase as an implication for a reduced time on placebo resulting in a decreased treatment effect. A number of papers have repeated well-known arguments to justify the delayed-start design as appropriate alternative to the standard parallel group RCT in the field of rare disease and do not discuss the specific needs of research methodology in this field. The main point is that a limited time on placebo will result in an underestimated treatment effect and, in consequence, in larger sample size requirements compared to those expected under a standard parallel-group design. This also impacts on benefit-risk assessment.

  5. Improved ASTM G72 Test Method for Ensuring Adequate Fuel-to-Oxidizer Ratios

    NASA Technical Reports Server (NTRS)

    Juarez, Alfredo; Harper, Susana A.

    2016-01-01

    The ASTM G72/G72M-15 Standard Test Method for Autogenous Ignition Temperature of Liquids and Solids in a High-Pressure Oxygen-Enriched Environment is currently used to evaluate materials for the ignition susceptibility driven by exposure to external heat in an enriched oxygen environment. Testing performed on highly volatile liquids such as cleaning solvents has proven problematic due to inconsistent test results (non-ignitions). Non-ignition results can be misinterpreted as favorable oxygen compatibility, although they are more likely associated with inadequate fuel-to-oxidizer ratios. Forced evaporation during purging and inadequate sample size were identified as two potential causes for inadequate available sample material during testing. In an effort to maintain adequate fuel-to-oxidizer ratios within the reaction vessel during test, several parameters were considered, including sample size, pretest sample chilling, pretest purging, and test pressure. Tests on a variety of solvents exhibiting a range of volatilities are presented in this paper. A proposed improvement to the standard test protocol as a result of this evaluation is also presented. Execution of the final proposed improved test protocol outlines an incremental step method of determining optimal conditions using increased sample sizes while considering test system safety limits. The proposed improved test method increases confidence in results obtained by utilizing the ASTM G72 autogenous ignition temperature test method and can aid in the oxygen compatibility assessment of highly volatile liquids and other conditions that may lead to false non-ignition results.

  6. Crack identification and evolution law in the vibration failure process of loaded coal

    NASA Astrophysics Data System (ADS)

    Li, Chengwu; Ai, Dihao; Sun, Xiaoyuan; Xie, Beijing

    2017-08-01

    To study the characteristics of coal cracks produced in the vibration failure process, we set up a static load and static and dynamic combination load failure test simulation system, prepared with different particle size, formation pressure, and firmness coefficient coal samples. Through static load damage testing of coal samples and then dynamic load (vibration exciter) and static (jack) combination destructive testing, the crack images of coal samples under the load condition were obtained. Combined with digital image processing technology, an algorithm of crack identification with high precision and in real-time is proposed. With the crack features of the coal samples under different load conditions as the research object, we analyzed the distribution of cracks on the surface of the coal samples and the factors influencing crack evolution using the proposed algorithm and a high-resolution industrial camera. Experimental results showed that the major portion of the crack after excitation is located in the rear of the coal sample where the vibration exciter cannot act. Under the same disturbance conditions, crack size and particle size exhibit a positive correlation, while crack size and formation pressure exhibit a negative correlation. Soft coal is more likely to lead to crack evolution than hard coal, and more easily causes instability failure. The experimental results and crack identification algorithm provide a solid basis for the prevention and control of instability and failure of coal and rock mass, and they are helpful in improving the monitoring method of coal and rock dynamic disasters.

  7. Sample Size for Tablet Compression and Capsule Filling Events During Process Validation.

    PubMed

    Charoo, Naseem Ahmad; Durivage, Mark; Rahman, Ziyaur; Ayad, Mohamad Haitham

    2017-12-01

    During solid dosage form manufacturing, the uniformity of dosage units (UDU) is ensured by testing samples at 2 stages, that is, blend stage and tablet compression or capsule/powder filling stage. The aim of this work is to propose a sample size selection approach based on quality risk management principles for process performance qualification (PPQ) and continued process verification (CPV) stages by linking UDU to potential formulation and process risk factors. Bayes success run theorem appeared to be the most appropriate approach among various methods considered in this work for computing sample size for PPQ. The sample sizes for high-risk (reliability level of 99%), medium-risk (reliability level of 95%), and low-risk factors (reliability level of 90%) were estimated to be 299, 59, and 29, respectively. Risk-based assignment of reliability levels was supported by the fact that at low defect rate, the confidence to detect out-of-specification units would decrease which must be supplemented with an increase in sample size to enhance the confidence in estimation. Based on level of knowledge acquired during PPQ and the level of knowledge further required to comprehend process, sample size for CPV was calculated using Bayesian statistics to accomplish reduced sampling design for CPV. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  8. Hierarchical modeling of cluster size in wildlife surveys

    USGS Publications Warehouse

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  9. OLT-centralized sampling frequency offset compensation scheme for OFDM-PON.

    PubMed

    Chen, Ming; Zhou, Hui; Zheng, Zhiwei; Deng, Rui; Chen, Qinghui; Peng, Miao; Liu, Cuiwei; He, Jing; Chen, Lin; Tang, Xionggui

    2017-08-07

    We propose an optical line terminal (OLT)-centralized sampling frequency offset (SFO) compensation scheme for adaptively-modulated OFDM-PON systems. By using the proposed SFO scheme, the phase rotation and inter-symbol interference (ISI) caused by SFOs between OLT and multiple optical network units (ONUs) can be centrally compensated in the OLT, which reduces the complexity of ONUs. Firstly, the optimal fast Fourier transform (FFT) size is identified in the intensity-modulated and direct-detection (IMDD) OFDM system in the presence of SFO. Then, the proposed SFO compensation scheme including phase rotation modulation (PRM) and length-adaptive OFDM frame has been experimentally demonstrated in the downlink transmission of an adaptively modulated optical OFDM with the optimal FFT size. The experimental results show that up to ± 300 ppm SFO can be successfully compensated without introducing any receiver performance penalties.

  10. Neuromuscular dose-response studies: determining sample size.

    PubMed

    Kopman, A F; Lien, C A; Naguib, M

    2011-02-01

    Investigators planning dose-response studies of neuromuscular blockers have rarely used a priori power analysis to determine the minimal sample size their protocols require. Institutional Review Boards and peer-reviewed journals now generally ask for this information. This study outlines a proposed method for meeting these requirements. The slopes of the dose-response relationships of eight neuromuscular blocking agents were determined using regression analysis. These values were substituted for γ in the Hill equation. When this is done, the coefficient of variation (COV) around the mean value of the ED₅₀ for each drug is easily calculated. Using these values, we performed an a priori one-sample two-tailed t-test of the means to determine the required sample size when the allowable error in the ED₅₀ was varied from ±10-20%. The COV averaged 22% (range 15-27%). We used a COV value of 25% in determining the sample size. If the allowable error in finding the mean ED₅₀ is ±15%, a sample size of 24 is needed to achieve a power of 80%. Increasing 'accuracy' beyond this point requires increasing greater sample sizes (e.g. an 'n' of 37 for a ±12% error). On the basis of the results of this retrospective analysis, a total sample size of not less than 24 subjects should be adequate for determining a neuromuscular blocking drug's clinical potency with a reasonable degree of assurance.

  11. Ultrafast image-based dynamic light scattering for nanoparticle sizing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Wu; Zhang, Jie; Liu, Lili

    An ultrafast sizing method for nanoparticles is proposed, called as UIDLS (Ultrafast Image-based Dynamic Light Scattering). This method makes use of the intensity fluctuation of scattered light from nanoparticles in Brownian motion, which is similar to the conventional DLS method. The difference in the experimental system is that the scattered light by nanoparticles is received by an image sensor instead of a photomultiplier tube. A novel data processing algorithm is proposed to directly get correlation coefficient between two images at a certain time interval (from microseconds to milliseconds) by employing a two-dimensional image correlation algorithm. This coefficient has been provedmore » to be a monotonic function of the particle diameter. Samples of standard latex particles (79/100/352/482/948 nm) were measured for validation of the proposed method. The measurement accuracy of higher than 90% was found with standard deviations less than 3%. A sample of nanosilver particle with nominal size of 20 ± 2 nm and a sample of polymethyl methacrylate emulsion with unknown size were also tested using UIDLS method. The measured results were 23.2 ± 3.0 nm and 246.1 ± 6.3 nm, respectively, which is substantially consistent with the transmission electron microscope results. Since the time for acquisition of two successive images has been reduced to less than 1 ms and the data processing time in about 10 ms, the total measuring time can be dramatically reduced from hundreds seconds to tens of milliseconds, which provides the potential for real-time and in situ nanoparticle sizing.« less

  12. Proposal of ultrasonic-assisted mid-infrared spectroscopy for incorporating into daily life like smart-toilet and non-invasive blood glucose sensor

    NASA Astrophysics Data System (ADS)

    Kitazaki, Tomoya; Mori, Keita; Yamamoto, Naoyuki; Wang, Congtao; Kawashima, Natsumi; Ishimaru, Ichiro

    2017-07-01

    We proposed the extremely compact beans-size snap-shot mid-infrared spectroscopy that will be able to be built in smartphones. And also the easy preparation method of thin-film samples generated by ultrasonic standing wave is proposed. Mid-infrared spectroscopy is able to identify material components and estimate component concentrations quantitatively from absorption spectra. But conventional spectral instruments were very large-size and too expensive to incorporate into daily life. And preparations of thin-film sample were very troublesome task. Because water absorption in mid-infrared lights is very strong, moisture-containing-sample thickness should be less than 100[μm]. Thus, midinfrared spectroscopy has been utilized only by analytical experts in their laboratories. Because ultrasonic standing wave is compressional wave, we can generate periodical refractive-index distributions inside of samples. A high refractiveindex plane is correspond to a reflection boundary. When we use a several MHz ultrasonic transducer, the distance between sample surface and generated first node become to be several ten μm. Thus, the double path of this distance is correspond to sample thickness. By combining these two proposed methods, as for liquid samples, urinary albumin and glucose concentrations will be able to be measured inside of toilet. And as for solid samples, by attaching these apparatus to earlobes, the enhancement of reflection lights from near skin surface will create a new path to realize the non-invasive blood glucose sensor. Using the small ultrasonic-transducer whose diameter was 10[mm] and applied voltage 8[V], we detected the internal reflection lights from colored water as liquid sample and acrylic board as solid sample.

  13. A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies

    PubMed Central

    2014-01-01

    Background The area under the receiver operating characteristic (ROC) curve, referred to as the AUC, is an appropriate measure for describing the overall accuracy of a diagnostic test or a biomarker in early phase trials without having to choose a threshold. There are many approaches for estimating the confidence interval for the AUC. However, all are relatively complicated to implement. Furthermore, many approaches perform poorly for large AUC values or small sample sizes. Methods The AUC is actually a probability. So we propose a modified Wald interval for a single proportion, which can be calculated on a pocket calculator. We performed a simulation study to compare this modified Wald interval (without and with continuity correction) with other intervals regarding coverage probability and statistical power. Results The main result is that the proposed modified Wald intervals maintain and exploit the type I error much better than the intervals of Agresti-Coull, Wilson, and Clopper-Pearson. The interval suggested by Bamber, the Mann-Whitney interval without transformation and also the interval of the binormal AUC are very liberal. For small sample sizes the Wald interval with continuity has a comparable coverage probability as the LT interval and higher power. For large sample sizes the results of the LT interval and of the Wald interval without continuity correction are comparable. Conclusions If individual patient data is not available, but only the estimated AUC and the total sample size, the modified Wald intervals can be recommended as confidence intervals for the AUC. For small sample sizes the continuity correction should be used. PMID:24552686

  14. A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies.

    PubMed

    Kottas, Martina; Kuss, Oliver; Zapf, Antonia

    2014-02-19

    The area under the receiver operating characteristic (ROC) curve, referred to as the AUC, is an appropriate measure for describing the overall accuracy of a diagnostic test or a biomarker in early phase trials without having to choose a threshold. There are many approaches for estimating the confidence interval for the AUC. However, all are relatively complicated to implement. Furthermore, many approaches perform poorly for large AUC values or small sample sizes. The AUC is actually a probability. So we propose a modified Wald interval for a single proportion, which can be calculated on a pocket calculator. We performed a simulation study to compare this modified Wald interval (without and with continuity correction) with other intervals regarding coverage probability and statistical power. The main result is that the proposed modified Wald intervals maintain and exploit the type I error much better than the intervals of Agresti-Coull, Wilson, and Clopper-Pearson. The interval suggested by Bamber, the Mann-Whitney interval without transformation and also the interval of the binormal AUC are very liberal. For small sample sizes the Wald interval with continuity has a comparable coverage probability as the LT interval and higher power. For large sample sizes the results of the LT interval and of the Wald interval without continuity correction are comparable. If individual patient data is not available, but only the estimated AUC and the total sample size, the modified Wald intervals can be recommended as confidence intervals for the AUC. For small sample sizes the continuity correction should be used.

  15. Comparison of Sample Size by Bootstrap and by Formulas Based on Normal Distribution Assumption.

    PubMed

    Wang, Zuozhen

    2018-01-01

    Bootstrapping technique is distribution-independent, which provides an indirect way to estimate the sample size for a clinical trial based on a relatively smaller sample. In this paper, sample size estimation to compare two parallel-design arms for continuous data by bootstrap procedure are presented for various test types (inequality, non-inferiority, superiority, and equivalence), respectively. Meanwhile, sample size calculation by mathematical formulas (normal distribution assumption) for the identical data are also carried out. Consequently, power difference between the two calculation methods is acceptably small for all the test types. It shows that the bootstrap procedure is a credible technique for sample size estimation. After that, we compared the powers determined using the two methods based on data that violate the normal distribution assumption. To accommodate the feature of the data, the nonparametric statistical method of Wilcoxon test was applied to compare the two groups in the data during the process of bootstrap power estimation. As a result, the power estimated by normal distribution-based formula is far larger than that by bootstrap for each specific sample size per group. Hence, for this type of data, it is preferable that the bootstrap method be applied for sample size calculation at the beginning, and that the same statistical method as used in the subsequent statistical analysis is employed for each bootstrap sample during the course of bootstrap sample size estimation, provided there is historical true data available that can be well representative of the population to which the proposed trial is planning to extrapolate.

  16. Variable aperture-based ptychographical iterative engine method

    NASA Astrophysics Data System (ADS)

    Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

    2018-02-01

    A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches.

  17. A Generalized Least Squares Regression Approach for Computing Effect Sizes in Single-Case Research: Application Examples

    ERIC Educational Resources Information Center

    Maggin, Daniel M.; Swaminathan, Hariharan; Rogers, Helen J.; O'Keeffe, Breda V.; Sugai, George; Horner, Robert H.

    2011-01-01

    A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of…

  18. A Bayesian nonparametric method for prediction in EST analysis

    PubMed Central

    Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor

    2007-01-01

    Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445

  19. Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.

    PubMed

    You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary

    2011-02-01

    The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardiner, W.W.; Barrows, E.S.; Antrim, L.D

    Buttermilk Channel was one of seven waterways that was sampled and evaluated for dredging and sediment disposal. Sediment samples were collected and analyses were conducted on sediment core samples. The evaluation of proposed dredged material from the channel included bulk sediment chemical analyses, chemical analyses of site water and elutriate, water column and benthic acute toxicity tests, and bioaccumulation studies. Individual sediment core samples were analyzed for grain size, moisture content, and total organic carbon. A composite sediment samples, representing the entire area proposed for dredging, was analyzed for bulk density, polynuclear aromatic hydrocarbons, and 1,4-dichlorobenzene. Site water and elutriatemore » were analyzed for metals, pesticides, and PCBs.« less

  1. Measures of precision for dissimilarity-based multivariate analysis of ecological communities.

    PubMed

    Anderson, Marti J; Santana-Garcon, Julia

    2015-01-01

    Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.

  2. What is an adequate sample size? Operationalising data saturation for theory-based interview studies.

    PubMed

    Francis, Jill J; Johnston, Marie; Robertson, Clare; Glidewell, Liz; Entwistle, Vikki; Eccles, Martin P; Grimshaw, Jeremy M

    2010-12-01

    In interview studies, sample size is often justified by interviewing participants until reaching 'data saturation'. However, there is no agreed method of establishing this. We propose principles for deciding saturation in theory-based interview studies (where conceptual categories are pre-established by existing theory). First, specify a minimum sample size for initial analysis (initial analysis sample). Second, specify how many more interviews will be conducted without new ideas emerging (stopping criterion). We demonstrate these principles in two studies, based on the theory of planned behaviour, designed to identify three belief categories (Behavioural, Normative and Control), using an initial analysis sample of 10 and stopping criterion of 3. Study 1 (retrospective analysis of existing data) identified 84 shared beliefs of 14 general medical practitioners about managing patients with sore throat without prescribing antibiotics. The criterion for saturation was achieved for Normative beliefs but not for other beliefs or studywise saturation. In Study 2 (prospective analysis), 17 relatives of people with Paget's disease of the bone reported 44 shared beliefs about taking genetic testing. Studywise data saturation was achieved at interview 17. We propose specification of these principles for reporting data saturation in theory-based interview studies. The principles may be adaptable for other types of studies.

  3. A New Model of Size-graded Soil Veneer on the Lunar Surface

    NASA Technical Reports Server (NTRS)

    Basu, Abhijit; McKay, David S.

    2005-01-01

    Introduction. We propose a new model of distribution of submillimeter sized lunar soil grains on the lunar surface. We propose that in the uppermost millimeter or two of the lunar surface, soil-grains are size graded with the finest nanoscale dust on top and larger micron-scale particles below. This standard state is perturbed by ejecta deposition of larger grains at the lunar surface, which have a coating of dusty layer that may not have substrates of intermediate sizes. Distribution of solar wind elements (SWE), agglutinates, vapor deposited nanophase Fe0 in size fractions of lunar soils and ir spectra of size fractions of lunar soils are compatible with this model. A direct test of this model requires bringing back glue-impregnated tubes of lunar soil samples to be dissected and examined on Earth.

  4. Unfolding sphere size distributions with a density estimator based on Tikhonov regularization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Weese, J.; Korat, E.; Maier, D.

    1997-12-01

    This report proposes a method for unfolding sphere size distributions given a sample of radii that combines the advantages of a density estimator with those of Tikhonov regularization methods. The following topics are discusses in this report to achieve this method: the relation between the profile and the sphere size distribution; the method for unfolding sphere size distributions; the results based on simulations; and the experimental data comparison.

  5. A more powerful test based on ratio distribution for retention noninferiority hypothesis.

    PubMed

    Deng, Ling; Chen, Gang

    2013-03-11

    Rothmann et al. ( 2003 ) proposed a method for the statistical inference of fraction retention noninferiority (NI) hypothesis. A fraction retention hypothesis is defined as a ratio of the new treatment effect verse the control effect in the context of a time to event endpoint. One of the major concerns using this method in the design of an NI trial is that with a limited sample size, the power of the study is usually very low. This makes an NI trial not applicable particularly when using time to event endpoint. To improve power, Wang et al. ( 2006 ) proposed a ratio test based on asymptotic normality theory. Under a strong assumption (equal variance of the NI test statistic under null and alternative hypotheses), the sample size using Wang's test was much smaller than that using Rothmann's test. However, in practice, the assumption of equal variance is generally questionable for an NI trial design. This assumption is removed in the ratio test proposed in this article, which is derived directly from a Cauchy-like ratio distribution. In addition, using this method, the fundamental assumption used in Rothmann's test, that the observed control effect is always positive, that is, the observed hazard ratio for placebo over the control is greater than 1, is no longer necessary. Without assuming equal variance under null and alternative hypotheses, the sample size required for an NI trial can be significantly reduced if using the proposed ratio test for a fraction retention NI hypothesis.

  6. A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.

    PubMed

    Tang, Liansheng Larry; Balakrishnan, N

    2011-01-01

    The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.

  7. A Proposed Approach for Joint Modeling of the Longitudinal and Time-To-Event Data in Heterogeneous Populations: An Application to HIV/AIDS's Disease.

    PubMed

    Roustaei, Narges; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf

    2018-01-01

    In recent years, the joint models have been widely used for modeling the longitudinal and time-to-event data simultaneously. In this study, we proposed an approach (PA) to study the longitudinal and survival outcomes simultaneously in heterogeneous populations. PA relaxes the assumption of conditional independence (CI). We also compared PA with joint latent class model (JLCM) and separate approach (SA) for various sample sizes (150, 300, and 600) and different association parameters (0, 0.2, and 0.5). The average bias of parameters estimation (AB-PE), average SE of parameters estimation (ASE-PE), and coverage probability of the 95% confidence interval (CP) among the three approaches were compared. In most cases, when the sample sizes increased, AB-PE and ASE-PE decreased for the three approaches, and CP got closer to the nominal level of 0.95. When there was a considerable association, PA in comparison with SA and JLCM performed better in the sense that PA had the smallest AB-PE and ASE-PE for the longitudinal submodel among the three approaches for the small and moderate sample sizes. Moreover, JLCM was desirable for the none-association and the large sample size. Finally, the evaluated approaches were applied on a real HIV/AIDS dataset for validation, and the results were compared.

  8. 77 FR 70780 - Agency Forms Undergoing Paperwork Reduction Act Review

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-27

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... notice. Proposed Project Generic Clearance for the Collection of Qualitative Feedback on Agency Service... Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for approval under the Paperwork...

  9. 78 FR 40729 - Agency Information Collection Activities; Proposed Collection; Comment Request: Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-08

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Qualitative Feedback on Agency Service Delivery AGENCY: Washington Headquarters Service (WHS), DOD. ACTION: 30... (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery...

  10. 76 FR 17861 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-31

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods...; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery... Information Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback...

  11. 76 FR 24920 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-03

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service... Information Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback...

  12. Design considerations for case series models with exposure onset measurement error.

    PubMed

    Mohammed, Sandra M; Dalrymple, Lorien S; Sentürk, Damla; Nguyen, Danh V

    2013-02-28

    The case series model allows for estimation of the relative incidence of events, such as cardiovascular events, within a pre-specified time window after an exposure, such as an infection. The method requires only cases (individuals with events) and controls for all fixed/time-invariant confounders. The measurement error case series model extends the original case series model to handle imperfect data, where the timing of an infection (exposure) is not known precisely. In this work, we propose a method for power/sample size determination for the measurement error case series model. Extensive simulation studies are used to assess the accuracy of the proposed sample size formulas. We also examine the magnitude of the relative loss of power due to exposure onset measurement error, compared with the ideal situation where the time of exposure is measured precisely. To facilitate the design of case series studies, we provide publicly available web-based tools for determining power/sample size for both the measurement error case series model as well as the standard case series model. Copyright © 2012 John Wiley & Sons, Ltd.

  13. Setting health research priorities using the CHNRI method: VI. Quantitative properties of human collective opinion

    PubMed Central

    Yoshida, Sachiyo; Rudan, Igor; Cousens, Simon

    2016-01-01

    Introduction Crowdsourcing has become an increasingly important tool to address many problems – from government elections in democracies, stock market prices, to modern online tools such as TripAdvisor or Internet Movie Database (IMDB). The CHNRI method (the acronym for the Child Health and Nutrition Research Initiative) for setting health research priorities has crowdsourcing as the major component, which it uses to generate, assess and prioritize between many competing health research ideas. Methods We conducted a series of analyses using data from a group of 91 scorers to explore the quantitative properties of their collective opinion. We were interested in the stability of their collective opinion as the sample size increases from 15 to 90. From a pool of 91 scorers who took part in a previous CHNRI exercise, we used sampling with replacement to generate multiple random samples of different size. First, for each sample generated, we identified the top 20 ranked research ideas, among 205 that were proposed and scored, and calculated the concordance with the ranking generated by the 91 original scorers. Second, we used rank correlation coefficients to compare the ranks assigned to all 205 proposed research ideas when samples of different size are used. We also analysed the original pool of 91 scorers to to look for evidence of scoring variations based on scorers' characteristics. Results The sample sizes investigated ranged from 15 to 90. The concordance for the top 20 scored research ideas increased with sample sizes up to about 55 experts. At this point, the median level of concordance stabilized at 15/20 top ranked questions (75%), with the interquartile range also generally stable (14–16). There was little further increase in overlap when the sample size increased from 55 to 90. When analysing the ranking of all 205 ideas, the rank correlation coefficient increased as the sample size increased, with a median correlation of 0.95 reached at the sample size of 45 experts (median of the rank correlation coefficient = 0.95; IQR 0.94–0.96). Conclusions Our analyses suggest that the collective opinion of an expert group on a large number of research ideas, expressed through categorical variables (Yes/No/Not Sure/Don't know), stabilises relatively quickly in terms of identifying the ideas that have most support. In the exercise we found a high degree of reproducibility of the identified research priorities was achieved with as few as 45–55 experts. PMID:27350874

  14. Setting health research priorities using the CHNRI method: VI. Quantitative properties of human collective opinion.

    PubMed

    Yoshida, Sachiyo; Rudan, Igor; Cousens, Simon

    2016-06-01

    Crowdsourcing has become an increasingly important tool to address many problems - from government elections in democracies, stock market prices, to modern online tools such as TripAdvisor or Internet Movie Database (IMDB). The CHNRI method (the acronym for the Child Health and Nutrition Research Initiative) for setting health research priorities has crowdsourcing as the major component, which it uses to generate, assess and prioritize between many competing health research ideas. We conducted a series of analyses using data from a group of 91 scorers to explore the quantitative properties of their collective opinion. We were interested in the stability of their collective opinion as the sample size increases from 15 to 90. From a pool of 91 scorers who took part in a previous CHNRI exercise, we used sampling with replacement to generate multiple random samples of different size. First, for each sample generated, we identified the top 20 ranked research ideas, among 205 that were proposed and scored, and calculated the concordance with the ranking generated by the 91 original scorers. Second, we used rank correlation coefficients to compare the ranks assigned to all 205 proposed research ideas when samples of different size are used. We also analysed the original pool of 91 scorers to to look for evidence of scoring variations based on scorers' characteristics. The sample sizes investigated ranged from 15 to 90. The concordance for the top 20 scored research ideas increased with sample sizes up to about 55 experts. At this point, the median level of concordance stabilized at 15/20 top ranked questions (75%), with the interquartile range also generally stable (14-16). There was little further increase in overlap when the sample size increased from 55 to 90. When analysing the ranking of all 205 ideas, the rank correlation coefficient increased as the sample size increased, with a median correlation of 0.95 reached at the sample size of 45 experts (median of the rank correlation coefficient = 0.95; IQR 0.94-0.96). Our analyses suggest that the collective opinion of an expert group on a large number of research ideas, expressed through categorical variables (Yes/No/Not Sure/Don't know), stabilises relatively quickly in terms of identifying the ideas that have most support. In the exercise we found a high degree of reproducibility of the identified research priorities was achieved with as few as 45-55 experts.

  15. Super-resolution imaging based on the temperature-dependent electron-phonon collision frequency effect of metal thin films

    NASA Astrophysics Data System (ADS)

    Ding, Chenliang; Wei, Jingsong; Xiao, Mufei

    2018-05-01

    We herein propose a far-field super-resolution imaging with metal thin films based on the temperature-dependent electron-phonon collision frequency effect. In the proposed method, neither fluorescence labeling nor any special properties are required for the samples. The 100 nm lands and 200 nm grooves on the Blu-ray disk substrates were clearly resolved and imaged through a laser scanning microscope of wavelength 405 nm. The spot size was approximately 0.80 μm , and the imaging resolution of 1/8 of the laser spot size was experimentally obtained. This work can be applied to the far-field super-resolution imaging of samples with neither fluorescence labeling nor any special properties.

  16. Multi-view L2-SVM and its multi-view core vector machine.

    PubMed

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Variable aperture-based ptychographical iterative engine method.

    PubMed

    Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

    2018-02-01

    A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  18. A Monte-Carlo simulation analysis for evaluating the severity distribution functions (SDFs) calibration methodology and determining the minimum sample-size requirements.

    PubMed

    Shirazi, Mohammadali; Reddy Geedipally, Srinivas; Lord, Dominique

    2017-01-01

    Severity distribution functions (SDFs) are used in highway safety to estimate the severity of crashes and conduct different types of safety evaluations and analyses. Developing a new SDF is a difficult task and demands significant time and resources. To simplify the process, the Highway Safety Manual (HSM) has started to document SDF models for different types of facilities. As such, SDF models have recently been introduced for freeway and ramps in HSM addendum. However, since these functions or models are fitted and validated using data from a few selected number of states, they are required to be calibrated to the local conditions when applied to a new jurisdiction. The HSM provides a methodology to calibrate the models through a scalar calibration factor. However, the proposed methodology to calibrate SDFs was never validated through research. Furthermore, there are no concrete guidelines to select a reliable sample size. Using extensive simulation, this paper documents an analysis that examined the bias between the 'true' and 'estimated' calibration factors. It was indicated that as the value of the true calibration factor deviates further away from '1', more bias is observed between the 'true' and 'estimated' calibration factors. In addition, simulation studies were performed to determine the calibration sample size for various conditions. It was found that, as the average of the coefficient of variation (CV) of the 'KAB' and 'C' crashes increases, the analyst needs to collect a larger sample size to calibrate SDF models. Taking this observation into account, sample-size guidelines are proposed based on the average CV of crash severities that are used for the calibration process. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Joint inversion of NMR and SIP data to estimate pore size distribution of geomaterials

    NASA Astrophysics Data System (ADS)

    Niu, Qifei; Zhang, Chi

    2018-03-01

    There are growing interests in using geophysical tools to characterize the microstructure of geomaterials because of the non-invasive nature and the applicability in field. In these applications, multiple types of geophysical data sets are usually processed separately, which may be inadequate to constrain the key feature of target variables. Therefore, simultaneous processing of multiple data sets could potentially improve the resolution. In this study, we propose a method to estimate pore size distribution by joint inversion of nuclear magnetic resonance (NMR) T2 relaxation and spectral induced polarization (SIP) spectra. The petrophysical relation between NMR T2 relaxation time and SIP relaxation time is incorporated in a nonlinear least squares problem formulation, which is solved using Gauss-Newton method. The joint inversion scheme is applied to a synthetic sample and a Berea sandstone sample. The jointly estimated pore size distributions are very close to the true model and results from other experimental method. Even when the knowledge of the petrophysical models of the sample is incomplete, the joint inversion can still capture the main features of the pore size distribution of the samples, including the general shape and relative peak positions of the distribution curves. It is also found from the numerical example that the surface relaxivity of the sample could be extracted with the joint inversion of NMR and SIP data if the diffusion coefficient of the ions in the electrical double layer is known. Comparing to individual inversions, the joint inversion could improve the resolution of the estimated pore size distribution because of the addition of extra data sets. The proposed approach might constitute a first step towards a comprehensive joint inversion that can extract the full pore geometry information of a geomaterial from NMR and SIP data.

  20. 76 FR 23536 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-27

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Qualitative Feedback on Agency Service Delivery April 22, 2011. AGENCY: Department of Agriculture (USDA... Qualitative Feedback on Agency Service Delivery'' to OMB for approval under the Paperwork Reduction Act (PRA...

  1. 76 FR 25693 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-05

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  2. 76 FR 79702 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-22

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Qualitative Feedback on Agency Service Delivery AGENCY: National Institute of Mental Health (NIMH), HHS... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  3. 76 FR 13977 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-15

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... of Qualitative Feedback on Agency Service Delivery AGENCY: Office of the Secretary/Office of the...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for...

  4. 77 FR 72361 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-12-05

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Qualitative Feedback on Agency Service Delivery SUMMARY: As part of a Federal Government-wide effort to... Information Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback...

  5. 76 FR 15027 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-18

    ... clustering), the precision requirements or power calculations that justify the proposed sample size, the...; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery... Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for approval under...

  6. 76 FR 19826 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-08

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY... (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery...

  7. 76 FR 10939 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-28

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... of Qualitative Feedback on Agency Service Delivery AGENCY: Federal Railroad Administration (FRA... Qualitative Feedback on Agency Service Delivery'' to OMB for approval under the Paperwork Reduction Act (PRA...

  8. 78 FR 44099 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-23

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service... Qualitative Feedback on Agency Service Delivery'' for approval under the Paperwork Reduction Act (PRA) (44 U.S...

  9. 75 FR 80542 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-22

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential...; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  10. 76 FR 31383 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-31

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods...; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: Peace... Qualitative Feedback on Agency Service Delivery '' to OMB for approval under the Paperwork Reduction Act (PRA...

  11. 78 FR 23755 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-04-22

    ... justify the proposed sample size, the expected response rate, methods for assessing potential non-response... Collection; Comment Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service... Qualitative Feedback on Agency Service Delivery '' for approval under the Paperwork Reduction Act (PRA) (44 U...

  12. 76 FR 55398 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-07

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Qualitative Feedback on Agency Service Delivery AGENCY: National Institutes of Health, Eunice Kennedy Shriver...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery '' to OMB for...

  13. 76 FR 44938 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-27

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Qualitative Feedback on Agency Service Delivery: National Cancer Center (NCI) ACTION: 30-Day notice of... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  14. 76 FR 20967 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-14

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: U... Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for approval under...

  15. 76 FR 38355 - Agency Information Collection Activities: Proposed Collection; Comment Request; Generic Clearance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-30

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... of Qualitative Feedback on Agency Service Delivery AGENCY: Architectural and Transportation Barriers...: ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to the Office of...

  16. 76 FR 22920 - Agency Information Collection Activities: Proposed Collection; Comment Request; DOL Generic...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-25

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Collection; Comment Request; DOL Generic Clearance for the Collection of Qualitative Feedback on Agency... of Qualitative Feedback on Agency Service Delivery'' to the Office of Management and Budget (OMB) for...

  17. Sequential ensemble-based optimal design for parameter estimation: SEQUENTIAL ENSEMBLE-BASED OPTIMAL DESIGN

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Man, Jun; Zhang, Jiangjiang; Li, Weixuan

    2016-10-01

    The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees ofmore » freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.« less

  18. Application of binomial and multinomial probability statistics to the sampling design process of a global grain tracing and recall system

    USDA-ARS?s Scientific Manuscript database

    Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...

  19. Inferring the demographic history from DNA sequences: An importance sampling approach based on non-homogeneous processes.

    PubMed

    Ait Kaci Azzou, S; Larribe, F; Froda, S

    2016-10-01

    In Ait Kaci Azzou et al. (2015) we introduced an Importance Sampling (IS) approach for estimating the demographic history of a sample of DNA sequences, the skywis plot. More precisely, we proposed a new nonparametric estimate of a population size that changes over time. We showed on simulated data that the skywis plot can work well in typical situations where the effective population size does not undergo very steep changes. In this paper, we introduce an iterative procedure which extends the previous method and gives good estimates under such rapid variations. In the iterative calibrated skywis plot we approximate the effective population size by a piecewise constant function, whose values are re-estimated at each step. These piecewise constant functions are used to generate the waiting times of non homogeneous Poisson processes related to a coalescent process with mutation under a variable population size model. Moreover, the present IS procedure is based on a modified version of the Stephens and Donnelly (2000) proposal distribution. Finally, we apply the iterative calibrated skywis plot method to a simulated data set from a rapidly expanding exponential model, and we show that the method based on this new IS strategy correctly reconstructs the demographic history. Copyright © 2016. Published by Elsevier Inc.

  20. Blinded sample size re-estimation in three-arm trials with 'gold standard' design.

    PubMed

    Mütze, Tobias; Friede, Tim

    2017-10-15

    In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach

    ERIC Educational Resources Information Center

    Rotondi, Michael A.; Donner, Allan

    2009-01-01

    The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…

  2. A Monte-Carlo method which is not based on Markov chain algorithm, used to study electrostatic screening of ion potential

    NASA Astrophysics Data System (ADS)

    Šantić, Branko; Gracin, Davor

    2017-12-01

    A new simple Monte Carlo method is introduced for the study of electrostatic screening by surrounding ions. The proposed method is not based on the generally used Markov chain method for sample generation. Each sample is pristine and there is no correlation with other samples. As the main novelty, the pairs of ions are gradually added to a sample provided that the energy of each ion is within the boundaries determined by the temperature and the size of ions. The proposed method provides reliable results, as demonstrated by the screening of ion in plasma and in water.

  3. Approximated affine projection algorithm for feedback cancellation in hearing aids.

    PubMed

    Lee, Sangmin; Kim, In-Young; Park, Young-Cheol

    2007-09-01

    We propose an approximated affine projection (AP) algorithm for feedback cancellation in hearing aids. It is based on the conventional approach using the Gauss-Seidel (GS) iteration, but provides more stable convergence behaviour even with small step sizes. In the proposed algorithm, a residue of the weighted error vector, instead of the current error sample, is used to provide stable convergence. A new learning rate control scheme is also applied to the proposed algorithm to prevent signal cancellation and system instability. The new scheme determines step size in proportion to the prediction factor of the input, so that adaptation is inhibited whenever tone-like signals are present in the input. Simulation results verified the efficiency of the proposed algorithm.

  4. Does size really matter? A multisite study assessing the latent structure of the proposed ICD-11 and DSM-5 diagnostic criteria for PTSD

    PubMed Central

    Hansen, Maj; Hyland, Philip; Karstoft, Karen-Inge; Vaegter, Henrik B.; Bramsen, Rikke H.; Nielsen, Anni B. S.; Armour, Cherie; Andersen, Søren B.; Høybye, Mette Terp; Larsen, Simone Kongshøj; Andersen, Tonny E.

    2017-01-01

    ABSTRACT Background: Researchers and clinicians within the field of trauma have to choose between different diagnostic descriptions of posttraumatic stress disorder (PTSD) in the DSM-5 and the proposed ICD-11. Several studies support different competing models of the PTSD structure according to both diagnostic systems; however, findings show that the choice of diagnostic systems can affect the estimated prevalence rates. Objectives: The present study aimed to investigate the potential impact of using a large (i.e. the DSM-5) compared to a small (i.e. the ICD-11) diagnostic description of PTSD. In other words, does the size of PTSD really matter? Methods: The aim was investigated by examining differences in diagnostic rates between the two diagnostic systems and independently examining the model fit of the competing DSM-5 and ICD-11 models of PTSD across three trauma samples: university students (N = 4213), chronic pain patients (N = 573), and military personnel (N = 118). Results: Diagnostic rates of PTSD were significantly lower according to the proposed ICD-11 criteria in the university sample, but no significant differences were found for chronic pain patients and military personnel. The proposed ICD-11 three-factor model provided the best fit of the tested ICD-11 models across all samples, whereas the DSM-5 seven-factor Hybrid model provided the best fit in the university and pain samples, and the DSM-5 six-factor Anhedonia model provided the best fit in the military sample of the tested DSM-5 models. Conclusions: The advantages and disadvantages of using a broad or narrow set of symptoms for PTSD can be debated, however, this study demonstrated that choice of diagnostic system may influence the estimated PTSD rates both qualitatively and quantitatively. In the current described diagnostic criteria only the ICD-11 model can reflect the configuration of symptoms satisfactorily. Thus, size does matter when assessing PTSD. PMID:29201287

  5. Does size really matter? A multisite study assessing the latent structure of the proposed ICD-11 and DSM-5 diagnostic criteria for PTSD.

    PubMed

    Hansen, Maj; Hyland, Philip; Karstoft, Karen-Inge; Vaegter, Henrik B; Bramsen, Rikke H; Nielsen, Anni B S; Armour, Cherie; Andersen, Søren B; Høybye, Mette Terp; Larsen, Simone Kongshøj; Andersen, Tonny E

    2017-01-01

    Background : Researchers and clinicians within the field of trauma have to choose between different diagnostic descriptions of posttraumatic stress disorder (PTSD) in the DSM-5 and the proposed ICD-11. Several studies support different competing models of the PTSD structure according to both diagnostic systems; however, findings show that the choice of diagnostic systems can affect the estimated prevalence rates. Objectives : The present study aimed to investigate the potential impact of using a large (i.e. the DSM-5) compared to a small (i.e. the ICD-11) diagnostic description of PTSD. In other words, does the size of PTSD really matter? Methods: The aim was investigated by examining differences in diagnostic rates between the two diagnostic systems and independently examining the model fit of the competing DSM-5 and ICD-11 models of PTSD across three trauma samples: university students ( N  = 4213), chronic pain patients ( N  = 573), and military personnel ( N  = 118). Results : Diagnostic rates of PTSD were significantly lower according to the proposed ICD-11 criteria in the university sample, but no significant differences were found for chronic pain patients and military personnel. The proposed ICD-11 three-factor model provided the best fit of the tested ICD-11 models across all samples, whereas the DSM-5 seven-factor Hybrid model provided the best fit in the university and pain samples, and the DSM-5 six-factor Anhedonia model provided the best fit in the military sample of the tested DSM-5 models. Conclusions : The advantages and disadvantages of using a broad or narrow set of symptoms for PTSD can be debated, however, this study demonstrated that choice of diagnostic system may influence the estimated PTSD rates both qualitatively and quantitatively. In the current described diagnostic criteria only the ICD-11 model can reflect the configuration of symptoms satisfactorily. Thus, size does matter when assessing PTSD.

  6. Sequential sampling: a novel method in farm animal welfare assessment.

    PubMed

    Heath, C A E; Main, D C J; Mullan, S; Haskell, M J; Browne, W J

    2016-02-01

    Lameness in dairy cows is an important welfare issue. As part of a welfare assessment, herd level lameness prevalence can be estimated from scoring a sample of animals, where higher levels of accuracy are associated with larger sample sizes. As the financial cost is related to the number of cows sampled, smaller samples are preferred. Sequential sampling schemes have been used for informing decision making in clinical trials. Sequential sampling involves taking samples in stages, where sampling can stop early depending on the estimated lameness prevalence. When welfare assessment is used for a pass/fail decision, a similar approach could be applied to reduce the overall sample size. The sampling schemes proposed here apply the principles of sequential sampling within a diagnostic testing framework. This study develops three sequential sampling schemes of increasing complexity to classify 80 fully assessed UK dairy farms, each with known lameness prevalence. Using the Welfare Quality herd-size-based sampling scheme, the first 'basic' scheme involves two sampling events. At the first sampling event half the Welfare Quality sample size is drawn, and then depending on the outcome, sampling either stops or is continued and the same number of animals is sampled again. In the second 'cautious' scheme, an adaptation is made to ensure that correctly classifying a farm as 'bad' is done with greater certainty. The third scheme is the only scheme to go beyond lameness as a binary measure and investigates the potential for increasing accuracy by incorporating the number of severely lame cows into the decision. The three schemes are evaluated with respect to accuracy and average sample size by running 100 000 simulations for each scheme, and a comparison is made with the fixed size Welfare Quality herd-size-based sampling scheme. All three schemes performed almost as well as the fixed size scheme but with much smaller average sample sizes. For the third scheme, an overall association between lameness prevalence and the proportion of lame cows that were severely lame on a farm was found. However, as this association was found to not be consistent across all farms, the sampling scheme did not prove to be as useful as expected. The preferred scheme was therefore the 'cautious' scheme for which a sampling protocol has also been developed.

  7. A novel method for correcting scanline-observational bias of discontinuity orientation

    PubMed Central

    Huang, Lei; Tang, Huiming; Tan, Qinwen; Wang, Dingjian; Wang, Liangqing; Ez Eldin, Mutasim A. M.; Li, Changdong; Wu, Qiong

    2016-01-01

    Scanline observation is known to introduce an angular bias into the probability distribution of orientation in three-dimensional space. In this paper, numerical solutions expressing the functional relationship between the scanline-observational distribution (in one-dimensional space) and the inherent distribution (in three-dimensional space) are derived using probability theory and calculus under the independence hypothesis of dip direction and dip angle. Based on these solutions, a novel method for obtaining the inherent distribution (also for correcting the bias) is proposed, an approach which includes two procedures: 1) Correcting the cumulative probabilities of orientation according to the solutions, and 2) Determining the distribution of the corrected orientations using approximation methods such as the one-sample Kolmogorov-Smirnov test. The inherent distribution corrected by the proposed method can be used for discrete fracture network (DFN) modelling, which is applied to such areas as rockmass stability evaluation, rockmass permeability analysis, rockmass quality calculation and other related fields. To maximize the correction capacity of the proposed method, the observed sample size is suggested through effectiveness tests for different distribution types, dispersions and sample sizes. The performance of the proposed method and the comparison of its correction capacity with existing methods are illustrated with two case studies. PMID:26961249

  8. Generalized optimal design for two-arm, randomized phase II clinical trials with endpoints from the exponential dispersion family.

    PubMed

    Jiang, Wei; Mahnken, Jonathan D; He, Jianghua; Mayo, Matthew S

    2016-11-01

    For two-arm randomized phase II clinical trials, previous literature proposed an optimal design that minimizes the total sample sizes subject to multiple constraints on the standard errors of the estimated event rates and their difference. The original design is limited to trials with dichotomous endpoints. This paper extends the original approach to be applicable to phase II clinical trials with endpoints from the exponential dispersion family distributions. The proposed optimal design minimizes the total sample sizes needed to provide estimates of population means of both arms and their difference with pre-specified precision. Its applications on data from specific distribution families are discussed under multiple design considerations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. Robustness-Based Design Optimization Under Data Uncertainty

    NASA Technical Reports Server (NTRS)

    Zaman, Kais; McDonald, Mark; Mahadevan, Sankaran; Green, Lawrence

    2010-01-01

    This paper proposes formulations and algorithms for design optimization under both aleatory (i.e., natural or physical variability) and epistemic uncertainty (i.e., imprecise probabilistic information), from the perspective of system robustness. The proposed formulations deal with epistemic uncertainty arising from both sparse and interval data without any assumption about the probability distributions of the random variables. A decoupled approach is proposed in this paper to un-nest the robustness-based design from the analysis of non-design epistemic variables to achieve computational efficiency. The proposed methods are illustrated for the upper stage design problem of a two-stage-to-orbit (TSTO) vehicle, where the information on the random design inputs are only available as sparse point and/or interval data. As collecting more data reduces uncertainty but increases cost, the effect of sample size on the optimality and robustness of the solution is also studied. A method is developed to determine the optimal sample size for sparse point data that leads to the solutions of the design problem that are least sensitive to variations in the input random variables.

  10. A single test for rejecting the null hypothesis in subgroups and in the overall sample.

    PubMed

    Lin, Yunzhi; Zhou, Kefei; Ganju, Jitendra

    2017-01-01

    In clinical trials, some patient subgroups are likely to demonstrate larger effect sizes than other subgroups. For example, the effect size, or informally the benefit with treatment, is often greater in patients with a moderate condition of a disease than in those with a mild condition. A limitation of the usual method of analysis is that it does not incorporate this ordering of effect size by patient subgroup. We propose a test statistic which supplements the conventional test by including this information and simultaneously tests the null hypothesis in pre-specified subgroups and in the overall sample. It results in more power than the conventional test when the differences in effect sizes across subgroups are at least moderately large; otherwise it loses power. The method involves combining p-values from models fit to pre-specified subgroups and the overall sample in a manner that assigns greater weight to subgroups in which a larger effect size is expected. Results are presented for randomized trials with two and three subgroups.

  11. Sampling and data handling methods for inhalable particulate sampling. Final report nov 78-dec 80

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, W.B.; Cushing, K.M.; Johnson, J.W.

    1982-05-01

    The report reviews the objectives of a research program on sampling and measuring particles in the inhalable particulate (IP) size range in emissions from stationary sources, and describes methods and equipment required. A computer technique was developed to analyze data on particle-size distributions of samples taken with cascade impactors from industrial process streams. Research in sampling systems for IP matter included concepts for maintaining isokinetic sampling conditions, necessary for representative sampling of the larger particles, while flowrates in the particle-sizing device were constant. Laboratory studies were conducted to develop suitable IP sampling systems with overall cut diameters of 15 micrometersmore » and conforming to a specified collection efficiency curve. Collection efficiencies were similarly measured for a horizontal elutriator. Design parameters were calculated for horizontal elutriators to be used with impactors, the EPA SASS train, and the EPA FAS train. Two cyclone systems were designed and evaluated. Tests on an Andersen Size Selective Inlet, a 15-micrometer precollector for high-volume samplers, showed its performance to be with the proposed limits for IP samplers. A stack sampling system was designed in which the aerosol is diluted in flow patterns and with mixing times simulating those in stack plumes.« less

  12. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range.

    PubMed

    Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun

    2014-12-19

    In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials. We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications. In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

  13. Signal Sampling for Efficient Sparse Representation of Resting State FMRI Data

    PubMed Central

    Ge, Bao; Makkie, Milad; Wang, Jin; Zhao, Shijie; Jiang, Xi; Li, Xiang; Lv, Jinglei; Zhang, Shu; Zhang, Wei; Han, Junwei; Guo, Lei; Liu, Tianming

    2015-01-01

    As the size of brain imaging data such as fMRI grows explosively, it provides us with unprecedented and abundant information about the brain. How to reduce the size of fMRI data but not lose much information becomes a more and more pressing issue. Recent literature studies tried to deal with it by dictionary learning and sparse representation methods, however, their computation complexities are still high, which hampers the wider application of sparse representation method to large scale fMRI datasets. To effectively address this problem, this work proposes to represent resting state fMRI (rs-fMRI) signals of a whole brain via a statistical sampling based sparse representation. First we sampled the whole brain’s signals via different sampling methods, then the sampled signals were aggregate into an input data matrix to learn a dictionary, finally this dictionary was used to sparsely represent the whole brain’s signals and identify the resting state networks. Comparative experiments demonstrate that the proposed signal sampling framework can speed-up by ten times in reconstructing concurrent brain networks without losing much information. The experiments on the 1000 Functional Connectomes Project further demonstrate its effectiveness and superiority. PMID:26646924

  14. Rule-of-thumb adjustment of sample sizes to accommodate dropouts in a two-stage analysis of repeated measurements.

    PubMed

    Overall, John E; Tonidandel, Scott; Starbuck, Robert R

    2006-01-01

    Recent contributions to the statistical literature have provided elegant model-based solutions to the problem of estimating sample sizes for testing the significance of differences in mean rates of change across repeated measures in controlled longitudinal studies with differentially correlated error and missing data due to dropouts. However, the mathematical complexity and model specificity of these solutions make them generally inaccessible to most applied researchers who actually design and undertake treatment evaluation research in psychiatry. In contrast, this article relies on a simple two-stage analysis in which dropout-weighted slope coefficients fitted to the available repeated measurements for each subject separately serve as the dependent variable for a familiar ANCOVA test of significance for differences in mean rates of change. This article is about how a sample of size that is estimated or calculated to provide desired power for testing that hypothesis without considering dropouts can be adjusted appropriately to take dropouts into account. Empirical results support the conclusion that, whatever reasonable level of power would be provided by a given sample size in the absence of dropouts, essentially the same power can be realized in the presence of dropouts simply by adding to the original dropout-free sample size the number of subjects who would be expected to drop from a sample of that original size under conditions of the proposed study.

  15. Designing clinical trials to test disease-modifying agents: application to the treatment trials of Alzheimer's disease.

    PubMed

    Xiong, Chengjie; van Belle, Gerald; Miller, J Philip; Morris, John C

    2011-02-01

    Therapeutic trials of disease-modifying agents on Alzheimer's disease (AD) require novel designs and analyses involving switch of treatments for at least a portion of subjects enrolled. Randomized start and randomized withdrawal designs are two examples of such designs. Crucial design parameters such as sample size and the time of treatment switch are important to understand in designing such clinical trials. The purpose of this article is to provide methods to determine sample sizes and time of treatment switch as well as optimum statistical tests of treatment efficacy for clinical trials of disease-modifying agents on AD. A general linear mixed effects model is proposed to test the disease-modifying efficacy of novel therapeutic agents on AD. This model links the longitudinal growth from both the placebo arm and the treatment arm at the time of treatment switch for these in the delayed treatment arm or early withdrawal arm and incorporates the potential correlation on the rate of cognitive change before and after the treatment switch. Sample sizes and the optimum time for treatment switch of such trials as well as optimum test statistic for the treatment efficacy are determined according to the model. Assuming an evenly spaced longitudinal design over a fixed duration, the optimum treatment switching time in a randomized start or a randomized withdrawal trial is half way through the trial. With the optimum test statistic for the treatment efficacy and over a wide spectrum of model parameters, the optimum sample size allocations are fairly close to the simplest design with a sample size ratio of 1:1:1 among the treatment arm, the delayed treatment or early withdrawal arm, and the placebo arm. The application of the proposed methodology to AD provides evidence that much larger sample sizes are required to adequately power disease-modifying trials when compared with those for symptomatic agents, even when the treatment switch time and efficacy test are optimally chosen. The proposed method assumes that the only and immediate effect of treatment switch is on the rate of cognitive change. Crucial design parameters for the clinical trials of disease-modifying agents on AD can be optimally chosen. Government and industry officials as well as academia researchers should consider the optimum use of the clinical trials design for disease-modifying agents on AD in their effort to search for the treatments with the potential to modify the underlying pathophysiology of AD.

  16. Local Feature Selection for Data Classification.

    PubMed

    Armanfard, Narges; Reilly, James P; Komeili, Majid

    2016-06-01

    Typical feature selection methods choose an optimal global feature subset that is applied over all regions of the sample space. In contrast, in this paper we propose a novel localized feature selection (LFS) approach whereby each region of the sample space is associated with its own distinct optimized feature set, which may vary both in membership and size across the sample space. This allows the feature set to optimally adapt to local variations in the sample space. An associated method for measuring the similarities of a query datum to each of the respective classes is also proposed. The proposed method makes no assumptions about the underlying structure of the samples; hence the method is insensitive to the distribution of the data over the sample space. The method is efficiently formulated as a linear programming optimization problem. Furthermore, we demonstrate the method is robust against the over-fitting problem. Experimental results on eleven synthetic and real-world data sets demonstrate the viability of the formulation and the effectiveness of the proposed algorithm. In addition we show several examples where localized feature selection produces better results than a global feature selection method.

  17. Disposable cartridge extraction of retinol and alpha-tocopherol from fatty samples.

    PubMed

    Bourgeois, C F; Ciba, N

    1988-01-01

    A new approach is proposed for liquid/solid extraction of retinol and alpha-tocopherol from samples, using a disposable kieselguhr cartridge. The substitution of the mixture methanol-ethanol-n-butanol (4 + 3 + 1) for methanol in the alkaline hydrolysis solution makes it now possible to process fatty samples. Methanol is necessary to solubilize the antioxidant ascorbic acid, and a linear chain alcohol such as n-butanol is necessary to reduce the size of soap micelles so that they can penetrate into the kieselguhr pores. In comparisons of the proposed method with conventional methods on mineral premixes and fatty feedstuffs, recovery and accuracy are at least as good by the proposed method. Advantages are increased rate of determinations and the ability to hydrolyze and extract retinol and alpha-tocopherol together from the same sample.

  18. Ferromagnetism appears in nitrogen implanted nanocrystalline diamond films

    NASA Astrophysics Data System (ADS)

    Remes, Zdenek; Sun, Shih-Jye; Varga, Marian; Chou, Hsiung; Hsu, Hua-Shu; Kromka, Alexander; Horak, Pavel

    2015-11-01

    The nanocrystalline diamond films turn to be ferromagnetic after implanting various nitrogen doses on them. Through this research, we confirm that the room-temperature ferromagnetism of the implanted samples is derived from the measurements of magnetic circular dichroism (MCD) and superconducting quantum interference device (SQUID). Samples with larger crystalline grains as well as higher implanted doses present more robust ferromagnetic signals at room temperature. Raman spectra indicate that the small grain-sized samples are much more disordered than the large grain-sized ones. We propose that a slightly large saturated ferromagnetism could be observed at low temperature, because the increased localization effects have a significant impact on more disordered structure.

  19. Blinded and unblinded internal pilot study designs for clinical trials with count data.

    PubMed

    Schneider, Simon; Schmidli, Heinz; Friede, Tim

    2013-07-01

    Internal pilot studies are a popular design feature to address uncertainties in the sample size calculations caused by vague information on nuisance parameters. Despite their popularity, only very recently blinded sample size reestimation procedures for trials with count data were proposed and their properties systematically investigated. Although blinded procedures are favored by regulatory authorities, practical application is somewhat limited by fears that blinded procedures are prone to bias if the treatment effect was misspecified in the planning. Here, we compare unblinded and blinded procedures with respect to bias, error rates, and sample size distribution. We find that both procedures maintain the desired power and that the unblinded procedure is slightly liberal whereas the actual significance level of the blinded procedure is close to the nominal level. Furthermore, we show that in situations where uncertainty about the assumed treatment effect exists, the blinded estimator of the control event rate is biased in contrast to the unblinded estimator, which results in differences in mean sample sizes in favor of the unblinded procedure. However, these differences are rather small compared to the deviations of the mean sample sizes from the sample size required to detect the true, but unknown effect. We demonstrate that the variation of the sample size resulting from the blinded procedure is in many practically relevant situations considerably smaller than the one of the unblinded procedures. The methods are extended to overdispersed counts using a quasi-likelihood approach and are illustrated by trials in relapsing multiple sclerosis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Sample size guidelines for fitting a lognormal probability distribution to censored most probable number data with a Markov chain Monte Carlo method.

    PubMed

    Williams, Michael S; Cao, Yong; Ebel, Eric D

    2013-07-15

    Levels of pathogenic organisms in food and water have steadily declined in many parts of the world. A consequence of this reduction is that the proportion of samples that test positive for the most contaminated product-pathogen pairings has fallen to less than 0.1. While this is unequivocally beneficial to public health, datasets with very few enumerated samples present an analytical challenge because a large proportion of the observations are censored values. One application of particular interest to risk assessors is the fitting of a statistical distribution function to datasets collected at some point in the farm-to-table continuum. The fitted distribution forms an important component of an exposure assessment. A number of studies have compared different fitting methods and proposed lower limits on the proportion of samples where the organisms of interest are identified and enumerated, with the recommended lower limit of enumerated samples being 0.2. This recommendation may not be applicable to food safety risk assessments for a number of reasons, which include the development of new Bayesian fitting methods, the use of highly sensitive screening tests, and the generally larger sample sizes found in surveys of food commodities. This study evaluates the performance of a Markov chain Monte Carlo fitting method when used in conjunction with a screening test and enumeration of positive samples by the Most Probable Number technique. The results suggest that levels of contamination for common product-pathogen pairs, such as Salmonella on poultry carcasses, can be reliably estimated with the proposed fitting method and samples sizes in excess of 500 observations. The results do, however, demonstrate that simple guidelines for this application, such as the proportion of positive samples, cannot be provided. Published by Elsevier B.V.

  1. Highly accurate adaptive TOF determination method for ultrasonic thickness measurement

    NASA Astrophysics Data System (ADS)

    Zhou, Lianjie; Liu, Haibo; Lian, Meng; Ying, Yangwei; Li, Te; Wang, Yongqing

    2018-04-01

    Determining the time of flight (TOF) is very critical for precise ultrasonic thickness measurement. However, the relatively low signal-to-noise ratio (SNR) of the received signals would induce significant TOF determination errors. In this paper, an adaptive time delay estimation method has been developed to improve the TOF determination’s accuracy. An improved variable step size adaptive algorithm with comprehensive step size control function is proposed. Meanwhile, a cubic spline fitting approach is also employed to alleviate the restriction of finite sampling interval. Simulation experiments under different SNR conditions were conducted for performance analysis. Simulation results manifested the performance advantage of proposed TOF determination method over existing TOF determination methods. When comparing with the conventional fixed step size, and Kwong and Aboulnasr algorithms, the steady state mean square deviation of the proposed algorithm was generally lower, which makes the proposed algorithm more suitable for TOF determination. Further, ultrasonic thickness measurement experiments were performed on aluminum alloy plates with various thicknesses. They indicated that the proposed TOF determination method was more robust even under low SNR conditions, and the ultrasonic thickness measurement accuracy could be significantly improved.

  2. Bon-EV: an improved multiple testing procedure for controlling false discovery rates.

    PubMed

    Li, Dongmei; Xie, Zidian; Zand, Martin; Fogg, Thomas; Dye, Timothy

    2017-01-03

    Stability of multiple testing procedures, defined as the standard deviation of total number of discoveries, can be used as an indicator of variability of multiple testing procedures. Improving stability of multiple testing procedures can help to increase the consistency of findings from replicated experiments. Benjamini-Hochberg's and Storey's q-value procedures are two commonly used multiple testing procedures for controlling false discoveries in genomic studies. Storey's q-value procedure has higher power and lower stability than Benjamini-Hochberg's procedure. To improve upon the stability of Storey's q-value procedure and maintain its high power in genomic data analysis, we propose a new multiple testing procedure, named Bon-EV, to control false discovery rate (FDR) based on Bonferroni's approach. Simulation studies show that our proposed Bon-EV procedure can maintain the high power of the Storey's q-value procedure and also result in better FDR control and higher stability than Storey's q-value procedure for samples of large size(30 in each group) and medium size (15 in each group) for either independent, somewhat correlated, or highly correlated test statistics. When sample size is small (5 in each group), our proposed Bon-EV procedure has performance between the Benjamini-Hochberg procedure and the Storey's q-value procedure. Examples using RNA-Seq data show that the Bon-EV procedure has higher stability than the Storey's q-value procedure while maintaining equivalent power, and higher power than the Benjamini-Hochberg's procedure. For medium or large sample sizes, the Bon-EV procedure has improved FDR control and stability compared with the Storey's q-value procedure and improved power compared with the Benjamini-Hochberg procedure. The Bon-EV multiple testing procedure is available as the BonEV package in R for download at https://CRAN.R-project.org/package=BonEV .

  3. Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling.

    PubMed

    Rosenblum, Michael A; Laan, Mark J van der

    2009-01-07

    The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

  4. Image re-sampling detection through a novel interpolation kernel.

    PubMed

    Hilal, Alaa

    2018-06-01

    Image re-sampling involved in re-size and rotation transformations is an essential element block in a typical digital image alteration. Fortunately, traces left from such processes are detectable, proving that the image has gone a re-sampling transformation. Within this context, we present in this paper two original contributions. First, we propose a new re-sampling interpolation kernel. It depends on five independent parameters that controls its amplitude, angular frequency, standard deviation, and duration. Then, we demonstrate its capacity to imitate the same behavior of the most frequent interpolation kernels used in digital image re-sampling applications. Secondly, the proposed model is used to characterize and detect the correlation coefficients involved in re-sampling transformations. The involved process includes a minimization of an error function using the gradient method. The proposed method is assessed over a large database of 11,000 re-sampled images. Additionally, it is implemented within an algorithm in order to assess images that had undergone complex transformations. Obtained results demonstrate better performance and reduced processing time when compared to a reference method validating the suitability of the proposed approaches. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. Estimation of a partially linear additive model for data from an outcome-dependent sampling design with a continuous outcome

    PubMed Central

    Tan, Ziwen; Qin, Guoyou; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) designs have been well recognized as a cost-effective way to enhance study efficiency in both statistical literature and biomedical and epidemiologic studies. A partially linear additive model (PLAM) is widely applied in real problems because it allows for a flexible specification of the dependence of the response on some covariates in a linear fashion and other covariates in a nonlinear non-parametric fashion. Motivated by an epidemiological study investigating the effect of prenatal polychlorinated biphenyls exposure on children's intelligence quotient (IQ) at age 7 years, we propose a PLAM in this article to investigate a more flexible non-parametric inference on the relationships among the response and covariates under the ODS scheme. We propose the estimation method and establish the asymptotic properties of the proposed estimator. Simulation studies are conducted to show the improved efficiency of the proposed ODS estimator for PLAM compared with that from a traditional simple random sampling design with the same sample size. The data of the above-mentioned study is analyzed to illustrate the proposed method. PMID:27006375

  6. 75 FR 31420 - Magnuson Stevens Act Provisions; General Provisions for Domestic Fisheries; Application for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-03

    ... survey designed to estimate the population size of Pacific sardine. NMFS requests public comment on the... use of 4200 mt to replicate the summer survey conducted under the EFP approved in 2009 and expand the sample size and area covered. In addition to the summer survey, the applicants proposed to use 800 mt of...

  7. Multiscale modeling of porous ceramics using movable cellular automaton method

    NASA Astrophysics Data System (ADS)

    Smolin, Alexey Yu.; Smolin, Igor Yu.; Smolina, Irina Yu.

    2017-10-01

    The paper presents a multiscale model for porous ceramics based on movable cellular automaton method, which is a particle method in novel computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the unique position in space. As a result, we get the average values of Young's modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behavior at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via effective properties determined earliar. If the pore size distribution function of the material has N maxima we need to perform computations for N-1 levels in order to get the properties step by step from the lowest scale up to the macroscale. The proposed approach was applied to modeling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behavior of the model sample at the macroscale.

  8. Impact of non-uniform correlation structure on sample size and power in multiple-period cluster randomised trials.

    PubMed

    Kasza, J; Hemming, K; Hooper, R; Matthews, Jns; Forbes, A B

    2017-01-01

    Stepped wedge and cluster randomised crossover trials are examples of cluster randomised designs conducted over multiple time periods that are being used with increasing frequency in health research. Recent systematic reviews of both of these designs indicate that the within-cluster correlation is typically taken account of in the analysis of data using a random intercept mixed model, implying a constant correlation between any two individuals in the same cluster no matter how far apart in time they are measured: within-period and between-period intra-cluster correlations are assumed to be identical. Recently proposed extensions allow the within- and between-period intra-cluster correlations to differ, although these methods require that all between-period intra-cluster correlations are identical, which may not be appropriate in all situations. Motivated by a proposed intensive care cluster randomised trial, we propose an alternative correlation structure for repeated cross-sectional multiple-period cluster randomised trials in which the between-period intra-cluster correlation is allowed to decay depending on the distance between measurements. We present results for the variance of treatment effect estimators for varying amounts of decay, investigating the consequences of the variation in decay on sample size planning for stepped wedge, cluster crossover and multiple-period parallel-arm cluster randomised trials. We also investigate the impact of assuming constant between-period intra-cluster correlations instead of decaying between-period intra-cluster correlations. Our results indicate that in certain design configurations, including the one corresponding to the proposed trial, a correlation decay can have an important impact on variances of treatment effect estimators, and hence on sample size and power. An R Shiny app allows readers to interactively explore the impact of correlation decay.

  9. A pretreatment method for grain size analysis of red mudstones

    NASA Astrophysics Data System (ADS)

    Jiang, Zaixing; Liu, Li'an

    2011-11-01

    Traditional sediment disaggregation methods work well for loose mud sediments, but not for tightly cemented mudstones by ferric oxide minerals. In this paper, a new pretreatment method for analyzing the grain size of red mudstones is presented. The experimental samples are Eocene red mudstones from the Dongying Depression, Bohai Bay Basin. The red mudstones are composed mainly of clay minerals, clastic sediments and ferric oxides that make the mudstones red and tightly compacted. The procedure of the method is as follows. Firstly, samples of the red mudstones were crushed into fragments with a diameter of 0.6-0.8 mm in size; secondly, the CBD (citrate-bicarbonate-dithionite) treatment was used to remove ferric oxides so that the cementation of intra-aggregates and inter-aggregates became weakened, and then 5% dilute hydrochloric acid was added to further remove the cements; thirdly, the fragments were further ground with a rubber pestle; lastly, an ultrasonicator was used to disaggregate the samples. After the treatment, the samples could then be used for grain size analysis or for other geological analyses of sedimentary grains. Compared with other pretreatment methods for size analysis of mudstones, this proposed method is more effective and has higher repeatability.

  10. Model calibration and validation for OFMSW and sewage sludge co-digestion reactors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Esposito, G., E-mail: giovanni.esposito@unicas.it; Frunzo, L., E-mail: luigi.frunzo@unina.it; Panico, A., E-mail: anpanico@unina.it

    2011-12-15

    Highlights: > Disintegration is the limiting step of the anaerobic co-digestion process. > Disintegration kinetic constant does not depend on the waste particle size. > Disintegration kinetic constant depends only on the waste nature and composition. > The model calibration can be performed on organic waste of any particle size. - Abstract: A mathematical model has recently been proposed by the authors to simulate the biochemical processes that prevail in a co-digestion reactor fed with sewage sludge and the organic fraction of municipal solid waste. This model is based on the Anaerobic Digestion Model no. 1 of the International Watermore » Association, which has been extended to include the co-digestion processes, using surface-based kinetics to model the organic waste disintegration and conversion to carbohydrates, proteins and lipids. When organic waste solids are present in the reactor influent, the disintegration process is the rate-limiting step of the overall co-digestion process. The main advantage of the proposed modeling approach is that the kinetic constant of such a process does not depend on the waste particle size distribution (PSD) and rather depends only on the nature and composition of the waste particles. The model calibration aimed to assess the kinetic constant of the disintegration process can therefore be conducted using organic waste samples of any PSD, and the resulting value will be suitable for all the organic wastes of the same nature as the investigated samples, independently of their PSD. This assumption was proven in this study by biomethane potential experiments that were conducted on organic waste samples with different particle sizes. The results of these experiments were used to calibrate and validate the mathematical model, resulting in a good agreement between the simulated and observed data for any investigated particle size of the solid waste. This study confirms the strength of the proposed model and calibration procedure, which can thus be used to assess the treatment efficiency and predict the methane production of full-scale digesters.« less

  11. Demonstration of Multi- and Single-Reader Sample Size Program for Diagnostic Studies software.

    PubMed

    Hillis, Stephen L; Schartz, Kevin M

    2015-02-01

    The recently released software Multi- and Single-Reader Sample Size Sample Size Program for Diagnostic Studies , written by Kevin Schartz and Stephen Hillis, performs sample size computations for diagnostic reader-performance studies. The program computes the sample size needed to detect a specified difference in a reader performance measure between two modalities, when using the analysis methods initially proposed by Dorfman, Berbaum, and Metz (DBM) and Obuchowski and Rockette (OR), and later unified and improved by Hillis and colleagues. A commonly used reader performance measure is the area under the receiver-operating-characteristic curve. The program can be used with typical common reader-performance measures which can be estimated parametrically or nonparametrically. The program has an easy-to-use step-by-step intuitive interface that walks the user through the entry of the needed information. Features of the software include the following: (1) choice of several study designs; (2) choice of inputs obtained from either OR or DBM analyses; (3) choice of three different inference situations: both readers and cases random, readers fixed and cases random, and readers random and cases fixed; (4) choice of two types of hypotheses: equivalence or noninferiority; (6) choice of two output formats: power for specified case and reader sample sizes, or a listing of case-reader combinations that provide a specified power; (7) choice of single or multi-reader analyses; and (8) functionality in Windows, Mac OS, and Linux.

  12. VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS

    PubMed Central

    Huang, Jian; Horowitz, Joel L.; Wei, Fengrong

    2010-01-01

    We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739

  13. Small Sample Performance of Bias-corrected Sandwich Estimators for Cluster-Randomized Trials with Binary Outcomes

    PubMed Central

    Li, Peng; Redden, David T.

    2014-01-01

    SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738

  14. Image analysis of representative food structures: application of the bootstrap method.

    PubMed

    Ramírez, Cristian; Germain, Juan C; Aguilera, José M

    2009-08-01

    Images (for example, photomicrographs) are routinely used as qualitative evidence of the microstructure of foods. In quantitative image analysis it is important to estimate the area (or volume) to be sampled, the field of view, and the resolution. The bootstrap method is proposed to estimate the size of the sampling area as a function of the coefficient of variation (CV(Bn)) and standard error (SE(Bn)) of the bootstrap taking sub-areas of different sizes. The bootstrap method was applied to simulated and real structures (apple tissue). For simulated structures, 10 computer-generated images were constructed containing 225 black circles (elements) and different coefficient of variation (CV(image)). For apple tissue, 8 images of apple tissue containing cellular cavities with different CV(image) were analyzed. Results confirmed that for simulated and real structures, increasing the size of the sampling area decreased the CV(Bn) and SE(Bn). Furthermore, there was a linear relationship between the CV(image) and CV(Bn) (.) For example, to obtain a CV(Bn) = 0.10 in an image with CV(image) = 0.60, a sampling area of 400 x 400 pixels (11% of whole image) was required, whereas if CV(image) = 1.46, a sampling area of 1000 x 100 pixels (69% of whole image) became necessary. This suggests that a large-size dispersion of element sizes in an image requires increasingly larger sampling areas or a larger number of images.

  15. 78 FR 27406 - Agency Information Collection Activities; Proposed Collection; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-10

    ... groups will be conducted with up to eight participants in each for a total sample size of 32. The second... determine eligibility for the pilot study to recruit a sample of 500 participants (50 from each clinical... participate in an in-depth, qualitative telephone interview for a total of 100 interviews. Finally, up to...

  16. SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

    PubMed

    Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

    2017-03-01

    We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).

  17. Conceptual data sampling for breast cancer histology image classification.

    PubMed

    Rezk, Eman; Awan, Zainab; Islam, Fahad; Jaoua, Ali; Al Maadeed, Somaya; Zhang, Nan; Das, Gautam; Rajpoot, Nasir

    2017-10-01

    Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Linear Combinations of Multiple Outcome Measures to Improve the Power of Efficacy Analysis ---Application to Clinical Trials on Early Stage Alzheimer Disease

    PubMed Central

    Xiong, Chengjie; Luo, Jingqin; Morris, John C; Bateman, Randall

    2018-01-01

    Modern clinical trials on Alzheimer disease (AD) focus on the early symptomatic stage or even the preclinical stage. Subtle disease progression at the early stages, however, poses a major challenge in designing such clinical trials. We propose a multivariate mixed model on repeated measures to model the disease progression over time on multiple efficacy outcomes, and derive the optimum weights to combine multiple outcome measures by minimizing the sample sizes to adequately power the clinical trials. A cross-validation simulation study is conducted to assess the accuracy for the estimated weights as well as the improvement in reducing the sample sizes for such trials. The proposed methodology is applied to the multiple cognitive tests from the ongoing observational study of the Dominantly Inherited Alzheimer Network (DIAN) to power future clinical trials in the DIAN with a cognitive endpoint. Our results show that the optimum weights to combine multiple outcome measures can be accurately estimated, and that compared to the individual outcomes, the combined efficacy outcome with these weights significantly reduces the sample size required to adequately power clinical trials. When applied to the clinical trial in the DIAN, the estimated linear combination of six cognitive tests can adequately power the clinical trial. PMID:29546251

  19. Across-cohort QC analyses of GWAS summary statistics from complex traits.

    PubMed

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2016-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.

  20. Across-cohort QC analyses of GWAS summary statistics from complex traits

    PubMed Central

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2017-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965

  1. Coalescence computations for large samples drawn from populations of time-varying sizes

    PubMed Central

    Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek

    2017-01-01

    We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404

  2. Combining the boundary shift integral and tensor-based morphometry for brain atrophy estimation

    NASA Astrophysics Data System (ADS)

    Michalkiewicz, Mateusz; Pai, Akshay; Leung, Kelvin K.; Sommer, Stefan; Darkner, Sune; Sørensen, Lauge; Sporring, Jon; Nielsen, Mads

    2016-03-01

    Brain atrophy from structural magnetic resonance images (MRIs) is widely used as an imaging surrogate marker for Alzheimers disease. Their utility has been limited due to the large degree of variance and subsequently high sample size estimates. The only consistent and reasonably powerful atrophy estimation methods has been the boundary shift integral (BSI). In this paper, we first propose a tensor-based morphometry (TBM) method to measure voxel-wise atrophy that we combine with BSI. The combined model decreases the sample size estimates significantly when compared to BSI and TBM alone.

  3. Trade off between variable and fixed size normalization in orthogonal polynomials based iris recognition system.

    PubMed

    Krishnamoorthi, R; Anna Poorani, G

    2016-01-01

    Iris normalization is an important stage in any iris biometric, as it has a propensity to trim down the consequences of iris distortion. To indemnify the variation in size of the iris owing to the action of stretching or enlarging the pupil in iris acquisition process and camera to eyeball distance, two normalization schemes has been proposed in this work. In the first method, the iris region of interest is normalized by converting the iris into the variable size rectangular model in order to avoid the under samples near the limbus border. In the second method, the iris region of interest is normalized by converting the iris region into a fixed size rectangular model in order to avoid the dimensional discrepancies between the eye images. The performance of the proposed normalization methods is evaluated with orthogonal polynomials based iris recognition in terms of FAR, FRR, GAR, CRR and EER.

  4. Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications

    PubMed Central

    Chaibub Neto, Elias

    2015-01-01

    In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965

  5. Code Saturation Versus Meaning Saturation: How Many Interviews Are Enough?

    PubMed

    Hennink, Monique M; Kaiser, Bonnie N; Marconi, Vincent C

    2017-03-01

    Saturation is a core guiding principle to determine sample sizes in qualitative research, yet little methodological research exists on parameters that influence saturation. Our study compared two approaches to assessing saturation: code saturation and meaning saturation. We examined sample sizes needed to reach saturation in each approach, what saturation meant, and how to assess saturation. Examining 25 in-depth interviews, we found that code saturation was reached at nine interviews, whereby the range of thematic issues was identified. However, 16 to 24 interviews were needed to reach meaning saturation where we developed a richly textured understanding of issues. Thus, code saturation may indicate when researchers have "heard it all," but meaning saturation is needed to "understand it all." We used our results to develop parameters that influence saturation, which may be used to estimate sample sizes for qualitative research proposals or to document in publications the grounds on which saturation was achieved.

  6. Influence of Sample Size of Polymer Materials on Aging Characteristics in the Salt Fog Test

    NASA Astrophysics Data System (ADS)

    Otsubo, Masahisa; Anami, Naoya; Yamashita, Seiji; Honda, Chikahisa; Takenouchi, Osamu; Hashimoto, Yousuke

    Polymer insulators have been used in worldwide because of some superior properties; light weight, high mechanical strength, good hydrophobicity etc., as compared with porcelain insulators. In this paper, effect of sample size on the aging characteristics in the salt fog test is examined. Leakage current was measured by using 100 MHz AD board or 100 MHz digital oscilloscope and separated three components as conductive current, corona discharge current and dry band arc discharge current by using FFT and the current differential method newly proposed. Each component cumulative charge was estimated automatically by a personal computer. As the results, when the sample size increased under the same average applied electric field, the peak values of leakage current and each component current increased. Especially, the cumulative charges and the arc discharge length of dry band arc discharge increased remarkably with the increase of gap length.

  7. Characterization of the Particle Size and Polydispersity of Dicumarol Using Solid-State NMR Spectroscopy.

    PubMed

    Dempah, Kassibla Elodie; Lubach, Joseph W; Munson, Eric J

    2017-03-06

    A variety of particle sizes of a model compound, dicumarol, were prepared and characterized in order to investigate the correlation between particle size and solid-state NMR (SSNMR) proton spin-lattice relaxation ( 1 H T 1 ) times. Conventional laser diffraction and scanning electron microscopy were used as particle size measurement techniques and showed crystalline dicumarol samples with sizes ranging from tens of micrometers to a few micrometers. Dicumarol samples were prepared using both bottom-up and top-down particle size control approaches, via antisolvent microprecipitation and cryogrinding. It was observed that smaller particles of dicumarol generally had shorter 1 H T 1 times than larger ones. Additionally, cryomilled particles had the shortest 1 H T 1 times encountered (8 s). SSNMR 1 H T 1 times of all the samples were measured and showed as-received dicumarol to have a T 1 of 1500 s, whereas the 1 H T 1 times of the precipitated samples ranged from 20 to 80 s, with no apparent change in the physical form of dicumarol. Physical mixtures of different sized particles were also analyzed to determine the effect of sample inhomogeneity on 1 H T 1 values. Mixtures of cryoground and as-received dicumarol were clearly inhomogeneous as they did not fit well to a one-component relaxation model, but could be fit much better to a two-component model with both fast-and slow-relaxing regimes. Results indicate that samples of crystalline dicumarol containing two significantly different particle size populations could be deconvoluted solely based on their differences in 1 H T 1 times. Relative populations of each particle size regime could also be approximated using two-component fitting models. Using NMR theory on spin diffusion as a reference, and taking into account the presence of crystal defects, a model for the correlation between the particle size of dicumarol and its 1 H T 1 time was proposed.

  8. Prediction of accrual closure date in multi-center clinical trials with discrete-time Poisson process models.

    PubMed

    Tang, Gong; Kong, Yuan; Chang, Chung-Chou Ho; Kong, Lan; Costantino, Joseph P

    2012-01-01

    In a phase III multi-center cancer clinical trial or a large public health study, sample size is predetermined to achieve desired power, and study participants are enrolled from tens or hundreds of participating institutions. As the accrual is closing to the target size, the coordinating data center needs to project the accrual closure date on the basis of the observed accrual pattern and notify the participating sites several weeks in advance. In the past, projections were simply based on some crude assessment, and conservative measures were incorporated in order to achieve the target accrual size. This approach often resulted in excessive accrual size and subsequently unnecessary financial burden on the study sponsors. Here we proposed a discrete-time Poisson process-based method to estimate the accrual rate at time of projection and subsequently the trial closure date. To ensure that target size would be reached with high confidence, we also proposed a conservative method for the closure date projection. The proposed method was illustrated through the analysis of the accrual data of the National Surgical Adjuvant Breast and Bowel Project trial B-38. The results showed that application of the proposed method could help to save considerable amount of expenditure in patient management without compromising the accrual goal in multi-center clinical trials. Copyright © 2012 John Wiley & Sons, Ltd.

  9. Testing for qualitative heterogeneity: An application to composite endpoints in survival analysis.

    PubMed

    Oulhaj, Abderrahim; El Ghouch, Anouar; Holman, Rury R

    2017-01-01

    Composite endpoints are frequently used in clinical outcome trials to provide more endpoints, thereby increasing statistical power. A key requirement for a composite endpoint to be meaningful is the absence of the so-called qualitative heterogeneity to ensure a valid overall interpretation of any treatment effect identified. Qualitative heterogeneity occurs when individual components of a composite endpoint exhibit differences in the direction of a treatment effect. In this paper, we develop a general statistical method to test for qualitative heterogeneity, that is to test whether a given set of parameters share the same sign. This method is based on the intersection-union principle and, provided that the sample size is large, is valid whatever the model used for parameters estimation. We propose two versions of our testing procedure, one based on a random sampling from a Gaussian distribution and another version based on bootstrapping. Our work covers both the case of completely observed data and the case where some observations are censored which is an important issue in many clinical trials. We evaluated the size and power of our proposed tests by carrying out some extensive Monte Carlo simulations in the case of multivariate time to event data. The simulations were designed under a variety of conditions on dimensionality, censoring rate, sample size and correlation structure. Our testing procedure showed very good performances in terms of statistical power and type I error. The proposed test was applied to a data set from a single-center, randomized, double-blind controlled trial in the area of Alzheimer's disease.

  10. Coalescent: an open-science framework for importance sampling in coalescent theory.

    PubMed

    Tewari, Susanta; Spouge, John L

    2015-01-01

    Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schemes computationally or benchmark against different schemes, in a manner that is reliable and maintainable. Moreover, most studies use computer programs lacking a convenient user interface or the flexibility to meet the current demands of open science. In particular, current computer frameworks can only evaluate the efficiency of a single importance sampling scheme or compare the efficiencies of different schemes in an ad hoc manner. Results. We have designed a general framework (http://coalescent.sourceforge.net; language: Java; License: GPLv3) for importance sampling that computes likelihoods under the standard neutral coalescent model of a single, well-mixed population of constant size over time following infinite sites model of mutation. The framework models the necessary core concepts, comes integrated with several data sets of varying size, implements the standard competing proposals, and integrates tightly with our previous framework for calculating exact probabilities. For a given dataset, it computes the likelihood and provides the maximum likelihood estimate of the mutation parameter. Well-known benchmarks in the coalescent literature validate the accuracy of the framework. The framework provides an intuitive user interface with minimal clutter. For performance, the framework switches automatically to modern multicore hardware, if available. It runs on three major platforms (Windows, Mac and Linux). Extensive tests and coverage make the framework reliable and maintainable. Conclusions. In coalescent theory, many studies of computational efficiency consider only effective sample size. Here, we evaluate proposals in the coalescent literature, to discover that the order of efficiency among the three importance sampling schemes changes when one considers running time as well as effective sample size. We also describe a computational technique called "just-in-time delegation" available to improve the trade-off between running time and precision by constructing improved importance sampling schemes from existing ones. Thus, our systems approach is a potential solution to the "2(8) programs problem" highlighted by Felsenstein, because it provides the flexibility to include or exclude various features of similar coalescent models or importance sampling schemes.

  11. Estimating the Size of a Large Network and its Communities from a Random Sample

    PubMed Central

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W.

    2017-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924

  12. Estimating the Size of a Large Network and its Communities from a Random Sample.

    PubMed

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W

    2016-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.

  13. Effect of dislocation pile-up on size-dependent yield strength in finite single-crystal micro-samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pan, Bo; Shibutani, Yoji, E-mail: sibutani@mech.eng.osaka-u.ac.jp; Zhang, Xu

    2015-07-07

    Recent research has explained that the steeply increasing yield strength in metals depends on decreasing sample size. In this work, we derive a statistical physical model of the yield strength of finite single-crystal micro-pillars that depends on single-ended dislocation pile-up inside the micro-pillars. We show that this size effect can be explained almost completely by considering the stochastic lengths of the dislocation source and the dislocation pile-up length in the single-crystal micro-pillars. The Hall–Petch-type relation holds even in a microscale single-crystal, which is characterized by its dislocation source lengths. Our quantitative conclusions suggest that the number of dislocation sources andmore » pile-ups are significant factors for the size effect. They also indicate that starvation of dislocation sources is another reason for the size effect. Moreover, we investigated the explicit relationship between the stacking fault energy and the dislocation “pile-up” effect inside the sample: materials with low stacking fault energy exhibit an obvious dislocation pile-up effect. Our proposed physical model predicts a sample strength that agrees well with experimental data, and our model can give a more precise prediction than the current single arm source model, especially for materials with low stacking fault energy.« less

  14. Multiscale Simulation of Porous Ceramics Based on Movable Cellular Automaton Method

    NASA Astrophysics Data System (ADS)

    Smolin, A.; Smolin, I.; Eremina, G.; Smolina, I.

    2017-10-01

    The paper presents a model for simulating mechanical behaviour of multiscale porous ceramics based on movable cellular automaton method, which is a novel particle method in computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the random unique position in space. As a result, we get the average values of Young’s modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behaviour at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via the effective properties determined at the previous scale level. If the pore size distribution function of the material has N maxima we need to perform computations for N - 1 levels in order to get the properties from the lowest scale up to the macroscale step by step. The proposed approach was applied to modelling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behaviour of the model sample at the macroscale.

  15. Determination of Hyaluronan Molecular Mass Distribution in Human Breast Milk

    PubMed Central

    Yuan, Han; Amin, Ripal; Ye, Xin; De La Motte, Carol A.; Cowman, Mary K.

    2015-01-01

    Hyaluronan (HA) in human milk mediates host responses to microbial infection, via TLR4- and CD44-dependent signaling. Signaling by HA is generally size-specific. Because pure HA with average molecular mass (M) of 35 kDa can elicit a protective response in intestinal epithelial cells, it has been proposed that human milk HA may have a bioactive low M component. Here we report the size distribution of HA in human milk samples from twenty unique donors. A new method for HA analysis, employingion exchange (IEX) chromatography to fractionate HA by size, and specific quantification of each size fraction by competitive Enzyme Linked Sorbent Assay (ELSA), was developed. When separated into four fractions, milk HA with M ≤ 20 kDa, M ≈20-60 kDa, and M ≈ 60-110 kDa comprised an average of 1.5%, 1.4% and 2% of the total HA, respectively. The remaining 95% was HA with M≥110 kDa. Electrophoretic analysis of the higher M HA from thirteen samples showed nearly identical M distributions, with an average M of ∼440 kDa. This higher M HA component in human milk is proposed to bind to CD44 and to enhance human beta defensin 2 (HBD2) induction by the low M HA components. PMID:25579786

  16. Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosis.

    PubMed

    Ozçift, Akin

    2011-05-01

    Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Adaptive Annealed Importance Sampling for Multimodal Posterior Exploration and Model Selection with Application to Extrasolar Planet Detection

    NASA Astrophysics Data System (ADS)

    Liu, Bin

    2014-07-01

    We describe an algorithm that can adaptively provide mixture summaries of multimodal posterior distributions. The parameter space of the involved posteriors ranges in size from a few dimensions to dozens of dimensions. This work was motivated by an astrophysical problem called extrasolar planet (exoplanet) detection, wherein the computation of stochastic integrals that are required for Bayesian model comparison is challenging. The difficulty comes from the highly nonlinear models that lead to multimodal posterior distributions. We resort to importance sampling (IS) to estimate the integrals, and thus translate the problem to be how to find a parametric approximation of the posterior. To capture the multimodal structure in the posterior, we initialize a mixture proposal distribution and then tailor its parameters elaborately to make it resemble the posterior to the greatest extent possible. We use the effective sample size (ESS) calculated based on the IS draws to measure the degree of approximation. The bigger the ESS is, the better the proposal resembles the posterior. A difficulty within this tailoring operation lies in the adjustment of the number of mixing components in the mixture proposal. Brute force methods just preset it as a large constant, which leads to an increase in the required computational resources. We provide an iterative delete/merge/add process, which works in tandem with an expectation-maximization step to tailor such a number online. The efficiency of our proposed method is tested via both simulation studies and real exoplanet data analysis.

  18. The effective elastic properties of human trabecular bone may be approximated using micro-finite element analyses of embedded volume elements.

    PubMed

    Daszkiewicz, Karol; Maquer, Ghislain; Zysset, Philippe K

    2017-06-01

    Boundary conditions (BCs) and sample size affect the measured elastic properties of cancellous bone. Samples too small to be representative appear stiffer under kinematic uniform BCs (KUBCs) than under periodicity-compatible mixed uniform BCs (PMUBCs). To avoid those effects, we propose to determine the effective properties of trabecular bone using an embedded configuration. Cubic samples of various sizes (2.63, 5.29, 7.96, 10.58 and 15.87 mm) were cropped from [Formula: see text] scans of femoral heads and vertebral bodies. They were converted into [Formula: see text] models and their stiffness tensor was established via six uniaxial and shear load cases. PMUBCs- and KUBCs-based tensors were determined for each sample. "In situ" stiffness tensors were also evaluated for the embedded configuration, i.e. when the loads were transmitted to the samples via a layer of trabecular bone. The Zysset-Curnier model accounting for bone volume fraction and fabric anisotropy was fitted to those stiffness tensors, and model parameters [Formula: see text] (Poisson's ratio) [Formula: see text] and [Formula: see text] (elastic and shear moduli) were compared between sizes. BCs and sample size had little impact on [Formula: see text]. However, KUBCs- and PMUBCs-based [Formula: see text] and [Formula: see text], respectively, decreased and increased with growing size, though convergence was not reached even for our largest samples. Both BCs produced upper and lower bounds for the in situ values that were almost constant across samples dimensions, thus appearing as an approximation of the effective properties. PMUBCs seem also appropriate for mimicking the trabecular core, but they still underestimate its elastic properties (especially in shear) even for nearly orthotropic samples.

  19. Rectification of depth measurement using pulsed thermography with logarithmic peak second derivative method

    NASA Astrophysics Data System (ADS)

    Li, Xiaoli; Zeng, Zhi; Shen, Jingling; Zhang, Cunlin; Zhao, Yuejin

    2018-03-01

    Logarithmic peak second derivative (LPSD) method is the most popular method for depth prediction in pulsed thermography. It is widely accepted that this method is independent of defect size. The theoretical model for LPSD method is based on the one-dimensional solution of heat conduction without considering the effect of defect size. When a decay term considering defect aspect ratio is introduced into the solution to correct the three-dimensional thermal diffusion effect, we found that LPSD method is affected by defect size by analytical model. Furthermore, we constructed the relation between the characteristic time of LPSD method and defect aspect ratio, which was verified with the experimental results of stainless steel and glass fiber reinforced plate (GFRP) samples. We also proposed an improved LPSD method for depth prediction when the effect of defect size was considered, and the rectification results of stainless steel and GFRP samples were presented and discussed.

  20. A new method for estimating the demographic history from DNA sequences: an importance sampling approach

    PubMed Central

    Ait Kaci Azzou, Sadoune; Larribe, Fabrice; Froda, Sorana

    2015-01-01

    The effective population size over time (demographic history) can be retraced from a sample of contemporary DNA sequences. In this paper, we propose a novel methodology based on importance sampling (IS) for exploring such demographic histories. Our starting point is the generalized skyline plot with the main difference being that our procedure, skywis plot, uses a large number of genealogies. The information provided by these genealogies is combined according to the IS weights. Thus, we compute a weighted average of the effective population sizes on specific time intervals (epochs), where the genealogies that agree more with the data are given more weight. We illustrate by a simulation study that the skywis plot correctly reconstructs the recent demographic history under the scenarios most commonly considered in the literature. In particular, our method can capture a change point in the effective population size, and its overall performance is comparable with the one of the bayesian skyline plot. We also introduce the case of serially sampled sequences and illustrate that it is possible to improve the performance of the skywis plot in the case of an exponential expansion of the effective population size. PMID:26300910

  1. Dual-window dual-bandwidth spectroscopic optical coherence tomography metric for qualitative scatterer size differentiation in tissues.

    PubMed

    Tay, Benjamin Chia-Meng; Chow, Tzu-Hao; Ng, Beng-Koon; Loh, Thomas Kwok-Seng

    2012-09-01

    This study investigates the autocorrelation bandwidths of dual-window (DW) optical coherence tomography (OCT) k-space scattering profile of different-sized microspheres and their correlation to scatterer size. A dual-bandwidth spectroscopic metric defined as the ratio of the 10% to 90% autocorrelation bandwidths is found to change monotonically with microsphere size and gives the best contrast enhancement for scatterer size differentiation in the resulting spectroscopic image. A simulation model supports the experimental results and revealed a tradeoff between the smallest detectable scatterer size and the maximum scatterer size in the linear range of the dual-window dual-bandwidth (DWDB) metric, which depends on the choice of the light source optical bandwidth. Spectroscopic OCT (SOCT) images of microspheres and tonsil tissue samples based on the proposed DWDB metric showed clear differentiation between different-sized scatterers as compared to those derived from conventional short-time Fourier transform metrics. The DWDB metric significantly improves the contrast in SOCT imaging and can aid the visualization and identification of dissimilar scatterer size in a sample. Potential applications include the early detection of cell nuclear changes in tissue carcinogenesis, the monitoring of healing tendons, and cell proliferation in tissue scaffolds.

  2. Dimensions of design space: a decision-theoretic approach to optimal research design.

    PubMed

    Conti, Stefano; Claxton, Karl

    2009-01-01

    Bayesian decision theory can be used not only to establish the optimal sample size and its allocation in a single clinical study but also to identify an optimal portfolio of research combining different types of study design. Within a single study, the highest societal payoff to proposed research is achieved when its sample sizes and allocation between available treatment options are chosen to maximize the expected net benefit of sampling (ENBS). Where a number of different types of study informing different parameters in the decision problem could be conducted, the simultaneous estimation of ENBS across all dimensions of the design space is required to identify the optimal sample sizes and allocations within such a research portfolio. This is illustrated through a simple example of a decision model of zanamivir for the treatment of influenza. The possible study designs include: 1) a single trial of all the parameters, 2) a clinical trial providing evidence only on clinical endpoints, 3) an epidemiological study of natural history of disease, and 4) a survey of quality of life. The possible combinations, samples sizes, and allocation between trial arms are evaluated over a range of cost-effectiveness thresholds. The computational challenges are addressed by implementing optimization algorithms to search the ENBS surface more efficiently over such large dimensions.

  3. Influence of specimen dimensions on ductile-to-brittle transition temperature in Charpy impact test

    NASA Astrophysics Data System (ADS)

    Rzepa, S.; Bucki, T.; Konopík, P.; Džugan, J.; Rund, M.; Procházka, R.

    2017-02-01

    This paper discusses the correlation between specimen dimensions and transition temperature. Notch toughness properties of Standard Charpy-V specimens are compared to samples with lower width (7.5 mm, 5 mm, 2.5 mm) and sub-size Charpy specimens with cross section 3×4. In this study transition curves are correlated with lateral ductile part of fracture related ones for 5 considered geometries. Based on the results obtained, correlation procedure for transition temperature determination of full size specimens defined by fracture appearance of sub-sized specimens is proposed.

  4. Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

    PubMed Central

    Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

    2013-01-01

    The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920

  5. Chandra Observations of Three Newly Discovered Quadruply Gravitationally Lensed Quasars

    NASA Astrophysics Data System (ADS)

    Pooley, David

    2017-09-01

    Our previous work has shown the unique power of Chandra observations of quadruply gravitationally lensed quasars to address several fundamental astrophysical issues. We have used these observations to (1) determine the cause of flux ratio anomalies, (2) measure the sizes of quasar accretion disks, (3) determine the dark matter content of the lensing galaxies, and (4) measure the stellar mass-to-light ratio (in fact, this is the only way to measure the stellar mass-to-light ratio beyond the solar neighborhood). In all cases, the main source of uncertainty in our results is the small size of the sample of known quads; only 15 systems are available for study with Chandra. We propose Chandra observations of three newly discovered quads, increasing the sample size by 20%

  6. A novel approach for small sample size family-based association studies: sequential tests.

    PubMed

    Ilk, Ozlem; Rajabli, Farid; Dungul, Dilay Ciglidag; Ozdag, Hilal; Ilk, Hakki Gokhan

    2011-08-01

    In this paper, we propose a sequential probability ratio test (SPRT) to overcome the problem of limited samples in studies related to complex genetic diseases. The results of this novel approach are compared with the ones obtained from the traditional transmission disequilibrium test (TDT) on simulated data. Although TDT classifies single-nucleotide polymorphisms (SNPs) to only two groups (SNPs associated with the disease and the others), SPRT has the flexibility of assigning SNPs to a third group, that is, those for which we do not have enough evidence and should keep sampling. It is shown that SPRT results in smaller ratios of false positives and negatives, as well as better accuracy and sensitivity values for classifying SNPs when compared with TDT. By using SPRT, data with small sample size become usable for an accurate association analysis.

  7. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

    PubMed Central

    Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260

  8. Brief Report: Accuracy and Response Time for the Recognition of Facial Emotions in a Large Sample of Children with Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Fink, Elian; de Rosnay, Marc; Wierda, Marlies; Koot, Hans M.; Begeer, Sander

    2014-01-01

    The empirical literature has presented inconsistent evidence for deficits in the recognition of basic emotion expressions in children with autism spectrum disorders (ASD), which may be due to the focus on research with relatively small sample sizes. Additionally, it is proposed that although children with ASD may correctly identify emotion…

  9. African Primary Care Research: qualitative interviewing in primary care.

    PubMed

    Reid, Steve; Mash, Bob

    2014-06-05

    This article is part of a series on African Primary Care Research and focuses on the topic of qualitative interviewing in primary care. In particular it looks at issues of study design, sample size, sampling and interviewing in relation to individual and focus group interviews.There is a particular focus on helping postgraduate students at a Masters level to write their research proposals.

  10. Factors Influencing Teachers' Competence in Developing Resilience in Vulnerable Children in Primary Schools in Uasin Gishu County, Kenya

    ERIC Educational Resources Information Center

    Silyvier, Tsindoli; Nyandusi, Charles

    2015-01-01

    The purpose of the study was to assess the effect of teacher characteristics on their competence in developing resilience in vulnerable primary school children. A descriptive survey research design was used. This study was based on resiliency theory as proposed by Krovetz (1998). Simple random sampling was used to select a sample size of 108…

  11. Sample size re-estimation and other midcourse adjustments with sequential parallel comparison design.

    PubMed

    Silverman, Rachel K; Ivanova, Anastasia

    2017-01-01

    Sequential parallel comparison design (SPCD) was proposed to reduce placebo response in a randomized trial with placebo comparator. Subjects are randomized between placebo and drug in stage 1 of the trial, and then, placebo non-responders are re-randomized in stage 2. Efficacy analysis includes all data from stage 1 and all placebo non-responding subjects from stage 2. This article investigates the possibility to re-estimate the sample size and adjust the design parameters, allocation proportion to placebo in stage 1 of SPCD, and weight of stage 1 data in the overall efficacy test statistic during an interim analysis.

  12. Power of tests for comparing trend curves with application to national immunization survey (NIS).

    PubMed

    Zhao, Zhen

    2011-02-28

    To develop statistical tests for comparing trend curves of study outcomes between two socio-demographic strata across consecutive time points, and compare statistical power of the proposed tests under different trend curves data, three statistical tests were proposed. For large sample size with independent normal assumption among strata and across consecutive time points, the Z and Chi-square test statistics were developed, which are functions of outcome estimates and the standard errors at each of the study time points for the two strata. For small sample size with independent normal assumption, the F-test statistic was generated, which is a function of sample size of the two strata and estimated parameters across study period. If two trend curves are approximately parallel, the power of Z-test is consistently higher than that of both Chi-square and F-test. If two trend curves cross at low interaction, the power of Z-test is higher than or equal to the power of both Chi-square and F-test; however, at high interaction, the powers of Chi-square and F-test are higher than that of Z-test. The measurement of interaction of two trend curves was defined. These tests were applied to the comparison of trend curves of vaccination coverage estimates of standard vaccine series with National Immunization Survey (NIS) 2000-2007 data. Copyright © 2011 John Wiley & Sons, Ltd.

  13. Estimation of signal-dependent noise level function in transform domain via a sparse recovery model.

    PubMed

    Yang, Jingyu; Gan, Ziqiao; Wu, Zhaoyang; Hou, Chunping

    2015-05-01

    This paper proposes a novel algorithm to estimate the noise level function (NLF) of signal-dependent noise (SDN) from a single image based on the sparse representation of NLFs. Noise level samples are estimated from the high-frequency discrete cosine transform (DCT) coefficients of nonlocal-grouped low-variation image patches. Then, an NLF recovery model based on the sparse representation of NLFs under a trained basis is constructed to recover NLF from the incomplete noise level samples. Confidence levels of the NLF samples are incorporated into the proposed model to promote reliable samples and weaken unreliable ones. We investigate the behavior of the estimation performance with respect to the block size, sampling rate, and confidence weighting. Simulation results on synthetic noisy images show that our method outperforms existing state-of-the-art schemes. The proposed method is evaluated on real noisy images captured by three types of commodity imaging devices, and shows consistently excellent SDN estimation performance. The estimated NLFs are incorporated into two well-known denoising schemes, nonlocal means and BM3D, and show significant improvements in denoising SDN-polluted images.

  14. 77 FR 35658 - Proposed Collection; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-14

    ... rulemaking mandates, the Bureau seeks to collect qualitative information from industry participants regarding... burdens on their respective institutions. The Bureau recognizes that burdens vary depending on the size... collections of information will seek to sample providers that are representative of markets affected by a...

  15. 75 FR 12003 - Investing in Innovation Fund

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-12

    ..., Proposed Practice, Strategy, or implemented experimental implemented strategy, or program, Program. study or well-designed experimental or quasi- or one similar to it, and well-implemented experimental study, has been attempted quasi-experimental with small sample sizes previously, albeit on a study; or (2...

  16. Variable-Size Bead Layer as Standard Reference for Endothelial Microscopes.

    PubMed

    Tufo, Simona; Prazzoli, Erica; Ferraro, Lorenzo; Cozza, Federica; Borghesi, Alessandro; Tavazzi, Silvia

    2017-02-01

    For morphometric analysis of the cell mosaic of corneal endothelium, checking accuracy and precision of instrumentation is a key step. In this study, a standard reference sample is proposed, developed to reproduce the cornea with its shape and the endothelium with its intrinsic variability in the cell size. A polystyrene bead layer (representing the endothelium) was deposited on a lens (representing the cornea). Bead diameters were 20, 25, and 30 μm (fractions in number 55%, 30%, and 15%, respectively). Bead density and hexagonality were simulated to obtain the expected true values and measured using a slit-lamp endothelial microscope applied to 1) a Takagi 700GL slit lamp at 40× magnification (recommended standard setup) and 2) a Takagi 2ZL slit lamp at 25× magnification. The simulation provided the expected bead density 2001 mm and hexagonality 47%. At 40×, density and hexagonality were measured to be 2009 mm (SD 93 mm) and 45% (SD 3%). At 25× on a different slit lamp, the comparison between measured and expected densities provided the factor 1.526 to resize the image and to use the current algorithms of the slit-lamp endothelial microscope for cell recognition. A variable-size polystyrene bead layer on a lens is proposed as a standard sample mimicking the real shape of the cornea and the variability of cell size and cell arrangement of corneal endothelium. The sample is suggested to evaluate accuracy and precision of cell density and hexagonality obtained by different endothelial microscopes, including a slit-lamp endothelial microscope applied to different slit lamps, also at different magnifications.

  17. Sequencing the Earliest Stages of Active Galactic Nuclei Development Using The Youngest Radio Sources

    NASA Astrophysics Data System (ADS)

    Collier, Jordan; Filipovic, Miroslav; Norris, Ray; Chow, Kate; Huynh, Minh; Banfield, Julie; Tothill, Nick; Sirothia, Sandeep Kumar; Shabala, Stanislav

    2014-04-01

    This proposal is a continuation of an extensive project (the core of Collier's PhD) to explore the earliest stages of AGN formation, using Gigahertz-Peaked Spectrum (GPS) and Compact Steep Spectrum (CSS) sources. Both are widely believed to represent the earliest stages of radio-loud AGN evolution, with GPS sources preceding CSS sources. In this project, we plan to (a) test this hypothesis, (b) place GPS and CSS sources into an evolutionary sequence with a number of other young AGN candidates, and (c) search for evidence of the evolving accretion mode. We will do this using high-resolution radio observations, with a number of other multiwavelength age indicators, of a carefully selected complete faint sample of 80 GPS/CSS sources. Analysis of the C2730 ELAIS-S1 data shows that we have so far met our goals, resolving the jets of 10/49 sources, and measuring accurate spectral indices from 0.843-10 GHz. This particular proposal is to almost triple the sample size by observing an additional 80 GPS/CSS sources in the Chandra Deep Field South (arguably the best-studied field) and allow a turnover frequency - linear size relation to be derived at >10-sigma. Sources found to be unresolved in our final sample will subsequently be observed with VLBI. Comparing those sources resolved with ATCA to the more compact sources resolved with VLBI will give a distribution of source sizes, helping to answer the question of whether all GPS/CSS sources grow to larger sizes.

  18. Performance of a Line Loss Correction Method for Gas Turbine Emission Measurements

    NASA Astrophysics Data System (ADS)

    Hagen, D. E.; Whitefield, P. D.; Lobo, P.

    2015-12-01

    International concern for the environmental impact of jet engine exhaust emissions in the atmosphere has led to increased attention on gas turbine engine emission testing. The Society of Automotive Engineers Aircraft Exhaust Emissions Measurement Committee (E-31) has published an Aerospace Information Report (AIR) 6241 detailing the sampling system for the measurement of non-volatile particulate matter from aircraft engines, and is developing an Aerospace Recommended Practice (ARP) for methodology and system specification. The Missouri University of Science and Technology (MST) Center for Excellence for Aerospace Particulate Emissions Reduction Research has led numerous jet engine exhaust sampling campaigns to characterize emissions at different locations in the expanding exhaust plume. Particle loss, due to various mechanisms, occurs in the sampling train that transports the exhaust sample from the engine exit plane to the measurement instruments. To account for the losses, both the size dependent penetration functions and the size distribution of the emitted particles need to be known. However in the proposed ARP, particle number and mass are measured, but size is not. Here we present a methodology to generate number and mass correction factors for line loss, without using direct size measurement. A lognormal size distribution is used to represent the exhaust aerosol at the engine exit plane and is defined by the measured number and mass at the downstream end of the sample train. The performance of this line loss correction is compared to corrections based on direct size measurements using data taken by MST during numerous engine test campaigns. The experimental uncertainty in these correction factors is estimated. Average differences between the line loss correction method and size based corrections are found to be on the order of 10% for number and 2.5% for mass.

  19. Orphan therapies: making best use of postmarket data.

    PubMed

    Maro, Judith C; Brown, Jeffrey S; Dal Pan, Gerald J; Li, Lingling

    2014-08-01

    Postmarket surveillance of the comparative safety and efficacy of orphan therapeutics is challenging, particularly when multiple therapeutics are licensed for the same orphan indication. To make best use of product-specific registry data collected to fulfill regulatory requirements, we propose the creation of a distributed electronic health data network among registries. Such a network could support sequential statistical analyses designed to detect early warnings of excess risks. We use a simulated example to explore the circumstances under which a distributed network may prove advantageous. We perform sample size calculations for sequential and non-sequential statistical studies aimed at comparing the incidence of hepatotoxicity following initiation of two newly licensed therapies for homozygous familial hypercholesterolemia. We calculate the sample size savings ratio, or the proportion of sample size saved if one conducted a sequential study as compared to a non-sequential study. Then, using models to describe the adoption and utilization of these therapies, we simulate when these sample sizes are attainable in calendar years. We then calculate the analytic calendar time savings ratio, analogous to the sample size savings ratio. We repeat these analyses for numerous scenarios. Sequential analyses detect effect sizes earlier or at the same time as non-sequential analyses. The most substantial potential savings occur when the market share is more imbalanced (i.e., 90% for therapy A) and the effect size is closest to the null hypothesis. However, due to low exposure prevalence, these savings are difficult to realize within the 30-year time frame of this simulation for scenarios in which the outcome of interest occurs at or more frequently than one event/100 person-years. We illustrate a process to assess whether sequential statistical analyses of registry data performed via distributed networks may prove a worthwhile infrastructure investment for pharmacovigilance.

  20. Simultaneous Determination of Size and Quantification of Gold Nanoparticles by Direct Coupling Thin layer Chromatography with Catalyzed Luminol Chemiluminescence

    PubMed Central

    Yan, Neng; Zhu, Zhenli; He, Dong; Jin, Lanlan; Zheng, Hongtao; Hu, Shenghong

    2016-01-01

    The increasing use of metal-based nanoparticle products has raised concerns in particular for the aquatic environment and thus the quantification of such nanomaterials released from products should be determined to assess their environmental risks. In this study, a simple, rapid and sensitive method for the determination of size and mass concentration of gold nanoparticles (AuNPs) in aqueous suspension was established by direct coupling of thin layer chromatography (TLC) with catalyzed luminol-H2O2 chemiluminescence (CL) detection. For this purpose, a moving stage was constructed to scan the chemiluminescence signal from TLC separated AuNPs. The proposed TLC-CL method allows the quantification of differently sized AuNPs (13 nm, 41 nm and 100 nm) contained in a mixture. Various experimental parameters affecting the characterization of AuNPs, such as the concentration of H2O2, the concentration and pH of the luminol solution, and the size of the spectrometer aperture were investigated. Under optimal conditions, the detection limits for AuNP size fractions of 13 nm, 41 nm and 100 nm were 38.4 μg L−1, 35.9 μg L−1 and 39.6 μg L−1, with repeatabilities (RSD, n = 7) of 7.3%, 6.9% and 8.1% respectively for 10 mg L−1 samples. The proposed method was successfully applied to the characterization of AuNP size and concentration in aqueous test samples. PMID:27080702

  1. High-speed adaptive contact-mode atomic force microscopy imaging with near-minimum-force

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Juan; Zou, Qingze, E-mail: qzzou@rci.rutgers.edu

    In this paper, an adaptive contact-mode imaging approach is proposed to replace the traditional contact-mode imaging by addressing the major concerns in both the speed and the force exerted to the sample. The speed of the traditional contact-mode imaging is largely limited by the need to maintain precision tracking of the sample topography over the entire imaged sample surface, while large image distortion and excessive probe-sample interaction force occur during high-speed imaging. In this work, first, the image distortion caused by the topography tracking error is accounted for in the topography quantification. Second, the quantified sample topography is utilized inmore » a gradient-based optimization method to adjust the cantilever deflection set-point for each scanline closely around the minimal level needed for maintaining stable probe-sample contact, and a data-driven iterative feedforward control that utilizes a prediction of the next-line topography is integrated to the topography feeedback loop to enhance the sample topography tracking. The proposed approach is demonstrated and evaluated through imaging a calibration sample of square pitches at both high speeds (e.g., scan rate of 75 Hz and 130 Hz) and large sizes (e.g., scan size of 30 μm and 80 μm). The experimental results show that compared to the traditional constant-force contact-mode imaging, the imaging speed can be increased by over 30 folds (with the scanning speed at 13 mm/s), and the probe-sample interaction force can be reduced by more than 15% while maintaining the same image quality.« less

  2. High-speed adaptive contact-mode atomic force microscopy imaging with near-minimum-force.

    PubMed

    Ren, Juan; Zou, Qingze

    2014-07-01

    In this paper, an adaptive contact-mode imaging approach is proposed to replace the traditional contact-mode imaging by addressing the major concerns in both the speed and the force exerted to the sample. The speed of the traditional contact-mode imaging is largely limited by the need to maintain precision tracking of the sample topography over the entire imaged sample surface, while large image distortion and excessive probe-sample interaction force occur during high-speed imaging. In this work, first, the image distortion caused by the topography tracking error is accounted for in the topography quantification. Second, the quantified sample topography is utilized in a gradient-based optimization method to adjust the cantilever deflection set-point for each scanline closely around the minimal level needed for maintaining stable probe-sample contact, and a data-driven iterative feedforward control that utilizes a prediction of the next-line topography is integrated to the topography feeedback loop to enhance the sample topography tracking. The proposed approach is demonstrated and evaluated through imaging a calibration sample of square pitches at both high speeds (e.g., scan rate of 75 Hz and 130 Hz) and large sizes (e.g., scan size of 30 μm and 80 μm). The experimental results show that compared to the traditional constant-force contact-mode imaging, the imaging speed can be increased by over 30 folds (with the scanning speed at 13 mm/s), and the probe-sample interaction force can be reduced by more than 15% while maintaining the same image quality.

  3. Design of the value of imaging in enhancing the wellness of your heart (VIEW) trial and the impact of uncertainty on power.

    PubMed

    Ambrosius, Walter T; Polonsky, Tamar S; Greenland, Philip; Goff, David C; Perdue, Letitia H; Fortmann, Stephen P; Margolis, Karen L; Pajewski, Nicholas M

    2012-04-01

    Although observational evidence has suggested that the measurement of coronary artery calcium (CAC) may improve risk stratification for cardiovascular events and thus help guide the use of lipid-lowering therapy, this contention has not been evaluated within the context of a randomized trial. The Value of Imaging in Enhancing the Wellness of Your Heart (VIEW) trial is proposed as a randomized study in participants at low intermediate risk of future coronary heart disease (CHD) events to evaluate whether CAC testing leads to improved patient outcomes. To describe the challenges encountered in designing a prototypical screening trial and to examine the impact of uncertainty on power. The VIEW trial was designed as an effectiveness clinical trial to examine the benefit of CAC testing to guide therapy on a primary outcome consisting of a composite of nonfatal myocardial infarction, probable or definite angina with revascularization, resuscitated cardiac arrest, nonfatal stroke (not transient ischemic attack (TIA)), CHD death, stroke death, other atherosclerotic death, or other cardiovascular disease (CVD) death. Many critical choices were faced in designing the trial, including (1) the choice of primary outcome, (2) the choice of therapy, (3) the target population with corresponding ethical issues, (4) specifications of assumptions for sample size calculations, and (5) impact of uncertainty in these assumptions on power/sample size determination. We have proposed a sample size of 30,000 (800 events), which provides 92.7% power. Alternatively, sample sizes of 20,228 (539 events), 23,138 (617 events), and 27,078 (722 events) provide 80%, 85%, and 90% power. We have also allowed for uncertainty in our assumptions by computing average power integrated over specified prior distributions. This relaxation of specificity indicates a reduction in power, dropping to 89.9% (95% confidence interval (CI): 89.8-89.9) for a sample size of 30,000. Samples sizes of 20,228, 23,138, and 27,078 provide power of 78.0% (77.9-78.0), 82.5% (82.5-82.6), and 87.2% (87.2-87.3), respectively. These power estimates are dependent on form and parameters of the prior distributions. Despite the pressing need for a randomized trial to evaluate the utility of CAC testing, conduct of such a trial requires recruiting a large patient population, making efficiency of critical importance. The large sample size is primarily due to targeting a study population at relatively low risk of a CVD event. Our calculations also illustrate the importance of formally considering uncertainty in power calculations of large trials as standard power calculations may tend to overestimate power.

  4. Design of the Value of Imaging in Enhancing the Wellness of Your Heart (VIEW) Trial and the Impact of Uncertainty on Power

    PubMed Central

    Ambrosius, Walter T.; Polonsky, Tamar S.; Greenland, Philip; Goff, David C.; Perdue, Letitia H.; Fortmann, Stephen P.; Margolis, Karen L.; Pajewski, Nicholas M.

    2014-01-01

    Background Although observational evidence has suggested that the measurement of CAC may improve risk stratification for cardiovascular events and thus help guide the use of lipid-lowering therapy, this contention has not been evaluated within the context of a randomized trial. The Value of Imaging in Enhancing the Wellness of Your Heart (VIEW) trial is proposed as a randomized study in participants at low intermediate risk of future coronary heart disease (CHD) events to evaluate whether coronary artery calcium (CAC) testing leads to improved patient outcomes. Purpose To describe the challenges encountered in designing a prototypical screening trial and to examine the impact of uncertainty on power. Methods The VIEW trial was designed as an effectiveness clinical trial to examine the benefit of CAC testing to guide therapy on a primary outcome consisting of a composite of non-fatal myocardial infarction, probable or definite angina with revascularization, resuscitated cardiac arrest, non-fatal stroke (not transient ischemic attack (TIA)), CHD death, stroke death, other atherosclerotic death, or other cardiovascular disease (CVD) death. Many critical choices were faced in designing the trial, including: (1) the choice of primary outcome, (2) the choice of therapy, (3) the target population with corresponding ethical issues, (4) specifications of assumptions for sample size calculations, and (5) impact of uncertainty in these assumptions on power/sample size determination. Results We have proposed a sample size of 30,000 (800 events) which provides 92.7% power. Alternatively, sample sizes of 20,228 (539 events), 23,138 (617 events) and 27,078 (722 events) provide 80, 85, and 90% power. We have also allowed for uncertainty in our assumptions by computing average power integrated over specified prior distributions. This relaxation of specificity indicates a reduction in power, dropping to 89.9% (95% confidence interval (CI): 89.8 to 89.9) for a sample size of 30,000. Samples sizes of 20,228, 23,138, and 27,078 provide power of 78.0% (77.9 to 78.0), 82.5% (82.5 to 82.6), and 87.2% (87.2 to 87.3), respectively. Limitations These power estimates are dependent on form and parameters of the prior distributions. Conclusions Despite the pressing need for a randomized trial to evaluate the utility of CAC testing, conduct of such a trial requires recruiting a large patient population, making efficiency of critical importance. The large sample size is primarily due to targeting a study population at relatively low risk of a CVD event. Our calculations also illustrate the importance of formally considering uncertainty in power calculations of large trials as standard power calculations may tend to overestimate power. PMID:22333998

  5. The albatross plot: A novel graphical tool for presenting results of diversely reported studies in a systematic review

    PubMed Central

    Jones, Hayley E.; Martin, Richard M.; Lewis, Sarah J.; Higgins, Julian P.T.

    2017-01-01

    Abstract Meta‐analyses combine the results of multiple studies of a common question. Approaches based on effect size estimates from each study are generally regarded as the most informative. However, these methods can only be used if comparable effect sizes can be computed from each study, and this may not be the case due to variation in how the studies were done or limitations in how their results were reported. Other methods, such as vote counting, are then used to summarize the results of these studies, but most of these methods are limited in that they do not provide any indication of the magnitude of effect. We propose a novel plot, the albatross plot, which requires only a 1‐sided P value and a total sample size from each study (or equivalently a 2‐sided P value, direction of effect and total sample size). The plot allows an approximate examination of underlying effect sizes and the potential to identify sources of heterogeneity across studies. This is achieved by drawing contours showing the range of effect sizes that might lead to each P value for given sample sizes, under simple study designs. We provide examples of albatross plots using data from previous meta‐analyses, allowing for comparison of results, and an example from when a meta‐analysis was not possible. PMID:28453179

  6. The two-sample problem with induced dependent censorship.

    PubMed

    Huang, Y

    1999-12-01

    Induced dependent censorship is a general phenomenon in health service evaluation studies in which a measure such as quality-adjusted survival time or lifetime medical cost is of interest. We investigate the two-sample problem and propose two classes of nonparametric tests. Based on consistent estimation of the survival function for each sample, the two classes of test statistics examine the cumulative weighted difference in hazard functions and in survival functions. We derive a unified asymptotic null distribution theory and inference procedure. The tests are applied to trial V of the International Breast Cancer Study Group and show that long duration chemotherapy significantly improves time without symptoms of disease and toxicity of treatment as compared with the short duration treatment. Simulation studies demonstrate that the proposed tests, with a wide range of weight choices, perform well under moderate sample sizes.

  7. Experimental study on microsphere assisted nanoscope in non-contact mode

    NASA Astrophysics Data System (ADS)

    Ling, Jinzhong; Li, Dancui; Liu, Xin; Wang, Xiaorui

    2018-07-01

    Microsphere assisted nanoscope was proposed in existing literatures to capture super-resolution images of the nano-structures beneath the microsphere attached on sample surface. In this paper, a microsphere assisted nanoscope working in non-contact mode is designed and demonstrated, in which the microsphere is controlled with a gap separated to sample surface. With a gap, the microsphere is moved in parallel to sample surface non-invasively, so as to observe all the areas of interest. Furthermore, the influence of gap size on image resolution is studied experimentally. Only when the microsphere is close enough to the sample surface, super-resolution image could be obtained. Generally, the resolution decreases when the gap increases as the contribution of evanescent wave disappears. To keep an appropriate gap size, a quantitative method is implemented to estimate the gap variation by observing Newton's rings around the microsphere, serving as a real-time feedback for tuning the gap size. With a constant gap, large-area image with high resolution can be obtained during microsphere scanning. Our study of non-contact mode makes the microsphere assisted nanoscope more practicable and easier to implement.

  8. Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations*

    PubMed Central

    Feehan, Dennis M.; Salganik, Matthew J.

    2018-01-01

    The network scale-up method enables researchers to estimate the size of hidden populations, such as drug injectors and sex workers, using sampled social network data. The basic scale-up estimator offers advantages over other size estimation techniques, but it depends on problematic modeling assumptions. We propose a new generalized scale-up estimator that can be used in settings with non-random social mixing and imperfect awareness about membership in the hidden population. Further, the new estimator can be used when data are collected via complex sample designs and from incomplete sampling frames. However, the generalized scale-up estimator also requires data from two samples: one from the frame population and one from the hidden population. In some situations these data from the hidden population can be collected by adding a small number of questions to already planned studies. For other situations, we develop interpretable adjustment factors that can be applied to the basic scale-up estimator. We conclude with practical recommendations for the design and analysis of future studies. PMID:29375167

  9. Estimation of within-stratum variance for sample allocation: Foreign commodity production forecasting

    NASA Technical Reports Server (NTRS)

    Chhikara, R. S.; Perry, C. R., Jr. (Principal Investigator)

    1980-01-01

    The problem of determining the stratum variances required for an optimum sample allocation for remotely sensed crop surveys is investigated with emphasis on an approach based on the concept of stratum variance as a function of the sampling unit size. A methodology using the existing and easily available information of historical statistics is developed for obtaining initial estimates of stratum variances. The procedure is applied to variance for wheat in the U.S. Great Plains and is evaluated based on the numerical results obtained. It is shown that the proposed technique is viable and performs satisfactorily with the use of a conservative value (smaller than the expected value) for the field size and with the use of crop statistics from the small political division level.

  10. Asymmetrical flow field flow fractionation methods to characterize submicron particles: application to carbon-based aggregates and nanoplastics.

    PubMed

    Gigault, Julien; El Hadri, Hind; Reynaud, Stéphanie; Deniau, Elise; Grassl, Bruno

    2017-11-01

    In the last 10 years, asymmetrical flow field flow fractionation (AF4) has been one of the most promising approaches to characterize colloidal particles. Nevertheless, despite its potentialities, it is still considered a complex technique to set up, and the theory is difficult to apply for the characterization of complex samples containing submicron particles and nanoparticles. In the present work, we developed and propose a simple analytical strategy to rapidly determine the presence of several submicron populations in an unknown sample with one programmed AF4 method. To illustrate this method, we analyzed polystyrene particles and fullerene aggregates of size covering the whole colloidal size distribution. A global and fast AF4 method (method O) allowed us to screen the presence of particles with size ranging from 1 to 800 nm. By examination of the fractionating power F d , as proposed in the literature, convenient fractionation resolution was obtained for size ranging from 10 to 400 nm. The global F d values, as well as the steric inversion diameter, for the whole colloidal size distribution correspond to the predicted values obtained by model studies. On the basis of this method and without the channel components or mobile phase composition being changed, four isocratic subfraction methods were performed to achieve further high-resolution separation as a function of different size classes: 10-100 nm, 100-200 nm, 200-450 nm, and 450-800 nm in diameter. Finally, all the methods developed were applied in characterization of nanoplastics, which has received great attention in recent years. Graphical Absract Characterization of the nanoplastics by asymmetrical flow field flow fractionation within the colloidal size range.

  11. Power calculation for overall hypothesis testing with high-dimensional commensurate outcomes.

    PubMed

    Chi, Yueh-Yun; Gribbin, Matthew J; Johnson, Jacqueline L; Muller, Keith E

    2014-02-28

    The complexity of system biology means that any metabolic, genetic, or proteomic pathway typically includes so many components (e.g., molecules) that statistical methods specialized for overall testing of high-dimensional and commensurate outcomes are required. While many overall tests have been proposed, very few have power and sample size methods. We develop accurate power and sample size methods and software to facilitate study planning for high-dimensional pathway analysis. With an account of any complex correlation structure between high-dimensional outcomes, the new methods allow power calculation even when the sample size is less than the number of variables. We derive the exact (finite-sample) and approximate non-null distributions of the 'univariate' approach to repeated measures test statistic, as well as power-equivalent scenarios useful to generalize our numerical evaluations. Extensive simulations of group comparisons support the accuracy of the approximations even when the ratio of number of variables to sample size is large. We derive a minimum set of constants and parameters sufficient and practical for power calculation. Using the new methods and specifying the minimum set to determine power for a study of metabolic consequences of vitamin B6 deficiency helps illustrate the practical value of the new results. Free software implementing the power and sample size methods applies to a wide range of designs, including one group pre-intervention and post-intervention comparisons, multiple parallel group comparisons with one-way or factorial designs, and the adjustment and evaluation of covariate effects. Copyright © 2013 John Wiley & Sons, Ltd.

  12. African Primary Care Research: Performing surveys using questionnaires

    PubMed Central

    Mabuza, Langalibalele H.; Ogunbanjo, Gboyega A.; Mash, Bob

    2014-01-01

    The aim of this article is to provide practical guidance on conducting surveys and the use of questionnaires for postgraduate students at a Masters level who are undertaking primary care research. The article is intended to assist with writing the methods section of the research proposal and thinking through the relevant issues that apply to sample size calculation, sampling strategy, design of a questionnaire and administration of a questionnaire. The article is part of a larger series on primary care research, with other articles in the series focusing on the structure of the research proposal and the literature review, as well as quantitative data analysis. PMID:26245434

  13. African primary care research: performing surveys using questionnaires.

    PubMed

    Govender, Indiran; Mabuza, Langalibalele H; Ogunbanjo, Gboyega A; Mash, Bob

    2014-04-25

    The aim of this article is to provide practical guidance on conducting surveys and the use of questionnaires for postgraduate students at a Masters level who are undertaking primary care research. The article is intended to assist with writing the methods section of the research proposal and thinking through the relevant issues that apply to sample size calculation, sampling strategy, design of a questionnaire and administration of a questionnaire. The articleis part of a larger series on primary care research, with other articles in the series focusing on the structure of the research proposal and the literature review, as well as quantitative data analysis.

  14. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  15. Integrative genetic risk prediction using non-parametric empirical Bayes classification.

    PubMed

    Zhao, Sihai Dave

    2017-06-01

    Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.

  16. Computed Tomography to Estimate the Representative Elementary Area for Soil Porosity Measurements

    PubMed Central

    Borges, Jaqueline Aparecida Ribaski; Pires, Luiz Fernando; Belmont Pereira, André

    2012-01-01

    Computed tomography (CT) is a technique that provides images of different solid and porous materials. CT could be an ideal tool to study representative sizes of soil samples because of the noninvasive characteristic of this technique. The scrutiny of such representative elementary sizes (RESs) has been the target of attention of many researchers related to soil physics field owing to the strong relationship between physical properties and size of the soil sample. In the current work, data from gamma-ray CT were used to assess RES in measurements of soil porosity (ϕ). For statistical analysis, a study on the full width at a half maximum (FWHM) of the adjustment of distribution of ϕ at different areas (1.2 to 1162.8 mm2) selected inside of tomographic images was proposed herein. The results obtained point out that samples with a section area corresponding to at least 882.1 mm2 were the ones that provided representative values of ϕ for the studied Brazilian tropical soil. PMID:22666133

  17. Integrative Analysis of Cancer Diagnosis Studies with Composite Penalization

    PubMed Central

    Liu, Jin; Huang, Jian; Ma, Shuangge

    2013-01-01

    Summary In cancer diagnosis studies, high-throughput gene profiling has been extensively conducted, searching for genes whose expressions may serve as markers. Data generated from such studies have the “large d, small n” feature, with the number of genes profiled much larger than the sample size. Penalization has been extensively adopted for simultaneous estimation and marker selection. Because of small sample sizes, markers identified from the analysis of single datasets can be unsatisfactory. A cost-effective remedy is to conduct integrative analysis of multiple heterogeneous datasets. In this article, we investigate composite penalization methods for estimation and marker selection in integrative analysis. The proposed methods use the minimax concave penalty (MCP) as the outer penalty. Under the homogeneity model, the ridge penalty is adopted as the inner penalty. Under the heterogeneity model, the Lasso penalty and MCP are adopted as the inner penalty. Effective computational algorithms based on coordinate descent are developed. Numerical studies, including simulation and analysis of practical cancer datasets, show satisfactory performance of the proposed methods. PMID:24578589

  18. Modified Distribution-Free Goodness-of-Fit Test Statistic.

    PubMed

    Chun, So Yeon; Browne, Michael W; Shapiro, Alexander

    2018-03-01

    Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.

  19. The Effects of Small Sample Size on Identifying Polytomous DIF Using the Liu-Agresti Estimator of the Cumulative Common Odds Ratio

    ERIC Educational Resources Information Center

    Carvajal, Jorge; Skorupski, William P.

    2010-01-01

    This study is an evaluation of the behavior of the Liu-Agresti estimator of the cumulative common odds ratio when identifying differential item functioning (DIF) with polytomously scored test items using small samples. The Liu-Agresti estimator has been proposed by Penfield and Algina as a promising approach for the study of polytomous DIF but no…

  20. On the comparison of the strength of morphological integration across morphometric datasets.

    PubMed

    Adams, Dean C; Collyer, Michael L

    2016-11-01

    Evolutionary morphologists frequently wish to understand the extent to which organisms are integrated, and whether the strength of morphological integration among subsets of phenotypic variables differ among taxa or other groups. However, comparisons of the strength of integration across datasets are difficult, in part because the summary measures that characterize these patterns (RV coefficient and r PLS ) are dependent both on sample size and on the number of variables. As a solution to this issue, we propose a standardized test statistic (a z-score) for measuring the degree of morphological integration between sets of variables. The approach is based on a partial least squares analysis of trait covariation, and its permutation-based sampling distribution. Under the null hypothesis of a random association of variables, the method displays a constant expected value and confidence intervals for datasets of differing sample sizes and variable number, thereby providing a consistent measure of integration suitable for comparisons across datasets. A two-sample test is also proposed to statistically determine whether levels of integration differ between datasets, and an empirical example examining cranial shape integration in Mediterranean wall lizards illustrates its use. Some extensions of the procedure are also discussed. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.

  1. a Novel Deep Convolutional Neural Network for Spectral-Spatial Classification of Hyperspectral Data

    NASA Astrophysics Data System (ADS)

    Li, N.; Wang, C.; Zhao, H.; Gong, X.; Wang, D.

    2018-04-01

    Spatial and spectral information are obtained simultaneously by hyperspectral remote sensing. Joint extraction of these information of hyperspectral image is one of most import methods for hyperspectral image classification. In this paper, a novel deep convolutional neural network (CNN) is proposed, which extracts spectral-spatial information of hyperspectral images correctly. The proposed model not only learns sufficient knowledge from the limited number of samples, but also has powerful generalization ability. The proposed framework based on three-dimensional convolution can extract spectral-spatial features of labeled samples effectively. Though CNN has shown its robustness to distortion, it cannot extract features of different scales through the traditional pooling layer that only have one size of pooling window. Hence, spatial pyramid pooling (SPP) is introduced into three-dimensional local convolutional filters for hyperspectral classification. Experimental results with a widely used hyperspectral remote sensing dataset show that the proposed model provides competitive performance.

  2. Evaluation of single and two-stage adaptive sampling designs for estimation of density and abundance of freshwater mussels in a large river

    USGS Publications Warehouse

    Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.

    2011-01-01

    Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.

  3. A Mars Sample Return Sample Handling System

    NASA Technical Reports Server (NTRS)

    Wilson, David; Stroker, Carol

    2013-01-01

    We present a sample handling system, a subsystem of the proposed Dragon landed Mars Sample Return (MSR) mission [1], that can return to Earth orbit a significant mass of frozen Mars samples potentially consisting of: rock cores, subsurface drilled rock and ice cuttings, pebble sized rocks, and soil scoops. The sample collection, storage, retrieval and packaging assumptions and concepts in this study are applicable for the NASA's MPPG MSR mission architecture options [2]. Our study assumes a predecessor rover mission collects samples for return to Earth to address questions on: past life, climate change, water history, age dating, understanding Mars interior evolution [3], and, human safety and in-situ resource utilization. Hence the rover will have "integrated priorities for rock sampling" [3] that cover collection of subaqueous or hydrothermal sediments, low-temperature fluidaltered rocks, unaltered igneous rocks, regolith and atmosphere samples. Samples could include: drilled rock cores, alluvial and fluvial deposits, subsurface ice and soils, clays, sulfates, salts including perchlorates, aeolian deposits, and concretions. Thus samples will have a broad range of bulk densities, and require for Earth based analysis where practical: in-situ characterization, management of degradation such as perchlorate deliquescence and volatile release, and contamination management. We propose to adopt a sample container with a set of cups each with a sample from a specific location. We considered two sample cups sizes: (1) a small cup sized for samples matching those submitted to in-situ characterization instruments, and, (2) a larger cup for 100 mm rock cores [4] and pebble sized rocks, thus providing diverse samples and optimizing the MSR sample mass payload fraction for a given payload volume. We minimize sample degradation by keeping them frozen in the MSR payload sample canister using Peltier chip cooling. The cups are sealed by interference fitted heat activated memory alloy caps [5] if the heating does not affect the sample, or by crimping caps similar to bottle capping. We prefer cap sealing surfaces be external to the cup rim to prevent sample dust inside the cups interfering with sealing, or, contamination of the sample by Teflon seal elements (if adopted). Finally the sample collection rover, or a Fetch rover, selects cups with best choice samples and loads them into a sample tray, before delivering it to the Earth Return Vehicle (ERV) in the MSR Dragon capsule as described in [1] (Fig 1). This ensures best use of the MSR payload mass allowance. A 3 meter long jointed robot arm is extended from the Dragon capsule's crew hatch, retrieves the sample tray and inserts it into the sample canister payload located on the ERV stage. The robot arm has capacity to obtain grab samples in the event of a rover failure. The sample canister has a robot arm capture casting to enable capture by crewed or robot spacecraft when it returns to Earth orbit

  4. ADAPTIVE ANNEALED IMPORTANCE SAMPLING FOR MULTIMODAL POSTERIOR EXPLORATION AND MODEL SELECTION WITH APPLICATION TO EXTRASOLAR PLANET DETECTION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Bin, E-mail: bins@ieee.org

    2014-07-01

    We describe an algorithm that can adaptively provide mixture summaries of multimodal posterior distributions. The parameter space of the involved posteriors ranges in size from a few dimensions to dozens of dimensions. This work was motivated by an astrophysical problem called extrasolar planet (exoplanet) detection, wherein the computation of stochastic integrals that are required for Bayesian model comparison is challenging. The difficulty comes from the highly nonlinear models that lead to multimodal posterior distributions. We resort to importance sampling (IS) to estimate the integrals, and thus translate the problem to be how to find a parametric approximation of the posterior.more » To capture the multimodal structure in the posterior, we initialize a mixture proposal distribution and then tailor its parameters elaborately to make it resemble the posterior to the greatest extent possible. We use the effective sample size (ESS) calculated based on the IS draws to measure the degree of approximation. The bigger the ESS is, the better the proposal resembles the posterior. A difficulty within this tailoring operation lies in the adjustment of the number of mixing components in the mixture proposal. Brute force methods just preset it as a large constant, which leads to an increase in the required computational resources. We provide an iterative delete/merge/add process, which works in tandem with an expectation-maximization step to tailor such a number online. The efficiency of our proposed method is tested via both simulation studies and real exoplanet data analysis.« less

  5. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

    PubMed

    Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

    2017-03-15

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  6. Declustering of clustered preferential sampling for histogram and semivariogram inference

    USGS Publications Warehouse

    Olea, R.A.

    2007-01-01

    Measurements of attributes obtained more as a consequence of business ventures than sampling design frequently result in samplings that are preferential both in location and value, typically in the form of clusters along the pay. Preferential sampling requires preprocessing for the purpose of properly inferring characteristics of the parent population, such as the cumulative distribution and the semivariogram. Consideration of the distance to the nearest neighbor allows preparation of resampled sets that produce comparable results to those from previously proposed methods. Clustered sampling of size 140, taken from an exhaustive sampling, is employed to illustrate this approach. ?? International Association for Mathematical Geology 2007.

  7. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.

    PubMed

    Hero, Alfred O; Rajaratnam, Bala

    2016-01-01

    When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.

  8. MEPAG Recommendations for a 2018 Mars Sample Return Caching Lander - Sample Types, Number, and Sizes

    NASA Technical Reports Server (NTRS)

    Allen, Carlton C.

    2011-01-01

    The return to Earth of geological and atmospheric samples from the surface of Mars is among the highest priority objectives of planetary science. The MEPAG Mars Sample Return (MSR) End-to-End International Science Analysis Group (MEPAG E2E-iSAG) was chartered to propose scientific objectives and priorities for returned sample science, and to map out the implications of these priorities, including for the proposed joint ESA-NASA 2018 mission that would be tasked with the crucial job of collecting and caching the samples. The E2E-iSAG identified four overarching scientific aims that relate to understanding: (A) the potential for life and its pre-biotic context, (B) the geologic processes that have affected the martian surface, (C) planetary evolution of Mars and its atmosphere, (D) potential for future human exploration. The types of samples deemed most likely to achieve the science objectives are, in priority order: (1A). Subaqueous or hydrothermal sediments (1B). Hydrothermally altered rocks or low temperature fluid-altered rocks (equal priority) (2). Unaltered igneous rocks (3). Regolith, including airfall dust (4). Present-day atmosphere and samples of sedimentary-igneous rocks containing ancient trapped atmosphere Collection of geologically well-characterized sample suites would add considerable value to interpretations of all collected rocks. To achieve this, the total number of rock samples should be about 30-40. In order to evaluate the size of individual samples required to meet the science objectives, the E2E-iSAG reviewed the analytical methods that would likely be applied to the returned samples by preliminary examination teams, for planetary protection (i.e., life detection, biohazard assessment) and, after distribution, by individual investigators. It was concluded that sample size should be sufficient to perform all high-priority analyses in triplicate. In keeping with long-established curatorial practice of extraterrestrial material, at least 40% by mass of each sample should be preserved to support future scientific investigations. Samples of 15-16 grams are considered optimal. The total mass of returned rocks, soils, blanks and standards should be approximately 500 grams. Atmospheric gas samples should be the equivalent of 50 cubic cm at 20 times Mars ambient atmospheric pressure.

  9. A multi-particle crushing apparatus for studying rock fragmentation due to repeated impacts

    NASA Astrophysics Data System (ADS)

    Huang, S.; Mohanty, B.; Xia, K.

    2017-12-01

    Rock crushing is a common process in mining and related operations. Although a number of particle crushing tests have been proposed in the literature, most of them are concerned with single-particle crushing, i.e., a single rock sample is crushed in each test. Considering the realistic scenario in crushers where many fragments are involved, a laboratory crushing apparatus is developed in this study. This device consists of a Hopkinson pressure bar system and a piston-holder system. The Hopkinson pressure bar system is used to apply calibrated dynamic loads to the piston-holder system, and the piston-holder system is used to hold rock samples and to recover fragments for subsequent particle size analysis. The rock samples are subjected to three to seven impacts under three impact velocities (2.2, 3.8, and 5.0 m/s), with the feed size of the rock particle samples limited between 9.5 and 12.7 mm. Several key parameters are determined from this test, including particle size distribution parameters, impact velocity, loading pressure, and total work. The results show that the total work correlates well with resulting fragmentation size distribution, and the apparatus provides a useful tool for studying the mechanism of crushing, which further provides guidelines for the design of commercial crushers.

  10. Analysis of spatial distribution of land cover maps accuracy

    NASA Astrophysics Data System (ADS)

    Khatami, R.; Mountrakis, G.; Stehman, S. V.

    2017-12-01

    Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.

  11. A Sorting Statistic with Application in Neurological Magnetic Resonance Imaging of Autism.

    PubMed

    Levman, Jacob; Takahashi, Emi; Forgeron, Cynthia; MacDonald, Patrick; Stewart, Natalie; Lim, Ashley; Martel, Anne

    2018-01-01

    Effect size refers to the assessment of the extent of differences between two groups of samples on a single measurement. Assessing effect size in medical research is typically accomplished with Cohen's d statistic. Cohen's d statistic assumes that average values are good estimators of the position of a distribution of numbers and also assumes Gaussian (or bell-shaped) underlying data distributions. In this paper, we present an alternative evaluative statistic that can quantify differences between two data distributions in a manner that is similar to traditional effect size calculations; however, the proposed approach avoids making assumptions regarding the shape of the underlying data distribution. The proposed sorting statistic is compared with Cohen's d statistic and is demonstrated to be capable of identifying feature measurements of potential interest for which Cohen's d statistic implies the measurement would be of little use. This proposed sorting statistic has been evaluated on a large clinical autism dataset from Boston Children's Hospital , Harvard Medical School , demonstrating that it can potentially play a constructive role in future healthcare technologies.

  12. A Sorting Statistic with Application in Neurological Magnetic Resonance Imaging of Autism

    PubMed Central

    Takahashi, Emi; Lim, Ashley; Martel, Anne

    2018-01-01

    Effect size refers to the assessment of the extent of differences between two groups of samples on a single measurement. Assessing effect size in medical research is typically accomplished with Cohen's d statistic. Cohen's d statistic assumes that average values are good estimators of the position of a distribution of numbers and also assumes Gaussian (or bell-shaped) underlying data distributions. In this paper, we present an alternative evaluative statistic that can quantify differences between two data distributions in a manner that is similar to traditional effect size calculations; however, the proposed approach avoids making assumptions regarding the shape of the underlying data distribution. The proposed sorting statistic is compared with Cohen's d statistic and is demonstrated to be capable of identifying feature measurements of potential interest for which Cohen's d statistic implies the measurement would be of little use. This proposed sorting statistic has been evaluated on a large clinical autism dataset from Boston Children's Hospital, Harvard Medical School, demonstrating that it can potentially play a constructive role in future healthcare technologies. PMID:29796236

  13. Crystallization to polycrystalline silicon thin film and simultaneous inactivation of electrical defects by underwater laser annealing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Machida, Emi; Research Fellowships of the Japan Society for the Promotion of Science, Japan Society for the Promotion of Science, 1-8 Chiyoda, Tokyo 102-8472; Horita, Masahiro

    2012-12-17

    We propose a low-temperature laser annealing method of a underwater laser annealing (WLA) for polycrystalline silicon (poly-Si) films. We performed crystallization to poly-Si films by laser irradiation in flowing deionized-water where KrF excimer laser was used for annealing. We demonstrated that the maximum value of maximum grain size of WLA samples was 1.5 {mu}m, and that of the average grain size was 2.8 times larger than that of conventional laser annealing in air (LA) samples. Moreover, WLA forms poly-Si films which show lower conductivity and larger carrier life time attributed to fewer electrical defects as compared to LA poly-Si films.

  14. Micro Electron MicroProbe and Sample Analyzer

    NASA Technical Reports Server (NTRS)

    Manohara, Harish; Bearman, Gregory; Douglas, Susanne; Bronikowski, Michael; Urgiles, Eduardo; Kowalczyk, Robert; Bryson, Charles

    2009-01-01

    A proposed, low-power, backpack-sized instrument, denoted the micro electron microprobe and sample analyzer (MEMSA), would serve as a means of rapidly performing high-resolution microscopy and energy-dispersive x-ray spectroscopy (EDX) of soil, dust, and rock particles in the field. The MEMSA would be similar to an environmental scanning electron microscope (ESEM) but would be much smaller and designed specifically for field use in studying effects of geological alteration at the micrometer scale. Like an ESEM, the MEMSA could be used to examine uncoated, electrically nonconductive specimens. In addition to the difference in size, other significant differences between the MEMSA and an ESEM lie in the mode of scanning and the nature of the electron source.

  15. as response to seasonal variability

    PubMed

    Badano, Ernesto I; Labra, Fabio A; Martínez-Pérez, Cecilia G; Vergara, Carlos H

    2016-03-01

    Ecologists have been largely interested in the description and understanding of the power scaling relationships between body size and abundance of organisms. Many studies have focused on estimating the exponents of these functions across taxonomic groups and spatial scales, to draw inferences about the processes underlying this pattern. The exponents of these functions usually approximate -3/4 at geographical scales, but they deviate from this value when smaller spatial extensions are considered. This has led to propose that body size-abundance relationships at small spatial scales may reflect the impact of environmental changes. This study tests this hypothesis by examining body size spectra of benthic shrimps (Decapoda: Caridea) and snails (Gastropoda) in the Tamiahua lagoon, a brackish body water located in the Eastern coast of Mexico. We mea- sured water quality parameters (dissolved oxygen, salinity, pH, water temperature, sediment organic matter and chemical oxygen demand) and sampled benthic macrofauna during three different climatic conditions of the year (cold, dry and rainy season). Given the small size of most individuals in the benthic macrofaunal samples, we used body volume, instead of weight, to estimate their body size. Body size-abundance relationships of both taxonomic groups were described by tabulating data from each season into base-2 logarithmic body size bins. In both taxonomic groups, observed frequencies per body size class in each season were standardized to yield densities (i.e., individuals/m(3)). Nonlinear regression analyses were separately performed for each taxonomic group at each season to assess whether body size spectra followed power scaling functions. Additionally, for each taxonomic group, multiple regression analyses were used to determine whether these relationships varied among seasons. Our results indicated that, while body size-abundance relationships in both taxonomic groups followed power functions, the parameters defining the shape of these relationships varied among seasons. These variations in the parameters of the body size-abundance relationships seems to be related to changes in the abundance of individuals within the different body size classes, which seems to follow the seasonal changes that occur in the environmental conditions of the lagoon. Thus, we propose that these body size-abundance relation- ships are influenced by the frequency and intensity of environmental changes affecting this ecosystem.

  16. 78 FR 26033 - Agency Forms Undergoing Paperwork Reduction Act Review

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-03

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Collection of Qualitative Feedback on Agency Service Delivery--NEW--Epidemiology and Analysis Program Office... Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to OMB for approval under...

  17. 77 FR 27062 - Agency Forms Undergoing Paperwork Reduction Act Review

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-08

    ... calculations that justify the proposed sample size, the expected response rate, methods for assessing potential... Project NIOSH Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery--NEW... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  18. Improving the quality of biomarker discovery research: the right samples and enough of them.

    PubMed

    Pepe, Margaret S; Li, Christopher I; Feng, Ziding

    2015-06-01

    Biomarker discovery research has yielded few biomarkers that validate for clinical use. A contributing factor may be poor study designs. The goal in discovery research is to identify a subset of potentially useful markers from a large set of candidates assayed on case and control samples. We recommend the PRoBE design for selecting samples. We propose sample size calculations that require specifying: (i) a definition for biomarker performance; (ii) the proportion of useful markers the study should identify (Discovery Power); and (iii) the tolerable number of useless markers amongst those identified (False Leads Expected, FLE). We apply the methodology to a study of 9,000 candidate biomarkers for risk of colon cancer recurrence where a useful biomarker has positive predictive value ≥ 30%. We find that 40 patients with recurrence and 160 without recurrence suffice to filter out 98% of useless markers (2% FLE) while identifying 95% of useful biomarkers (95% Discovery Power). Alternative methods for sample size calculation required more assumptions. Biomarker discovery research should utilize quality biospecimen repositories and include sample sizes that enable markers meeting prespecified performance characteristics for well-defined clinical applications to be identified. The scientific rigor of discovery research should be improved. ©2015 American Association for Cancer Research.

  19. MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.

    PubMed

    Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi

    2018-03-05

    Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.

  20. Systematic review of proposed definitions of nocturnal polyuria and population-based evidence of their diagnostic accuracy.

    PubMed

    Olesen, Tine Kold; Denys, Marie-Astrid; Vande Walle, Johan; Everaert, Karel

    2018-02-06

    Background Evidence of diagnostic accuracy for proposed definitions of nocturnal polyuria is currently unclear. Purpose Systematic review to determine population-based evidence of the diagnostic accuracy of proposed definitions of nocturnal polyuria based on data from frequency-volume charts. Methods Seventeen pre-specified search terms identified 351 unique investigations published from 1990 to 2016 in BIOSIS, Embase, Embase Alerts, International Pharmaceutical Abstract, Medline, and Cochrane. Thirteen original communications were included in this review based on pre-specified exclusion criteria. Data were extracted from each paper regarding subject age, sex, ethnicity, health status, sample size, data collection methods, and diagnostic discrimination of proposed definitions including sensitivity, specificity, positive and negative predictive value. Results The sample size of study cohorts, participant age, sex, ethnicity, and health status varied considerably in 13 studies reporting on the diagnostic performance of seven different definitions of nocturnal polyuria using frequency-volume chart data from 4968 participants. Most study cohorts were small, mono-ethnic, including only Caucasian males aged 50 or higher with primary or secondary polyuria that were compared to a control group of healthy men without nocturia in prospective or retrospective settings. Proposed definitions had poor discriminatory accuracy in evaluations based on data from subjects independent from the original study cohorts with findings being similar regarding the most widely evaluated definition endorsed by ICS. Conclusions Diagnostic performance characteristics for proposed definitions of nocturnal polyuria show poor to modest discrimination and are not based on sufficient level of evidence from representative, multi-ethnic population-based data from both females and males of all adult ages.

  1. Overlap between treatment and control distributions as an effect size measure in experiments.

    PubMed

    Hedges, Larry V; Olkin, Ingram

    2016-03-01

    The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).

  2. Macrostructure from Microstructure: Generating Whole Systems from Ego Networks

    PubMed Central

    Smith, Jeffrey A.

    2014-01-01

    This paper presents a new simulation method to make global network inference from sampled data. The proposed simulation method takes sampled ego network data and uses Exponential Random Graph Models (ERGM) to reconstruct the features of the true, unknown network. After describing the method, the paper presents two validity checks of the approach: the first uses the 20 largest Add Health networks while the second uses the Sociology Coauthorship network in the 1990's. For each test, I take random ego network samples from the known networks and use my method to make global network inference. I find that my method successfully reproduces the properties of the networks, such as distance and main component size. The results also suggest that simpler, baseline models provide considerably worse estimates for most network properties. I end the paper by discussing the bounds/limitations of ego network sampling. I also discuss possible extensions to the proposed approach. PMID:25339783

  3. 2005 Earth-Mars Round Trip

    NASA Technical Reports Server (NTRS)

    2000-01-01

    This paper presents, in viewgraph form, the 2005 Earth-Mars Round Trip. The contents include: 1) Lander; 2) Mars Sample Return Project; 3) Rover; 4) Rover Size Comparison; 5) Mars Ascent Vehicle; 6) Return Orbiter; 7) A New Mars Surveyor Program Architecture; 8) Definition Study Summary Result; 9) Mars Surveyor Proposed Architecture 2003, 2005 Opportunities; 10) Mars Micromissions Using Ariane 5; 11) Potential International Partnerships; 12) Proposed Integrated Architecture; and 13) Mars Exploration Program Report of the Architecture Team.

  4. Computationally efficient algorithm for high sampling-frequency operation of active noise control

    NASA Astrophysics Data System (ADS)

    Rout, Nirmal Kumar; Das, Debi Prasad; Panda, Ganapati

    2015-05-01

    In high sampling-frequency operation of active noise control (ANC) system the length of the secondary path estimate and the ANC filter are very long. This increases the computational complexity of the conventional filtered-x least mean square (FXLMS) algorithm. To reduce the computational complexity of long order ANC system using FXLMS algorithm, frequency domain block ANC algorithms have been proposed in past. These full block frequency domain ANC algorithms are associated with some disadvantages such as large block delay, quantization error due to computation of large size transforms and implementation difficulties in existing low-end DSP hardware. To overcome these shortcomings, the partitioned block ANC algorithm is newly proposed where the long length filters in ANC are divided into a number of equal partitions and suitably assembled to perform the FXLMS algorithm in the frequency domain. The complexity of this proposed frequency domain partitioned block FXLMS (FPBFXLMS) algorithm is quite reduced compared to the conventional FXLMS algorithm. It is further reduced by merging one fast Fourier transform (FFT)-inverse fast Fourier transform (IFFT) combination to derive the reduced structure FPBFXLMS (RFPBFXLMS) algorithm. Computational complexity analysis for different orders of filter and partition size are presented. Systematic computer simulations are carried out for both the proposed partitioned block ANC algorithms to show its accuracy compared to the time domain FXLMS algorithm.

  5. Sparse feature learning for instrument identification: Effects of sampling and pooling methods.

    PubMed

    Han, Yoonchang; Lee, Subin; Nam, Juhan; Lee, Kyogu

    2016-05-01

    Feature learning for music applications has recently received considerable attention from many researchers. This paper reports on the sparse feature learning algorithm for musical instrument identification, and in particular, focuses on the effects of the frame sampling techniques for dictionary learning and the pooling methods for feature aggregation. To this end, two frame sampling techniques are examined that are fixed and proportional random sampling. Furthermore, the effect of using onset frame was analyzed for both of proposed sampling methods. Regarding summarization of the feature activation, a standard deviation pooling method is used and compared with the commonly used max- and average-pooling techniques. Using more than 47 000 recordings of 24 instruments from various performers, playing styles, and dynamics, a number of tuning parameters are experimented including the analysis frame size, the dictionary size, and the type of frequency scaling as well as the different sampling and pooling methods. The results show that the combination of proportional sampling and standard deviation pooling achieve the best overall performance of 95.62% while the optimal parameter set varies among the instrument classes.

  6. Frail or hale: Skeletal frailty indices in Medieval London skeletons

    PubMed Central

    Crews, Douglas E.

    2017-01-01

    To broaden bioarchaeological applicability of skeletal frailty indices (SFIs) and increase sample size, we propose indices with fewer biomarkers (2–11 non-metric biomarkers) and compare these reduced biomarker SFIs to the original metric/non-metric 13-biomarker SFI. From the 2-11-biomarker SFIs, we choose the index with the fewest biomarkers (6-biomarker SFI), which still maintains the statistical robusticity of a 13-biomarker SFI, and apply this index to the same Medieval monastic and nonmonastic populations, albeit with an increased sample size. For this increased monastic and nonmonastic sample, we also propose and implement a 4-biomarker SFI, comprised of biomarkers from each of four stressor categories, and compare these SFI distributions with those of the non-metric biomarker SFIs. From the Museum of London WORD database, we tabulate multiple SFIs (2- to 13-biomarkers) for Medieval monastic and nonmonastic samples (N = 134). We evaluate associations between these ten non-metric SFIs and the 13-biomarker SFI using Spearman’s correlation coefficients. Subsequently, we test non-metric 6-biomarker and 4-biomarker SFI distributions for associations with cemetery, age, and sex using Analysis of Variance/Covariance (ANOVA/ANCOVA) on larger samples from the monastic and nonmonastic cemeteries (N = 517). For Medieval samples, Spearman’s correlation coefficients show a significant association between the 13-biomarker SFI and all non-metric SFIs. Utilizing a 6-biomarker and parsimonious 4-biomarker SFI, we increase the nonmonastic and monastic samples and demonstrate significant lifestyle and sex differences in frailty that were not observed in the original, smaller sample. Results from the 6-biomarker and parsimonious 4-biomarker SFIs generally indicate similarities in means, explained variation (R2), and associated P-values (ANOVA/ANCOVA) within and between nonmonastic and monastic samples. We show that non-metric reduced biomarker SFIs provide alternative indices for application to other bioarchaeological collections. These findings suggest that a SFI, comprised of six or more non-metric biomarkers available for the specific sample, may have greater applicability than, but comparable statistical characteristics to, the originally proposed 13-biomarker SFI. PMID:28467438

  7. Stratum variance estimation for sample allocation in crop surveys. [Great Plains Corridor

    NASA Technical Reports Server (NTRS)

    Perry, C. R., Jr.; Chhikara, R. S. (Principal Investigator)

    1980-01-01

    The problem of determining stratum variances needed in achieving an optimum sample allocation for crop surveys by remote sensing is investigated by considering an approach based on the concept of stratum variance as a function of the sampling unit size. A methodology using the existing and easily available information of historical crop statistics is developed for obtaining initial estimates of tratum variances. The procedure is applied to estimate stratum variances for wheat in the U.S. Great Plains and is evaluated based on the numerical results thus obtained. It is shown that the proposed technique is viable and performs satisfactorily, with the use of a conservative value for the field size and the crop statistics from the small political subdivision level, when the estimated stratum variances were compared to those obtained using the LANDSAT data.

  8. A time-varying effect model for examining group differences in trajectories of zero-inflated count outcomes with applications in substance abuse research.

    PubMed

    Yang, Songshan; Cranford, James A; Jester, Jennifer M; Li, Runze; Zucker, Robert A; Buu, Anne

    2017-02-28

    This study proposes a time-varying effect model for examining group differences in trajectories of zero-inflated count outcomes. The motivating example demonstrates that this zero-inflated Poisson model allows investigators to study group differences in different aspects of substance use (e.g., the probability of abstinence and the quantity of alcohol use) simultaneously. The simulation study shows that the accuracy of estimation of trajectory functions improves as the sample size increases; the accuracy under equal group sizes is only higher when the sample size is small (100). In terms of the performance of the hypothesis testing, the type I error rates are close to their corresponding significance levels under all settings. Furthermore, the power increases as the alternative hypothesis deviates more from the null hypothesis, and the rate of this increasing trend is higher when the sample size is larger. Moreover, the hypothesis test for the group difference in the zero component tends to be less powerful than the test for the group difference in the Poisson component. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. Using pilot data to size a two-arm randomized trial to find a nearly optimal personalized treatment strategy.

    PubMed

    Laber, Eric B; Zhao, Ying-Qi; Regh, Todd; Davidian, Marie; Tsiatis, Anastasios; Stanford, Joseph B; Zeng, Donglin; Song, Rui; Kosorok, Michael R

    2016-04-15

    A personalized treatment strategy formalizes evidence-based treatment selection by mapping patient information to a recommended treatment. Personalized treatment strategies can produce better patient outcomes while reducing cost and treatment burden. Thus, among clinical and intervention scientists, there is a growing interest in conducting randomized clinical trials when one of the primary aims is estimation of a personalized treatment strategy. However, at present, there are no appropriate sample size formulae to assist in the design of such a trial. Furthermore, because the sampling distribution of the estimated outcome under an estimated optimal treatment strategy can be highly sensitive to small perturbations in the underlying generative model, sample size calculations based on standard (uncorrected) asymptotic approximations or computer simulations may not be reliable. We offer a simple and robust method for powering a single stage, two-armed randomized clinical trial when the primary aim is estimating the optimal single stage personalized treatment strategy. The proposed method is based on inverting a plugin projection confidence interval and is thereby regular and robust to small perturbations of the underlying generative model. The proposed method requires elicitation of two clinically meaningful parameters from clinical scientists and uses data from a small pilot study to estimate nuisance parameters, which are not easily elicited. The method performs well in simulated experiments and is illustrated using data from a pilot study of time to conception and fertility awareness. Copyright © 2015 John Wiley & Sons, Ltd.

  10. Linear models for airborne-laser-scanning-based operational forest inventory with small field sample size and highly correlated LiDAR data

    USGS Publications Warehouse

    Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.

    2015-01-01

    Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.

  11. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization.

    PubMed

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between logistical constraints and additional sampling performance should be carefully evaluated.

  12. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization

    PubMed Central

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A.

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between logistical constraints and additional sampling performance should be carefully evaluated. PMID:28334046

  13. Evolutionary Trends and the Salience Bias (with Apologies to Oil Tankers, Karl Marx, and Others).

    ERIC Educational Resources Information Center

    McShea, Daniel W.

    1994-01-01

    Examines evolutionary trends, specifically trends in size, complexity, and fitness. Notes that documentation of these trends consists of either long lists of cases, or descriptions of a small number of salient cases. Proposes the use of random samples to avoid this "saliency bias." (SR)

  14. 76 FR 36139 - Agency Information Collection Activities: Submission for OMB Review; Comment Request; Generic...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-21

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: Federal Emergency...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery'' to the Office of...

  15. 76 FR 29763 - Agency Information Collection Activities; Submission for Office of Management and Budget Review...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-23

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Request; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY... ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery.'' Also include...

  16. 77 FR 63798 - Agency Information Collection Activities: Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-17

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Clearance for the Collection of Qualitative Feedback on the Service Delivery of the Consumer Financial... title, ``Generic Clearance for the Collection of Qualitative Feedback on the Service Delivery of the...

  17. 76 FR 37825 - Agency Information Collection Activities; Generic Clearance for the Collection of Qualitative...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-28

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Activities; Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: U...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery '' to OMB for...

  18. 76 FR 13020 - Agency Information Collection Activities: Comment Request; Generic Clearance for the Collection...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: Department of... Collection Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency...

  19. 77 FR 28893 - Intent To Request Approval From OMB of One New Public Collection of Information: Generic...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-16

    ... that justify the proposed sample size, the expected response rate, methods for assessing potential non... Qualitative Feedback on Agency Service Delivery AGENCY: Transportation Security Administration, DHS. ACTION... collection activity provides a means to gather qualitative customer and stakeholder feedback in an efficient...

  20. 76 FR 13019 - Agency Information Collection Activities: Comment Request; Generic Clearance for the Collection...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods... Clearance for the Collection of Qualitative Feedback on Agency Service Delivery AGENCY: Department of...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery '' to OMB for...

  1. 76 FR 21800 - Agency Information Collection Activities: Submission for OMB Review; Comment Request; Generic...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-18

    ... precision requirements or power calculations that justify the proposed sample size, the expected response... Activities: Submission for OMB Review; Comment Request; Generic Clearance for the Collection of Qualitative... Request (Generic ICR): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service...

  2. Profiling Local Optima in K-Means Clustering: Developing a Diagnostic Technique

    ERIC Educational Resources Information Center

    Steinley, Douglas

    2006-01-01

    Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate…

  3. 78 FR 70015 - Proposed Information Collection; Comment Request; Large Pelagic Fishing Survey

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-22

    ...) target sample size from 10,780 to 15,900 interviews for Northeast and Southeast combined. Add up to five questions to the LPTS questionnaire. Add a non-response follow-up survey to the LPTS in the Southeast region... from 1,500 to 1,000 interviews. [[Page 70016

  4. Robustness of methods for blinded sample size re-estimation with overdispersed count data.

    PubMed

    Schneider, Simon; Schmidli, Heinz; Friede, Tim

    2013-09-20

    Counts of events are increasingly common as primary endpoints in randomized clinical trials. With between-patient heterogeneity leading to variances in excess of the mean (referred to as overdispersion), statistical models reflecting this heterogeneity by mixtures of Poisson distributions are frequently employed. Sample size calculation in the planning of such trials requires knowledge on the nuisance parameters, that is, the control (or overall) event rate and the overdispersion parameter. Usually, there is only little prior knowledge regarding these parameters in the design phase resulting in considerable uncertainty regarding the sample size. In this situation internal pilot studies have been found very useful and very recently several blinded procedures for sample size re-estimation have been proposed for overdispersed count data, one of which is based on an EM-algorithm. In this paper we investigate the EM-algorithm based procedure with respect to aspects of their implementation by studying the algorithm's dependence on the choice of convergence criterion and find that the procedure is sensitive to the choice of the stopping criterion in scenarios relevant to clinical practice. We also compare the EM-based procedure to other competing procedures regarding their operating characteristics such as sample size distribution and power. Furthermore, the robustness of these procedures to deviations from the model assumptions is explored. We find that some of the procedures are robust to at least moderate deviations. The results are illustrated using data from the US National Heart, Lung and Blood Institute sponsored Asymptomatic Cardiac Ischemia Pilot study. Copyright © 2013 John Wiley & Sons, Ltd.

  5. Assessment of optimum threshold and particle shape parameter for the image analysis of aggregate size distribution of concrete sections

    NASA Astrophysics Data System (ADS)

    Ozen, Murat; Guler, Murat

    2014-02-01

    Aggregate gradation is one of the key design parameters affecting the workability and strength properties of concrete mixtures. Estimating aggregate gradation from hardened concrete samples can offer valuable insights into the quality of mixtures in terms of the degree of segregation and the amount of deviation from the specified gradation limits. In this study, a methodology is introduced to determine the particle size distribution of aggregates from 2D cross sectional images of concrete samples. The samples used in the study were fabricated from six mix designs by varying the aggregate gradation, aggregate source and maximum aggregate size with five replicates of each design combination. Each sample was cut into three pieces using a diamond saw and then scanned to obtain the cross sectional images using a desktop flatbed scanner. An algorithm is proposed to determine the optimum threshold for the image analysis of the cross sections. A procedure was also suggested to determine a suitable particle shape parameter to be used in the analysis of aggregate size distribution within each cross section. Results of analyses indicated that the optimum threshold hence the pixel distribution functions may be different even for the cross sections of an identical concrete sample. Besides, the maximum ferret diameter is the most suitable shape parameter to estimate the size distribution of aggregates when computed based on the diagonal sieve opening. The outcome of this study can be of practical value for the practitioners to evaluate concrete in terms of the degree of segregation and the bounds of mixture's gradation achieved during manufacturing.

  6. How conservative is Fisher's exact test? A quantitative evaluation of the two-sample comparative binomial trial.

    PubMed

    Crans, Gerald G; Shuster, Jonathan J

    2008-08-15

    The debate as to which statistical methodology is most appropriate for the analysis of the two-sample comparative binomial trial has persisted for decades. Practitioners who favor the conditional methods of Fisher, Fisher's exact test (FET), claim that only experimental outcomes containing the same amount of information should be considered when performing analyses. Hence, the total number of successes should be fixed at its observed level in hypothetical repetitions of the experiment. Using conditional methods in clinical settings can pose interpretation difficulties, since results are derived using conditional sample spaces rather than the set of all possible outcomes. Perhaps more importantly from a clinical trial design perspective, this test can be too conservative, resulting in greater resource requirements and more subjects exposed to an experimental treatment. The actual significance level attained by FET (the size of the test) has not been reported in the statistical literature. Berger (J. R. Statist. Soc. D (The Statistician) 2001; 50:79-85) proposed assessing the conservativeness of conditional methods using p-value confidence intervals. In this paper we develop a numerical algorithm that calculates the size of FET for sample sizes, n, up to 125 per group at the two-sided significance level, alpha = 0.05. Additionally, this numerical method is used to define new significance levels alpha(*) = alpha+epsilon, where epsilon is a small positive number, for each n, such that the size of the test is as close as possible to the pre-specified alpha (0.05 for the current work) without exceeding it. Lastly, a sample size and power calculation example are presented, which demonstrates the statistical advantages of implementing the adjustment to FET (using alpha(*) instead of alpha) in the two-sample comparative binomial trial. 2008 John Wiley & Sons, Ltd

  7. The albatross plot: A novel graphical tool for presenting results of diversely reported studies in a systematic review.

    PubMed

    Harrison, Sean; Jones, Hayley E; Martin, Richard M; Lewis, Sarah J; Higgins, Julian P T

    2017-09-01

    Meta-analyses combine the results of multiple studies of a common question. Approaches based on effect size estimates from each study are generally regarded as the most informative. However, these methods can only be used if comparable effect sizes can be computed from each study, and this may not be the case due to variation in how the studies were done or limitations in how their results were reported. Other methods, such as vote counting, are then used to summarize the results of these studies, but most of these methods are limited in that they do not provide any indication of the magnitude of effect. We propose a novel plot, the albatross plot, which requires only a 1-sided P value and a total sample size from each study (or equivalently a 2-sided P value, direction of effect and total sample size). The plot allows an approximate examination of underlying effect sizes and the potential to identify sources of heterogeneity across studies. This is achieved by drawing contours showing the range of effect sizes that might lead to each P value for given sample sizes, under simple study designs. We provide examples of albatross plots using data from previous meta-analyses, allowing for comparison of results, and an example from when a meta-analysis was not possible. Copyright © 2017 The Authors. Research Synthesis Methods Published by John Wiley & Sons Ltd.

  8. Evaluation of dredged material proposed for ocean disposal from Hudson River, New York

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardiner, W.W.; Barrows, E.S.; Antrim, L.D.

    1996-09-01

    The Hudson River (Federal Project No. 41) was one of seven waterways that the U.S. Army Corps of Engineers-New York District (USACE-NYD) requested the Battelle Marine Sciences Laboratory (MSL) to sample and evaluate for dredging and disposal in March 1994. Sediment samples were collected from the Hudson River. Tests and analyses were conducted on Hudson River sediment core samples. The evaluation of proposed dredged material from the Hudson River included bulk sediment chemical analyses, chemical analyses of site water and elutriate, water-column and benthic acute toxicity tests, and bioaccumulation studies. Individual sediment core samples collected from Hudson River were analyzedmore » for grain size, moisture content, and total organic carbon (TOC). A composite sediment sample, representing the entire area proposed for dredging, was analyzed for bulk density, specific gravity, metals, chlorinated pesticides, polychlorinated biphenyl (PCB) congeners, polynuclear aromatic hydrocarbons (PAH), and 1,4-dichlorobenzene. Site water and elutriate water, prepared from the suspended-particulate phase (SPP) of Hudson River sediment, were analyzed for metals, pesticides, and PCBS. Water-column or SPP toxicity tests were performed with three species. Benthic acute toxicity tests were performed. Bioaccumulation tests were also conducted.« less

  9. Mean estimation in highly skewed samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pederson, S P

    The problem of inference for the mean of a highly asymmetric distribution is considered. Even with large sample sizes, usual asymptotics based on normal theory give poor answers, as the right-hand tail of the distribution is often under-sampled. This paper attempts to improve performance in two ways. First, modifications of the standard confidence interval procedure are examined. Second, diagnostics are proposed to indicate whether or not inferential procedures are likely to be valid. The problems are illustrated with data simulated from an absolute value Cauchy distribution. 4 refs., 2 figs., 1 tab.

  10. One-step estimation of networked population size: Respondent-driven capture-recapture with anonymity.

    PubMed

    Khan, Bilal; Lee, Hsuan-Wei; Fellows, Ian; Dombrowski, Kirk

    2018-01-01

    Size estimation is particularly important for populations whose members experience disproportionate health issues or pose elevated health risks to the ambient social structures in which they are embedded. Efforts to derive size estimates are often frustrated when the population is hidden or hard-to-reach in ways that preclude conventional survey strategies, as is the case when social stigma is associated with group membership or when group members are involved in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, for use in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. We give provably sufficient conditions for the consistency of these estimators in large configuration networks. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which also perform well on a real-world location-based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable tradeoffs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population size estimates are derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. We discuss limitations and future work in the concluding section.

  11. Technical note: Alternatives to reduce adipose tissue sampling bias.

    PubMed

    Cruz, G D; Wang, Y; Fadel, J G

    2014-10-01

    Understanding the mechanisms by which nutritional and pharmaceutical factors can manipulate adipose tissue growth and development in production animals has direct and indirect effects in the profitability of an enterprise. Adipocyte cellularity (number and size) is a key biological response that is commonly measured in animal science research. The variability and sampling of adipocyte cellularity within a muscle has been addressed in previous studies, but no attempt to critically investigate these issues has been proposed in the literature. The present study evaluated 2 sampling techniques (random and systematic) in an attempt to minimize sampling bias and to determine the minimum number of samples from 1 to 15 needed to represent the overall adipose tissue in the muscle. Both sampling procedures were applied on adipose tissue samples dissected from 30 longissimus muscles from cattle finished either on grass or grain. Briefly, adipose tissue samples were fixed with osmium tetroxide, and size and number of adipocytes were determined by a Coulter Counter. These results were then fit in a finite mixture model to obtain distribution parameters of each sample. To evaluate the benefits of increasing number of samples and the advantage of the new sampling technique, the concept of acceptance ratio was used; simply stated, the higher the acceptance ratio, the better the representation of the overall population. As expected, a great improvement on the estimation of the overall adipocyte cellularity parameters was observed using both sampling techniques when sample size number increased from 1 to 15 samples, considering both techniques' acceptance ratio increased from approximately 3 to 25%. When comparing sampling techniques, the systematic procedure slightly improved parameters estimation. The results suggest that more detailed research using other sampling techniques may provide better estimates for minimum sampling.

  12. A hybrid artificial bee colony algorithm and pattern search method for inversion of particle size distribution from spectral extinction data

    NASA Astrophysics Data System (ADS)

    Wang, Li; Li, Feng; Xing, Jian

    2017-10-01

    In this paper, a hybrid artificial bee colony (ABC) algorithm and pattern search (PS) method is proposed and applied for recovery of particle size distribution (PSD) from spectral extinction data. To be more useful and practical, size distribution function is modelled as the general Johnson's ? function that can overcome the difficulty of not knowing the exact type beforehand encountered in many real circumstances. The proposed hybrid algorithm is evaluated through simulated examples involving unimodal, bimodal and trimodal PSDs with different widths and mean particle diameters. For comparison, all examples are additionally validated by the single ABC algorithm. In addition, the performance of the proposed algorithm is further tested by actual extinction measurements with real standard polystyrene samples immersed in water. Simulation and experimental results illustrate that the hybrid algorithm can be used as an effective technique to retrieve the PSDs with high reliability and accuracy. Compared with the single ABC algorithm, our proposed algorithm can produce more accurate and robust inversion results while taking almost comparative CPU time over ABC algorithm alone. The superiority of ABC and PS hybridization strategy in terms of reaching a better balance of estimation accuracy and computation effort increases its potentials as an excellent inversion technique for reliable and efficient actual measurement of PSD.

  13. An Efficient Independence Sampler for Updating Branches in Bayesian Markov chain Monte Carlo Sampling of Phylogenetic Trees.

    PubMed

    Aberer, Andre J; Stamatakis, Alexandros; Ronquist, Fredrik

    2016-01-01

    Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  14. Controlling the type I error rate in two-stage sequential adaptive designs when testing for average bioequivalence.

    PubMed

    Maurer, Willi; Jones, Byron; Chen, Ying

    2018-05-10

    In a 2×2 crossover trial for establishing average bioequivalence (ABE) of a generic agent and a currently marketed drug, the recommended approach to hypothesis testing is the two one-sided test (TOST) procedure, which depends, among other things, on the estimated within-subject variability. The power of this procedure, and therefore the sample size required to achieve a minimum power, depends on having a good estimate of this variability. When there is uncertainty, it is advisable to plan the design in two stages, with an interim sample size reestimation after the first stage, using an interim estimate of the within-subject variability. One method and 3 variations of doing this were proposed by Potvin et al. Using simulation, the operating characteristics, including the empirical type I error rate, of the 4 variations (called Methods A, B, C, and D) were assessed by Potvin et al and Methods B and C were recommended. However, none of these 4 variations formally controls the type I error rate of falsely claiming ABE, even though the amount of inflation produced by Method C was considered acceptable. A major disadvantage of assessing type I error rate inflation using simulation is that unless all possible scenarios for the intended design and analysis are investigated, it is impossible to be sure that the type I error rate is controlled. Here, we propose an alternative, principled method of sample size reestimation that is guaranteed to control the type I error rate at any given significance level. This method uses a new version of the inverse-normal combination of p-values test, in conjunction with standard group sequential techniques, that is more robust to large deviations in initial assumptions regarding the variability of the pharmacokinetic endpoints. The sample size reestimation step is based on significance levels and power requirements that are conditional on the first-stage results. This necessitates a discussion and exploitation of the peculiar properties of the power curve of the TOST testing procedure. We illustrate our approach with an example based on a real ABE study and compare the operating characteristics of our proposed method with those of Method B of Povin et al. Copyright © 2018 John Wiley & Sons, Ltd.

  15. Design and implementation of an optical Gaussian noise generator

    NASA Astrophysics Data System (ADS)

    Za~O, Leonardo; Loss, Gustavo; Coelho, Rosângela

    2009-08-01

    A design of a fast and accurate optical Gaussian noise generator is proposed and demonstrated. The noise sample generation is based on the Box-Muller algorithm. The functions implementation was performed on a high-speed Altera Stratix EP1S25 field-programmable gate array (FPGA) development kit. It enabled the generation of 150 million 16-bit noise samples per second. The Gaussian noise generator required only 7.4% of the FPGA logic elements, 1.2% of the RAM memory, 0.04% of the ROM memory, and a laser source. The optical pulses were generated by a laser source externally modulated by the data bit samples using the frequency-shift keying technique. The accuracy of the noise samples was evaluated for different sequences size and confidence intervals. The noise sample pattern was validated by the Bhattacharyya distance (Bd) and the autocorrelation function. The results showed that the proposed design of the optical Gaussian noise generator is very promising to evaluate the performance of optical communications channels with very low bit-error-rate values.

  16. Accounting for treatment by center interaction in sample size determinations and the use of surrogate outcomes in the pessary for the prevention of preterm birth trial: a simulation study.

    PubMed

    Willan, Andrew R

    2016-07-05

    The Pessary for the Prevention of Preterm Birth Study (PS3) is an international, multicenter, randomized clinical trial designed to examine the effectiveness of the Arabin pessary in preventing preterm birth in pregnant women with a short cervix. During the design of the study two methodological issues regarding power and sample size were raised. Since treatment in the Standard Arm will vary between centers, it is anticipated that so too will the probability of preterm birth in that arm. This will likely result in a treatment by center interaction, and the issue of how this will affect the sample size requirements was raised. The sample size requirements to examine the effect of the pessary on the baby's clinical outcome was prohibitively high, so the second issue is how best to examine the effect on clinical outcome. The approaches taken to address these issues are presented. Simulation and sensitivity analysis were used to address the sample size issue. The probability of preterm birth in the Standard Arm was assumed to vary between centers following a Beta distribution with a mean of 0.3 and a coefficient of variation of 0.3. To address the second issue a Bayesian decision model is proposed that combines the information regarding the between-treatment difference in the probability of preterm birth from PS3 with the data from the Multiple Courses of Antenatal Corticosteroids for Preterm Birth Study that relate preterm birth and perinatal mortality/morbidity. The approach provides a between-treatment comparison with respect to the probability of a bad clinical outcome. The performance of the approach was assessed using simulation and sensitivity analysis. Accounting for a possible treatment by center interaction increased the sample size from 540 to 700 patients per arm for the base case. The sample size requirements increase with the coefficient of variation and decrease with the number of centers. Under the same assumptions used for determining the sample size requirements, the simulated mean probability that pessary reduces the risk of perinatal mortality/morbidity is 0.98. The simulated mean decreased with coefficient of variation and increased with the number of clinical sites. Employing simulation and sensitivity analysis is a useful approach for determining sample size requirements while accounting for the additional uncertainty due to a treatment by center interaction. Using a surrogate outcome in conjunction with a Bayesian decision model is an efficient way to compare important clinical outcomes in a randomized clinical trial in situations where the direct approach requires a prohibitively high sample size.

  17. Comparing and Combining Data across Multiple Sources via Integration of Paired-sample Data to Correct for Measurement Error

    PubMed Central

    Huang, Yunda; Huang, Ying; Moodie, Zoe; Li, Sue; Self, Steve

    2014-01-01

    Summary In biomedical research such as the development of vaccines for infectious diseases or cancer, measures from the same assay are often collected from multiple sources or laboratories. Measurement error that may vary between laboratories needs to be adjusted for when combining samples across laboratories. We incorporate such adjustment in comparing and combining independent samples from different labs via integration of external data, collected on paired samples from the same two laboratories. We propose: 1) normalization of individual level data from two laboratories to the same scale via the expectation of true measurements conditioning on the observed; 2) comparison of mean assay values between two independent samples in the Main study accounting for inter-source measurement error; and 3) sample size calculations of the paired-sample study so that hypothesis testing error rates are appropriately controlled in the Main study comparison. Because the goal is not to estimate the true underlying measurements but to combine data on the same scale, our proposed methods do not require that the true values for the errorprone measurements are known in the external data. Simulation results under a variety of scenarios demonstrate satisfactory finite sample performance of our proposed methods when measurement errors vary. We illustrate our methods using real ELISpot assay data generated by two HIV vaccine laboratories. PMID:22764070

  18. Use of a size-resolved 1-D resuspension scheme to evaluate resuspended radioactive material associated with mineral dust particles from the ground surface.

    PubMed

    Ishizuka, Masahide; Mikami, Masao; Tanaka, Taichu Y; Igarashi, Yasuhito; Kita, Kazuyuki; Yamada, Yutaka; Yoshida, Naohiro; Toyoda, Sakae; Satou, Yukihiko; Kinase, Takeshi; Ninomiya, Kazuhiko; Shinohara, Atsushi

    2017-01-01

    A size-resolved, one-dimensional resuspension scheme for soil particles from the ground surface is proposed to evaluate the concentration of radioactivity in the atmosphere due to the secondary emission of radioactive material. The particle size distributions of radioactive particles at a sampling point were measured and compared with the results evaluated by the scheme using four different soil textures: sand, loamy sand, sandy loam, and silty loam. For sandy loam and silty loam, the results were in good agreement with the size-resolved atmospheric radioactivity concentrations observed at a school ground in Tsushima District, Namie Town, Fukushima, which was heavily contaminated after the Fukushima Dai-ichi Nuclear Power Plant accident in March 2011. Though various assumptions were incorporated into both the scheme and evaluation conditions, this study shows that the proposed scheme can be applied to evaluate secondary emissions caused by aeolian resuspension of radioactive materials associated with mineral dust particles from the ground surface. The results underscore the importance of taking soil texture into account when evaluating the concentrations of resuspended, size-resolved atmospheric radioactivity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. SIproc: an open-source biomedical data processing platform for large hyperspectral images.

    PubMed

    Berisha, Sebastian; Chang, Shengyuan; Saki, Sam; Daeinejad, Davar; He, Ziqi; Mankar, Rupali; Mayerich, David

    2017-04-10

    There has recently been significant interest within the vibrational spectroscopy community to apply quantitative spectroscopic imaging techniques to histology and clinical diagnosis. However, many of the proposed methods require collecting spectroscopic images that have a similar region size and resolution to the corresponding histological images. Since spectroscopic images contain significantly more spectral samples than traditional histology, the resulting data sets can approach hundreds of gigabytes to terabytes in size. This makes them difficult to store and process, and the tools available to researchers for handling large spectroscopic data sets are limited. Fundamental mathematical tools, such as MATLAB, Octave, and SciPy, are extremely powerful but require that the data be stored in fast memory. This memory limitation becomes impractical for even modestly sized histological images, which can be hundreds of gigabytes in size. In this paper, we propose an open-source toolkit designed to perform out-of-core processing of hyperspectral images. By taking advantage of graphical processing unit (GPU) computing combined with adaptive data streaming, our software alleviates common workstation memory limitations while achieving better performance than existing applications.

  20. A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

    PubMed

    Jiang, Wenyu; Simon, Richard

    2007-12-20

    This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.

  1. Effect sizes and cut-off points: a meta-analytical review of burnout in latin American countries.

    PubMed

    García-Arroyo, Jose; Osca Segovia, Amparo

    2018-05-02

    Burnout is a highly prevalent globalized health issue that causes significant physical and psychological health problems. In Latin America research on this topic has increased in recent years, however there are no studies comparing results across countries, nor normative reference cut-offs. The present meta-analysis examines the intensity of burnout (emotional exhaustion, cynicism and personal accomplishment) in 58 adult nonclinical samples from 8 countries (Argentina, Brazil, Chile, Colombia, Ecuador, Mexico, Peru and Venezuela). We found low intensity of burnout but there are significant differences between countries in emotional exhaustion explained by occupation and language. Social and human service professionals (police officers, social workers, public administration staff) are more exhausted than health professionals (physicians, nurses) or teachers. The samples with Portuguese language score higher in emotional exhaustion than Spanish, supporting the theory of cultural relativism. Demographics (sex, age) and study variables (sample size, instrument), were not found significant to predict burnout. The effect size and confidence intervals found are proposed as a useful baseline for research and medical diagnosis of burnout in Latin American countries.

  2. Sample size determination for GEE analyses of stepped wedge cluster randomized trials.

    PubMed

    Li, Fan; Turner, Elizabeth L; Preisser, John S

    2018-06-19

    In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.

  3. Investigation of Polymer Liquid Crystals

    NASA Technical Reports Server (NTRS)

    Han, Kwang S.

    1996-01-01

    The positron annihilation lifetime spectroscopy (PALS) using a low energy flux generator may provide a reasonably accurate technique for measuring molecular weights of linear polymers and characterization of thin polyimide films in terms of their dielectric constants and hydrophobity etc. Among the tested samples are glassy poly arylene Ether Ketone films, epoxy and other polyimide films. One of the proposed techniques relates the free volume cell size (V(sub f)) with sample molecular weight (M) in a manner remarkably similar to that obtained by Mark Houwink (M-H) between the inherent viscosity (eta) and molecular wieght of polymer solution. The PALS has also demonstrated that free-volume cell size in thermoset is a versatile, useful parameter that relates directly to the polymer segmental molecular weight, the cross-link density, and the coefficient of thermal expansion. Thus, a determination of free volume cell size provides a viable basis for complete microstructural characterization of thermoset polyimides and also gives direct information about the cross-link density and coefficient of expansion of the test samples. Seven areas of the research conducted are reported here.

  4. Factors Associated with the Performance and Cost-Effectiveness of Using Lymphatic Filariasis Transmission Assessment Surveys for Monitoring Soil-Transmitted Helminths: A Case Study in Kenya

    PubMed Central

    Smith, Jennifer L.; Sturrock, Hugh J. W.; Assefa, Liya; Nikolay, Birgit; Njenga, Sammy M.; Kihara, Jimmy; Mwandawiro, Charles S.; Brooker, Simon J.

    2015-01-01

    Transmission assessment surveys (TAS) for lymphatic filariasis have been proposed as a platform to assess the impact of mass drug administration (MDA) on soil-transmitted helminths (STHs). This study used computer simulation and field data from pre- and post-MDA settings across Kenya to evaluate the performance and cost-effectiveness of the TAS design for STH assessment compared with alternative survey designs. Variations in the TAS design and different sample sizes and diagnostic methods were also evaluated. The district-level TAS design correctly classified more districts compared with standard STH designs in pre-MDA settings. Aggregating districts into larger evaluation units in a TAS design decreased performance, whereas age group sampled and sample size had minimal impact. The low diagnostic sensitivity of Kato-Katz and mini-FLOTAC methods was found to increase misclassification. We recommend using a district-level TAS among children 8–10 years of age to assess STH but suggest that key consideration is given to evaluation unit size. PMID:25487730

  5. Surrogate and clinical endpoints for studies in peripheral artery occlusive disease: Are statistics the brakes?

    PubMed

    Waliszewski, Matthias W; Redlich, Ulf; Breul, Victor; Tautenhahn, Jörg

    2017-04-30

    The aim of this review is to present the available clinical and surrogate endpoints that may be used in future studies performed in patients with peripheral artery occlusive disease (PAOD). Importantly, we describe statistical limitations of the most commonly used endpoints and offer some guidance with respect to study design for a given sample size. The proposed endpoints may be used in studies using surgical or interventional revascularization and/or drug treatments. Considering recently published study endpoints and designs, the usefulness of these endpoints for reimbursement is evaluated. Based on these potential study endpoints and patient sample size estimates with different non-inferiority or tests for difference hypotheses, a rating relative to their corresponding reimbursement values is attempted. As regards the benefit for the patients and for the payers, walking distance and the ankle brachial index (ABI) are the most feasible endpoints in a relatively small study samples given that other non-vascular impact factors can be controlled. Angiographic endpoints such as minimal lumen diameter (MLD) do not seem useful from a reimbursement standpoint despite their intuitiveness. Other surrogate endpoints, such as transcutaneous oxygen tension measurements, have yet to be established as useful endpoints in reasonably sized studies with patients with critical limb ischemia (CLI). From a reimbursement standpoint, WD and ABI are effective endpoints for a moderate study sample size given that non-vascular confounding factors can be controlled.

  6. SABRE: a method for assessing the stability of gene modules in complex tissues and subject populations.

    PubMed

    Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T

    2016-11-14

    Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.

  7. Bayes factors for testing inequality constrained hypotheses: Issues with prior specification.

    PubMed

    Mulder, Joris

    2014-02-01

    Several issues are discussed when testing inequality constrained hypotheses using a Bayesian approach. First, the complexity (or size) of the inequality constrained parameter spaces can be ignored. This is the case when using the posterior probability that the inequality constraints of a hypothesis hold, Bayes factors based on non-informative improper priors, and partial Bayes factors based on posterior priors. Second, the Bayes factor may not be invariant for linear one-to-one transformations of the data. This can be observed when using balanced priors which are centred on the boundary of the constrained parameter space with a diagonal covariance structure. Third, the information paradox can be observed. When testing inequality constrained hypotheses, the information paradox occurs when the Bayes factor of an inequality constrained hypothesis against its complement converges to a constant as the evidence for the first hypothesis accumulates while keeping the sample size fixed. This paradox occurs when using Zellner's g prior as a result of too much prior shrinkage. Therefore, two new methods are proposed that avoid these issues. First, partial Bayes factors are proposed based on transformed minimal training samples. These training samples result in posterior priors that are centred on the boundary of the constrained parameter space with the same covariance structure as in the sample. Second, a g prior approach is proposed by letting g go to infinity. This is possible because the Jeffreys-Lindley paradox is not an issue when testing inequality constrained hypotheses. A simulation study indicated that the Bayes factor based on this g prior approach converges fastest to the true inequality constrained hypothesis. © 2013 The British Psychological Society.

  8. 77 FR 16517 - Agency Information Collection Activities: Proposed Collection; Comment Request-Information...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-21

    ... children and provide low cost or free school lunch meals to qualified students through subsidies to schools... records to demonstrate compliance with the meal requirements. To the extent practicable, schools ensure... verification of a required sample size), the number of meals served, and data from required reviews conducted...

  9. 76 FR 31575 - United States Standards for Grades of Frozen Onions

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-01

    ... processors of frozen onions in the United States. The petition provided information on style, sample size... change in the style designations for minced style, and a correction to the text. The members agreed with the proposed section concerning requirements for Styles, Type I, Whole onions and Type II, Pearl...

  10. How to Measure the Onset of Babbling Reliably?

    ERIC Educational Resources Information Center

    Molemans, Inge; van den Berg, Renate; van Severen, Lieve; Gillis, Steven

    2012-01-01

    Various measures for identifying the onset of babbling have been proposed in the literature, but a formal definition of the exact procedure and a thorough validation of the sample size required for reliably establishing babbling onset is lacking. In this paper the reliability of five commonly used measures is assessed using a large longitudinal…

  11. Modeling Valuations from Experience: A Comment on Ashby and Rakow (2014)

    ERIC Educational Resources Information Center

    Wulff, Dirk U.; Pachur, Thorsten

    2016-01-01

    What are the cognitive mechanisms underlying subjective valuations formed on the basis of sequential experiences of an option's possible outcomes? Ashby and Rakow (2014) have proposed a sliding window model (SWIM), according to which people's valuations represent the average of a limited sample of recent experiences (the size of which is estimated…

  12. Estimation of the vortex length scale and intensity from two-dimensional samples

    NASA Technical Reports Server (NTRS)

    Reuss, D. L.; Cheng, W. P.

    1992-01-01

    A method is proposed for estimating flow features that influence flame wrinkling in reciprocating internal combustion engines, where traditional statistical measures of turbulence are suspect. Candidate methods were tested in a computed channel flow where traditional turbulence measures are valid and performance can be rationally evaluated. Two concepts are tested. First, spatial filtering is applied to the two-dimensional velocity distribution and found to reveal structures corresponding to the vorticity field. Decreasing the spatial-frequency cutoff of the filter locally changes the character and size of the flow structures that are revealed by the filter. Second, vortex length scale and intensity is estimated by computing the ensemble-average velocity distribution conditionally sampled on the vorticity peaks. The resulting conditionally sampled 'average vortex' has a peak velocity less than half the rms velocity and a size approximately equal to the two-point-correlation integral-length scale.

  13. Estimation of the diagnostic threshold accounting for decision costs and sampling uncertainty.

    PubMed

    Skaltsa, Konstantina; Jover, Lluís; Carrasco, Josep Lluís

    2010-10-01

    Medical diagnostic tests are used to classify subjects as non-diseased or diseased. The classification rule usually consists of classifying subjects using the values of a continuous marker that is dichotomised by means of a threshold. Here, the optimum threshold estimate is found by minimising a cost function that accounts for both decision costs and sampling uncertainty. The cost function is optimised either analytically in a normal distribution setting or empirically in a free-distribution setting when the underlying probability distributions of diseased and non-diseased subjects are unknown. Inference of the threshold estimates is based on approximate analytically standard errors and bootstrap-based approaches. The performance of the proposed methodology is assessed by means of a simulation study, and the sample size required for a given confidence interval precision and sample size ratio is also calculated. Finally, a case example based on previously published data concerning the diagnosis of Alzheimer's patients is provided in order to illustrate the procedure.

  14. Passive acoustic measurement of bedload grain size distribution using self-generated noise

    NASA Astrophysics Data System (ADS)

    Petrut, Teodor; Geay, Thomas; Gervaise, Cédric; Belleudy, Philippe; Zanker, Sebastien

    2018-01-01

    Monitoring sediment transport processes in rivers is of particular interest to engineers and scientists to assess the stability of rivers and hydraulic structures. Various methods for sediment transport process description were proposed using conventional or surrogate measurement techniques. This paper addresses the topic of the passive acoustic monitoring of bedload transport in rivers and especially the estimation of the bedload grain size distribution from self-generated noise. It discusses the feasibility of linking the acoustic signal spectrum shape to bedload grain sizes involved in elastic impacts with the river bed treated as a massive slab. Bedload grain size distribution is estimated by a regularized algebraic inversion scheme fed with the power spectrum density of river noise estimated from one hydrophone. The inversion methodology relies upon a physical model that predicts the acoustic field generated by the collision between rigid bodies. Here we proposed an analytic model of the acoustic energy spectrum generated by the impacts between a sphere and a slab. The proposed model computes the power spectral density of bedload noise using a linear system of analytic energy spectra weighted by the grain size distribution. The algebraic system of equations is then solved by least square optimization and solution regularization methods. The result of inversion leads directly to the estimation of the bedload grain size distribution. The inversion method was applied to real acoustic data from passive acoustics experiments realized on the Isère River, in France. The inversion of in situ measured spectra reveals good estimations of grain size distribution, fairly close to what was estimated by physical sampling instruments. These results illustrate the potential of the hydrophone technique to be used as a standalone method that could ensure high spatial and temporal resolution measurements for sediment transport in rivers.

  15. Freeway travel speed calculation model based on ETC transaction data.

    PubMed

    Weng, Jiancheng; Yuan, Rongliang; Wang, Ru; Wang, Chang

    2014-01-01

    Real-time traffic flow operation condition of freeway gradually becomes the critical information for the freeway users and managers. In fact, electronic toll collection (ETC) transaction data effectively records operational information of vehicles on freeway, which provides a new method to estimate the travel speed of freeway. First, the paper analyzed the structure of ETC transaction data and presented the data preprocess procedure. Then, a dual-level travel speed calculation model was established under different levels of sample sizes. In order to ensure a sufficient sample size, ETC data of different enter-leave toll plazas pairs which contain more than one road segment were used to calculate the travel speed of every road segment. The reduction coefficient α and reliable weight θ for sample vehicle speed were introduced in the model. Finally, the model was verified by the special designed field experiments which were conducted on several freeways in Beijing at different time periods. The experiments results demonstrated that the average relative error was about 6.5% which means that the freeway travel speed could be estimated by the proposed model accurately. The proposed model is helpful to promote the level of the freeway operation monitoring and the freeway management, as well as to provide useful information for the freeway travelers.

  16. Semiautomatic Segmentation of Glioma on Mobile Devices.

    PubMed

    Wu, Ya-Ping; Lin, Yu-Song; Wu, Wei-Guo; Yang, Cong; Gu, Jian-Qin; Bai, Yan; Wang, Mei-Yun

    2017-01-01

    Brain tumor segmentation is the first and the most critical step in clinical applications of radiomics. However, segmenting brain images by radiologists is labor intense and prone to inter- and intraobserver variability. Stable and reproducible brain image segmentation algorithms are thus important for successful tumor detection in radiomics. In this paper, we propose a supervised brain image segmentation method, especially for magnetic resonance (MR) brain images with glioma. This paper uses hard edge multiplicative intrinsic component optimization to preprocess glioma medical image on the server side, and then, the doctors could supervise the segmentation process on mobile devices in their convenient time. Since the preprocessed images have the same brightness for the same tissue voxels, they have small data size (typically 1/10 of the original image size) and simple structure of 4 types of intensity value. This observation thus allows follow-up steps to be processed on mobile devices with low bandwidth and limited computing performance. Experiments conducted on 1935 brain slices from 129 patients show that more than 30% of the sample can reach 90% similarity; over 60% of the samples can reach 85% similarity, and more than 80% of the sample could reach 75% similarity. The comparisons with other segmentation methods also demonstrate both efficiency and stability of the proposed approach.

  17. A small-plane heat source method for measuring the thermal conductivities of anisotropic materials

    NASA Astrophysics Data System (ADS)

    Cheng, Liang; Yue, Kai; Wang, Jun; Zhang, Xinxin

    2017-07-01

    A new small-plane heat source method was proposed in this study to simultaneously measure the in-plane and cross-plane thermal conductivities of anisotropic insulating materials. In this method the size of the heat source element is smaller than the sample size and the boundary condition is thermal insulation due to no heat flux at the edge of the sample during the experiment. A three-dimensional model in a rectangular coordinate system was established to exactly describe the heat transfer process of the measurement system. Using the Laplace transform, variable separation, and Laplace inverse transform methods, the analytical solution of the temperature rise of the sample was derived. The temperature rises calculated by the analytical solution agree well with the results of numerical calculation. The result of the sensitivity analysis shows that the sensitivity coefficients of the estimated thermal conductivities are high and uncorrelated to each other. At room temperature and in a high-temperature environment, experimental measurements of anisotropic silica aerogel were carried out using the traditional one-dimensional plane heat source method and the proposed method, respectively. The results demonstrate that the measurement method developed in this study is effective and feasible for simultaneously obtaining the in-plane and cross-plane thermal conductivities of the anisotropic materials.

  18. tscvh R Package: Computational of the two samples test on microarray-sequencing data

    NASA Astrophysics Data System (ADS)

    Fajriyah, Rohmatul; Rosadi, Dedi

    2017-12-01

    We present a new R package, a tscvh (two samples cross-variance homogeneity), as we called it. This package is a software of the cross-variance statistical test which has been proposed and introduced by Fajriyah ([3] and [4]), based on the cross-variance concept. The test can be used as an alternative test for the significance difference between two means when sample size is small, the situation which is usually appeared in the bioinformatics research. Based on its statistical distribution, the p-value can be also provided. The package is built under a homogeneity of variance between samples.

  19. A proposed coast-wide reference monitoring system for evaluating Wetland restoration trajectories in Louisiana

    USGS Publications Warehouse

    Steyer, G.D.; Sasser, C.E.; Visser, J.M.; Swenson, E.M.; Nyman, J.A.; Raynie, R.C.

    2003-01-01

    Wetland restoration efforts conducted in Louisiana under the Coastal Wetlands Planning, Protection and Restoration Act require monitoring the effectiveness of individual projects as well as monitoring the cumulative effects of all projects in restoring, creating, enhancing, and protecting the coastal landscape. The effectiveness of the traditional paired-reference monitoring approach in Louisiana has been limited because of difficulty in finding comparable reference sites. A multiple reference approach is proposed that uses aspects of hydrogeomorphic functional assessments and probabilistic sampling. This approach includes a suite of sites that encompass the range of ecological condition for each stratum, with projects placed on a continuum of conditions found for that stratum. Trajectories in reference sites through time are then compared with project trajectories through time. Plant community zonation complicated selection of indicators, strata, and sample size. The approach proposed could serve as a model for evaluating wetland ecosystems.

  20. A proposed coast-wide reference monitoring system for evaluating wetland restoration trajectories in Louisiana.

    PubMed

    Steyer, Gregory D; Sasser, Charles E; Visser, Jenneke M; Swenson, Erick M; Nyman, John A; Raynie, Richard C

    2003-01-01

    Wetland restoration efforts conducted in Louisiana under the Coastal Wetlands Planning, Protection and Restoration Act require monitoring the effectiveness of individual projects as well as monitoring the cumulative effects of all projects in restoring, creating, enhancing, and protecting the coastal landscape. The effectiveness of the traditional paired-reference monitoring approach in Louisiana has been limited because of difficulty in finding comparable reference sites. A multiple reference approach is proposed that uses aspects of hydrogeomorphic functional assessments and probabilistic sampling. This approach includes a suite of sites that encompass the range of ecological condition for each stratum, with projects placed on a continuum of conditions found for that stratum. Trajectories in reference sites through time are then compared with project trajectories through time. Plant community zonation complicated selection of indicators, strata, and sample size. The approach proposed could serve as a model for evaluating wetland ecosystems.

  1. Predicting fractional bed load transport rates: Application of the Wilcock‐Crowe equations to a regulated gravel bed river

    USGS Publications Warehouse

    Gaeuman, David; Andrews, E.D.; Krause, Andreas; Smith, Wes

    2009-01-01

    Bed load samples from four locations in the Trinity River of northern California are analyzed to evaluate the performance of the Wilcock‐Crowe bed load transport equations for predicting fractional bed load transport rates. Bed surface particles become smaller and the fraction of sand on the bed increases with distance downstream from Lewiston Dam. The dimensionless reference shear stress for the mean bed particle size (τ*rm) is largest near the dam, but varies relatively little between the more downstream locations. The relation between τ*rm and the reference shear stresses for other size fractions is constant across all locations. Total bed load transport rates predicted with the Wilcock‐Crowe equations are within a factor of 2 of sampled transport rates for 68% of all samples. The Wilcock‐Crowe equations nonetheless consistently under‐predict the transport of particles larger than 128 mm, frequently by more than an order of magnitude. Accurate prediction of the transport rates of the largest particles is important for models in which the evolution of the surface grain size distribution determines subsequent bed load transport rates. Values of τ*rm estimated from bed load samples are up to 50% larger than those predicted with the Wilcock‐Crowe equations, and sampled bed load transport approximates equal mobility across a wider range of grain sizes than is implied by the equations. Modifications to the Wilcock‐Crowe equation for determining τ*rm and the hiding function used to scale τ*rm to other grain size fractions are proposed to achieve the best fit to observed bed load transport in the Trinity River.

  2. Effect of Study Design on Sample Size in Studies Intended to Evaluate Bioequivalence of Inhaled Short‐Acting β‐Agonist Formulations

    PubMed Central

    Zeng, Yaohui; Singh, Sachinkumar; Wang, Kai

    2017-01-01

    Abstract Pharmacodynamic studies that use methacholine challenge to assess bioequivalence of generic and innovator albuterol formulations are generally designed per published Food and Drug Administration guidance, with 3 reference doses and 1 test dose (3‐by‐1 design). These studies are challenging and expensive to conduct, typically requiring large sample sizes. We proposed 14 modified study designs as alternatives to the Food and Drug Administration–recommended 3‐by‐1 design, hypothesizing that adding reference and/or test doses would reduce sample size and cost. We used Monte Carlo simulation to estimate sample size. Simulation inputs were selected based on published studies and our own experience with this type of trial. We also estimated effects of these modified study designs on study cost. Most of these altered designs reduced sample size and cost relative to the 3‐by‐1 design, some decreasing cost by more than 40%. The most effective single study dose to add was 180 μg of test formulation, which resulted in an estimated 30% relative cost reduction. Adding a single test dose of 90 μg was less effective, producing only a 13% cost reduction. Adding a lone reference dose of either 180, 270, or 360 μg yielded little benefit (less than 10% cost reduction), whereas adding 720 μg resulted in a 19% cost reduction. Of the 14 study design modifications we evaluated, the most effective was addition of both a 90‐μg test dose and a 720‐μg reference dose (42% cost reduction). Combining a 180‐μg test dose and a 720‐μg reference dose produced an estimated 36% cost reduction. PMID:29281130

  3. Tissue recommendations for precision cancer therapy using next generation sequencing: a comprehensive single cancer center’s experiences

    PubMed Central

    Hong, Mineui; Bang, Heejin; Van Vrancken, Michael; Kim, Seungtae; Lee, Jeeyun; Park, Se Hoon; Park, Joon Oh; Park, Young Suk; Lim, Ho Yeong; Kang, Won Ki; Sun, Jong-Mu; Lee, Se Hoon; Ahn, Myung-Ju; Park, Keunchil; Kim, Duk Hwan; Lee, Seunggwan; Park, Woongyang; Kim, Kyoung-Mee

    2017-01-01

    To generate accurate next-generation sequencing (NGS) data, the amount and quality of DNA extracted is critical. We analyzed 1564 tissue samples from patients with metastatic or recurrent solid tumor submitted for NGS according to their sample size, acquisition method, organ, and fixation to propose appropriate tissue requirements. Of the 1564 tissue samples, 481 (30.8%) consisted of fresh-frozen (FF) tissue, and 1,083 (69.2%) consisted of formalin-fixed paraffin-embedded (FFPE) tissue. We obtained successful NGS results in 95.9% of cases. Out of 481 FF biopsies, 262 tissue samples were from lung, and the mean fragment size was 2.4 mm. Compared to lung, GI tract tumor fragments showed a significantly lower DNA extraction failure rate (2.1 % versus 6.1%, p = 0.04). For FFPE biopsy samples, the size of biopsy tissue was similar regardless of tumor type with a mean of 0.8 × 0.3 cm, and the mean DNA yield per one unstained slide was 114 ng. We obtained highest amount of DNA from the colorectum (2353 ng) and the lowest amount from the hepatobiliary tract (760.3 ng) likely due to a relatively smaller biopsy size, extensive hemorrhage and necrosis, and lower tumor volume. On one unstained slide from FFPE operation specimens, the mean size of the specimen was 2.0 × 1.0 cm, and the mean DNA yield per one unstained slide was 1800 ng. In conclusions, we present our experiences on tissue requirements for appropriate NGS workflow: > 1 mm2 for FF biopsy, > 5 unstained slides for FFPE biopsy, and > 1 unstained slide for FFPE operation specimens for successful test results in 95.9% of cases. PMID:28477007

  4. Thermal conductivity of microporous layers: Analytical modeling and experimental validation

    NASA Astrophysics Data System (ADS)

    Andisheh-Tadbir, Mehdi; Kjeang, Erik; Bahrami, Majid

    2015-11-01

    A new compact relationship is developed for the thermal conductivity of the microporous layer (MPL) used in polymer electrolyte fuel cells as a function of pore size distribution, porosity, and compression pressure. The proposed model is successfully validated against experimental data obtained from a transient plane source thermal constants analyzer. The thermal conductivities of carbon paper samples with and without MPL were measured as a function of load (1-6 bars) and the MPL thermal conductivity was found between 0.13 and 0.17 W m-1 K-1. The proposed analytical model predicts the experimental thermal conductivities within 5%. A correlation generated from the analytical model was used in a multi objective genetic algorithm to predict the pore size distribution and porosity for an MPL with optimized thermal conductivity and mass diffusivity. The results suggest that an optimized MPL, in terms of heat and mass transfer coefficients, has an average pore size of 122 nm and 63% porosity.

  5. A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies

    PubMed Central

    Antoneli, Fernando; Passos, Fernando M.; Lopes, Luciano R.

    2018-01-01

    Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. PMID:29300759

  6. Dispersed particle extraction--a new procedure for trace element enrichment from natural aqueous samples with subsequent ICP-OES analysis.

    PubMed

    Bauer, Gerald; Neouze, Marie-Alexandra; Limbeck, Andreas

    2013-01-15

    A novel sample pre-treatment method for multi trace element enrichment from environmental waters prior to optical emission spectrometry analysis with inductively coupled plasma (ICP-OES) is proposed, based on dispersed particle extraction (DPE). This method is based on the use of silica nanoparticles functionalized with strong cation exchange ligands. After separation from the investigated sample solution, the nanoparticles used for the extraction are directly introduced in the ICP for measurement of the adsorbed target analytes. A prerequisite for the successful application of the developed slurry approach is the use of sorbent particles with a mean size of 500 nm instead of commercially available μm sized beads. The proposed method offers the known advantages of common bead-injection (BI) techniques, and further circumvents the elution step required in conventional solid phase extraction procedures. With the use of 14.4 mL sample and addition of ammonium acetate buffer and particle slurry limits of detection (LODs) from 0.03 μg L(-1) for Be to 0.48 μg L(-1) for Fe, with relative standard deviations ranging from 1.7% for Fe and 5.5% for Cr and an average enrichment factor of 10.4 could be achieved. By implementing this method the possibility to access sorbent materials with irreversible bonding mechanisms for sample pre-treatment is established, thus improvements in the selectivity of sample pre-treatment procedures can be achieved. The presented procedure was tested for accuracy with NIST standard reference material 1643e (fresh water) and was applied to drinking water samples from the vicinity of Vienna. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. A sequential test for assessing observed agreement between raters.

    PubMed

    Bersimis, Sotiris; Sachlas, Athanasios; Chakraborti, Subha

    2018-01-01

    Assessing the agreement between two or more raters is an important topic in medical practice. Existing techniques, which deal with categorical data, are based on contingency tables. This is often an obstacle in practice as we have to wait for a long time to collect the appropriate sample size of subjects to construct the contingency table. In this paper, we introduce a nonparametric sequential test for assessing agreement, which can be applied as data accrues, does not require a contingency table, facilitating a rapid assessment of the agreement. The proposed test is based on the cumulative sum of the number of disagreements between the two raters and a suitable statistic representing the waiting time until the cumulative sum exceeds a predefined threshold. We treat the cases of testing two raters' agreement with respect to one or more characteristics and using two or more classification categories, the case where the two raters extremely disagree, and finally the case of testing more than two raters' agreement. The numerical investigation shows that the proposed test has excellent performance. Compared to the existing methods, the proposed method appears to require significantly smaller sample size with equivalent power. Moreover, the proposed method is easily generalizable and brings the problem of assessing the agreement between two or more raters and one or more characteristics under a unified framework, thus providing an easy to use tool to medical practitioners. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. A scalable kernel-based semisupervised metric learning algorithm with out-of-sample generalization ability.

    PubMed

    Yeung, Dit-Yan; Chang, Hong; Dai, Guang

    2008-11-01

    In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.

  9. Computing physical properties with quantum Monte Carlo methods with statistical fluctuations independent of system size.

    PubMed

    Assaraf, Roland

    2014-12-01

    We show that the recently proposed correlated sampling without reweighting procedure extends the locality (asymptotic independence of the system size) of a physical property to the statistical fluctuations of its estimator. This makes the approach potentially vastly more efficient for computing space-localized properties in large systems compared with standard correlated methods. A proof is given for a large collection of noninteracting fragments. Calculations on hydrogen chains suggest that this behavior holds not only for systems displaying short-range correlations, but also for systems with long-range correlations.

  10. Quenching of the Quantum Hall Effect in Graphene with Scrolled Edges

    NASA Astrophysics Data System (ADS)

    Cresti, Alessandro; Fogler, Michael M.; Guinea, Francisco; Castro Neto, A. H.; Roche, Stephan

    2012-04-01

    Edge nanoscrolls are shown to strongly influence transport properties of suspended graphene in the quantum Hall regime. The relatively long arclength of the scrolls in combination with their compact transverse size results in formation of many nonchiral transport channels in the scrolls. They short circuit the bulk current paths and inhibit the observation of the quantized two-terminal resistance. Unlike competing theoretical proposals, this mechanism of disrupting the Hall quantization in suspended graphene is not caused by ill-chosen placement of the contacts, singular elastic strains, or a small sample size.

  11. An empirical Bayes approach to analyzing recurring animal surveys

    USGS Publications Warehouse

    Johnson, D.H.

    1989-01-01

    Recurring estimates of the size of animal populations are often required by biologists or wildlife managers. Because of cost or other constraints, estimates frequently lack the accuracy desired but cannot readily be improved by additional sampling. This report proposes a statistical method employing empirical Bayes (EB) estimators as alternatives to those customarily used to estimate population size, and evaluates them by a subsampling experiment on waterfowl surveys. EB estimates, especially a simple limited-translation version, were more accurate and provided shorter confidence intervals with greater coverage probabilities than customary estimates.

  12. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data

    PubMed Central

    Chen, Yi-Hau

    2017-01-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336

  13. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.

    PubMed

    Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin

    2017-06-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.

  14. Optimising a modified free-space permittivity characterisation method for civil engineering applications

    NASA Astrophysics Data System (ADS)

    Muller, Wayne; Scheuermann, Alexander

    2016-04-01

    Measuring the electrical permittivity of civil engineering materials is important for a range of ground penetrating radar (GPR) and pavement moisture measurement applications. Compacted unbound granular (UBG) pavement materials present a number of preparation and measurement challenges using conventional characterisation techniques. As an alternative to these methods, a modified free-space (MFS) characterisation approach has previously been investigated. This paper describes recent work to optimise and validate the MFS technique. The research included finite difference time domain (FDTD) modelling to better understand the nature of wave propagation within material samples and the test apparatus. This research led to improvements in the test approach and optimisation of sample sizes. The influence of antenna spacing and sample thickness on the permittivity results was investigated by a series of experiments separating antennas and measuring samples of nylon and water. Permittivity measurements of samples of nylon and water approximately 100 mm and 170 mm thick were also compared, showing consistent results. These measurements also agreed well with surface probe measurements of the nylon sample and literature values for water. The results indicate permittivity estimates of acceptable accuracy can be obtained using the proposed approach, apparatus and sample sizes.

  15. Study of vesicle size distribution dependence on pH value based on nanopore resistive pulse method

    NASA Astrophysics Data System (ADS)

    Lin, Yuqing; Rudzevich, Yauheni; Wearne, Adam; Lumpkin, Daniel; Morales, Joselyn; Nemec, Kathleen; Tatulian, Suren; Lupan, Oleg; Chow, Lee

    2013-03-01

    Vesicles are low-micron to sub-micron spheres formed by a lipid bilayer shell and serve as potential vehicles for drug delivery. The size of vesicle is proposed to be one of the instrumental variables affecting delivery efficiency since the size is correlated to factors like circulation and residence time in blood, the rate for cell endocytosis, and efficiency in cell targeting. In this work, we demonstrate accessible and reliable detection and size distribution measurement employing a glass nanopore device based on the resistive pulse method. This novel method enables us to investigate the size distribution dependence of pH difference across the membrane of vesicles with very small sample volume and rapid speed. This provides useful information for optimizing the efficiency of drug delivery in a pH sensitive environment.

  16. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hero, Alfred O.; Rajaratnam, Bala

    When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less

  17. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    PubMed Central

    Hero, Alfred O.; Rajaratnam, Bala

    2015-01-01

    When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700

  18. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    DOE PAGES

    Hero, Alfred O.; Rajaratnam, Bala

    2015-12-09

    When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less

  19. Enhancing the Damping Behavior of Dilute Zn-0.3Al Alloy by Equal Channel Angular Pressing

    NASA Astrophysics Data System (ADS)

    Demirtas, M.; Atli, K. C.; Yanar, H.; Purcek, G.

    2017-06-01

    The effect of grain size on the damping capacity of a dilute Zn-0.3Al alloy was investigated. It was found that there was a critical strain value (≈1 × 10-4) below and above which damping of Zn-0.3Al showed dynamic and static/dynamic hysteresis behavior, respectively. In the dynamic hysteresis region, damping resulted from viscous sliding of phase/grain boundaries, and decreasing grain size increased the damping capacity. While the quenched sample with 100 to 250 µm grain size showed very limited damping capacity with a loss factor tanδ of less than 0.007, decreasing grain size down to 2 µm by equal channel angular pressing (ECAP) increased tanδ to 0.100 in this region. Dynamic recrystallization due to microplasticity at the sample surface was proposed as the damping mechanism for the first time in the region where the alloy showed the combined aspects of dynamic and static hysteresis damping. In this region, tanδ increased with increasing strain amplitude, and ECAPed sample showed a tanδ value of 0.256 at a strain amplitude of 2 × 10-3, the highest recorded so far in the damping capacity-related studies on ZA alloys.

  20. Population clustering based on copy number variations detected from next generation sequencing data.

    PubMed

    Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2014-08-01

    Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.

  1. Visual search for tropical web spiders: the influence of plot length, sampling effort, and phase of the day on species richness.

    PubMed

    Pinto-Leite, C M; Rocha, P L B

    2012-12-01

    Empirical studies using visual search methods to investigate spider communities were conducted with different sampling protocols, including a variety of plot sizes, sampling efforts, and diurnal periods for sampling. We sampled 11 plots ranging in size from 5 by 10 m to 5 by 60 m. In each plot, we computed the total number of species detected every 10 min during 1 hr during the daytime and during the nighttime (0630 hours to 1100 hours, both a.m. and p.m.). We measured the influence of time effort on the measurement of species richness by comparing the curves produced by sample-based rarefaction and species richness estimation (first-order jackknife). We used a general linear model with repeated measures to assess whether the phase of the day during which sampling occurred and the differences in the plot lengths influenced the number of species observed and the number of species estimated. To measure the differences in species composition between the phases of the day, we used a multiresponse permutation procedure and a graphical representation based on nonmetric multidimensional scaling. After 50 min of sampling, we noted a decreased rate of species accumulation and a tendency of the estimated richness curves to reach an asymptote. We did not detect an effect of plot size on the number of species sampled. However, differences in observed species richness and species composition were found between phases of the day. Based on these results, we propose guidelines for visual search for tropical web spiders.

  2. Accuracy of remotely sensed data: Sampling and analysis procedures

    NASA Technical Reports Server (NTRS)

    Congalton, R. G.; Oderwald, R. G.; Mead, R. A.

    1982-01-01

    A review and update of the discrete multivariate analysis techniques used for accuracy assessment is given. A listing of the computer program written to implement these techniques is given. New work on evaluating accuracy assessment using Monte Carlo simulation with different sampling schemes is given. The results of matrices from the mapping effort of the San Juan National Forest is given. A method for estimating the sample size requirements for implementing the accuracy assessment procedures is given. A proposed method for determining the reliability of change detection between two maps of the same area produced at different times is given.

  3. Enantiospecific Detection of Chiral Nanosamples Using Photoinduced Force

    NASA Astrophysics Data System (ADS)

    Kamandi, Mohammad; Albooyeh, Mohammad; Guclu, Caner; Veysi, Mehdi; Zeng, Jinwei; Wickramasinghe, Kumar; Capolino, Filippo

    2017-12-01

    We propose a high-resolution microscopy technique for enantiospecific detection of chiral samples down to sub-100-nm size based on force measurement. We delve into the differential photoinduced optical force Δ F exerted on an achiral probe in the vicinity of a chiral sample when left and right circularly polarized beams separately excite the sample-probe interactive system. We analytically prove that Δ F is entangled with the enantiomer type of the sample enabling enantiospecific detection of chiral inclusions. Moreover, we demonstrate that Δ F is linearly dependent on both the chiral response of the sample and the electric response of the tip and is inversely related to the quartic power of probe-sample distance. We provide physical insight into the transfer of optical activity from the chiral sample to the achiral tip based on a rigorous analytical approach. We support our theoretical achievements by several numerical examples highlighting the potential application of the derived analytic properties. Lastly, we demonstrate the sensitivity of our method to enantiospecify nanoscale chiral samples with chirality parameter on the order of 0.01 and discuss how the sensitivity of our proposed technique can be further improved.

  4. Driven Boson Sampling.

    PubMed

    Barkhofen, Sonja; Bartley, Tim J; Sansoni, Linda; Kruse, Regina; Hamilton, Craig S; Jex, Igor; Silberhorn, Christine

    2017-01-13

    Sampling the distribution of bosons that have undergone a random unitary evolution is strongly believed to be a computationally hard problem. Key to outperforming classical simulations of this task is to increase both the number of input photons and the size of the network. We propose driven boson sampling, in which photons are input within the network itself, as a means to approach this goal. We show that the mean number of photons entering a boson sampling experiment can exceed one photon per input mode, while maintaining the required complexity, potentially leading to less stringent requirements on the input states for such experiments. When using heralded single-photon sources based on parametric down-conversion, this approach offers an ∼e-fold enhancement in the input state generation rate over scattershot boson sampling, reaching the scaling limit for such sources. This approach also offers a dramatic increase in the signal-to-noise ratio with respect to higher-order photon generation from such probabilistic sources, which removes the need for photon number resolution during the heralding process as the size of the system increases.

  5. Alternative sample sizes for verification dose experiments and dose audits

    NASA Astrophysics Data System (ADS)

    Taylor, W. A.; Hansen, J. M.

    1999-01-01

    ISO 11137 (1995), "Sterilization of Health Care Products—Requirements for Validation and Routine Control—Radiation Sterilization", provides sampling plans for performing initial verification dose experiments and quarterly dose audits. Alternative sampling plans are presented which provide equivalent protection. These sampling plans can significantly reduce the cost of testing. These alternative sampling plans have been included in a draft ISO Technical Report (type 2). This paper examines the rational behind the proposed alternative sampling plans. The protection provided by the current verification and audit sampling plans is first examined. Then methods for identifying equivalent plans are highlighted. Finally, methods for comparing the cost associated with the different plans are provided. This paper includes additional guidance for selecting between the original and alternative sampling plans not included in the technical report.

  6. Radiographic analysis of vocal tract length and its relation to overall body size in two canid species.

    PubMed

    Plotsky, K; Rendall, D; Riede, T; Chase, K

    2013-09-01

    Body size is an important determinant of resource and mate competition in many species. Competition is often mediated by conspicuous vocal displays, which may help to intimidate rivals and attract mates by providing honest cues to signaler size. Fitch proposed that vocal tract resonances (or formants) should provide particularly good, or honest, acoustic cues to signaler size because they are determined by the length of the vocal tract, which in turn, is hypothesized to scale reliably with overall body size. There is some empirical support for this hypothesis, but to date, many of the effects have been either mixed for males compared with females, weaker than expected in one or the other sex, or complicated by sampling issues. In this paper, we undertake a direct test of Fitch's hypothesis in two canid species using large samples that control for age- and sex-related variation. The samples involved radiographic images of 120 Portuguese water dogs Canis lupus familiaris and 121 Russian silver foxes Vulpes vulpes . Direct measurements were made of vocal tract length from X-ray images and compared against independent measures of body size. In adults of both species, and within both sexes, overall vocal tract length was strongly and significantly correlated with body size. Effects were strongest for the oral component of the vocal tract. By contrast, the length of the pharyngeal component was not as consistently related to body size. These outcomes are some of the clearest evidence to date in support of Fitch's hypothesis. At the same time, they highlight the potential for elements of both honest and deceptive body signaling to occur simultaneously via differential acoustic cues provided by the oral versus pharyngeal components of the vocal tract.

  7. Radiographic analysis of vocal tract length and its relation to overall body size in two canid species

    PubMed Central

    Plotsky, K.; Rendall, D.; Riede, T.; Chase, K.

    2013-01-01

    Body size is an important determinant of resource and mate competition in many species. Competition is often mediated by conspicuous vocal displays, which may help to intimidate rivals and attract mates by providing honest cues to signaler size. Fitch proposed that vocal tract resonances (or formants) should provide particularly good, or honest, acoustic cues to signaler size because they are determined by the length of the vocal tract, which in turn, is hypothesized to scale reliably with overall body size. There is some empirical support for this hypothesis, but to date, many of the effects have been either mixed for males compared with females, weaker than expected in one or the other sex, or complicated by sampling issues. In this paper, we undertake a direct test of Fitch’s hypothesis in two canid species using large samples that control for age- and sex-related variation. The samples involved radiographic images of 120 Portuguese water dogs Canis lupus familiaris and 121 Russian silver foxes Vulpes vulpes. Direct measurements were made of vocal tract length from X-ray images and compared against independent measures of body size. In adults of both species, and within both sexes, overall vocal tract length was strongly and significantly correlated with body size. Effects were strongest for the oral component of the vocal tract. By contrast, the length of the pharyngeal component was not as consistently related to body size. These outcomes are some of the clearest evidence to date in support of Fitch’s hypothesis. At the same time, they highlight the potential for elements of both honest and deceptive body signaling to occur simultaneously via differential acoustic cues provided by the oral versus pharyngeal components of the vocal tract. PMID:24363497

  8. A citrus waste-based biorefinery as a source of renewable energy: technical advances and analysis of engineering challenges.

    PubMed

    Rivas-Cantu, Raul C; Jones, Kim D; Mills, Patrick L

    2013-04-01

    An assessment of recent technical advances on pretreatment processes and its effects on enzymatic hydrolysis as the main steps of a proposed citrus processing waste (CPW) biorefinery is presented. Engineering challenges and relevant gaps in scientific and technical information for reliable design, modeling and scale up of a CPW biorefinery are also discussed. Some integrated physico-chemical pretreatments are proposed for testing for CPW, including high speed knife-grinding and simultaneous caustic addition. These new proposed processes and the effect of parameters such as particle size, surface area and morphology, pore volume and chemical composition of the diverse fractions resulting from pretreatment and enzymatic hydrolysis need to be evaluated and compared for pretreated and untreated samples of grapefruit processing waste. This assessment suggests the potential for filling the data gaps, and preliminary results demonstrate that the reduction of particle size and the increased surface area for the CPW will result in higher reaction rates and monosaccharide yields for the pretreated waste material.

  9. Extension of the Haseman-Elston regression model to longitudinal data.

    PubMed

    Won, Sungho; Elston, Robert C; Park, Taesung

    2006-01-01

    We propose an extension to longitudinal data of the Haseman and Elston regression method for linkage analysis. The proposed model is a mixed model having several random effects. As response variable, we investigate the sibship sample mean corrected cross-product (smHE) and the BLUP-mean corrected cross product (pmHE), comparing them with the original squared difference (oHE), the overall mean corrected cross-product (rHE), and the weighted average of the squared difference and the squared mean-corrected sum (wHE). The proposed model allows for the correlation structure of longitudinal data. Also, the model can test for gene x time interaction to discover genetic variation over time. The model was applied in an analysis of the Genetic Analysis Workshop 13 (GAW13) simulated dataset for a quantitative trait simulating systolic blood pressure. Independence models did not preserve the test sizes, while the mixed models with both family and sibpair random effects tended to preserve size well. Copyright 2006 S. Karger AG, Basel.

  10. Determination of Magnetic Parameters of Maghemite (γ-Fe2O3) Core-Shell Nanoparticles from Nonlinear Magnetic Susceptibility Measurements

    NASA Astrophysics Data System (ADS)

    Syvorotka, Ihor I.; Pavlyk, Lyubomyr P.; Ubizskii, Sergii B.; Buryy, Oleg A.; Savytskyy, Hrygoriy V.; Mitina, Nataliya Y.; Zaichenko, Oleksandr S.

    2017-04-01

    Method of determining of magnetic moment and size from measurements of dependence of the nonlinear magnetic susceptibility upon magnetic field is proposed, substantiated and tested for superparamagnetic nanoparticles (SPNP) of the "magnetic core-polymer shell" type which are widely used in biomedical technologies. The model of the induction response of the SPNP ensemble on the combined action of the magnetic harmonic excitation field and permanent bias field is built, and the analysis of possible ways to determine the magnetic moment and size of the nanoparticles as well as the parameters of the distribution of these variables is performed. Experimental verification of the proposed method was implemented on samples of SPNP with maghemite core in dry form as well as in colloidal systems. The results have been compared with the data obtained by other methods. Advantages of the proposed method are analyzed and discussed, particularly in terms of its suitability for routine express testing of SPNP for biomedical technology.

  11. Fused Deposition Modeling Enables the Low-Cost Fabrication of Porous, Customized-Shape Sorbents for Small-Molecule Extraction.

    PubMed

    Belka, Mariusz; Ulenberg, Szymon; Bączek, Tomasz

    2017-04-18

    Fused deposition modeling, one of the most common techniques in three-dimensional printing and additive manufacturing, has many practical applications in the fields of chemistry and pharmacy. We demonstrate that a thermoplastic elastomer-poly(vinyl alcohol) (PVA) composite material (LAY-FOMM 60), which presents porous properties after PVA removal, is useful for the extraction of small-molecule drug-like compounds from water samples. The usefulness of the proposed approach is demonstrated by the extraction of glimepiride from a water sample, followed by LC-MS analysis. The recovery was 82.24%, with a relative standard deviation of less than 5%. The proposed approach can change the way of thinking about extraction and sample preparation due to a shift to the use of sorbents with customizable size, shape, and chemical properties that do not rely on commercial suppliers.

  12. Improvements in sub-grid, microphysics averages using quadrature based approaches

    NASA Astrophysics Data System (ADS)

    Chowdhary, K.; Debusschere, B.; Larson, V. E.

    2013-12-01

    Sub-grid variability in microphysical processes plays a critical role in atmospheric climate models. In order to account for this sub-grid variability, Larson and Schanen (2013) propose placing a probability density function on the sub-grid cloud microphysics quantities, e.g. autoconversion rate, essentially interpreting the cloud microphysics quantities as a random variable in each grid box. Random sampling techniques, e.g. Monte Carlo and Latin Hypercube, can be used to calculate statistics, e.g. averages, on the microphysics quantities, which then feed back into the model dynamics on the coarse scale. We propose an alternate approach using numerical quadrature methods based on deterministic sampling points to compute the statistical moments of microphysics quantities in each grid box. We have performed a preliminary test on the Kessler autoconversion formula, and, upon comparison with Latin Hypercube sampling, our approach shows an increased level of accuracy with a reduction in sample size by almost two orders of magnitude. Application to other microphysics processes is the subject of ongoing research.

  13. Confirmatory factor analysis applied to the Force Concept Inventory

    NASA Astrophysics Data System (ADS)

    Eaton, Philip; Willoughby, Shannon D.

    2018-06-01

    In 1995, Huffman and Heller used exploratory factor analysis to draw into question the factors of the Force Concept Inventory (FCI). Since then several papers have been published examining the factors of the FCI on larger sets of student responses and understandable factors were extracted as a result. However, none of these proposed factor models have been verified to not be unique to their original sample through the use of independent sets of data. This paper seeks to confirm the factor models proposed by Scott et al. in 2012, and Hestenes et al. in 1992, as well as another expert model proposed within this study through the use of confirmatory factor analysis (CFA) and a sample of 20 822 postinstruction student responses to the FCI. Upon application of CFA using the full sample, all three models were found to fit the data with acceptable global fit statistics. However, when CFA was performed using these models on smaller sample sizes the models proposed by Scott et al. and Eaton and Willoughby were found to be far more stable than the model proposed by Hestenes et al. The goodness of fit of these models to the data suggests that the FCI can be scored on factors that are not unique to a single class. These scores could then be used to comment on how instruction methods effect the performance of students along a single factor and more in-depth analyses of curriculum changes may be possible as a result.

  14. Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets.

    PubMed

    Li, Der-Chiang; Hu, Susan C; Lin, Liang-Sian; Yeh, Chun-Wu

    2017-01-01

    It is difficult for learning models to achieve high classification performances with imbalanced data sets, because with imbalanced data sets, when one of the classes is much larger than the others, most machine learning and data mining classifiers are overly influenced by the larger classes and ignore the smaller ones. As a result, the classification algorithms often have poor learning performances due to slow convergence in the smaller classes. To balance such data sets, this paper presents a strategy that involves reducing the sizes of the majority data and generating synthetic samples for the minority data. In the reducing operation, we use the box-and-whisker plot approach to exclude outliers and the Mega-Trend-Diffusion method to find representative data from the majority data. To generate the synthetic samples, we propose a counterintuitive hypothesis to find the distributed shape of the minority data, and then produce samples according to this distribution. Four real datasets were used to examine the performance of the proposed approach. We used paired t-tests to compare the Accuracy, G-mean, and F-measure scores of the proposed data pre-processing (PPDP) method merging in the D3C method (PPDP+D3C) with those of the one-sided selection (OSS), the well-known SMOTEBoost (SB) study, and the normal distribution-based oversampling (NDO) approach, and the proposed data pre-processing (PPDP) method. The results indicate that the classification performance of the proposed approach is better than that of above-mentioned methods.

  15. Handling limited datasets with neural networks in medical applications: A small-data approach.

    PubMed

    Shaikhina, Torgyn; Khovanova, Natalia A

    2017-01-01

    Single-centre studies in medical domain are often characterised by limited samples due to the complexity and high costs of patient data collection. Machine learning methods for regression modelling of small datasets (less than 10 observations per predictor variable) remain scarce. Our work bridges this gap by developing a novel framework for application of artificial neural networks (NNs) for regression tasks involving small medical datasets. In order to address the sporadic fluctuations and validation issues that appear in regression NNs trained on small datasets, the method of multiple runs and surrogate data analysis were proposed in this work. The approach was compared to the state-of-the-art ensemble NNs; the effect of dataset size on NN performance was also investigated. The proposed framework was applied for the prediction of compressive strength (CS) of femoral trabecular bone in patients suffering from severe osteoarthritis. The NN model was able to estimate the CS of osteoarthritic trabecular bone from its structural and biological properties with a standard error of 0.85MPa. When evaluated on independent test samples, the NN achieved accuracy of 98.3%, outperforming an ensemble NN model by 11%. We reproduce this result on CS data of another porous solid (concrete) and demonstrate that the proposed framework allows for an NN modelled with as few as 56 samples to generalise on 300 independent test samples with 86.5% accuracy, which is comparable to the performance of an NN developed with 18 times larger dataset (1030 samples). The significance of this work is two-fold: the practical application allows for non-destructive prediction of bone fracture risk, while the novel methodology extends beyond the task considered in this study and provides a general framework for application of regression NNs to medical problems characterised by limited dataset sizes. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  16. Ensemble representations: effects of set size and item heterogeneity on average size perception.

    PubMed

    Marchant, Alexander P; Simons, Daniel J; de Fockert, Jan W

    2013-02-01

    Observers can accurately perceive and evaluate the statistical properties of a set of objects, forming what is now known as an ensemble representation. The accuracy and speed with which people can judge the mean size of a set of objects have led to the proposal that ensemble representations of average size can be computed in parallel when attention is distributed across the display. Consistent with this idea, judgments of mean size show little or no decrement in accuracy when the number of objects in the set increases. However, the lack of a set size effect might result from the regularity of the item sizes used in previous studies. Here, we replicate these previous findings, but show that judgments of mean set size become less accurate when set size increases and the heterogeneity of the item sizes increases. This pattern can be explained by assuming that average size judgments are computed using a limited capacity sampling strategy, and it does not necessitate an ensemble representation computed in parallel across all items in a display. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Statistical Methods in Assembly Quality Management of Multi-Element Products on Automatic Rotor Lines

    NASA Astrophysics Data System (ADS)

    Pries, V. V.; Proskuriakov, N. E.

    2018-04-01

    To control the assembly quality of multi-element mass-produced products on automatic rotor lines, control methods with operational feedback are required. However, due to possible failures in the operation of the devices and systems of automatic rotor line, there is always a real probability of getting defective (incomplete) products into the output process stream. Therefore, a continuous sampling control of the products completeness, based on the use of statistical methods, remains an important element in managing the quality of assembly of multi-element mass products on automatic rotor lines. The feature of continuous sampling control of the multi-element products completeness in the assembly process is its breaking sort, which excludes the possibility of returning component parts after sampling control to the process stream and leads to a decrease in the actual productivity of the assembly equipment. Therefore, the use of statistical procedures for continuous sampling control of the multi-element products completeness when assembled on automatic rotor lines requires the use of such sampling plans that ensure a minimum size of control samples. Comparison of the values of the limit of the average output defect level for the continuous sampling plan (CSP) and for the automated continuous sampling plan (ACSP) shows the possibility of providing lower limit values for the average output defects level using the ACSP-1. Also, the average sample size when using the ACSP-1 plan is less than when using the CSP-1 plan. Thus, the application of statistical methods in the assembly quality management of multi-element products on automatic rotor lines, involving the use of proposed plans and methods for continuous selective control, will allow to automating sampling control procedures and the required level of quality of assembled products while minimizing sample size.

  18. Synthesis of mesoscale, crumpled, reduced graphene oxide roses by water-in-oil emulsion approach

    NASA Astrophysics Data System (ADS)

    Sharma, Shruti; Pham, Viet H.; Boscoboinik, Jorge A.; Camino, Fernando; Dickerson, James H.; Tannenbaum, Rina

    2018-05-01

    Mesoscale crumpled graphene oxide roses (GO roses) were synthesized by using colloidal graphene oxide (GO) variants as precursors for a hybrid emulsification-rapid evaporation approach. This process produced rose-like, spherical, reduced mesostructures of colloidal GO sheets, with corrugated surfaces and particle sizes tunable in the range of ∼800 nm to 15 μm. Excellent reproducibility for particle size distribution is shown for each selected speed of homogenizer rotor among different sample batches. The morphology and chemical structure of these produced GO roses was investigated using electron microscopy and spectroscopy techniques. The proposed synthesis route provides control over particle size, morphology and chemical properties of the synthesized GO roses.

  19. Spatial distribution and sequential sampling plans for Tuta absoluta (Lepidoptera: Gelechiidae) in greenhouse tomato crops.

    PubMed

    Cocco, Arturo; Serra, Giuseppe; Lentini, Andrea; Deliperi, Salvatore; Delrio, Gavino

    2015-09-01

    The within- and between-plant distribution of the tomato leafminer, Tuta absoluta (Meyrick), was investigated in order to define action thresholds based on leaf infestation and to propose enumerative and binomial sequential sampling plans for pest management applications in protected crops. The pest spatial distribution was aggregated between plants, and median leaves were the most suitable sample to evaluate the pest density. Action thresholds of 36 and 48%, 43 and 56% and 60 and 73% infested leaves, corresponding to economic thresholds of 1 and 3% damaged fruits, were defined for tomato cultivars with big, medium and small fruits respectively. Green's method was a more suitable enumerative sampling plan as it required a lower sampling effort. Binomial sampling plans needed lower average sample sizes than enumerative plans to make a treatment decision, with probabilities of error of <0.10. The enumerative sampling plan required 87 or 343 leaves to estimate the population density in extensive or intensive ecological studies respectively. Binomial plans would be more practical and efficient for control purposes, needing average sample sizes of 17, 20 and 14 leaves to take a pest management decision in order to avoid fruit damage higher than 1% in cultivars with big, medium and small fruits respectively. © 2014 Society of Chemical Industry.

  20. 75 FR 46958 - Proposed Fair Market Rents for the Housing Choice Voucher Program and Moderate Rehabilitation...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-04

    ... program staff. Questions on how to conduct FMR surveys or concerning further methodological explanations... insufficient sample sizes. The areas covered by this estimation method had less than the HUD standard of 200...-bedroom FMR for that area's CBSA as calculated using methods employed for past metropolitan area FMR...

  1. The Empirical Selection of Anchor Items Using a Multistage Approach

    ERIC Educational Resources Information Center

    Craig, Brandon

    2017-01-01

    The purpose of this study was to determine if using a multistage approach for the empirical selection of anchor items would lead to more accurate DIF detection rates than the anchor selection methods proposed by Kopf, Zeileis, & Strobl (2015b). A simulation study was conducted in which the sample size, percentage of DIF, and balance of DIF…

  2. 77 FR 47590 - Notice of Request for a Revision to and Extension of Approval of an Information Collection...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-08-09

    ... requirements or power calculations that justify the proposed sample size, the expected response rate, methods...] Notice of Request for a Revision to and Extension of Approval of an Information Collection; Qualitative... associated with qualitative customer and stakeholder feedback on service delivery by the Animal and Plant...

  3. An Investigation of the Raudenbush (1988) Test for Studying Variance Heterogeneity.

    ERIC Educational Resources Information Center

    Harwell, Michael

    1997-01-01

    The meta-analytic method proposed by S. W. Raudenbush (1988) for studying variance heterogeneity was studied. Results of a Monte Carlo study indicate that the Type I error rate of the test is sensitive to even modestly platykurtic score distributions and to the ratio of study sample size to the number of studies. (SLD)

  4. Rollerjaw Rock Crusher

    NASA Technical Reports Server (NTRS)

    Peters, Gregory; Brown, Kyle; Fuerstenau, Stephen

    2009-01-01

    The rollerjaw rock crusher melds the concepts of jaw crushing and roll crushing long employed in the mining and rock-crushing industries. Rollerjaw rock crushers have been proposed for inclusion in geological exploration missions on Mars, where they would be used to pulverize rock samples into powders in the tens of micrometer particle size range required for analysis by scientific instruments.

  5. Mesoporous Akaganeite of Adjustable Pore Size Synthesized using Mixed Templates

    NASA Astrophysics Data System (ADS)

    Zhang, Y.; Ge, D. L.; Ren, H. P.; Fan, Y. J.; Wu, L. M.; Sun, Z. X.

    2017-12-01

    Mesoporous akaganeite with large and adjustable pore size was synthesized through a co-template method, which was achieved by the combined interaction between PEG2000 and alkyl amines with different lengths of the straight carbon chain. The characterized results indicate that the synthesized samples show comparatively narrow BJH pore size distributions and centered at 14.3 nm when PEG and HEPA was used, and it could be enlarged to 16.8 and 19.4 nm respectively through changing the alkyl amines to DDA and HDA. Meanwhile, all the synthesized akaganeite possess relativity high specific surface area ranging from 183 to 281 m2/g and high total pore volume of 0.98 to 1.5 cm3/g. A possible mechanism leading to the pore size changing was also proposed.

  6. Interval estimation and optimal design for the within-subject coefficient of variation for continuous and binary variables

    PubMed Central

    Shoukri, Mohamed M; Elkum, Nasser; Walter, Stephen D

    2006-01-01

    Background In this paper we propose the use of the within-subject coefficient of variation as an index of a measurement's reliability. For continuous variables and based on its maximum likelihood estimation we derive a variance-stabilizing transformation and discuss confidence interval construction within the framework of a one-way random effects model. We investigate sample size requirements for the within-subject coefficient of variation for continuous and binary variables. Methods We investigate the validity of the approximate normal confidence interval by Monte Carlo simulations. In designing a reliability study, a crucial issue is the balance between the number of subjects to be recruited and the number of repeated measurements per subject. We discuss efficiency of estimation and cost considerations for the optimal allocation of the sample resources. The approach is illustrated by an example on Magnetic Resonance Imaging (MRI). We also discuss the issue of sample size estimation for dichotomous responses with two examples. Results For the continuous variable we found that the variance stabilizing transformation improves the asymptotic coverage probabilities on the within-subject coefficient of variation for the continuous variable. The maximum like estimation and sample size estimation based on pre-specified width of confidence interval are novel contribution to the literature for the binary variable. Conclusion Using the sample size formulas, we hope to help clinical epidemiologists and practicing statisticians to efficiently design reliability studies using the within-subject coefficient of variation, whether the variable of interest is continuous or binary. PMID:16686943

  7. Topological Analysis and Gaussian Decision Tree: Effective Representation and Classification of Biosignals of Small Sample Size.

    PubMed

    Zhang, Zhifei; Song, Yang; Cui, Haochen; Wu, Jayne; Schwartz, Fernando; Qi, Hairong

    2017-09-01

    Bucking the trend of big data, in microdevice engineering, small sample size is common, especially when the device is still at the proof-of-concept stage. The small sample size, small interclass variation, and large intraclass variation, have brought biosignal analysis new challenges. Novel representation and classification approaches need to be developed to effectively recognize targets of interests with the absence of a large training set. Moving away from the traditional signal analysis in the spatiotemporal domain, we exploit the biosignal representation in the topological domain that would reveal the intrinsic structure of point clouds generated from the biosignal. Additionally, we propose a Gaussian-based decision tree (GDT), which can efficiently classify the biosignals even when the sample size is extremely small. This study is motivated by the application of mastitis detection using low-voltage alternating current electrokinetics (ACEK) where five categories of bisignals need to be recognized with only two samples in each class. Experimental results demonstrate the robustness of the topological features as well as the advantage of GDT over some conventional classifiers in handling small dataset. Our method reduces the voltage of ACEK to a safe level and still yields high-fidelity results with a short assay time. This paper makes two distinctive contributions to the field of biosignal analysis, including performing signal processing in the topological domain and handling extremely small dataset. Currently, there have been no related works that can efficiently tackle the dilemma between avoiding electrochemical reaction and accelerating assay process using ACEK.

  8. Purification of complex samples: Implementation of a modular and reconfigurable droplet-based microfluidic platform with cascaded deterministic lateral displacement separation modules

    PubMed Central

    Pudda, Catherine; Boizot, François; Verplanck, Nicolas; Revol-Cavalier, Frédéric; Berthier, Jean; Thuaire, Aurélie

    2018-01-01

    Particle separation in microfluidic devices is a common problematic for sample preparation in biology. Deterministic lateral displacement (DLD) is efficiently implemented as a size-based fractionation technique to separate two populations of particles around a specific size. However, real biological samples contain components of many different sizes and a single DLD separation step is not sufficient to purify these complex samples. When connecting several DLD modules in series, pressure balancing at the DLD outlets of each step becomes critical to ensure an optimal separation efficiency. A generic microfluidic platform is presented in this paper to optimize pressure balancing, when DLD separation is connected either to another DLD module or to a different microfluidic function. This is made possible by generating droplets at T-junctions connected to the DLD outlets. Droplets act as pressure controllers, which perform at the same time the encapsulation of DLD sorted particles and the balance of output pressures. The optimized pressures to apply on DLD modules and on T-junctions are determined by a general model that ensures the equilibrium of the entire platform. The proposed separation platform is completely modular and reconfigurable since the same predictive model applies to any cascaded DLD modules of the droplet-based cartridge. PMID:29768490

  9. Substantial Expansion of Detectable Size Range in Ionic Current Sensing through Pores by Using a Microfluidic Bridge Circuit.

    PubMed

    Yasaki, Hirotoshi; Yasui, Takao; Yanagida, Takeshi; Kaji, Noritada; Kanai, Masaki; Nagashima, Kazuki; Kawai, Tomoji; Baba, Yoshinobu

    2017-10-11

    Measuring ionic currents passing through nano- or micropores has shown great promise for the electrical discrimination of various biomolecules, cells, bacteria, and viruses. However, conventional measurements have shown there is an inherent limitation to the detectable particle volume (1% of the pore volume), which critically hinders applications to real mixtures of biomolecule samples with a wide size range of suspended particles. Here we propose a rational methodology that can detect samples with the detectable particle volume of 0.01% of the pore volume by measuring a transient current generated from the potential differences in a microfluidic bridge circuit. Our method substantially suppresses the background ionic current from the μA level to the pA level, which essentially lowers the detectable particle volume limit even for relatively large pore structures. Indeed, utilizing a microscale long pore structure (volume of 5.6 × 10 4 aL; height and width of 2.0 × 2.0 μm; length of 14 μm), we successfully detected various samples including polystyrene nanoparticles (volume: 4 aL), bacteria, cancer cells, and DNA molecules. Our method will expand the applicability of ionic current sensing systems for various mixed biomolecule samples with a wide size range, which have been difficult to measure by previously existing pore technologies.

  10. A similarity based learning framework for interim analysis of outcome prediction of acupuncture for neck pain.

    PubMed

    Zhang, Gang; Liang, Zhaohui; Yin, Jian; Fu, Wenbin; Li, Guo-Zheng

    2013-01-01

    Chronic neck pain is a common morbid disorder in modern society. Acupuncture has been administered for treating chronic pain as an alternative therapy for a long time, with its effectiveness supported by the latest clinical evidence. However, the potential effective difference in different syndrome types is questioned due to the limits of sample size and statistical methods. We applied machine learning methods in an attempt to solve this problem. Through a multi-objective sorting of subjective measurements, outstanding samples are selected to form the base of our kernel-oriented model. With calculation of similarities between the concerned sample and base samples, we are able to make full use of information contained in the known samples, which is especially effective in the case of a small sample set. To tackle the parameters selection problem in similarity learning, we propose an ensemble version of slightly different parameter setting to obtain stronger learning. The experimental result on a real data set shows that compared to some previous well-known methods, the proposed algorithm is capable of discovering the underlying difference among different syndrome types and is feasible for predicting the effective tendency in clinical trials of large samples.

  11. The interrupted power law and the size of shadow banking.

    PubMed

    Fiaschi, Davide; Kondor, Imre; Marsili, Matteo; Volpati, Valerio

    2014-01-01

    Using public data (Forbes Global 2000) we show that the asset sizes for the largest global firms follow a Pareto distribution in an intermediate range, that is "interrupted" by a sharp cut-off in its upper tail, where it is totally dominated by financial firms. This flattening of the distribution contrasts with a large body of empirical literature which finds a Pareto distribution for firm sizes both across countries and over time. Pareto distributions are generally traced back to a mechanism of proportional random growth, based on a regime of constant returns to scale. This makes our findings of an "interrupted" Pareto distribution all the more puzzling, because we provide evidence that financial firms in our sample should operate in such a regime. We claim that the missing mass from the upper tail of the asset size distribution is a consequence of shadow banking activity and that it provides an (upper) estimate of the size of the shadow banking system. This estimate-which we propose as a shadow banking index-compares well with estimates of the Financial Stability Board until 2009, but it shows a sharper rise in shadow banking activity after 2010. Finally, we propose a proportional random growth model that reproduces the observed distribution, thereby providing a quantitative estimate of the intensity of shadow banking activity.

  12. Particle Morphology Analysis of Biomass Material Based on Improved Image Processing Method

    PubMed Central

    Lu, Zhaolin

    2017-01-01

    Particle morphology, including size and shape, is an important factor that significantly influences the physical and chemical properties of biomass material. Based on image processing technology, a method was developed to process sample images, measure particle dimensions, and analyse the particle size and shape distributions of knife-milled wheat straw, which had been preclassified into five nominal size groups using mechanical sieving approach. Considering the great variation of particle size from micrometer to millimeter, the powders greater than 250 μm were photographed by a flatbed scanner without zoom function, and the others were photographed using a scanning electron microscopy (SEM) with high-image resolution. Actual imaging tests confirmed the excellent effect of backscattered electron (BSE) imaging mode of SEM. Particle aggregation is an important factor that affects the recognition accuracy of the image processing method. In sample preparation, the singulated arrangement and ultrasonic dispersion methods were used to separate powders into particles that were larger and smaller than the nominal size of 250 μm. In addition, an image segmentation algorithm based on particle geometrical information was proposed to recognise the finer clustered powders. Experimental results demonstrated that the improved image processing method was suitable to analyse the particle size and shape distributions of ground biomass materials and solve the size inconsistencies in sieving analysis. PMID:28298925

  13. Kidney function endpoints in kidney transplant trials: a struggle for power.

    PubMed

    Ibrahim, A; Garg, A X; Knoll, G A; Akbari, A; White, C A

    2013-03-01

    Kidney function endpoints are commonly used in randomized controlled trials (RCTs) in kidney transplantation (KTx). We conducted this study to estimate the proportion of ongoing RCTs with kidney function endpoints in KTx where the proposed sample size is large enough to detect meaningful differences in glomerular filtration rate (GFR) with adequate statistical power. RCTs were retrieved using the key word "kidney transplantation" from the National Institute of Health online clinical trial registry. Included trials had at least one measure of kidney function tracked for at least 1 month after transplant. We determined the proportion of two-arm parallel trials that had sufficient sample sizes to detect a minimum 5, 7.5 and 10 mL/min difference in GFR between arms. Fifty RCTs met inclusion criteria. Only 7% of the trials were above a sample size of 562, the number needed to detect a minimum 5 mL/min difference between the groups should one exist (assumptions: α = 0.05; power = 80%, 10% loss to follow-up, common standard deviation of 20 mL/min). The result increased modestly to 36% of trials when a minimum 10 mL/min difference was considered. Only a minority of ongoing trials have adequate statistical power to detect between-group differences in kidney function using conventional sample size estimating parameters. For this reason, some potentially effective interventions which ultimately could benefit patients may be abandoned from future assessment. © Copyright 2013 The American Society of Transplantation and the American Society of Transplant Surgeons.

  14. Selecting promising treatments in randomized Phase II cancer trials with an active control.

    PubMed

    Cheung, Ying Kuen

    2009-01-01

    The primary objective of Phase II cancer trials is to evaluate the potential efficacy of a new regimen in terms of its antitumor activity in a given type of cancer. Due to advances in oncology therapeutics and heterogeneity in the patient population, such evaluation can be interpreted objectively only in the presence of a prospective control group of an active standard treatment. This paper deals with the design problem of Phase II selection trials in which several experimental regimens are compared to an active control, with an objective to identify an experimental arm that is more effective than the control or to declare futility if no such treatment exists. Conducting a multi-arm randomized selection trial is a useful strategy to prioritize experimental treatments for further testing when many candidates are available, but the sample size required in such a trial with an active control could raise feasibility concerns. In this study, we extend the sequential probability ratio test for normal observations to the multi-arm selection setting. The proposed methods, allowing frequent interim monitoring, offer high likelihood of early trial termination, and as such enhance enrollment feasibility. The termination and selection criteria have closed form solutions and are easy to compute with respect to any given set of error constraints. The proposed methods are applied to design a selection trial in which combinations of sorafenib and erlotinib are compared to a control group in patients with non-small-cell lung cancer using a continuous endpoint of change in tumor size. The operating characteristics of the proposed methods are compared to that of a single-stage design via simulations: The sample size requirement is reduced substantially and is feasible at an early stage of drug development.

  15. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  16. Statistical differences between relative quantitative molecular fingerprints from microbial communities.

    PubMed

    Portillo, M C; Gonzalez, J M

    2008-08-01

    Molecular fingerprints of microbial communities are a common method for the analysis and comparison of environmental samples. The significance of differences between microbial community fingerprints was analyzed considering the presence of different phylotypes and their relative abundance. A method is proposed by simulating coverage of the analyzed communities as a function of sampling size applying a Cramér-von Mises statistic. Comparisons were performed by a Monte Carlo testing procedure. As an example, this procedure was used to compare several sediment samples from freshwater ponds using a relative quantitative PCR-DGGE profiling technique. The method was able to discriminate among different samples based on their molecular fingerprints, and confirmed the lack of differences between aliquots from a single sample.

  17. Determination of the size, concentration, and refractive index of silica nanoparticles from turbidity spectra.

    PubMed

    Khlebtsov, Boris N; Khanadeev, Vitaly A; Khlebtsov, Nikolai G

    2008-08-19

    The size and concentration of silica cores determine the size and concentration of silica/gold nanoshells in final preparations. Until now, the concentration of silica/gold nanoshells with Stober's silica core has been evaluated through the material balance assumption. Here, we describe a method for simultaneous determination of the average size and concentration of silica nanospheres from turbidity spectra measured within the 400-600 nm spectral band. As the refractive index of silica nanoparticles is the key input parameter for optical determination of their concentration, we propose an optical method and provide experimental data on a direct determination of the refractive index of silica particles n = 1.475 +/- 0.005. Finally, we exemplify our method by determining the particle size and concentration for 10 samples and compare the results with transmission electron microscopy (TEM), atomic force microscopy (AFM), and dynamic light scattering data.

  18. A class of Box-Cox transformation models for recurrent event data.

    PubMed

    Sun, Liuquan; Tong, Xingwei; Zhou, Xian

    2011-04-01

    In this article, we propose a class of Box-Cox transformation models for recurrent event data, which includes the proportional means models as special cases. The new model offers great flexibility in formulating the effects of covariates on the mean functions of counting processes while leaving the stochastic structure completely unspecified. For the inference on the proposed models, we apply a profile pseudo-partial likelihood method to estimate the model parameters via estimating equation approaches and establish large sample properties of the estimators and examine its performance in moderate-sized samples through simulation studies. In addition, some graphical and numerical procedures are presented for model checking. An example of application on a set of multiple-infection data taken from a clinic study on chronic granulomatous disease (CGD) is also illustrated.

  19. Detector-unit-dependent calibration for polychromatic projections of rock core CT.

    PubMed

    Li, Mengfei; Zhao, Yunsong; Zhang, Peng

    2017-01-01

    Computed tomography (CT) plays an important role in digital rock analysis, which is a new prospective technique for oil and gas industry. But the artifacts in CT images will influence the accuracy of the digital rock model. In this study, we proposed and demonstrated a novel method to restore detector-unit-dependent functions for polychromatic projection calibration by scanning some simple shaped reference samples. As long as the attenuation coefficients of the reference samples are similar to the scanned object, the size or position is not needed to be exactly known. Both simulated and real data were used to verify the proposed method. The results showed that the new method reduced both beam hardening artifacts and ring artifacts effectively. Moreover, the method appeared to be quite robust.

  20. Effect of particle size and percentages of Boron carbide on the thermal neutron radiation shielding properties of HDPE/B4C composite: Experimental and simulation studies

    NASA Astrophysics Data System (ADS)

    Soltani, Zahra; Beigzadeh, Amirmohammad; Ziaie, Farhood; Asadi, Eskandar

    2016-10-01

    In this paper the effects of particle size and weight percentage of the reinforcement phase on the absorption ability of thermal neutron by HDPE/B4C composites were investigated by means of Monte-Carlo simulation method using MCNP code and experimental studies. The composite samples were prepared using the HDPE filled with different weight percentages of Boron carbide powder in the form of micro and nano particles. Micro and nano composite were prepared under the similar mixing and moulding processes. The samples were subjected to thermal neutron radiation. Neutron shielding efficiency in terms of the neutron transmission fractions of the composite samples were investigated and compared with simulation results. According to the simulation results, the particle size of the radiation shielding material has an important role on the shielding efficiency. By decreasing the particle size of shielding material in each weight percentages of the reinforcement phase, better radiation shielding properties were obtained. It seems that, decreasing the particle size and homogeneous distribution of nano forms of B4C particles, cause to increase the collision probability between the incident thermal neutron and the shielding material which consequently improve the radiation shielding properties. So, this result, propose the feasibility of nano composite as shielding material to have a high performance shielding characteristic, low weight and low thick shielding along with economical benefit.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Fangyan; Zhang, Song; Chung Wong, Pak

    Effectively visualizing large graphs and capturing the statistical properties are two challenging tasks. To aid in these two tasks, many sampling approaches for graph simplification have been proposed, falling into three categories: node sampling, edge sampling, and traversal-based sampling. It is still unknown which approach is the best. We evaluate commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates. We conduct our evaluation on three graph models: random graphs, small-world graphs, and scale-free graphs. Initial results indicate that the effectiveness of a sampling method is dependent on the graph model, themore » size of the graph, and the desired statistical property. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task, and the results presented can be incorporated into graph visualization and analysis tools.« less

  2. Size and shape of uniform particles precipitated in homogeneous solutions

    NASA Astrophysics Data System (ADS)

    Sevonkaev, Igor V.

    The assembly of nanosize crystals into larger uniform colloids is a fundamental process that plays a critical role in the formation of a very broad range of fine-particles used in numerous applications in technology, medicine, and national security. It is widely accepted that, along with size, in most of these applications the shape of the particles represents a critical factor. In the current research, we investigate the size and shape control of uniform particles prepared by precipitation in homogeneous solutions. In the first---theoretical---part a combinational mechanism of the shape control during particle growth was proposed and analyzed numerically. The main finding of our simulation is that a proper balance of two processes, preferential attachment of transported monomers at the protruding features of the growing cluster and monomer rearrangement at the cluster surface, can yield a well-defined particle shape that persist for sizes much larger than the original seed over a large interval of time. In the experimental part, three chemically simple systems were selected MgF2, NaMgF3, and PbS for defining and evaluating the key parameters of the shape and size control of the precipitates. Thus, uniform dispersions of particles of different morphologies (spherical, cubic, platelet, and prismatic) were prepared by precipitation in aqueous solutions. The mechanisms of the formation of the resulting particles of different shapes are explained by the role of the pH, temperature, solubility, and ionic strength. Stages of particles growth were evaluated on short and long time scales, winch allowed to propose multistage mechanisms of NaMgF3 growth and estimate induction time and critical nuclei size for MgF2. In addition, for prospective numerical modeling the surface tensions of spherical and platelet particles of MgF2 were evaluated from the X-ray data by a lattice parameter change method. Also, a new method for the evaluation of the variation in the density distribution in colloidal spherical particles was proposed. This method utilizes transmission electron microscopy without high resolution mode and processes acquired images. Suggested method eliminates the dependency of the image contrast on sample crystallinity. The advantage of such approach manifested by the short time sample preparation, fast instrument tune-up, rapid image acquisition and analysis, all of which shortens the processing time.

  3. Rock magnetic properties estimated from coercivity - blocking temperature diagram: application to recent volcanic rocks

    NASA Astrophysics Data System (ADS)

    Terada, T.; Sato, M.; Mochizuki, N.; Yamamoto, Y.; Tsunakawa, H.

    2013-12-01

    Magnetic properties of ferromagnetic minerals generally depend on their chemical composition, crystal structure, size, and shape. In the usual paleomagnetic study, we use a bulk sample which is the assemblage of magnetic minerals showing broad distributions of various magnetic properties. Microscopic and Curie-point observations of the bulk sample enable us to identify the constituent magnetic minerals, while other measurements, for example, stepwise thermal and/or alternating field demagnetizations (ThD, AFD) make it possible to estimate size, shape and domain state of the constituent magnetic grains. However, estimation based on stepwise demagnetizations has a limitation that magnetic grains with the same coercivity Hc (or blocking temperature Tb) can be identified as the single population even though they could have different size and shape. Dunlop and West (1969) carried out mapping of grain size and coercivity (Hc) using pTRM. However, it is considered that their mapping method is basically applicable to natural rocks containing only SD grains, since the grain sizes are estimated on the basis of the single domain theory (Neel, 1949). In addition, it is impossible to check thermal alteration due to laboratory heating in their experiment. In the present study we propose a new experimental method which makes it possible to estimate distribution of size and shape of magnetic minerals in a bulk sample. The present method is composed of simple procedures: (1) imparting ARM to a bulk sample, (2) ThD at a certain temperature, (3) stepwise AFD on the remaining ARM, (4) repeating the steps (1) ~ (3) with ThD at elevating temperatures up to the Curie temperature of the sample. After completion of the whole procedures, ARM spectra are calculated and mapped on the HC-Tb plane (hereafter called HC-Tb diagram). We analyze the Hc-Tb diagrams as follows: (1) For uniaxial SD populations, theoretical curve for a certain grain size (or shape anisotropy) is drawn on the Hc-Tb diagram. The curves are calculated using the single domain theory, since coercivity and blocking temperature of uniaxial SD grains can be expressed as a function of size and shape. (2) Boundary between SD and MD grains are calculated and drawn on the Hc-Tb diagram according to the theory by Butler and Banerjee (1975). (3) Theoretical predictions by (1) and (2) are compared with the obtained ARM spectra to estimate quantitive distribution of size, shape and domain state of magnetic grains in the sample. This mapping method has been applied to three samples: Hawaiian basaltic lava extruded in 1995, Ueno basaltic lava formed during Matsuyama chron, and Oshima basaltic lava extruded in 1986. We will discuss physical states of magnetic grains (size, shape, domain state, etc.) and their possible origins.

  4. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali.

    PubMed

    Minetti, Andrea; Riera-Montes, Margarita; Nackers, Fabienne; Roederer, Thomas; Koudika, Marie Hortense; Sekkenes, Johanne; Taconet, Aurore; Fermon, Florence; Touré, Albouhary; Grais, Rebecca F; Checchi, Francesco

    2012-10-12

    Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes.

  5. Estimating the Latent Number of Types in Growing Corpora with Reduced Cost-Accuracy Trade-Off

    ERIC Educational Resources Information Center

    Hidaka, Shohei

    2016-01-01

    The number of unique words in children's speech is one of most basic statistics indicating their language development. We may, however, face difficulties when trying to accurately evaluate the number of unique words in a child's growing corpus over time with a limited sample size. This study proposes a novel technique to estimate the latent number…

  6. 3D data processing with advanced computer graphics tools

    NASA Astrophysics Data System (ADS)

    Zhang, Song; Ekstrand, Laura; Grieve, Taylor; Eisenmann, David J.; Chumbley, L. Scott

    2012-09-01

    Often, the 3-D raw data coming from an optical profilometer contains spiky noises and irregular grid, which make it difficult to analyze and difficult to store because of the enormously large size. This paper is to address these two issues for an optical profilometer by substantially reducing the spiky noise of the 3-D raw data from an optical profilometer, and by rapidly re-sampling the raw data into regular grids at any pixel size and any orientation with advanced computer graphics tools. Experimental results will be presented to demonstrate the effectiveness of the proposed approach.

  7. Rethinking non-inferiority: a practical trial design for optimising treatment duration.

    PubMed

    Quartagno, Matteo; Walker, A Sarah; Carpenter, James R; Phillips, Patrick Pj; Parmar, Mahesh Kb

    2018-06-01

    Background Trials to identify the minimal effective treatment duration are needed in different therapeutic areas, including bacterial infections, tuberculosis and hepatitis C. However, standard non-inferiority designs have several limitations, including arbitrariness of non-inferiority margins, choice of research arms and very large sample sizes. Methods We recast the problem of finding an appropriate non-inferior treatment duration in terms of modelling the entire duration-response curve within a pre-specified range. We propose a multi-arm randomised trial design, allocating patients to different treatment durations. We use fractional polynomials and spline-based methods to flexibly model the duration-response curve. We call this a 'Durations design'. We compare different methods in terms of a scaled version of the area between true and estimated prediction curves. We evaluate sensitivity to key design parameters, including sample size, number and position of arms. Results A total sample size of ~ 500 patients divided into a moderate number of equidistant arms (5-7) is sufficient to estimate the duration-response curve within a 5% error margin in 95% of the simulations. Fractional polynomials provide similar or better results than spline-based methods in most scenarios. Conclusion Our proposed practical randomised trial 'Durations design' shows promising performance in the estimation of the duration-response curve; subject to a pending careful investigation of its inferential properties, it provides a potential alternative to standard non-inferiority designs, avoiding many of their limitations, and yet being fairly robust to different possible duration-response curves. The trial outcome is the whole duration-response curve, which may be used by clinicians and policymakers to make informed decisions, facilitating a move away from a forced binary hypothesis testing paradigm.

  8. Estimation of critical behavior from the density of states in classical statistical models

    NASA Astrophysics Data System (ADS)

    Malakis, A.; Peratzakis, A.; Fytas, N. G.

    2004-12-01

    We present a simple and efficient approximation scheme which greatly facilitates the extension of Wang-Landau sampling (or similar techniques) in large systems for the estimation of critical behavior. The method, presented in an algorithmic approach, is based on a very simple idea, familiar in statistical mechanics from the notion of thermodynamic equivalence of ensembles and the central limit theorem. It is illustrated that we can predict with high accuracy the critical part of the energy space and by using this restricted part we can extend our simulations to larger systems and improve the accuracy of critical parameters. It is proposed that the extensions of the finite-size critical part of the energy space, determining the specific heat, satisfy a scaling law involving the thermal critical exponent. The method is applied successfully for the estimation of the scaling behavior of specific heat of both square and simple cubic Ising lattices. The proposed scaling law is verified by estimating the thermal critical exponent from the finite-size behavior of the critical part of the energy space. The density of states of the zero-field Ising model on these lattices is obtained via a multirange Wang-Landau sampling.

  9. Test Statistics and Confidence Intervals to Establish Noninferiority between Treatments with Ordinal Categorical Data.

    PubMed

    Zhang, Fanghong; Miyaoka, Etsuo; Huang, Fuping; Tanaka, Yutaka

    2015-01-01

    The problem for establishing noninferiority is discussed between a new treatment and a standard (control) treatment with ordinal categorical data. A measure of treatment effect is used and a method of specifying noninferiority margin for the measure is provided. Two Z-type test statistics are proposed where the estimation of variance is constructed under the shifted null hypothesis using U-statistics. Furthermore, the confidence interval and the sample size formula are given based on the proposed test statistics. The proposed procedure is applied to a dataset from a clinical trial. A simulation study is conducted to compare the performance of the proposed test statistics with that of the existing ones, and the results show that the proposed test statistics are better in terms of the deviation from nominal level and the power.

  10. Effect size and statistical power in the rodent fear conditioning literature - A systematic review.

    PubMed

    Carneiro, Clarissa F D; Moulin, Thiago C; Macleod, Malcolm R; Amaral, Olavo B

    2018-01-01

    Proposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science.

  11. Effect size and statistical power in the rodent fear conditioning literature – A systematic review

    PubMed Central

    Macleod, Malcolm R.

    2018-01-01

    Proposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science. PMID:29698451

  12. A robust measure of HIV-1 population turnover within chronically infected individuals.

    PubMed

    Achaz, G; Palmer, S; Kearney, M; Maldarelli, F; Mellors, J W; Coffin, J M; Wakeley, J

    2004-10-01

    A simple nonparameteric test for population structure was applied to temporally spaced samples of HIV-1 sequences from the gag-pol region within two chronically infected individuals. The results show that temporal structure can be detected for samples separated by about 22 months or more. The performance of the method, which was originally proposed to detect geographic structure, was tested for temporally spaced samples using neutral coalescent simulations. Simulations showed that the method is robust to variation in samples sizes and mutation rates, to the presence/absence of recombination, and that the power to detect temporal structure is high. By comparing levels of temporal structure in simulations to the levels observed in real data, we estimate the effective intra-individual population size of HIV-1 to be between 10(3) and 10(4) viruses, which is in agreement with some previous estimates. Using this estimate and a simple measure of sequence diversity, we estimate an effective neutral mutation rate of about 5 x 10(-6) per site per generation in the gag-pol region. The definition and interpretation of estimates of such "effective" population parameters are discussed.

  13. A methodology to investigate the intrinsic effect of the pulsed electric current during the spark plasma sintering of electrically conductive powders

    PubMed Central

    Locci, Antonio Mario; Cincotti, Alberto; Todde, Sara; Orrù, Roberto; Cao, Giacomo

    2010-01-01

    A novel methodology is proposed for investigating the effect of the pulsed electric current during the spark plasma sintering (SPS) of electrically conductive powders without potential misinterpretation of experimental results. First, ensemble configurations (geometry, size and material of the powder sample, die, plunger and spacers) are identified where the electric current is forced to flow only through either the sample or the die, so that the sample is heated either through the Joule effect or by thermal conduction, respectively. These ensemble configurations are selected using a recently proposed mathematical model of an SPS apparatus, which, once suitably modified, makes it possible to carry out detailed electrical and thermal analysis. Next, SPS experiments are conducted using the ensemble configurations theoretically identified. Using aluminum powders as a case study, we find that the temporal profiles of sample shrinkage, which indicate densification behavior, as well as the final density of the sample are clearly different when the electric current flows only through the sample or through the die containing it, whereas the temperature cycle and mechanical load are the same in both cases. PMID:27877354

  14. Finite-Time and -Size Scalings in the Evaluation of Large Deviation Functions. Numerical Analysis in Continuous Time

    NASA Astrophysics Data System (ADS)

    Guevara Hidalgo, Esteban; Nemoto, Takahiro; Lecomte, Vivien

    Rare trajectories of stochastic systems are important to understand because of their potential impact. However, their properties are by definition difficult to sample directly. Population dynamics provide a numerical tool allowing their study, by means of simulating a large number of copies of the system, which are subjected to a selection rule that favors the rare trajectories of interest. However, such algorithms are plagued by finite simulation time- and finite population size- effects that can render their use delicate. Using the continuous-time cloning algorithm, we analyze the finite-time and finite-size scalings of estimators of the large deviation functions associated to the distribution of the rare trajectories. We use these scalings in order to propose a numerical approach which allows to extract the infinite-time and infinite-size limit of these estimators.

  15. Measuring firm size distribution with semi-nonparametric densities

    NASA Astrophysics Data System (ADS)

    Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier

    2017-11-01

    In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.

  16. GEE-based SNP set association test for continuous and discrete traits in family-based association studies.

    PubMed

    Wang, Xuefeng; Lee, Seunggeun; Zhu, Xiaofeng; Redline, Susan; Lin, Xihong

    2013-12-01

    Family-based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P-values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP-SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P-value GEE test for an SNP-set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study. © 2013 WILEY PERIODICALS, INC.

  17. Confidence intervals and sample size calculations for the standardized mean difference effect size between two normal populations under heteroscedasticity.

    PubMed

    Shieh, G

    2013-12-01

    The use of effect sizes and associated confidence intervals in all empirical research has been strongly emphasized by journal publication guidelines. To help advance theory and practice in the social sciences, this article describes an improved procedure for constructing confidence intervals of the standardized mean difference effect size between two independent normal populations with unknown and possibly unequal variances. The presented approach has advantages over the existing formula in both theoretical justification and computational simplicity. In addition, simulation results show that the suggested one- and two-sided confidence intervals are more accurate in achieving the nominal coverage probability. The proposed estimation method provides a feasible alternative to the most commonly used measure of Cohen's d and the corresponding interval procedure when the assumption of homogeneous variances is not tenable. To further improve the potential applicability of the suggested methodology, the sample size procedures for precise interval estimation of the standardized mean difference are also delineated. The desired precision of a confidence interval is assessed with respect to the control of expected width and to the assurance probability of interval width within a designated value. Supplementary computer programs are developed to aid in the usefulness and implementation of the introduced techniques.

  18. Seven ways to increase power without increasing N.

    PubMed

    Hansen, W B; Collins, L M

    1994-01-01

    Many readers of this monograph may wonder why a chapter on statistical power was included. After all, by now the issue of statistical power is in many respects mundane. Everyone knows that statistical power is a central research consideration, and certainly most National Institute on Drug Abuse grantees or prospective grantees understand the importance of including a power analysis in research proposals. However, there is ample evidence that, in practice, prevention researchers are not paying sufficient attention to statistical power. If they were, the findings observed by Hansen (1992) in a recent review of the prevention literature would not have emerged. Hansen (1992) examined statistical power based on 46 cohorts followed longitudinally, using nonparametric assumptions given the subjects' age at posttest and the numbers of subjects. Results of this analysis indicated that, in order for a study to attain 80-percent power for detecting differences between treatment and control groups, the difference between groups at posttest would need to be at least 8 percent (in the best studies) and as much as 16 percent (in the weakest studies). In order for a study to attain 80-percent power for detecting group differences in pre-post change, 22 of the 46 cohorts would have needed relative pre-post reductions of greater than 100 percent. Thirty-three of the 46 cohorts had less than 50-percent power to detect a 50-percent relative reduction in substance use. These results are consistent with other review findings (e.g., Lipsey 1990) that have shown a similar lack of power in a broad range of research topics. Thus, it seems that, although researchers are aware of the importance of statistical power (particularly of the necessity for calculating it when proposing research), they somehow are failing to end up with adequate power in their completed studies. This chapter argues that the failure of many prevention studies to maintain adequate statistical power is due to an overemphasis on sample size (N) as the only, or even the best, way to increase statistical power. It is easy to see how this overemphasis has come about. Sample size is easy to manipulate, has the advantage of being related to power in a straight-forward way, and usually is under the direct control of the researcher, except for limitations imposed by finances or subject availability. Another option for increasing power is to increase the alpha used for hypothesis-testing but, as very few researchers seriously consider significance levels much larger than the traditional .05, this strategy seldom is used. Of course, sample size is important, and the authors of this chapter are not recommending that researchers cease choosing sample sizes carefully. Rather, they argue that researchers should not confine themselves to increasing N to enhance power. It is important to take additional measures to maintain and improve power over and above making sure the initial sample size is sufficient. The authors recommend two general strategies. One strategy involves attempting to maintain the effective initial sample size so that power is not lost needlessly. The other strategy is to take measures to maximize the third factor that determines statistical power: effect size.

  19. Comparison of sampling strategies for object-based classification of urban vegetation from Very High Resolution satellite images

    NASA Astrophysics Data System (ADS)

    Rougier, Simon; Puissant, Anne; Stumpf, André; Lachiche, Nicolas

    2016-09-01

    Vegetation monitoring is becoming a major issue in the urban environment due to the services they procure and necessitates an accurate and up to date mapping. Very High Resolution satellite images enable a detailed mapping of the urban tree and herbaceous vegetation. Several supervised classifications with statistical learning techniques have provided good results for the detection of urban vegetation but necessitate a large amount of training data. In this context, this study proposes to investigate the performances of different sampling strategies in order to reduce the number of examples needed. Two windows based active learning algorithms from state-of-art are compared to a classical stratified random sampling and a third combining active learning and stratified strategies is proposed. The efficiency of these strategies is evaluated on two medium size French cities, Strasbourg and Rennes, associated to different datasets. Results demonstrate that classical stratified random sampling can in some cases be just as effective as active learning methods and that it should be used more frequently to evaluate new active learning methods. Moreover, the active learning strategies proposed in this work enables to reduce the computational runtime by selecting multiple windows at each iteration without increasing the number of windows needed.

  20. Design of a Single Channel Modulated Wideband Converter for Wideband Spectrum Sensing: Theory, Architecture and Hardware Implementation

    PubMed Central

    Liu, Weisong; Huang, Zhitao; Wang, Xiang; Sun, Weichao

    2017-01-01

    In a cognitive radio sensor network (CRSN), wideband spectrum sensing devices which aims to effectively exploit temporarily vacant spectrum intervals as soon as possible are of great importance. However, the challenge of increasingly high signal frequency and wide bandwidth requires an extremely high sampling rate which may exceed today’s best analog-to-digital converters (ADCs) front-end bandwidth. Recently, the newly proposed architecture called modulated wideband converter (MWC), is an attractive analog compressed sensing technique that can highly reduce the sampling rate. However, the MWC has high hardware complexity owing to its parallel channel structure especially when the number of signals increases. In this paper, we propose a single channel modulated wideband converter (SCMWC) scheme for spectrum sensing of band-limited wide-sense stationary (WSS) signals. With one antenna or sensor, this scheme can save not only sampling rate but also hardware complexity. We then present a new, SCMWC based, single node CR prototype System, on which the spectrum sensing algorithm was tested. Experiments on our hardware prototype show that the proposed architecture leads to successful spectrum sensing. And the total sampling rate as well as hardware size is only one channel’s consumption of MWC. PMID:28471410

  1. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  2. New atlas of open star clusters

    NASA Astrophysics Data System (ADS)

    Seleznev, Anton F.; Avvakumova, Ekaterina; Kulesh, Maxim; Filina, Julia; Tsaregorodtseva, Polina; Kvashnina, Alvira

    2017-11-01

    Due to numerous new discoveries of open star clusters in the last two decades, astronomers need an easy-touse resource to get visual information on the relative position of clusters in the sky. Therefore we propose a new atlas of open star clusters. It is based on a table compiled from the largest modern cluster catalogues. The atlas shows the positions and sizes of 3291 clusters and associations, and consists of two parts. The first contains 108 maps of 12 by 12 degrees with an overlapping of 2 degrees in three strips along the Galactic equator. The second one is an online web application, which shows a square field of an arbitrary size, either in equatorial coordinates or in galactic coordinates by request. The atlas is proposed for the sampling of clusters and cluster stars for further investigation. Another use is the identification of clusters among overdensities in stellar density maps or among stellar groups in images of the sky.

  3. Conditional estimation using prior information in 2-stage group sequential designs assuming asymptotic normality when the trial terminated early.

    PubMed

    Shimura, Masashi; Maruo, Kazushi; Gosho, Masahiko

    2018-04-23

    Two-stage designs are widely used to determine whether a clinical trial should be terminated early. In such trials, a maximum likelihood estimate is often adopted to describe the difference in efficacy between the experimental and reference treatments; however, this method is known to display conditional bias. To reduce such bias, a conditional mean-adjusted estimator (CMAE) has been proposed, although the remaining bias may be nonnegligible when a trial is stopped for efficacy at the interim analysis. We propose a new estimator for adjusting the conditional bias of the treatment effect by extending the idea of the CMAE. This estimator is calculated by weighting the maximum likelihood estimate obtained at the interim analysis and the effect size prespecified when calculating the sample size. We evaluate the performance of the proposed estimator through analytical and simulation studies in various settings in which a trial is stopped for efficacy or futility at the interim analysis. We find that the conditional bias of the proposed estimator is smaller than that of the CMAE when the information time at the interim analysis is small. In addition, the mean-squared error of the proposed estimator is also smaller than that of the CMAE. In conclusion, we recommend the use of the proposed estimator for trials that are terminated early for efficacy or futility. Copyright © 2018 John Wiley & Sons, Ltd.

  4. Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine

    PubMed Central

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Garshasbi, Masoud

    2018-01-01

    Background: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. Methods: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. Results: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. Conclusions: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface. PMID:29535919

  5. Simulation Studies as Designed Experiments: The Comparison of Penalized Regression Models in the “Large p, Small n” Setting

    PubMed Central

    Chaibub Neto, Elias; Bare, J. Christopher; Margolin, Adam A.

    2014-01-01

    New algorithms are continuously proposed in computational biology. Performance evaluation of novel methods is important in practice. Nonetheless, the field experiences a lack of rigorous methodology aimed to systematically and objectively evaluate competing approaches. Simulation studies are frequently used to show that a particular method outperforms another. Often times, however, simulation studies are not well designed, and it is hard to characterize the particular conditions under which different methods perform better. In this paper we propose the adoption of well established techniques in the design of computer and physical experiments for developing effective simulation studies. By following best practices in planning of experiments we are better able to understand the strengths and weaknesses of competing algorithms leading to more informed decisions about which method to use for a particular task. We illustrate the application of our proposed simulation framework with a detailed comparison of the ridge-regression, lasso and elastic-net algorithms in a large scale study investigating the effects on predictive performance of sample size, number of features, true model sparsity, signal-to-noise ratio, and feature correlation, in situations where the number of covariates is usually much larger than sample size. Analysis of data sets containing tens of thousands of features but only a few hundred samples is nowadays routine in computational biology, where “omics” features such as gene expression, copy number variation and sequence data are frequently used in the predictive modeling of complex phenotypes such as anticancer drug response. The penalized regression approaches investigated in this study are popular choices in this setting and our simulations corroborate well established results concerning the conditions under which each one of these methods is expected to perform best while providing several novel insights. PMID:25289666

  6. Microgravity Testing of a Surface Sampling System for Sample Return from Small Solar System Bodies

    NASA Technical Reports Server (NTRS)

    Franzen, M. A.; Preble, J.; Schoenoff, M.; Halona, K.; Long, T. E.; Park, T.; Sears, D. W. G.

    2004-01-01

    The return of samples from solar system bodies is becoming an essential element of solar system exploration. The recent National Research Council Solar System Exploration Decadal Survey identified six sample return missions as high priority missions: South-Aitken Basin Sample Return, Comet Surface Sample Return, Comet Surface Sample Return-sample from selected surface sites, Asteroid Lander/Rover/Sample Return, Comet Nucleus Sample Return-cold samples from depth, and Mars Sample Return [1] and the NASA Roadmap also includes sample return missions [2] . Sample collection methods that have been flown on robotic spacecraft to date return subgram quantities, but many scientific issues (like bulk composition, particle size distributions, petrology, chronology) require tens to hundreds of grams of sample. Many complex sample collection devices have been proposed, however, small robotic missions require simplicity. We present here the results of experiments done with a simple but innovative collection system for sample return from small solar system bodies.

  7. Improving size estimates of open animal populations by incorporating information on age

    USGS Publications Warehouse

    Manly, Bryan F.J.; McDonald, Trent L.; Amstrup, Steven C.; Regehr, Eric V.

    2003-01-01

    Around the world, a great deal of effort is expended each year to estimate the sizes of wild animal populations. Unfortunately, population size has proven to be one of the most intractable parameters to estimate. The capture-recapture estimation models most commonly used (of the Jolly-Seber type) are complicated and require numerous, sometimes questionable, assumptions. The derived estimates usually have large variances and lack consistency over time. In capture–recapture studies of long-lived animals, the ages of captured animals can often be determined with great accuracy and relative ease. We show how to incorporate age information into size estimates for open populations, where the size changes through births, deaths, immigration, and emigration. The proposed method allows more precise estimates of population size than the usual models, and it can provide these estimates from two sample occasions rather than the three usually required. Moreover, this method does not require specialized programs for capture-recapture data; researchers can derive their estimates using the logistic regression module in any standard statistical package.

  8. Competitive Deep-Belief Networks for Underwater Acoustic Target Recognition

    PubMed Central

    Shen, Sheng; Yao, Xiaohui; Sheng, Meiping; Wang, Chen

    2018-01-01

    Underwater acoustic target recognition based on ship-radiated noise belongs to the small-sample-size recognition problems. A competitive deep-belief network is proposed to learn features with more discriminative information from labeled and unlabeled samples. The proposed model consists of four stages: (1) A standard restricted Boltzmann machine is pretrained using a large number of unlabeled data to initialize its parameters; (2) the hidden units are grouped according to categories, which provides an initial clustering model for competitive learning; (3) competitive training and back-propagation algorithms are used to update the parameters to accomplish the task of clustering; (4) by applying layer-wise training and supervised fine-tuning, a deep neural network is built to obtain features. Experimental results show that the proposed method can achieve classification accuracy of 90.89%, which is 8.95% higher than the accuracy obtained by the compared methods. In addition, the highest accuracy of our method is obtained with fewer features than other methods. PMID:29570642

  9. Strategies for informed sample size reduction in adaptive controlled clinical trials

    NASA Astrophysics Data System (ADS)

    Arandjelović, Ognjen

    2017-12-01

    Clinical trial adaptation refers to any adjustment of the trial protocol after the onset of the trial. The main goal is to make the process of introducing new medical interventions to patients more efficient. The principal challenge, which is an outstanding research problem, is to be found in the question of how adaptation should be performed so as to minimize the chance of distorting the outcome of the trial. In this paper, we propose a novel method for achieving this. Unlike most of the previously published work, our approach focuses on trial adaptation by sample size adjustment, i.e. by reducing the number of trial participants in a statistically informed manner. Our key idea is to select the sample subset for removal in a manner which minimizes the associated loss of information. We formalize this notion and describe three algorithms which approach the problem in different ways, respectively, using (i) repeated random draws, (ii) a genetic algorithm, and (iii) what we term pair-wise sample compatibilities. Experiments on simulated data demonstrate the effectiveness of all three approaches, with a consistently superior performance exhibited by the pair-wise sample compatibilities-based method.

  10. A Novel Videography Method for Generating Crack-Extension Resistance Curves in Small Bone Samples

    PubMed Central

    Katsamenis, Orestis L.; Jenkins, Thomas; Quinci, Federico; Michopoulou, Sofia; Sinclair, Ian; Thurner, Philipp J.

    2013-01-01

    Assessment of bone quality is an emerging solution for quantifying the effects of bone pathology or treatment. Perhaps one of the most important parameters characterising bone quality is the toughness behaviour of bone. Particularly, fracture toughness, is becoming a popular means for evaluating bone quality. The method is moving from a single value approach that models bone as a linear-elastic material (using the stress intensity factor, K) towards full crack extension resistance curves (R-curves) using a non-linear model (the strain energy release rate in J-R curves). However, for explanted human bone or small animal bones, there are difficulties in measuring crack-extension resistance curves due to size constraints at the millimetre and sub-millimetre scale. This research proposes a novel “whitening front tracking” method that uses videography to generate full fracture resistance curves in small bone samples where crack propagation cannot typically be observed. Here we present this method on sharp edge notched samples (<1 mm×1 mm×Length) prepared from four human femora tested in three-point bending. Each sample was loaded in a mechanical tester with the crack propagation recorded using videography and analysed using an algorithm to track the whitening (damage) zone. Using the “whitening front tracking” method, full R-curves and J-R curves could be generated for these samples. The curves for this antiplane longitudinal orientation were similar to those found in the literature, being between the published longitudinal and transverse orientations. The proposed technique shows the ability to generate full “crack” extension resistance curves by tracking the whitening front propagation to overcome the small size limitations and the single value approach. PMID:23405186

  11. Optimizing adaptive design for Phase 2 dose-finding trials incorporating long-term success and financial considerations: A case study for neuropathic pain.

    PubMed

    Gao, Jingjing; Nangia, Narinder; Jia, Jia; Bolognese, James; Bhattacharyya, Jaydeep; Patel, Nitin

    2017-06-01

    In this paper, we propose an adaptive randomization design for Phase 2 dose-finding trials to optimize Net Present Value (NPV) for an experimental drug. We replace the traditional fixed sample size design (Patel, et al., 2012) by this new design to see if NPV from the original paper can be improved. Comparison of the proposed design to the previous design is made via simulations using a hypothetical example based on a Diabetic Neuropathic Pain Study. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. The influence of subjective norm on intention to use of learning management system among Malaysian higher education students

    NASA Astrophysics Data System (ADS)

    Baleghi-Zadeh, Sousan; Ayub, Ahmad Fauzi Mohd; Mahmud, Rosnaini; Daud, Shaffe Mohd

    2014-12-01

    In recent years, the use of learning management system by universities has been increasingly growing. However, the results of several studies have revealed that students do not fully use the information systems. The present study proposes a model which investigates the influence of three constructs of perceived ease of use, perceived usefulness, and subjective norm on behavior intention to use of learning management system. The sample size was 216 Malaysian undergraduate students. The results of the study revealed that the proposed model accounts for 31.1 % variance of behavior intention to use.

  13. Multiscale Rotation-Invariant Convolutional Neural Networks for Lung Texture Classification.

    PubMed

    Wang, Qiangchang; Zheng, Yuanjie; Yang, Gongping; Jin, Weidong; Chen, Xinjian; Yin, Yilong

    2018-01-01

    We propose a new multiscale rotation-invariant convolutional neural network (MRCNN) model for classifying various lung tissue types on high-resolution computed tomography. MRCNN employs Gabor-local binary pattern that introduces a good property in image analysis-invariance to image scales and rotations. In addition, we offer an approach to deal with the problems caused by imbalanced number of samples between different classes in most of the existing works, accomplished by changing the overlapping size between the adjacent patches. Experimental results on a public interstitial lung disease database show a superior performance of the proposed method to state of the art.

  14. Maraia Capsule Flight Testing and Results for Entry, Descent, and Landing

    NASA Technical Reports Server (NTRS)

    Sostaric, Ronald R.; Strahan, Alan L.

    2016-01-01

    The Maraia concept is a modest size (150 lb., 30" diameter) capsule that has been proposed as an ISS based, mostly autonomous earth return capability to function either as an Entry, Descent, and Landing (EDL) technology test platform or as a small on-demand sample return vehicle. A flight test program has been completed including high altitude balloon testing of the proposed capsule shape, with the purpose of investigating aerodynamics and stability during the latter portion of the entry flight regime, along with demonstrating a potential recovery system. This paper includes description, objectives, and results from the test program.

  15. High frequency mesozooplankton monitoring: Can imaging systems and automated sample analysis help us describe and interpret changes in zooplankton community composition and size structure — An example from a coastal site

    NASA Astrophysics Data System (ADS)

    Romagnan, Jean Baptiste; Aldamman, Lama; Gasparini, Stéphane; Nival, Paul; Aubert, Anaïs; Jamet, Jean Louis; Stemmann, Lars

    2016-10-01

    The present work aims to show that high throughput imaging systems can be useful to estimate mesozooplankton community size and taxonomic descriptors that can be the base for consistent large scale monitoring of plankton communities. Such monitoring is required by the European Marine Strategy Framework Directive (MSFD) in order to ensure the Good Environmental Status (GES) of European coastal and offshore marine ecosystems. Time and cost-effective, automatic, techniques are of high interest in this context. An imaging-based protocol has been applied to a high frequency time series (every second day between April 2003 to April 2004 on average) of zooplankton obtained in a coastal site of the NW Mediterranean Sea, Villefranche Bay. One hundred eighty four mesozooplankton net collected samples were analysed with a Zooscan and an associated semi-automatic classification technique. The constitution of a learning set designed to maximize copepod identification with more than 10,000 objects enabled the automatic sorting of copepods with an accuracy of 91% (true positives) and a contamination of 14% (false positives). Twenty seven samples were then chosen from the total copepod time series for detailed visual sorting of copepods after automatic identification. This method enabled the description of the dynamics of two well-known copepod species, Centropages typicus and Temora stylifera, and 7 other taxonomically broader copepod groups, in terms of size, biovolume and abundance-size distributions (size spectra). Also, total copepod size spectra underwent significant changes during the sampling period. These changes could be partially related to changes in the copepod assemblage taxonomic composition and size distributions. This study shows that the use of high throughput imaging systems is of great interest to extract relevant coarse (i.e. total abundance, size structure) and detailed (i.e. selected species dynamics) descriptors of zooplankton dynamics. Innovative zooplankton analyses are therefore proposed and open the way for further development of zooplankton community indicators of changes.

  16. Effect of Study Design on Sample Size in Studies Intended to Evaluate Bioequivalence of Inhaled Short-Acting β-Agonist Formulations.

    PubMed

    Zeng, Yaohui; Singh, Sachinkumar; Wang, Kai; Ahrens, Richard C

    2018-04-01

    Pharmacodynamic studies that use methacholine challenge to assess bioequivalence of generic and innovator albuterol formulations are generally designed per published Food and Drug Administration guidance, with 3 reference doses and 1 test dose (3-by-1 design). These studies are challenging and expensive to conduct, typically requiring large sample sizes. We proposed 14 modified study designs as alternatives to the Food and Drug Administration-recommended 3-by-1 design, hypothesizing that adding reference and/or test doses would reduce sample size and cost. We used Monte Carlo simulation to estimate sample size. Simulation inputs were selected based on published studies and our own experience with this type of trial. We also estimated effects of these modified study designs on study cost. Most of these altered designs reduced sample size and cost relative to the 3-by-1 design, some decreasing cost by more than 40%. The most effective single study dose to add was 180 μg of test formulation, which resulted in an estimated 30% relative cost reduction. Adding a single test dose of 90 μg was less effective, producing only a 13% cost reduction. Adding a lone reference dose of either 180, 270, or 360 μg yielded little benefit (less than 10% cost reduction), whereas adding 720 μg resulted in a 19% cost reduction. Of the 14 study design modifications we evaluated, the most effective was addition of both a 90-μg test dose and a 720-μg reference dose (42% cost reduction). Combining a 180-μg test dose and a 720-μg reference dose produced an estimated 36% cost reduction. © 2017, The Authors. The Journal of Clinical Pharmacology published by Wiley Periodicals, Inc. on behalf of American College of Clinical Pharmacology.

  17. Evaluation of dredged material proposed for ocean disposal from Bronx River Project Area, New York

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gruendell, B.D.; Gardiner, W.W.; Antrim, L.D.

    1996-12-01

    The objective of the Bronx River project was to evaluate proposed dredged material from the Bronx River project area in Bronx, New York, to determine its suitability for unconfined ocean disposal at the Mud Dump Site. Bronx River was one of five waterways that the US Army Corps of Engineers-New York District (USAGE-NYD) requested the Battelle Marine Sciences Laboratory (MSL) to sample and to evaluate for dredging and disposal. Sediment samples were submitted for physical and chemical analyses, chemical analyses of dredging site water and elutriate, benthic and water-column acute toxicity tests, and bioaccumulation studies. Fifteen individual sediment core samplesmore » collected from the Bronx River project area were analyzed for grain size, moisture content, and total organic carbon (TOC). One composite sediment sample, representing the entire reach of the area proposed for dredging, was analyzed for bulk density, specific gravity, metals, chlorinated pesticides, polychlorinated biphenyl (PCB) congeners, polynuclear aromatic hydrocarbons (PAH), and 1,4- dichlorobenzene. Dredging site water and elutriate water, which was prepared from the suspended-particulate phase (SPP) of the Bronx River sediment composite, were analyzed for metals, pesticides, and PCBS.« less

  18. A Meta-Analysis on Antecedents and Outcomes of Detachment from Work.

    PubMed

    Wendsche, Johannes; Lohmann-Haislah, Andrea

    2016-01-01

    Detachment from work has been proposed as an important non-work experience helping employees to recover from work demands. This meta-analysis (86 publications, k = 91 independent study samples, N = 38,124 employees) examined core antecedents and outcomes of detachment in employee samples. With regard to outcomes, results indicated average positive correlations between detachment and self-reported mental (i.e., less exhaustion, higher life satisfaction, more well-being, better sleep) and physical (i.e., lower physical discomfort) health, state well-being (i.e., less fatigue, higher positive affect, more intensive state of recovery), and task performance (small to medium sized effects). However, average relationships between detachment and physiological stress indicators and work motivation were not significant while associations with contextual performance and creativity were significant, but negative. Concerning work characteristics, as expected, job demands were negatively related and job resources were positively related to detachment (small sized effects). Further, analyses revealed that person characteristics such as negative affectivity/neuroticism (small sized effect) and heavy work investment (medium sized effect) were negatively related to detachment whereas detachment and demographic variables (i.e., age and gender) were not related. Moreover, we found a medium sized average negative relationship between engagement in work-related activities during non-work time and detachment. For most of the examined relationships heterogeneity of effect sizes was moderate to high. We identified study design, samples' gender distribution, and affective valence of work-related thoughts as moderators for some of these aforementioned relationships. The results of this meta-analysis point to detachment as a non-work (recovery) experience that is influenced by work-related and personal characteristics which in turn is relevant for a range of employee outcomes.

  19. A Meta-Analysis on Antecedents and Outcomes of Detachment from Work

    PubMed Central

    Wendsche, Johannes; Lohmann-Haislah, Andrea

    2017-01-01

    Detachment from work has been proposed as an important non-work experience helping employees to recover from work demands. This meta-analysis (86 publications, k = 91 independent study samples, N = 38,124 employees) examined core antecedents and outcomes of detachment in employee samples. With regard to outcomes, results indicated average positive correlations between detachment and self-reported mental (i.e., less exhaustion, higher life satisfaction, more well-being, better sleep) and physical (i.e., lower physical discomfort) health, state well-being (i.e., less fatigue, higher positive affect, more intensive state of recovery), and task performance (small to medium sized effects). However, average relationships between detachment and physiological stress indicators and work motivation were not significant while associations with contextual performance and creativity were significant, but negative. Concerning work characteristics, as expected, job demands were negatively related and job resources were positively related to detachment (small sized effects). Further, analyses revealed that person characteristics such as negative affectivity/neuroticism (small sized effect) and heavy work investment (medium sized effect) were negatively related to detachment whereas detachment and demographic variables (i.e., age and gender) were not related. Moreover, we found a medium sized average negative relationship between engagement in work-related activities during non-work time and detachment. For most of the examined relationships heterogeneity of effect sizes was moderate to high. We identified study design, samples' gender distribution, and affective valence of work-related thoughts as moderators for some of these aforementioned relationships. The results of this meta-analysis point to detachment as a non-work (recovery) experience that is influenced by work-related and personal characteristics which in turn is relevant for a range of employee outcomes. PMID:28133454

  20. The Interrupted Power Law and the Size of Shadow Banking

    PubMed Central

    Fiaschi, Davide; Kondor, Imre; Marsili, Matteo; Volpati, Valerio

    2014-01-01

    Using public data (Forbes Global 2000) we show that the asset sizes for the largest global firms follow a Pareto distribution in an intermediate range, that is “interrupted” by a sharp cut-off in its upper tail, where it is totally dominated by financial firms. This flattening of the distribution contrasts with a large body of empirical literature which finds a Pareto distribution for firm sizes both across countries and over time. Pareto distributions are generally traced back to a mechanism of proportional random growth, based on a regime of constant returns to scale. This makes our findings of an “interrupted” Pareto distribution all the more puzzling, because we provide evidence that financial firms in our sample should operate in such a regime. We claim that the missing mass from the upper tail of the asset size distribution is a consequence of shadow banking activity and that it provides an (upper) estimate of the size of the shadow banking system. This estimate–which we propose as a shadow banking index–compares well with estimates of the Financial Stability Board until 2009, but it shows a sharper rise in shadow banking activity after 2010. Finally, we propose a proportional random growth model that reproduces the observed distribution, thereby providing a quantitative estimate of the intensity of shadow banking activity. PMID:24728096

  1. Wave transmission approach based on modal analysis for embedded mechanical systems

    NASA Astrophysics Data System (ADS)

    Cretu, Nicolae; Nita, Gelu; Ioan Pop, Mihail

    2013-09-01

    An experimental method for determining the phase velocity in small solid samples is proposed. The method is based on measuring the resonant frequencies of a binary or ternary solid elastic system comprising the small sample of interest and a gauge material of manageable size. The wave transmission matrix of the combined system is derived and the theoretical values of its eigenvalues are used to determine the expected eigenfrequencies that, equated with the measured values, allow for the numerical estimation of the phase velocities in both materials. The known phase velocity of the gauge material is then used to asses the accuracy of the method. Using computer simulation and the experimental values for phase velocities, the theoretical values for the eigenfrequencies of the eigenmodes of the embedded elastic system are obtained, to validate the method. We conclude that the proposed experimental method may be reliably used to determine the elastic properties of small solid samples whose geometries do not allow a direct measurement of their resonant frequencies.

  2. Dynamical traps in Wang-Landau sampling of continuous systems: Mechanism and solution

    NASA Astrophysics Data System (ADS)

    Koh, Yang Wei; Sim, Adelene Y. L.; Lee, Hwee Kuan

    2015-08-01

    We study the mechanism behind dynamical trappings experienced during Wang-Landau sampling of continuous systems reported by several authors. Trapping is caused by the random walker coming close to a local energy extremum, although the mechanism is different from that of the critical slowing-down encountered in conventional molecular dynamics or Monte Carlo simulations. When trapped, the random walker misses the entire or even several stages of Wang-Landau modification factor reduction, leading to inadequate sampling of the configuration space and a rough density of states, even though the modification factor has been reduced to very small values. Trapping is dependent on specific systems, the choice of energy bins, and the Monte Carlo step size, making it highly unpredictable. A general, simple, and effective solution is proposed where the configurations of multiple parallel Wang-Landau trajectories are interswapped to prevent trapping. We also explain why swapping frees the random walker from such traps. The efficacy of the proposed algorithm is demonstrated.

  3. Towards a plot size for Canada's national forest inventory

    Treesearch

    Steen Magnussen; P. Boudewyn; M. Gillis

    2000-01-01

    A proposed national forest inventory for Canada is to report on the state and trends of resource attributes gathered mainly from aerial photos of sample plots located on a national grid. A pilot project in New Brunswick indicates it takes about 2,800 square 400-ha plots (10 percent inventoried) to achieve a relative standard error of 10 percent or less on 14 out of 17...

  4. Slurry sampling high-resolution continuum source electrothermal atomic absorption spectrometry for direct beryllium determination in soil and sediment samples after elimination of SiO interference by least-squares background correction.

    PubMed

    Husáková, Lenka; Urbanová, Iva; Šafránková, Michaela; Šídová, Tereza

    2017-12-01

    In this work a simple, efficient, and environmentally-friendly method is proposed for determination of Be in soil and sediment samples employing slurry sampling and high-resolution continuum source electrothermal atomic absorption spectrometry (HR-CS-ETAAS). The spectral effects originating from SiO species were identified and successfully corrected by means of a mathematical correction algorithm. Fractional factorial design has been employed to assess the parameters affecting the analytical results and especially to help in the development of the slurry preparation and optimization of measuring conditions. The effects of seven analytical variables including particle size, concentration of glycerol and HNO 3 for stabilization and analyte extraction, respectively, the effect of ultrasonic agitation for slurry homogenization, concentration of chemical modifier, pyrolysis and atomization temperature were investigated by a 2 7-3 replicate (n = 3) design. Using the optimized experimental conditions, the proposed method allowed the determination of Be with a detection limit being 0.016mgkg -1 and characteristic mass 1.3pg. Optimum results were obtained after preparing the slurries by weighing 100mg of a sample with particle size < 54µm and adding 25mL of 20% w/w glycerol. The use of 1μg Rh and 50μg citric acid was found satisfactory for the analyte stabilization. Accurate data were obtained with the use of matrix-free calibration. The accuracy of the method was confirmed by analysis of two certified reference materials (NIST SRM 2702 Inorganics in Marine Sediment and IGI BIL-1 Baikal Bottom Silt) and by comparison of the results obtained for ten real samples by slurry sampling with those determined after microwave-assisted extraction by inductively coupled plasma time of flight mass spectrometry (TOF-ICP-MS). The reported method has a precision better than 7%. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Correcting intensity loss errors in the absence of texture-free reference samples during pole figure measurement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saleh, Ahmed A., E-mail: asaleh@uow.edu.au

    Even with the use of X-ray polycapillary lenses, sample tilting during pole figure measurement results in a decrease in the recorded X-ray intensity. The magnitude of this error is affected by the sample size and/or the finite detector size. These errors can be typically corrected by measuring the intensity loss as a function of the tilt angle using a texture-free reference sample (ideally made of the same alloy as the investigated material). Since texture-free reference samples are not readily available for all alloys, the present study employs an empirical procedure to estimate the correction curve for a particular experimental configuration.more » It involves the use of real texture-free reference samples that pre-exist in any X-ray diffraction laboratory to first establish the empirical correlations between X-ray intensity, sample tilt and their Bragg angles and thereafter generate correction curves for any Bragg angle. It will be shown that the empirically corrected textures are in very good agreement with the experimentally corrected ones. - Highlights: •Sample tilting during X-ray pole figure measurement leads to intensity loss errors. •Texture-free reference samples are typically used to correct the pole figures. •An empirical correction procedure is proposed in the absence of reference samples. •The procedure relies on reference samples that pre-exist in any texture laboratory. •Experimentally and empirically corrected textures are in very good agreement.« less

  6. A method of bias correction for maximal reliability with dichotomous measures.

    PubMed

    Penev, Spiridon; Raykov, Tenko

    2010-02-01

    This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example.

  7. The Grain-size Patchiness of Braided Gravel-Bed Streams: Example of the Urumqi River (northeast Tian Shan, China)

    NASA Astrophysics Data System (ADS)

    Guerit, L.; Barrier, L.; Narteau, C.; Métivier, F.; Liu, Y.; Lajeunesse, E.; Gayer, E.; Malverti, L.; Meunier, P.; Ye, B.

    2012-04-01

    In gravel-beds rivers, sediments are sorted into patches of different grain-sizes. For single-thread streams, it has long been shown that this local granulometric sorting is closely linked to the channel morpho-sedimentary elements. For braided streams, this relation is still unclear. In such rivers, many observations of vertical sediment sorting has led to the definition of a surface and a subsurface layers. Because of this common stratification, methods for sampling gravel-bed rivers have been divided in two families. The surface layer is generally sampled by surface methods and the subsurface layer by volumetric methods. Yet, the equivalency between the two kind of techniques is still a key question. In this study, we characterized the grain-size distribution of the surface layer of the Urumqi River, a shallow braided gravel-bed river in China, by surface-count (Wolman grid-by-number) and volumetric (sieve-by-weight) sampling methods. An analysis of two large samples (212 grains and 3226 kg) show that these two methods are equivalent to characterize the river-bed surface layer. Then, we looked at the grain-size distributions of the river-bed morpho-sedimentary elements: (1) chutes at flow constrictions, which pass downstream to (2) anabranches and (3) bars at flow expansions. Using both sampling methods, we measured the diameter of more than 2300 grains and weight more than 6000 kg of grains larger than 4 mm. Our results show that the three morpho-sedimentary elements correspond only to two kinds of grain-size patches: (1) chutes composed of one coarse-grained top layer lying on finer deposits, and (2) anabranches and bars made up of finer-grained deposits more homogeneous in depth. On the basis of these quantitative observations, together with the concave or convex morphology of the different elements, we propose that chute patches form by erosion and transit with size-selective entrainment, whereas anabranch and bar patches rather develop and migrate by transit and deposition. These patch features may be typical of shallow braided gravel-bed rivers and should be considered in future works about on bedload transport processes and their geomorphologic and stratigraphic results.

  8. Finite-time and finite-size scalings in the evaluation of large-deviation functions: Numerical approach in continuous time.

    PubMed

    Guevara Hidalgo, Esteban; Nemoto, Takahiro; Lecomte, Vivien

    2017-06-01

    Rare trajectories of stochastic systems are important to understand because of their potential impact. However, their properties are by definition difficult to sample directly. Population dynamics provides a numerical tool allowing their study, by means of simulating a large number of copies of the system, which are subjected to selection rules that favor the rare trajectories of interest. Such algorithms are plagued by finite simulation time and finite population size, effects that can render their use delicate. In this paper, we present a numerical approach which uses the finite-time and finite-size scalings of estimators of the large deviation functions associated to the distribution of rare trajectories. The method we propose allows one to extract the infinite-time and infinite-size limit of these estimators, which-as shown on the contact process-provides a significant improvement of the large deviation function estimators compared to the standard one.

  9. 3D contour fluorescence spectroscopy with Brus model: Determination of size and band gap of double stranded DNA templated silver nanoclusters

    NASA Astrophysics Data System (ADS)

    Kamalraj, Devaraj; Yuvaraj, Selvaraj; Yoganand, Coimbatore Paramasivam; Jaffer, Syed S.

    2018-01-01

    Here, we propose a new synthetic methodology for silver nanocluster preparation by using a double stranded-DNA (ds-DNA) template which no one has reported yet. A new calculative method was formulated to determine the size of the nanocluster and their band gaps by using steady state 3D contour fluorescence technique with Brus model. Generally, the structure and size of the nanoclusters determine by using High Resolution Transmission Electron Microscopy (HR-TEM). Before imaging the samples by using HR-TEM, they are introduced to drying process which causes aggregation and forms bigger polycrystalline particles. It takes long time duration and expensive methodology. In this current methodology, we found out the size and band gap of the nanocluster in the liquid form without any polycrystalline aggregation for which 3D contour fluorescence technique was used as an alternative approach to the HR-TEM method.

  10. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali

    PubMed Central

    2012-01-01

    Background Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. Methods We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. Results VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Conclusions Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes. PMID:23057445

  11. Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression.

    PubMed

    Gao, Guangwei; Yang, Jian; Jing, Xiaoyuan; Huang, Pu; Hua, Juliang; Yue, Dong

    2016-01-01

    In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.

  12. Size-Based Separation of Particles and Cells Utilizing Viscoelastic Effects in Straight Microchannels.

    PubMed

    Liu, Chao; Xue, Chundong; Chen, Xiaodong; Shan, Lei; Tian, Yu; Hu, Guoqing

    2015-06-16

    Viscoelasticity-induced particle migration has recently received increasing attention due to its ability to obtain high-quality focusing over a wide range of flow rates. However, its application is limited to low throughput regime since the particles can defocus as flow rate increases. Using an engineered carrier medium with constant and low viscosity and strong elasticity, the sample flow rates are improved to be 1 order of magnitude higher than those in existing studies. Utilizing differential focusing of particles of different sizes, here, we present sheathless particle/cell separation in simple straight microchannels that possess excellent parallelizability for further throughput enhancement. The present method can be implemented over a wide range of particle/cell sizes and flow rates. We successfully separate small particles from larger particles, MCF-7 cells from red blood cells (RBCs), and Escherichia coli (E. coli) bacteria from RBCs in different straight microchannels. The proposed method could broaden the applications of viscoelastic microfluidic devices to particle/cell separation due to the enhanced sample throughput and simple channel design.

  13. Testing homogeneity of proportion ratios for stratified correlated bilateral data in two-arm randomized clinical trials.

    PubMed

    Pei, Yanbo; Tian, Guo-Liang; Tang, Man-Lai

    2014-11-10

    Stratified data analysis is an important research topic in many biomedical studies and clinical trials. In this article, we develop five test statistics for testing the homogeneity of proportion ratios for stratified correlated bilateral binary data based on an equal correlation model assumption. Bootstrap procedures based on these test statistics are also considered. To evaluate the performance of these statistics and procedures, we conduct Monte Carlo simulations to study their empirical sizes and powers under various scenarios. Our results suggest that the procedure based on score statistic performs well generally and is highly recommended. When the sample size is large, procedures based on the commonly used weighted least square estimate and logarithmic transformation with Mantel-Haenszel estimate are recommended as they do not involve any computation of maximum likelihood estimates requiring iterative algorithms. We also derive approximate sample size formulas based on the recommended test procedures. Finally, we apply the proposed methods to analyze a multi-center randomized clinical trial for scleroderma patients. Copyright © 2014 John Wiley & Sons, Ltd.

  14. Volatility measurement with directional change in Chinese stock market: Statistical property and investment strategy

    NASA Astrophysics Data System (ADS)

    Ma, Junjun; Xiong, Xiong; He, Feng; Zhang, Wei

    2017-04-01

    The stock price fluctuation is studied in this paper with intrinsic time perspective. The event, directional change (DC) or overshoot, are considered as time scale of price time series. With this directional change law, its corresponding statistical properties and parameter estimation is tested in Chinese stock market. Furthermore, a directional change trading strategy is proposed for invest in the market portfolio in Chinese stock market, and both in-sample and out-of-sample performance are compared among the different method of model parameter estimation. We conclude that DC method can capture important fluctuations in Chinese stock market and gain profit due to the statistical property that average upturn overshoot size is bigger than average downturn directional change size. The optimal parameter of DC method is not fixed and we obtained 1.8% annual excess return with this DC-based trading strategy.

  15. Complex disease and phenotype mapping in the domestic dog

    PubMed Central

    Hayward, Jessica J.; Castelhano, Marta G.; Oliveira, Kyle C.; Corey, Elizabeth; Balkman, Cheryl; Baxter, Tara L.; Casal, Margret L.; Center, Sharon A.; Fang, Meiying; Garrison, Susan J.; Kalla, Sara E.; Korniliev, Pavel; Kotlikoff, Michael I.; Moise, N. S.; Shannon, Laura M.; Simpson, Kenneth W.; Sutter, Nathan B.; Todhunter, Rory J.; Boyko, Adam R.

    2016-01-01

    The domestic dog is becoming an increasingly valuable model species in medical genetics, showing particular promise to advance our understanding of cancer and orthopaedic disease. Here we undertake the largest canine genome-wide association study to date, with a panel of over 4,200 dogs genotyped at 180,000 markers, to accelerate mapping efforts. For complex diseases, we identify loci significantly associated with hip dysplasia, elbow dysplasia, idiopathic epilepsy, lymphoma, mast cell tumour and granulomatous colitis; for morphological traits, we report three novel quantitative trait loci that influence body size and one that influences fur length and shedding. Using simulation studies, we show that modestly larger sample sizes and denser marker sets will be sufficient to identify most moderate- to large-effect complex disease loci. This proposed design will enable efficient mapping of canine complex diseases, most of which have human homologues, using far fewer samples than required in human studies. PMID:26795439

  16. Multiple defocused coherent diffraction imaging: method for simultaneously reconstructing objects and probe using X-ray free-electron lasers.

    PubMed

    Hirose, Makoto; Shimomura, Kei; Suzuki, Akihiro; Burdet, Nicolas; Takahashi, Yukio

    2016-05-30

    The sample size must be less than the diffraction-limited focal spot size of the incident beam in single-shot coherent X-ray diffraction imaging (CXDI) based on a diffract-before-destruction scheme using X-ray free electron lasers (XFELs). This is currently a major limitation preventing its wider applications. We here propose multiple defocused CXDI, in which isolated objects are sequentially illuminated with a divergent beam larger than the objects and the coherent diffraction pattern of each object is recorded. This method can simultaneously reconstruct both objects and a probe from the coherent X-ray diffraction patterns without any a priori knowledge. We performed a computer simulation of the prposed method and then successfully demonstrated it in a proof-of-principle experiment at SPring-8. The prposed method allows us to not only observe broad samples but also characterize focused XFEL beams.

  17. Burning rate for steel-cased, pressed binderless HMX

    NASA Technical Reports Server (NTRS)

    Fifer, R. A.; Cole, J. E.

    1980-01-01

    The burning behavior of pressed binderless HMX laterally confined in 6.4 mm i.d. steel cases was measured over the pressure range 1.45 to 338 MPa in a constant pressure strand burner. The measured regression rates are compared to those reported previously for unconfined samples. It is shown that lateral confinement results in a several-fold decrease in the regression rate for the coarse particle size HMX above the transition to super fast regression. For class E samples, confinement shifts the transition to super fast regression from low pressure to high pressure. These results are interpreted in terms of the previously proposed progressive deconsolidation mechanism. Preliminary holographic photography and closed bomb tests are also described. Theoretical one dimensional modeling calculations were carried out to predict the expected flame height (particle burn out distance) as a function of particle size and pressure for binderless HMX burning by a progressive deconsolidation mechanism.

  18. Detecting a Weak Association by Testing its Multiple Perturbations: a Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Lo, Min-Tzu; Lee, Wen-Chung

    2014-05-01

    Many risk factors/interventions in epidemiologic/biomedical studies are of minuscule effects. To detect such weak associations, one needs a study with a very large sample size (the number of subjects, n). The n of a study can be increased but unfortunately only to an extent. Here, we propose a novel method which hinges on increasing sample size in a different direction-the total number of variables (p). We construct a p-based `multiple perturbation test', and conduct power calculations and computer simulations to show that it can achieve a very high power to detect weak associations when p can be made very large. As a demonstration, we apply the method to analyze a genome-wide association study on age-related macular degeneration and identify two novel genetic variants that are significantly associated with the disease. The p-based method may set a stage for a new paradigm of statistical tests.

  19. Multilayered nonuniform sampling for three-dimensional scene representation

    NASA Astrophysics Data System (ADS)

    Lin, Huei-Yung; Xiao, Yu-Hua; Chen, Bo-Ren

    2015-09-01

    The representation of a three-dimensional (3-D) scene is essential in multiview imaging technologies. We present a unified geometry and texture representation based on global resampling of the scene. A layered data map representation with a distance-dependent nonuniform sampling strategy is proposed. It is capable of increasing the details of the 3-D structure locally and is compact in size. The 3-D point cloud obtained from the multilayered data map is used for view rendering. For any given viewpoint, image synthesis with different levels of detail is carried out using the quadtree-based nonuniformly sampled 3-D data points. Experimental results are presented using the 3-D models of reconstructed real objects.

  20. Local reconstruction in computed tomography of diffraction enhanced imaging

    NASA Astrophysics Data System (ADS)

    Huang, Zhi-Feng; Zhang, Li; Kang, Ke-Jun; Chen, Zhi-Qiang; Zhu, Pei-Ping; Yuan, Qing-Xi; Huang, Wan-Xia

    2007-07-01

    Computed tomography of diffraction enhanced imaging (DEI-CT) based on synchrotron radiation source has extremely high sensitivity of weakly absorbing low-Z samples in medical and biological fields. The authors propose a modified backprojection filtration(BPF)-type algorithm based on PI-line segments to reconstruct region of interest from truncated refraction-angle projection data in DEI-CT. The distribution of refractive index decrement in the sample can be directly estimated from its reconstruction images, which has been proved by experiments at the Beijing Synchrotron Radiation Facility. The algorithm paves the way for local reconstruction of large-size samples by the use of DEI-CT with small field of view based on synchrotron radiation source.

  1. Sample Size Calculations for Micro-randomized Trials in mHealth

    PubMed Central

    Liao, Peng; Klasnja, Predrag; Tewari, Ambuj; Murphy, Susan A.

    2015-01-01

    The use and development of mobile interventions are experiencing rapid growth. In “just-in-time” mobile interventions, treatments are provided via a mobile device and they are intended to help an individual make healthy decisions “in the moment,” and thus have a proximal, near future impact. Currently the development of mobile interventions is proceeding at a much faster pace than that of associated data science methods. A first step toward developing data-based methods is to provide an experimental design for testing the proximal effects of these just-in-time treatments. In this paper, we propose a “micro-randomized” trial design for this purpose. In a micro-randomized trial, treatments are sequentially randomized throughout the conduct of the study, with the result that each participant may be randomized at the 100s or 1000s of occasions at which a treatment might be provided. Further, we develop a test statistic for assessing the proximal effect of a treatment as well as an associated sample size calculator. We conduct simulation evaluations of the sample size calculator in various settings. Rules of thumb that might be used in designing a micro-randomized trial are discussed. This work is motivated by our collaboration on the HeartSteps mobile application designed to increase physical activity. PMID:26707831

  2. Ferromagnetic linewidth measurements employing electrodynamic model of the magnetic plasmon resonance

    NASA Astrophysics Data System (ADS)

    Krupka, Jerzy; Aleshkevych, Pavlo; Salski, Bartlomiej; Kopyt, Pawel

    2018-02-01

    The mode of uniform precession, or Kittel mode, in a magnetized ferromagnetic sphere, has recently been proven to be the magnetic plasmon resonance. In this paper we show how to apply the electrodynamic model of the magnetic plasmon resonance for accurate measurements of the ferromagnetic resonance linewidth ΔH. Two measurement methods are presented. The first one employs Q-factor measurements of the magnetic plasmon resonance coupled to the resonance of an empty metallic cavity. Such coupled modes are known as magnon-polariton modes, i.e. hybridized modes between the collective spin excitation and the cavity excitation. The second one employs direct Q-factor measurements of the magnetic plasmon resonance in a filter setup with two orthogonal semi-loops used for coupling. Q-factor measurements are performed employing a vector network analyser. The methods presented in this paper allow one to extend the measurement range of the ferromagnetic resonance linewidth ΔH well beyond the limits of the commonly used measurement standards in terms of the size of the samples and the lowest measurable linewidths. Samples that can be measured with the newly proposed methods may have larger size as compared to the size of samples that were used in the standard methods restricted by the limits of perturbation theory.

  3. DRME: Count-based differential RNA methylation analysis at small sample size scenario.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Gao, Fan; Zhang, Yixin; Huang, Yufei; Chen, Runsheng; Meng, Jia

    2016-04-15

    Differential methylation, which concerns difference in the degree of epigenetic regulation via methylation between two conditions, has been formulated as a beta or beta-binomial distribution to address the within-group biological variability in sequencing data. However, a beta or beta-binomial model is usually difficult to infer at small sample size scenario with discrete reads count in sequencing data. On the other hand, as an emerging research field, RNA methylation has drawn more and more attention recently, and the differential analysis of RNA methylation is significantly different from that of DNA methylation due to the impact of transcriptional regulation. We developed DRME to better address the differential RNA methylation problem. The proposed model can effectively describe within-group biological variability at small sample size scenario and handles the impact of transcriptional regulation on RNA methylation. We tested the newly developed DRME algorithm on simulated and 4 MeRIP-Seq case-control studies and compared it with Fisher's exact test. It is in principle widely applicable to several other RNA-related data types as well, including RNA Bisulfite sequencing and PAR-CLIP. The code together with an MeRIP-Seq dataset is available online (https://github.com/lzcyzm/DRME) for evaluation and reproduction of the figures shown in this article. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Kinematic measurement from panned cinematography.

    PubMed

    Gervais, P; Bedingfield, E W; Wronko, C; Kollias, I; Marchiori, G; Kuntz, J; Way, N; Kuiper, D

    1989-06-01

    Traditional 2-D cinematography has used a stationary camera with its optical axis perpendicular to the plane of motion. This method has constrained the size of the object plane or has introduced potential errors from a small subject image size with large object field widths. The purpose of this study was to assess a panning technique that could overcome the inherent limitations of small object field widths, small object image sizes and limited movement samples. The proposed technique used a series of reference targets in the object field that provided the necessary scales and origin translations. A 102 m object field was panned. Comparisons between criterion distances and film measured distances for field widths of 46 m and 22 m resulted in absolute mean differences that were comparable to that of the traditional method.

  5. Breast Reference Set Application: Chris Li-FHCRC (2014) — EDRN Public Portal

    Cancer.gov

    This application proposes to use Reference Set #1. We request access to serum samples collected at the time of breast biopsy from subjects with IC (n=30) or benign disease without atypia (n=30). Statistical power: With 30 BC cases and 30 normal controls, a 25% difference in mean metabolite levels can be detected between groups with 80% power and α=0.05, assuming coefficients of variation of 30%, consistent with our past studies. These sample sizes appear sufficient to enable detection of changes similar in magnitude to those previously reported in pre-clinical (BC recurrence) specimens (20).

  6. Combustion Product Evaluation of Various Charge Sizes and Propellant Formulations. Task 1 Report: Sampling and Analytical Procedures Proposed for Use in Gun Propellant Combustion Product Characterization

    DTIC Science & Technology

    1988-01-01

    ml, 0.01N, NaOH + 1 ml 30% H202 H2S 10 ml. 0.01M CdSO4 + STRactan 10 in H2 Aldehydes 10 ml of 3.1 WM/ml 2.4- dinitrophenylhydrazine (DNPH) soln. in...3 15 IITRI C06673-TR1 I,ý 3. 3.1.2.6 Aldehydes I After sampling, the 11 flasks will be injected with 10 mi of a 3.1 14/mX -2,4- dinitrophenylhydrazine

  7. Tests for informative cluster size using a novel balanced bootstrap scheme.

    PubMed

    Nevalainen, Jaakko; Oja, Hannu; Datta, Somnath

    2017-07-20

    Clustered data are often encountered in biomedical studies, and to date, a number of approaches have been proposed to analyze such data. However, the phenomenon of informative cluster size (ICS) is a challenging problem, and its presence has an impact on the choice of a correct analysis methodology. For example, Dutta and Datta (2015, Biometrics) presented a number of marginal distributions that could be tested. Depending on the nature and degree of informativeness of the cluster size, these marginal distributions may differ, as do the choices of the appropriate test. In particular, they applied their new test to a periodontal data set where the plausibility of the informativeness was mentioned, but no formal test for the same was conducted. We propose bootstrap tests for testing the presence of ICS. A balanced bootstrap method is developed to successfully estimate the null distribution by merging the re-sampled observations with closely matching counterparts. Relying on the assumption of exchangeability within clusters, the proposed procedure performs well in simulations even with a small number of clusters, at different distributions and against different alternative hypotheses, thus making it an omnibus test. We also explain how to extend the ICS test to a regression setting and thereby enhancing its practical utility. The methodologies are illustrated using the periodontal data set mentioned earlier. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: application to studies of near-universal interventions in intensive care.

    PubMed

    Forbes, Andrew B; Akram, Muhammad; Pilcher, David; Cooper, Jamie; Bellomo, Rinaldo

    2015-02-01

    Cluster randomised crossover trials have been utilised in recent years in the health and social sciences. Methods for analysis have been proposed; however, for binary outcomes, these have received little assessment of their appropriateness. In addition, methods for determination of sample size are currently limited to balanced cluster sizes both between clusters and between periods within clusters. This article aims to extend this work to unbalanced situations and to evaluate the properties of a variety of methods for analysis of binary data, with a particular focus on the setting of potential trials of near-universal interventions in intensive care to reduce in-hospital mortality. We derive a formula for sample size estimation for unbalanced cluster sizes, and apply it to the intensive care setting to demonstrate the utility of the cluster crossover design. We conduct a numerical simulation of the design in the intensive care setting and for more general configurations, and we assess the performance of three cluster summary estimators and an individual-data estimator based on binomial-identity-link regression. For settings similar to the intensive care scenario involving large cluster sizes and small intra-cluster correlations, the sample size formulae developed and analysis methods investigated are found to be appropriate, with the unweighted cluster summary method performing well relative to the more optimal but more complex inverse-variance weighted method. More generally, we find that the unweighted and cluster-size-weighted summary methods perform well, with the relative efficiency of each largely determined systematically from the study design parameters. Performance of individual-data regression is adequate with small cluster sizes but becomes inefficient for large, unbalanced cluster sizes. When outcome prevalences are 6% or less and the within-cluster-within-period correlation is 0.05 or larger, all methods display sub-nominal confidence interval coverage, with the less prevalent the outcome the worse the coverage. As with all simulation studies, conclusions are limited to the configurations studied. We confined attention to detecting intervention effects on an absolute risk scale using marginal models and did not explore properties of binary random effects models. Cluster crossover designs with binary outcomes can be analysed using simple cluster summary methods, and sample size in unbalanced cluster size settings can be determined using relatively straightforward formulae. However, caution needs to be applied in situations with low prevalence outcomes and moderate to high intra-cluster correlations. © The Author(s) 2014.

  9. Speckle pattern sequential extraction metric for estimating the focus spot size on a remote diffuse target.

    PubMed

    Yu, Zhan; Li, Yuanyang; Liu, Lisheng; Guo, Jin; Wang, Tingfeng; Yang, Guoqing

    2017-11-10

    The speckle pattern (line by line) sequential extraction (SPSE) metric is proposed by the one-dimensional speckle intensity level crossing theory. Through the sequential extraction of received speckle information, the speckle metrics for estimating the variation of focusing spot size on a remote diffuse target are obtained. Based on the simulation, we will give some discussions about the SPSE metric range of application under the theoretical conditions, and the aperture size will affect the metric performance of the observation system. The results of the analyses are verified by the experiment. This method is applied to the detection of relative static target (speckled jitter frequency is less than the CCD sampling frequency). The SPSE metric can determine the variation of the focusing spot size over a long distance, moreover, the metric will estimate the spot size under some conditions. Therefore, the monitoring and the feedback of far-field spot will be implemented laser focusing system applications and help the system to optimize the focusing performance.

  10. [Application of asymmetrical flow field-flow fractionation for size characterization of low density lipoprotein in egg yolk plasma].

    PubMed

    Zhang, Wenhui; Cai, Chunxue; Wang, Jing; Mao, Zhen; Li, Yueqiu; Ding, Liang; Shen, Shigang; Dou, Haiyang

    2017-08-08

    Home-made asymmetrical flow field-flow fractionation (AF4) system, online coupled with ultraviolet/visible (UV/Vis) detector was employed for the separation and size characterization of low density lipoprotein (LDL) in egg yolk plasma. At close to natural condition of egg yolk, the effects of cross flow rate, sample loading, and type of membrane on the size distribution of LDL were investigated. Under the optimal operation conditions, AF4-UV/Vis provides the size distribution of LDL. Moreover, the precision of AF4-UV/Vis method proposed in this work for the analysis of LDL in egg yolk plasma was evaluated. The intra-day precisions were 1.3% and 1.9% ( n =7) and the inter-day precisions were 2.4% and 2.3% ( n =7) for the elution peak height and elution peak area of LDL, respectively. Results reveal that AF4-UV/Vis is a useful tool for the separation and size characterization of LDL in egg yolk plasma.

  11. A computational framework for estimating statistical power and planning hypothesis-driven experiments involving one-dimensional biomechanical continua.

    PubMed

    Pataky, Todd C; Robinson, Mark A; Vanrenterghem, Jos

    2018-01-03

    Statistical power assessment is an important component of hypothesis-driven research but until relatively recently (mid-1990s) no methods were available for assessing power in experiments involving continuum data and in particular those involving one-dimensional (1D) time series. The purpose of this study was to describe how continuum-level power analyses can be used to plan hypothesis-driven biomechanics experiments involving 1D data. In particular, we demonstrate how theory- and pilot-driven 1D effect modeling can be used for sample-size calculations for both single- and multi-subject experiments. For theory-driven power analysis we use the minimum jerk hypothesis and single-subject experiments involving straight-line, planar reaching. For pilot-driven power analysis we use a previously published knee kinematics dataset. Results show that powers on the order of 0.8 can be achieved with relatively small sample sizes, five and ten for within-subject minimum jerk analysis and between-subject knee kinematics, respectively. However, the appropriate sample size depends on a priori justifications of biomechanical meaning and effect size. The main advantage of the proposed technique is that it encourages a priori justification regarding the clinical and/or scientific meaning of particular 1D effects, thereby robustly structuring subsequent experimental inquiry. In short, it shifts focus from a search for significance to a search for non-rejectable hypotheses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Nonparametric analysis of bivariate gap time with competing risks.

    PubMed

    Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng

    2016-09-01

    This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall's tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article. © 2016, The International Biometric Society.

  13. An ensemble predictive modeling framework for breast cancer classification.

    PubMed

    Nagarajan, Radhakrishnan; Upreti, Meenakshi

    2017-12-01

    Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Analytic Methods for Adjusting Subjective Rating Schemes

    DTIC Science & Technology

    1976-06-01

    individual performance. The approach developed here is a variant of the classical linear regression model. Specifically, it la proposed that...values of y and X. Moreover, this difference la gener- ally independent of sample size, so that LS estimates are different from ML estimates at...baervationa. H^ever, aa T. -. - ,„ aU . th(. Hit (4.10) la aatlafled, and EKV and ML eatlnatea are equlvalent A practical proble, in applying

  15. Integrated Micro-Optics for Microfluidic Detection.

    PubMed

    Kazama, Yuto; Hibara, Akihide

    2016-01-01

    A method of embedding micro-optics into a microfluidic device was proposed and demonstrated. First, the usefulness of embedded right-angle prisms was demonstrated in microscope observation. Lateral-view microscopic observation of an aqueous dye flow in a 100-μm-sized microchannel was demonstrated. Then, the embedded right-angle prisms were utilized for multi-beam laser spectroscopy. Here, crossed-beam thermal lens detection of a liquid sample was applied to glucose detection.

  16. Consistent Alignment of World Embedding Models

    DTIC Science & Technology

    2017-03-02

    propose a solution that aligns variations of the same model (or different models) in a joint low-dimensional la- tent space leveraging carefully...representations of linguistic enti- ties, most often referred to as embeddings. This includes techniques that rely on matrix factoriza- tion (Levy & Goldberg ...higher, the variation is much higher as well. As we increase the size of the neighborhood, or improve the quality of our sample by only picking the most

  17. Bayesian selective response-adaptive design using the historical control.

    PubMed

    Kim, Mi-Ok; Harun, Nusrat; Liu, Chunyan; Khoury, Jane C; Broderick, Joseph P

    2018-06-13

    High quality historical control data, if incorporated, may reduce sample size, trial cost, and duration. A too optimistic use of the data, however, may result in bias under prior-data conflict. Motivated by well-publicized two-arm comparative trials in stroke, we propose a Bayesian design that both adaptively incorporates historical control data and selectively adapt the treatment allocation ratios within an ongoing trial responsively to the relative treatment effects. The proposed design differs from existing designs that borrow from historical controls. As opposed to reducing the number of subjects assigned to the control arm blindly, this design does so adaptively to the relative treatment effects only if evaluation of cumulated current trial data combined with the historical control suggests the superiority of the intervention arm. We used the effective historical sample size approach to quantify borrowed information on the control arm and modified the treatment allocation rules of the doubly adaptive biased coin design to incorporate the quantity. The modified allocation rules were then implemented under the Bayesian framework with commensurate priors addressing prior-data conflict. Trials were also more frequently concluded earlier in line with the underlying truth, reducing trial cost, and duration and yielded parameter estimates with smaller standard errors. © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons, Ltd.

  18. Two-stage phase II oncology designs using short-term endpoints for early stopping.

    PubMed

    Kunz, Cornelia U; Wason, James Ms; Kieser, Meinhard

    2017-08-01

    Phase II oncology trials are conducted to evaluate whether the tumour activity of a new treatment is promising enough to warrant further investigation. The most commonly used approach in this context is a two-stage single-arm design with binary endpoint. As for all designs with interim analysis, its efficiency strongly depends on the relation between recruitment rate and follow-up time required to measure the patients' outcomes. Usually, recruitment is postponed after the sample size of the first stage is achieved up until the outcomes of all patients are available. This may lead to a considerable increase of the trial length and with it to a delay in the drug development process. We propose a design where an intermediate endpoint is used in the interim analysis to decide whether or not the study is continued with a second stage. Optimal and minimax versions of this design are derived. The characteristics of the proposed design in terms of type I error rate, power, maximum and expected sample size as well as trial duration are investigated. Guidance is given on how to select the most appropriate design. Application is illustrated by a phase II oncology trial in patients with advanced angiosarcoma, which motivated this research.

  19. Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

    PubMed Central

    Lin, Wei; Feng, Rui; Li, Hongzhe

    2014-01-01

    In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be much larger than the sample size. Motivated by such modern applications, we consider the problem of variable selection and estimation in high-dimensional sparse instrumental variables models. To overcome the difficulty of high dimensionality and unknown optimal instruments, we propose a two-stage regularization framework for identifying and estimating important covariate effects while selecting and estimating optimal instruments. The methodology extends the classical two-stage least squares estimator to high dimensions by exploiting sparsity using sparsity-inducing penalty functions in both stages. The resulting procedure is efficiently implemented by coordinate descent optimization. For the representative L1 regularization and a class of concave regularization methods, we establish estimation, prediction, and model selection properties of the two-stage regularized estimators in the high-dimensional setting where the dimensionality of co-variates and instruments are both allowed to grow exponentially with the sample size. The practical performance of the proposed method is evaluated by simulation studies and its usefulness is illustrated by an analysis of mouse obesity data. Supplementary materials for this article are available online. PMID:26392642

  20. An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies.

    PubMed

    Dai, Hongying; Wu, Guodong; Wu, Michael; Zhi, Degui

    2016-01-01

    Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Formula: see text], compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Formula: see text]). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.

  1. An approach for sample size determination of average bioequivalence based on interval estimation.

    PubMed

    Chiang, Chieh; Hsiao, Chin-Fu

    2017-03-30

    In 1992, the US Food and Drug Administration declared that two drugs demonstrate average bioequivalence (ABE) if the log-transformed mean difference of pharmacokinetic responses lies in (-0.223, 0.223). The most widely used approach for assessing ABE is the two one-sided tests procedure. More specifically, ABE is concluded when a 100(1 - 2α) % confidence interval for mean difference falls within (-0.223, 0.223). As known, bioequivalent studies are usually conducted by crossover design. However, in the case that the half-life of a drug is long, a parallel design for the bioequivalent study may be preferred. In this study, a two-sided interval estimation - such as Satterthwaite's, Cochran-Cox's, or Howe's approximations - is used for assessing parallel ABE. We show that the asymptotic joint distribution of the lower and upper confidence limits is bivariate normal, and thus the sample size can be calculated based on the asymptotic power so that the confidence interval falls within (-0.223, 0.223). Simulation studies also show that the proposed method achieves sufficient empirical power. A real example is provided to illustrate the proposed method. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets.

    PubMed

    Nan, Shengyu; Sun, Lei; Chen, Badong; Lin, Zhiping; Toh, Kar-Ann

    2017-01-01

    Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nyström method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

  3. Sampling design for the 1980 commercial and multifamily residential building survey

    NASA Astrophysics Data System (ADS)

    Bowen, W. M.; Olsen, A. R.; Nieves, A. L.

    1981-06-01

    The extent to which new building design practices comply with the proposed 1980 energy budget levels for commercial and multifamily residential building designs (DEB-80) can be assessed by: (1) identifying small number of building types which account for the majority of commercial buildings constructed in the U.S.A.; (2) conducting a separate survey for each building type; and (3) including only buildings designed during 1980. For each building, the design energy consumption (DEC-80) will be determined by the DOE2.1 computer program. The quantity X = (DEC-80 - DEB-80). These X quantities can then be used to compute sample statistics. Inferences about nationwide compliance with DEB-80 may then be made for each building type. Details of the population, sampling frame, stratification, sample size, and implementation of the sampling plan are provided.

  4. Chi-Squared Test of Fit and Sample Size-A Comparison between a Random Sample Approach and a Chi-Square Value Adjustment Method.

    PubMed

    Bergh, Daniel

    2015-01-01

    Chi-square statistics are commonly used for tests of fit of measurement models. Chi-square is also sensitive to sample size, which is why several approaches to handle large samples in test of fit analysis have been developed. One strategy to handle the sample size problem may be to adjust the sample size in the analysis of fit. An alternative is to adopt a random sample approach. The purpose of this study was to analyze and to compare these two strategies using simulated data. Given an original sample size of 21,000, for reductions of sample sizes down to the order of 5,000 the adjusted sample size function works as good as the random sample approach. In contrast, when applying adjustments to sample sizes of lower order the adjustment function is less effective at approximating the chi-square value for an actual random sample of the relevant size. Hence, the fit is exaggerated and misfit under-estimated using the adjusted sample size function. Although there are big differences in chi-square values between the two approaches at lower sample sizes, the inferences based on the p-values may be the same.

  5. Gravel-Sand-Clay Mixture Model for Predictions of Permeability and Velocity of Unconsolidated Sediments

    NASA Astrophysics Data System (ADS)

    Konishi, C.

    2014-12-01

    Gravel-sand-clay mixture model is proposed particularly for unconsolidated sediments to predict permeability and velocity from volume fractions of the three components (i.e. gravel, sand, and clay). A well-known sand-clay mixture model or bimodal mixture model treats clay contents as volume fraction of the small particle and the rest of the volume is considered as that of the large particle. This simple approach has been commonly accepted and has validated by many studies before. However, a collection of laboratory measurements of permeability and grain size distribution for unconsolidated samples show an impact of presence of another large particle; i.e. only a few percent of gravel particles increases the permeability of the sample significantly. This observation cannot be explained by the bimodal mixture model and it suggests the necessity of considering the gravel-sand-clay mixture model. In the proposed model, I consider the three volume fractions of each component instead of using only the clay contents. Sand becomes either larger or smaller particles in the three component mixture model, whereas it is always the large particle in the bimodal mixture model. The total porosity of the two cases, one is the case that the sand is smaller particle and the other is the case that the sand is larger particle, can be modeled independently from sand volume fraction by the same fashion in the bimodal model. However, the two cases can co-exist in one sample; thus, the total porosity of the mixed sample is calculated by weighted average of the two cases by the volume fractions of gravel and clay. The effective porosity is distinguished from the total porosity assuming that the porosity associated with clay is zero effective porosity. In addition, effective grain size can be computed from the volume fractions and representative grain sizes for each component. Using the effective porosity and the effective grain size, the permeability is predicted by Kozeny-Carman equation. Furthermore, elastic properties are obtainable by general Hashin-Shtrikman-Walpole bounds. The predicted results by this new mixture model are qualitatively consistent with laboratory measurements and well log obtained for unconsolidated sediments. Acknowledgement: A part of this study was accomplished with a subsidy of River Environment Fund of Japan.

  6. Epistemic belief structures within introductory astronomy

    NASA Astrophysics Data System (ADS)

    Johnson, Keith; Willoughby, Shannon D.

    2018-06-01

    The reliability and validity of inventories should be verified in multiple ways. Although the epistemological beliefs about the physical science survey (EBAPS) has been deemed to be reliable and valid by the authors, the axes or factor structure proposed by the authors has not been independently checked. Using data from a study sample we discussed in previous publications, we performed exploratory factor analysis on 1,258 post-test EBAPS surveys. The students in the sample were from an introductory Astronomy course at a mid-sized western university. Inspection suggested the use of either a three-factor model or a five-factor model. Each of the factors is interpreted and discussed, and the factors are compared to the axes proposed by the authors of the EBAPS. We find that the five-factor model extrapolated from our data partially overlaps with the model put forth by the authors of the EBAPS, and that many of the questions did not load onto any factors.

  7. On the degrees of freedom of reduced-rank estimators in multivariate regression

    PubMed Central

    Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.

    2015-01-01

    Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155

  8. Nanometer-sized ceria-coated silica-iron oxide for the reagentless microextraction/preconcentration of heavy metals in environmental and biological samples followed by slurry introduction to ICP-OES.

    PubMed

    Dados, A; Paparizou, E; Eleftheriou, P; Papastephanou, C; Stalikas, C D

    2014-04-01

    A slurry suspension sampling technique is developed and optimized for the rapid microextraction of heavy metals and analysis using nanometer-sized ceria-coated silica-iron oxide particles and inductively coupled plasma optical emission spectrometry (ICP-OES). Magnetic-silica material is synthesized by a co-precipitation and sol-gel method followed by ceria coating through a precipitation. The large particles are removed using a sedimentation-fractionation procedure and a magnetic homogeneous colloidal suspension of ceria-modified iron oxide-silica is produced for microextraction. The nanometer-sized particles are separated from the sample solution magnetically and analyzed with ICP-OES using a slurry suspension sampling approach. The ceria-modified iron oxide-silica does not contain any organic matter and this probably justifies the absence of matrix effect on plasma atomization capacity, when increased concentrations of slurries are aspirated. The As, Be, Mo, Cr, Cu, Pb, Hg, Sb, Se and V can be preconcentrated by the proposed method at pH 6.0 while Mn, Cd, Co and Ni require a pH ≥ 8.0. Satisfactory values are obtained for the relative standard deviations (2-6%), recoveries (88-102%), enrichment factors (14-19) and regression correlation coefficients as well as detectability, at sub-μg L(-1) levels. The applicability of magnetic ceria for the microextraction of metal ions in combination with the slurry introduction technique using ICP is substantiated by the analysis of environmental water and urine samples. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Accounting for parameter uncertainty in the definition of parametric distributions used to describe individual patient variation in health economic models.

    PubMed

    Degeling, Koen; IJzerman, Maarten J; Koopman, Miriam; Koffijberg, Hendrik

    2017-12-15

    Parametric distributions based on individual patient data can be used to represent both stochastic and parameter uncertainty. Although general guidance is available on how parameter uncertainty should be accounted for in probabilistic sensitivity analysis, there is no comprehensive guidance on reflecting parameter uncertainty in the (correlated) parameters of distributions used to represent stochastic uncertainty in patient-level models. This study aims to provide this guidance by proposing appropriate methods and illustrating the impact of this uncertainty on modeling outcomes. Two approaches, 1) using non-parametric bootstrapping and 2) using multivariate Normal distributions, were applied in a simulation and case study. The approaches were compared based on point-estimates and distributions of time-to-event and health economic outcomes. To assess sample size impact on the uncertainty in these outcomes, sample size was varied in the simulation study and subgroup analyses were performed for the case-study. Accounting for parameter uncertainty in distributions that reflect stochastic uncertainty substantially increased the uncertainty surrounding health economic outcomes, illustrated by larger confidence ellipses surrounding the cost-effectiveness point-estimates and different cost-effectiveness acceptability curves. Although both approaches performed similar for larger sample sizes (i.e. n = 500), the second approach was more sensitive to extreme values for small sample sizes (i.e. n = 25), yielding infeasible modeling outcomes. Modelers should be aware that parameter uncertainty in distributions used to describe stochastic uncertainty needs to be reflected in probabilistic sensitivity analysis, as it could substantially impact the total amount of uncertainty surrounding health economic outcomes. If feasible, the bootstrap approach is recommended to account for this uncertainty.

  10. The impact of sample non-normality on ANOVA and alternative methods.

    PubMed

    Lantz, Björn

    2013-05-01

    In this journal, Zimmerman (2004, 2011) has discussed preliminary tests that researchers often use to choose an appropriate method for comparing locations when the assumption of normality is doubtful. The conceptual problem with this approach is that such a two-stage process makes both the power and the significance of the entire procedure uncertain, as type I and type II errors are possible at both stages. A type I error at the first stage, for example, will obviously increase the probability of a type II error at the second stage. Based on the idea of Schmider et al. (2010), which proposes that simulated sets of sample data be ranked with respect to their degree of normality, this paper investigates the relationship between population non-normality and sample non-normality with respect to the performance of the ANOVA, Brown-Forsythe test, Welch test, and Kruskal-Wallis test when used with different distributions, sample sizes, and effect sizes. The overall conclusion is that the Kruskal-Wallis test is considerably less sensitive to the degree of sample normality when populations are distinctly non-normal and should therefore be the primary tool used to compare locations when it is known that populations are not at least approximately normal. © 2012 The British Psychological Society.

  11. Determination of the origin and texture of marble artifacts using stable isotopes

    NASA Astrophysics Data System (ADS)

    Dotsika, E.; Poutoukis, D.; Zisi, N.; Psomiadis, D.

    2009-04-01

    For the characterization of marble and the identification of the origin of marble artifacts, samples from several ancient monuments of Greece were analyzed using several techniques: stable isotopes of carbonates (13C, 18O), XRD analysis and optical microscopy, from which information can be obtained on the origin and texture of the marble used for the production of the artifacts. The full range of grain sizes and isotopic signatures that occur in a lot of different quarries has been measured and presented. In a δ13C versus δ18O diagram, the fields corresponding to all known ancient quarries (from Penteli, Cyclades, especially Naxos (Mela, Apol, Apir, Senax), Keros, Paros (Parlak, Parlyc) and Asia Minor (Prokon)) are reported. The plots representing the analyzed samples are also shown on the same diagram. The final results of the study indicate the origin of the carbonate material of the artefacts from each of the ancient monument. In cases that the samples plot on overlapping areas, a further study is proposed, using the maximum grain size of the material.

  12. Fast wettability transition from hydrophilic to superhydrophobic laser-textured stainless steel surfaces under low-temperature annealing

    NASA Astrophysics Data System (ADS)

    Ngo, Chi-Vinh; Chun, Doo-Man

    2017-07-01

    Recently, the fabrication of superhydrophobic metallic surfaces by means of pulsed laser texturing has been developed. After laser texturing, samples are typically chemically coated or aged in ambient air for a relatively long time of several weeks to achieve superhydrophobicity. To accelerate the wettability transition from hydrophilicity to superhydrophobicity without the use of additional chemical treatment, a simple annealing post process has been developed. In the present work, grid patterns were first fabricated on stainless steel by a nanosecond pulsed laser, then an additional low-temperature annealing post process at 100 °C was applied. The effect of 100-500 μm step size of the textured grid upon the wettability transition time was also investigated. The proposed post process reduced the transition time from a couple of months to within several hours. All samples showed superhydrophobicity with contact angles greater than 160° and sliding angles smaller than 10° except samples with 500 μm step size, and could be applied in several potential applications such as self-cleaning and control of water adhesion.

  13. Use of units of measurement error in anthropometric comparisons.

    PubMed

    Lucas, Teghan; Henneberg, Maciej

    2017-09-01

    Anthropometrists attempt to minimise measurement errors, however, errors cannot be eliminated entirely. Currently, measurement errors are simply reported. Measurement errors should be included into analyses of anthropometric data. This study proposes a method which incorporates measurement errors into reported values, replacing metric units with 'units of technical error of measurement (TEM)' by applying these to forensics, industrial anthropometry and biological variation. The USA armed forces anthropometric survey (ANSUR) contains 132 anthropometric dimensions of 3982 individuals. Concepts of duplication and Euclidean distance calculations were applied to the forensic-style identification of individuals in this survey. The National Size and Shape Survey of Australia contains 65 anthropometric measurements of 1265 women. This sample was used to show how a woman's body measurements expressed in TEM could be 'matched' to standard clothing sizes. Euclidean distances show that two sets of repeated anthropometric measurements of the same person cannot be matched (> 0) on measurements expressed in millimetres but can in units of TEM (= 0). Only 81 women can fit into any standard clothing size when matched using centimetres, with units of TEM, 1944 women fit. The proposed method can be applied to all fields that use anthropometry. Units of TEM are considered a more reliable unit of measurement for comparisons.

  14. Low Complexity Compression and Speed Enhancement for Optical Scanning Holography

    PubMed Central

    Tsang, P. W. M.; Poon, T.-C.; Liu, J.-P.; Kim, T.; Kim, Y. S.

    2016-01-01

    In this paper we report a low complexity compression method that is suitable for compact optical scanning holography (OSH) systems with different optical settings. Our proposed method can be divided into 2 major parts. First, an automatic decision maker is applied to select the rows of holographic pixels to be scanned. This process enhances the speed of acquiring a hologram, and also lowers the data rate. Second, each row of down-sampled pixels is converted into a one-bit representation with delta modulation (DM). Existing DM-based hologram compression techniques suffers from the disadvantage that a core parameter, commonly known as the step size, has to be determined in advance. However, the correct value of the step size for compressing each row of hologram is dependent on the dynamic range of the pixels, which could deviate significantly with the object scene, as well as OSH systems with different opical settings. We have overcome this problem by incorporating a dynamic step-size adjustment scheme. The proposed method is applied in the compression of holograms that are acquired with 2 different OSH systems, demonstrating a compression ratio of over two orders of magnitude, while preserving favorable fidelity on the reconstructed images. PMID:27708410

  15. Design of pilot studies to inform the construction of composite outcome measures.

    PubMed

    Edland, Steven D; Ard, M Colin; Li, Weiwei; Jiang, Lingjing

    2017-06-01

    Composite scales have recently been proposed as outcome measures for clinical trials. For example, the Prodromal Alzheimer's Cognitive Composite (PACC) is the sum of z-score normed component measures assessing episodic memory, timed executive function, and global cognition. Alternative methods of calculating composite total scores using the weighted sum of the component measures that maximize signal-to-noise of the resulting composite score have been proposed. Optimal weights can be estimated from pilot data, but it is an open question how large a pilot trial is required to calculate reliably optimal weights. In this manuscript, we describe the calculation of optimal weights, and use large-scale computer simulations to investigate the question of how large a pilot study sample is required to inform the calculation of optimal weights. The simulations are informed by the pattern of decline observed in cognitively normal subjects enrolled in the Alzheimer's Disease Cooperative Study (ADCS) Prevention Instrument cohort study, restricting to n=75 subjects age 75 and over with an ApoE E4 risk allele and therefore likely to have an underlying Alzheimer neurodegenerative process. In the context of secondary prevention trials in Alzheimer's disease, and using the components of the PACC, we found that pilot studies as small as 100 are sufficient to meaningfully inform weighting parameters. Regardless of the pilot study sample size used to inform weights, the optimally weighted PACC consistently outperformed the standard PACC in terms of statistical power to detect treatment effects in a clinical trial. Pilot studies of size 300 produced weights that achieved near-optimal statistical power, and reduced required sample size relative to the standard PACC by more than half. These simulations suggest that modestly sized pilot studies, comparable to that of a phase 2 clinical trial, are sufficient to inform the construction of composite outcome measures. Although these findings apply only to the PACC in the context of prodromal AD, the observation that weights only have to approximate the optimal weights to achieve near-optimal performance should generalize. Performing a pilot study or phase 2 trial to inform the weighting of proposed composite outcome measures is highly cost-effective. The net effect of more efficient outcome measures is that smaller trials will be required to test novel treatments. Alternatively, second generation trials can use prior clinical trial data to inform weighting, so that greater efficiency can be achieved as we move forward.

  16. 75 FR 36737 - Self-Regulatory Organizations; New York Stock Exchange LLC; Notice of Filing and Immediate...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-28

    ... Proposed Rule Change Amending Rule 1000 Regarding Order Size Eligible for Automatic Execution June 22, 2010... proposes to amend Exchange Rule 1000 regarding order size eligible for automatic execution. The text of the... Proposed Rule Change 1. Purpose The Exchange proposes to amend Rule 1000 to state that the order size...

  17. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  18. Accurate and fast multiple-testing correction in eQTL studies.

    PubMed

    Sul, Jae Hoon; Raj, Towfique; de Jong, Simone; de Bakker, Paul I W; Raychaudhuri, Soumya; Ophoff, Roel A; Stranger, Barbara E; Eskin, Eleazar; Han, Buhm

    2015-06-04

    In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  19. Multipinhole SPECT helical scan parameters and imaging volume

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yao, Rutao, E-mail: rutaoyao@buffalo.edu; Deng, Xiao; Wei, Qingyang

    Purpose: The authors developed SPECT imaging capability on an animal PET scanner using a multiple-pinhole collimator and step-and-shoot helical data acquisition protocols. The objective of this work was to determine the preferred helical scan parameters, i.e., the angular and axial step sizes, and the imaging volume, that provide optimal imaging performance. Methods: The authors studied nine helical scan protocols formed by permuting three rotational and three axial step sizes. These step sizes were chosen around the reference values analytically calculated from the estimated spatial resolution of the SPECT system and the Nyquist sampling theorem. The nine helical protocols were evaluatedmore » by two figures-of-merit: the sampling completeness percentage (SCP) and the root-mean-square (RMS) resolution. SCP was an analytically calculated numerical index based on projection sampling. RMS resolution was derived from the reconstructed images of a sphere-grid phantom. Results: The RMS resolution results show that (1) the start and end pinhole planes of the helical scheme determine the axial extent of the effective field of view (EFOV), and (2) the diameter of the transverse EFOV is adequately calculated from the geometry of the pinhole opening, since the peripheral region beyond EFOV would introduce projection multiplexing and consequent effects. The RMS resolution results of the nine helical scan schemes show optimal resolution is achieved when the axial step size is the half, and the angular step size is about twice the corresponding values derived from the Nyquist theorem. The SCP results agree in general with that of RMS resolution but are less critical in assessing the effects of helical parameters and EFOV. Conclusions: The authors quantitatively validated the effective FOV of multiple pinhole helical scan protocols and proposed a simple method to calculate optimal helical scan parameters.« less

  20. Handheld microwave bomb-detecting imaging system

    NASA Astrophysics Data System (ADS)

    Gorwara, Ashok; Molchanov, Pavlo

    2017-05-01

    Proposed novel imaging technique will provide all weather high-resolution imaging and recognition capability for RF/Microwave signals with good penetration through highly scattered media: fog, snow, dust, smoke, even foliage, camouflage, walls and ground. Image resolution in proposed imaging system is not limited by diffraction and will be determined by processor and sampling frequency. Proposed imaging system can simultaneously cover wide field of view, detect multiple targets and can be multi-frequency, multi-function. Directional antennas in imaging system can be close positioned and installed in cell phone size handheld device, on small aircraft or distributed around protected border or object. Non-scanning monopulse system allows dramatically decrease in transmitting power and at the same time provides increased imaging range by integrating 2-3 orders more signals than regular scanning imaging systems.

  1. New Pt/Alumina model catalysts for STM and in situ XPS studies

    NASA Astrophysics Data System (ADS)

    Nartova, Anna V.; Gharachorlou, Amir; Bukhtiyarov, Andrey V.; Kvon, Ren I.; Bukhtiyarov, Valerii I.

    2017-04-01

    The new Pt/alumina model catalysts for STM and in situ XPS studies based on thin alumina film formed over the conductive substrate are proposed. Procedure of platinum deposition developed for porous alumina was adapted for the model alumina support. The set of Pt/AlOx-film samples with the different mean platinum particle size was prepared. Capabilities of in situ XPS investigations of the proposed catalysts were demonstrated in study of NO decomposition on platinum nanoparticles. It is shown that proposed model catalysts behave similarly to Pt/γ-Al2O3 and provide the new opportunities for the instrumental studies of platinum catalysts due to resolving several issues (charging, heating, screening) that are typical for the investigation of the porous oxide supported catalysts.

  2. Deep Recurrent Neural Networks for Human Activity Recognition

    PubMed Central

    Murad, Abdulmajid

    2017-01-01

    Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs. PMID:29113103

  3. Deep Recurrent Neural Networks for Human Activity Recognition.

    PubMed

    Murad, Abdulmajid; Pyun, Jae-Young

    2017-11-06

    Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs.

  4. Smoothing spline ANOVA frailty model for recurrent event data.

    PubMed

    Du, Pang; Jiang, Yihua; Wang, Yuedong

    2011-12-01

    Gap time hazard estimation is of particular interest in recurrent event data. This article proposes a fully nonparametric approach for estimating the gap time hazard. Smoothing spline analysis of variance (ANOVA) decompositions are used to model the log gap time hazard as a joint function of gap time and covariates, and general frailty is introduced to account for between-subject heterogeneity and within-subject correlation. We estimate the nonparametric gap time hazard function and parameters in the frailty distribution using a combination of the Newton-Raphson procedure, the stochastic approximation algorithm (SAA), and the Markov chain Monte Carlo (MCMC) method. The convergence of the algorithm is guaranteed by decreasing the step size of parameter update and/or increasing the MCMC sample size along iterations. Model selection procedure is also developed to identify negligible components in a functional ANOVA decomposition of the log gap time hazard. We evaluate the proposed methods with simulation studies and illustrate its use through the analysis of bladder tumor data. © 2011, The International Biometric Society.

  5. Cerebral cortex astroglia and the brain of a genius: A propos of A. Einstein's

    PubMed Central

    Colombo, Jorge A.; Reisin, Hernán D.; Miguel-Hidalgo, José J.; Rajkowska, Grazyna

    2010-01-01

    The glial fibrillary acidic protein immunoreactive astroglial layout of the cerebral cortex from Albert Einstein and other four age-matched human cases lacking any known neurological disease was analyzed using quantification of geometrical features mathematically defined. Several parameters (parallelism, relative depth, tortuosity) describing the primate-specific interlaminar glial processes did not show individually distinctive characteristics in any of the samples analyzed. However, A. Einstein's astrocytic processes showed larger sizes and higher numbers of interlaminar terminal masses, reaching sizes of 15 μm in diameter. These bulbous endings are of unknown significance and they have been described occurring in Alzheimer's disease. These observations are placed in the context of the general discussion regarding the proposal – by other authors – that structural, postmortem characteristics of the aged brain of Albert Einstein may serve as markers of his cognitive performance, a proposal to which the authors of this paper do not subscribe, and argue against. PMID:16675021

  6. Use of the ARM Measurement of Spectral Zenith Radiance For Better Understanding Of 3D Cloud-Radiation Processes and Aerosol-Cloud Interaction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chiu, Jui-Yuan

    2010-10-19

    Our proposal focuses on cloud-radiation processes in a general 3D cloud situation, with particular emphasis on cloud optical depth and effective particle size. We also focus on zenith radiance measurements, both active and passive. The proposal has three main parts. Part One exploits the "solar-background" mode of ARM lidars to allow them to retrieve cloud optical depth not just for thin clouds but for all clouds. This also enables the study of aerosol cloud interactions with a single instrument. Part Two exploits the large number of new wavelengths offered by ARM's zenith-pointing ShortWave Spectrometer (SWS), especially during CLASIC, to developmore » better retrievals not only of cloud optical depth but also of cloud particle size. We also propose to take advantage of the SWS's 1 Hz sampling to study the "twilight zone" around clouds where strong aerosol-cloud interactions are taking place. Part Three involves continuing our cloud optical depth and cloud fraction retrieval research with ARM's 2NFOV instrument by, first, analyzing its data from the AMF-COPS/CLOWD deployment, and second, making our algorithms part of ARM's operational data processing.« less

  7. The Power of Low Back Pain Trials: A Systematic Review of Power, Sample Size, and Reporting of Sample Size Calculations Over Time, in Trials Published Between 1980 and 2012.

    PubMed

    Froud, Robert; Rajendran, Dévan; Patel, Shilpa; Bright, Philip; Bjørkli, Tom; Eldridge, Sandra; Buchbinder, Rachelle; Underwood, Martin

    2017-06-01

    A systematic review of nonspecific low back pain trials published between 1980 and 2012. To explore what proportion of trials have been powered to detect different bands of effect size; whether there is evidence that sample size in low back pain trials has been increasing; what proportion of trial reports include a sample size calculation; and whether likelihood of reporting sample size calculations has increased. Clinical trials should have a sample size sufficient to detect a minimally important difference for a given power and type I error rate. An underpowered trial is one within which probability of type II error is too high. Meta-analyses do not mitigate underpowered trials. Reviewers independently abstracted data on sample size at point of analysis, whether a sample size calculation was reported, and year of publication. Descriptive analyses were used to explore ability to detect effect sizes, and regression analyses to explore the relationship between sample size, or reporting sample size calculations, and time. We included 383 trials. One-third were powered to detect a standardized mean difference of less than 0.5, and 5% were powered to detect less than 0.3. The average sample size was 153 people, which increased only slightly (∼4 people/yr) from 1980 to 2000, and declined slightly (∼4.5 people/yr) from 2005 to 2011 (P < 0.00005). Sample size calculations were reported in 41% of trials. The odds of reporting a sample size calculation (compared to not reporting one) increased until 2005 and then declined (Equation is included in full-text article.). Sample sizes in back pain trials and the reporting of sample size calculations may need to be increased. It may be justifiable to power a trial to detect only large effects in the case of novel interventions. 3.

  8. Empirical Bayes Gaussian likelihood estimation of exposure distributions from pooled samples in human biomonitoring.

    PubMed

    Li, Xiang; Kuk, Anthony Y C; Xu, Jinfeng

    2014-12-10

    Human biomonitoring of exposure to environmental chemicals is important. Individual monitoring is not viable because of low individual exposure level or insufficient volume of materials and the prohibitive cost of taking measurements from many subjects. Pooling of samples is an efficient and cost-effective way to collect data. Estimation is, however, complicated as individual values within each pool are not observed but are only known up to their average or weighted average. The distribution of such averages is intractable when the individual measurements are lognormally distributed, which is a common assumption. We propose to replace the intractable distribution of the pool averages by a Gaussian likelihood to obtain parameter estimates. If the pool size is large, this method produces statistically efficient estimates, but regardless of pool size, the method yields consistent estimates as the number of pools increases. An empirical Bayes (EB) Gaussian likelihood approach, as well as its Bayesian analog, is developed to pool information from various demographic groups by using a mixed-effect formulation. We also discuss methods to estimate the underlying mean-variance relationship and to select a good model for the means, which can be incorporated into the proposed EB or Bayes framework. By borrowing strength across groups, the EB estimator is more efficient than the individual group-specific estimator. Simulation results show that the EB Gaussian likelihood estimates outperform a previous method proposed for the National Health and Nutrition Examination Surveys with much smaller bias and better coverage in interval estimation, especially after correction of bias. Copyright © 2014 John Wiley & Sons, Ltd.

  9. Maximizing Sampling Efficiency and Minimizing Uncertainty in Presence/Absence Classification of Rare Salamander Populations

    DTIC Science & Technology

    2008-10-31

    of the Apalachicola River drainage. Although this proposed division in classification appears to be generally accepted by the herpetological community...breeding in small forest ponds. Herpetological Review 33(4):275-280. Carle, F. L. and M. R. Strub. 1978. A new method for estimating population size...gopher frogs (Rana capito) and southern leopard frogs (Rana sphenocephala). Journal of Herpetology 42: 97-103. Grevstad, F.S. 2005. Simulating

  10. 77 FR 55737 - Small Business Size Standards: Finance and Insurance and Management of Companies and Enterprises

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-11

    ...The U.S. Small Business Administration (SBA) proposes to increase small business size standards for 37 industries in North American Industry Classification System (NAICS) Sector 52, Finance and Insurance, and for two industries in NAICS Sector 55, Management of Companies and Enterprises. In addition, SBA proposes to change the measure of size from average assets to average receipts for NAICS 522293, International Trade Financing. As part of its ongoing comprehensive size standards review, SBA evaluated all receipts based and assets based size standards in NAICS Sectors 52 and 55 to determine whether they should be retained or revised. This proposed rule is one of a series of proposed rules that will review size standards of industries grouped by NAICS Sector. SBA issued a White Paper entitled ``Size Standards Methodology'' and published a notice in the October 21, 2009 issue of the Federal Register to advise the public that the document is available on its Web site at www.sba.gov/size for public review and comments. The ``Size Standards Methodology'' White Paper explains how SBA establishes, reviews, and modifies its receipts based and employee based small business size standards. In this proposed rule, SBA has applied its methodology that pertains to establishing, reviewing, and modifying a receipts based size standard.

  11. The Kepler Mission: Search for Habitable Planets

    NASA Technical Reports Server (NTRS)

    Borucki, William; Likins, B.; DeVincenzi, Donald L. (Technical Monitor)

    1998-01-01

    Detecting extrasolar terrestrial planets orbiting main-sequence stars is of great interest and importance. Current ground-based methods are only capable of detecting objects about the size or mass of Jupiter or larger. The difficulties encountered with direct imaging of Earth-size planets from space are expected to be resolved in the next twenty years. Spacebased photometry of planetary transits is currently the only viable method for detection of terrestrial planets (30-600 times less massive than Jupiter). This method searches the extended solar neighborhood, providing a statistically large sample and the detailed characteristics of each individual case. A robust concept has been developed and proposed as a Discovery-class mission. Its capabilities and strengths are presented.

  12. Sexual dimorphism in the human face assessed by euclidean distance matrix analysis.

    PubMed Central

    Ferrario, V F; Sforza, C; Pizzini, G; Vogel, G; Miani, A

    1993-01-01

    The form of any object can be viewed as a combination of size and shape. A recently proposed method (euclidean distance matrix analysis) can differentiate between size and shape differences. It has been applied to analyse the sexual dimorphism in facial form in a sample of 108 healthy young adults (57 men, 51 women). The face was wider and longer in men than in women. A global shape difference was demonstrated, the male face being more rectangular and the female face more square. Gender variations involved especially the lower third of the face and, in particular, the position of the pogonion relative to the other structures. PMID:8300436

  13. Clinical trial designs for testing biomarker-based personalized therapies

    PubMed Central

    Lai, Tze Leung; Lavori, Philip W; Shih, Mei-Chiung I; Sikic, Branimir I

    2014-01-01

    Background Advances in molecular therapeutics in the past decade have opened up new possibilities for treating cancer patients with personalized therapies, using biomarkers to determine which treatments are most likely to benefit them, but there are difficulties and unresolved issues in the development and validation of biomarker-based personalized therapies. We develop a new clinical trial design to address some of these issues. The goal is to capture the strengths of the frequentist and Bayesian approaches to address this problem in the recent literature and to circumvent their limitations. Methods We use generalized likelihood ratio tests of the intersection null and enriched strategy null hypotheses to derive a novel clinical trial design for the problem of advancing promising biomarker-guided strategies toward eventual validation. We also investigate the usefulness of adaptive randomization (AR) and futility stopping proposed in the recent literature. Results Simulation studies demonstrate the advantages of testing both the narrowly focused enriched strategy null hypothesis related to validating a proposed strategy and the intersection null hypothesis that can accommodate to a potentially successful strategy. AR and early termination of ineffective treatments offer increased probability of receiving the preferred treatment and better response rates for patients in the trial, at the expense of more complicated inference under small-to-moderate total sample sizes and some reduction in power. Limitations The binary response used in the development phase may not be a reliable indicator of treatment benefit on long-term clinical outcomes. In the proposed design, the biomarker-guided strategy (BGS) is not compared to ‘standard of care’, such as physician’s choice that may be informed by patient characteristics. Therefore, a positive result does not imply superiority of the BGS to ‘standard of care’. The proposed design and tests are valid asymptotically. Simulations are used to examine small-to-moderate sample properties. Conclusion Innovative clinical trial designs are needed to address the difficulties and issues in the development and validation of biomarker-based personalized therapies. The article shows the advantages of using likelihood inference and interim analysis to meet the challenges in the sample size needed and in the constantly evolving biomarker landscape and genomic and proteomic technologies. PMID:22397801

  14. The influence of base rates on correlations: An evaluation of proposed alternative effect sizes with real-world data.

    PubMed

    Babchishin, Kelly M; Helmus, Leslie-Maaike

    2016-09-01

    Correlations are the simplest and most commonly understood effect size statistic in psychology. The purpose of the current paper was to use a large sample of real-world data (109 correlations with 60,415 participants) to illustrate the base rate dependence of correlations when applied to dichotomous or ordinal data. Specifically, we examined the influence of the base rate on different effect size metrics. Correlations decreased when the dichotomous variable did not have a 50 % base rate. The higher the deviation from a 50 % base rate, the smaller the observed Pearson's point-biserial and Kendall's tau correlation coefficients. In contrast, the relationship between base rate deviations and the more commonly proposed alternatives (i.e., polychoric correlation coefficients, AUCs, Pearson/Thorndike adjusted correlations, and Cohen's d) were less remarkable, with AUCs being most robust to attenuation due to base rates. In other words, the base rate makes a marked difference in the magnitude of the correlation. As such, when using dichotomous data, the correlation may be more sensitive to base rates than is optimal for the researcher's goals. Given the magnitude of the association between the base rate and point-biserial correlations (r = -.81) and Kendall's tau (r = -.80), we recommend that AUCs, Pearson/Thorndike adjusted correlations, Cohen's d, or polychoric correlations should be considered as alternate effect size statistics in many contexts.

  15. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

    PubMed

    Lin, Johnny; Bentler, Peter M

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.

  16. Adaptation of G-TAG Software for Validating Touch-and-Go Comet Surface Sampling Design Methodology

    NASA Technical Reports Server (NTRS)

    Mandic, Milan; Acikmese, Behcet; Blackmore, Lars

    2011-01-01

    The G-TAG software tool was developed under the R&TD on Integrated Autonomous Guidance, Navigation, and Control for Comet Sample Return, and represents a novel, multi-body dynamics simulation software tool for studying TAG sampling. The G-TAG multi-body simulation tool provides a simulation environment in which a Touch-and-Go (TAG) sampling event can be extensively tested. TAG sampling requires the spacecraft to descend to the surface, contact the surface with a sampling collection device, and then to ascend to a safe altitude. The TAG event lasts only a few seconds but is mission-critical with potentially high risk. Consequently, there is a need for the TAG event to be well characterized and studied by simulation and analysis in order for the proposal teams to converge on a reliable spacecraft design. This adaptation of the G-TAG tool was developed to support the Comet Odyssey proposal effort, and is specifically focused to address comet sample return missions. In this application, the spacecraft descends to and samples from the surface of a comet. Performance of the spacecraft during TAG is assessed based on survivability and sample collection performance. For the adaptation of the G-TAG simulation tool to comet scenarios, models are developed that accurately describe the properties of the spacecraft, approach trajectories, and descent velocities, as well as the models of the external forces and torques acting on the spacecraft. The adapted models of the spacecraft, descent profiles, and external sampling forces/torques were more sophisticated and customized for comets than those available in the basic G-TAG simulation tool. Scenarios implemented include the study of variations in requirements, spacecraft design (size, locations, etc. of the spacecraft components), and the environment (surface properties, slope, disturbances, etc.). The simulations, along with their visual representations using G-View, contributed to the Comet Odyssey New Frontiers proposal effort by indicating problems and/or benefits of different approaches and designs.

  17. Multi-level slug tests in highly permeable formations: 2. Hydraulic conductivity identification, method verification, and field applications

    USGS Publications Warehouse

    Zlotnik, V.A.; McGuire, V.L.

    1998-01-01

    Using the developed theory and modified Springer-Gelhar (SG) model, an identification method is proposed for estimating hydraulic conductivity from multi-level slug tests. The computerized algorithm calculates hydraulic conductivity from both monotonic and oscillatory well responses obtained using a double-packer system. Field verification of the method was performed at a specially designed fully penetrating well of 0.1-m diameter with a 10-m screen in a sand and gravel alluvial aquifer (MSEA site, Shelton, Nebraska). During well installation, disturbed core samples were collected every 0.6 m using a split-spoon sampler. Vertical profiles of hydraulic conductivity were produced on the basis of grain-size analysis of the disturbed core samples. These results closely correlate with the vertical profile of horizontal hydraulic conductivity obtained by interpreting multi-level slug test responses using the modified SG model. The identification method was applied to interpret the response from 474 slug tests in 156 locations at the MSEA site. More than 60% of responses were oscillatory. The method produced a good match to experimental data for both oscillatory and monotonic responses using an automated curve matching procedure. The proposed method allowed us to drastically increase the efficiency of each well used for aquifer characterization and to process massive arrays of field data. Recommendations generalizing this experience to massive application of the proposed method are developed.Using the developed theory and modified Springer-Gelhar (SG) model, an identification method is proposed for estimating hydraulic conductivity from multi-level slug tests. The computerized algorithm calculates hydraulic conductivity from both monotonic and oscillatory well responses obtained using a double-packer system. Field verification of the method was performed at a specially designed fully penetrating well of 0.1-m diameter with a 10-m screen in a sand and gravel alluvial aquifer (MSEA site, Shelton, Nebraska). During well installation, disturbed core samples were collected every 0.6 m using a split-spoon sampler. Vertical profiles of hydraulic conductivity were produced on the basis of grain-size analysis of the disturbed core samples. These results closely correlate with the vertical profile of horizontal hydraulic conductivity obtained by interpreting multi-level slug test responses using the modified SG model. The identification method was applied to interpret the response from 474 slug tests in 156 locations at the MSEA site. More than 60% of responses were oscillatory. The method produced a good match to experimental data for both oscillatory and monotonic responses using an automated curve matching procedure. The proposed method allowed us to drastically increase the efficiency of each well used for aquifer characterization and to process massive arrays of field data. Recommendations generalizing this experience to massive application of the proposed method are developed.

  18. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

    PubMed Central

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357

  19. Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size.

    PubMed

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. We found a negative correlation of r = -.45 [95% CI: -.53; -.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.

  20. Two-dimensional imaging via a narrowband MIMO radar system with two perpendicular linear arrays.

    PubMed

    Wang, Dang-wei; Ma, Xiao-yan; Su, Yi

    2010-05-01

    This paper presents a system model and method for the 2-D imaging application via a narrowband multiple-input multiple-output (MIMO) radar system with two perpendicular linear arrays. Furthermore, the imaging formulation for our method is developed through a Fourier integral processing, and the parameters of antenna array including the cross-range resolution, required size, and sampling interval are also examined. Different from the spatial sequential procedure sampling the scattered echoes during multiple snapshot illuminations in inverse synthetic aperture radar (ISAR) imaging, the proposed method utilizes a spatial parallel procedure to sample the scattered echoes during a single snapshot illumination. Consequently, the complex motion compensation in ISAR imaging can be avoided. Moreover, in our array configuration, multiple narrowband spectrum-shared waveforms coded with orthogonal polyphase sequences are employed. The mainlobes of the compressed echoes from the different filter band could be located in the same range bin, and thus, the range alignment in classical ISAR imaging is not necessary. Numerical simulations based on synthetic data are provided for testing our proposed method.

Top