Sample records for random sampling approach

  1. A random sampling approach for robust estimation of tissue-to-plasma ratio from extremely sparse data.

    PubMed

    Chu, Hui-May; Ette, Ene I

    2005-09-02

    his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.

  2. Chi-Squared Test of Fit and Sample Size-A Comparison between a Random Sample Approach and a Chi-Square Value Adjustment Method.

    PubMed

    Bergh, Daniel

    2015-01-01

    Chi-square statistics are commonly used for tests of fit of measurement models. Chi-square is also sensitive to sample size, which is why several approaches to handle large samples in test of fit analysis have been developed. One strategy to handle the sample size problem may be to adjust the sample size in the analysis of fit. An alternative is to adopt a random sample approach. The purpose of this study was to analyze and to compare these two strategies using simulated data. Given an original sample size of 21,000, for reductions of sample sizes down to the order of 5,000 the adjusted sample size function works as good as the random sample approach. In contrast, when applying adjustments to sample sizes of lower order the adjustment function is less effective at approximating the chi-square value for an actual random sample of the relevant size. Hence, the fit is exaggerated and misfit under-estimated using the adjusted sample size function. Although there are big differences in chi-square values between the two approaches at lower sample sizes, the inferences based on the p-values may be the same.

  3. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST)

    PubMed Central

    Xu, Chonggang; Gertner, George

    2013-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. PMID:24143037

  4. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST).

    PubMed

    Xu, Chonggang; Gertner, George

    2011-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements.

  5. Systematic versus random sampling in stereological studies.

    PubMed

    West, Mark J

    2012-12-01

    The sampling that takes place at all levels of an experimental design must be random if the estimate is to be unbiased in a statistical sense. There are two fundamental ways by which one can make a random sample of the sections and positions to be probed on the sections. Using a card-sampling analogy, one can pick any card at all out of a deck of cards. This is referred to as independent random sampling because the sampling of any one card is made without reference to the position of the other cards. The other approach to obtaining a random sample would be to pick a card within a set number of cards and others at equal intervals within the deck. Systematic sampling along one axis of many biological structures is more efficient than random sampling, because most biological structures are not randomly organized. This article discusses the merits of systematic versus random sampling in stereological studies.

  6. A comparison of two sampling approaches for assessing the urban forest canopy cover from aerial photography.

    Treesearch

    Ucar Zennure; Pete Bettinger; Krista Merry; Jacek Siry; J.M. Bowker

    2016-01-01

    Two different sampling approaches for estimating urban tree canopy cover were applied to two medium-sized cities in the United States, in conjunction with two freely available remotely sensed imagery products. A random point-based sampling approach, which involved 1000 sample points, was compared against a plot/grid sampling (cluster sampling) approach that involved a...

  7. LOD score exclusion analyses for candidate genes using random population samples.

    PubMed

    Deng, H W; Li, J; Recker, R R

    2001-05-01

    While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes with random population samples. We develop a LOD score approach for exclusion analyses of candidate genes with random population samples. Under this approach, specific genetic effects and inheritance models at candidate genes can be analysed and if a LOD score is < or = - 2.0, the locus can be excluded from having an effect larger than that specified. Computer simulations show that, with sample sizes often employed in association studies, this approach has high power to exclude a gene from having moderate genetic effects. In contrast to regular association analyses, population admixture will not affect the robustness of our analyses; in fact, it renders our analyses more conservative and thus any significant exclusion result is robust. Our exclusion analysis complements association analysis for candidate genes in random population samples and is parallel to the exclusion mapping analyses that may be conducted in linkage analyses with pedigrees or relative pairs. The usefulness of the approach is demonstrated by an application to test the importance of vitamin D receptor and estrogen receptor genes underlying the differential risk to osteoporotic fractures.

  8. Sequential time interleaved random equivalent sampling for repetitive signal.

    PubMed

    Zhao, Yijiu; Liu, Jingjing

    2016-12-01

    Compressed sensing (CS) based sampling techniques exhibit many advantages over other existing approaches for sparse signal spectrum sensing; they are also incorporated into non-uniform sampling signal reconstruction to improve the efficiency, such as random equivalent sampling (RES). However, in CS based RES, only one sample of each acquisition is considered in the signal reconstruction stage, and it will result in more acquisition runs and longer sampling time. In this paper, a sampling sequence is taken in each RES acquisition run, and the corresponding block measurement matrix is constructed using a Whittaker-Shannon interpolation formula. All the block matrices are combined into an equivalent measurement matrix with respect to all sampling sequences. We implemented the proposed approach with a multi-cores analog-to-digital converter (ADC), whose ADC cores are time interleaved. A prototype realization of this proposed CS based sequential random equivalent sampling method has been developed. It is able to capture an analog waveform at an equivalent sampling rate of 40 GHz while sampled at 1 GHz physically. Experiments indicate that, for a sparse signal, the proposed CS based sequential random equivalent sampling exhibits high efficiency.

  9. Methods for sample size determination in cluster randomized trials

    PubMed Central

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-01-01

    Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515

  10. Multilattice sampling strategies for region of interest dynamic MRI.

    PubMed

    Rilling, Gabriel; Tao, Yuehui; Marshall, Ian; Davies, Mike E

    2013-08-01

    A multilattice sampling approach is proposed for dynamic MRI with Cartesian trajectories. It relies on the use of sampling patterns composed of several different lattices and exploits an image model where only some parts of the image are dynamic, whereas the rest is assumed static. Given the parameters of such an image model, the methodology followed for the design of a multilattice sampling pattern adapted to the model is described. The multi-lattice approach is compared to single-lattice sampling, as used by traditional acceleration methods such as UNFOLD (UNaliasing by Fourier-Encoding the Overlaps using the temporal Dimension) or k-t BLAST, and random sampling used by modern compressed sensing-based methods. On the considered image model, it allows more flexibility and higher accelerations than lattice sampling and better performance than random sampling. The method is illustrated on a phase-contrast carotid blood velocity mapping MR experiment. Combining the multilattice approach with the KEYHOLE technique allows up to 12× acceleration factors. Simulation and in vivo undersampling results validate the method. Compared to lattice and random sampling, multilattice sampling provides significant gains at high acceleration factors. © 2012 Wiley Periodicals, Inc.

  11. Adapted random sampling patterns for accelerated MRI.

    PubMed

    Knoll, Florian; Clason, Christian; Diwoky, Clemens; Stollberger, Rudolf

    2011-02-01

    Variable density random sampling patterns have recently become increasingly popular for accelerated imaging strategies, as they lead to incoherent aliasing artifacts. However, the design of these sampling patterns is still an open problem. Current strategies use model assumptions like polynomials of different order to generate a probability density function that is then used to generate the sampling pattern. This approach relies on the optimization of design parameters which is very time consuming and therefore impractical for daily clinical use. This work presents a new approach that generates sampling patterns by making use of power spectra of existing reference data sets and hence requires neither parameter tuning nor an a priori mathematical model of the density of sampling points. The approach is validated with downsampling experiments, as well as with accelerated in vivo measurements. The proposed approach is compared with established sampling patterns, and the generalization potential is tested by using a range of reference images. Quantitative evaluation is performed for the downsampling experiments using RMS differences to the original, fully sampled data set. Our results demonstrate that the image quality of the method presented in this paper is comparable to that of an established model-based strategy when optimization of the model parameter is carried out and yields superior results to non-optimized model parameters. However, no random sampling pattern showed superior performance when compared to conventional Cartesian subsampling for the considered reconstruction strategy.

  12. Variable density randomized stack of spirals (VDR-SoS) for compressive sensing MRI.

    PubMed

    Valvano, Giuseppe; Martini, Nicola; Landini, Luigi; Santarelli, Maria Filomena

    2016-07-01

    To develop a 3D sampling strategy based on a stack of variable density spirals for compressive sensing MRI. A random sampling pattern was obtained by rotating each spiral by a random angle and by delaying for few time steps the gradient waveforms of the different interleaves. A three-dimensional (3D) variable sampling density was obtained by designing different variable density spirals for each slice encoding. The proposed approach was tested with phantom simulations up to a five-fold undersampling factor. Fully sampled 3D dataset of a human knee, and of a human brain, were obtained from a healthy volunteer. The proposed approach was tested with off-line reconstructions of the knee dataset up to a four-fold acceleration and compared with other noncoherent trajectories. The proposed approach outperformed the standard stack of spirals for various undersampling factors. The level of coherence and the reconstruction quality of the proposed approach were similar to those of other trajectories that, however, require 3D gridding for the reconstruction. The variable density randomized stack of spirals (VDR-SoS) is an easily implementable trajectory that could represent a valid sampling strategy for 3D compressive sensing MRI. It guarantees low levels of coherence without requiring 3D gridding. Magn Reson Med 76:59-69, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  13. Urban tree cover change in Detroit and Atlanta, USA, 1951-2010

    Treesearch

    Krista Merry; Jacek Siry; Pete Bettinger; J.M. Bowker

    2014-01-01

    We assessed tree cover using random points and polygons distributed within the administrative boundaries of Detroit, MI and Atlanta, GA. Two approaches were tested, a point-based approach using 1000 randomly located sample points, and polygon-based approach using 250 circular areas, 200 m in radius (12.56 ha). In the case of Atlanta, both approaches arrived at similar...

  14. A simple and efficient alternative to implementing systematic random sampling in stereological designs without a motorized microscope stage.

    PubMed

    Melvin, Neal R; Poda, Daniel; Sutherland, Robert J

    2007-10-01

    When properly applied, stereology is a very robust and efficient method to quantify a variety of parameters from biological material. A common sampling strategy in stereology is systematic random sampling, which involves choosing a random sampling [corrected] start point outside the structure of interest, and sampling relevant objects at [corrected] sites that are placed at pre-determined, equidistant intervals. This has proven to be a very efficient sampling strategy, and is used widely in stereological designs. At the microscopic level, this is most often achieved through the use of a motorized stage that facilitates the systematic random stepping across the structure of interest. Here, we report a simple, precise and cost-effective software-based alternative to accomplishing systematic random sampling under the microscope. We believe that this approach will facilitate the use of stereological designs that employ systematic random sampling in laboratories that lack the resources to acquire costly, fully automated systems.

  15. The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy.

    PubMed

    Rodgers, J L

    1999-10-01

    A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.

  16. LOD score exclusion analyses for candidate QTLs using random population samples.

    PubMed

    Deng, Hong-Wen

    2003-11-01

    While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes as putative QTLs using random population samples. Previously, we developed an LOD score exclusion mapping approach for candidate genes for complex diseases. Here, we extend this LOD score approach for exclusion analyses of candidate genes for quantitative traits. Under this approach, specific genetic effects (as reflected by heritability) and inheritance models at candidate QTLs can be analyzed and if an LOD score is < or = -2.0, the locus can be excluded from having a heritability larger than that specified. Simulations show that this approach has high power to exclude a candidate gene from having moderate genetic effects if it is not a QTL and is robust to population admixture. Our exclusion analysis complements association analysis for candidate genes as putative QTLs in random population samples. The approach is applied to test the importance of Vitamin D receptor (VDR) gene as a potential QTL underlying the variation of bone mass, an important determinant of osteoporosis.

  17. Health plan auditing: 100-percent-of-claims vs. random-sample audits.

    PubMed

    Sillup, George P; Klimberg, Ronald K

    2011-01-01

    The objective of this study was to examine the relative efficacy of two different methodologies for auditing self-funded medical claim expenses: 100-percent-of-claims auditing versus random-sampling auditing. Multiple data sets of claim errors or 'exceptions' from two Fortune-100 corporations were analysed and compared to 100 simulated audits of 300- and 400-claim random samples. Random-sample simulations failed to identify a significant number and amount of the errors that ranged from $200,000 to $750,000. These results suggest that health plan expenses of corporations could be significantly reduced if they audited 100% of claims and embraced a zero-defect approach.

  18. Binomial leap methods for simulating stochastic chemical kinetics.

    PubMed

    Tian, Tianhai; Burrage, Kevin

    2004-12-01

    This paper discusses efficient simulation methods for stochastic chemical kinetics. Based on the tau-leap and midpoint tau-leap methods of Gillespie [D. T. Gillespie, J. Chem. Phys. 115, 1716 (2001)], binomial random variables are used in these leap methods rather than Poisson random variables. The motivation for this approach is to improve the efficiency of the Poisson leap methods by using larger stepsizes. Unlike Poisson random variables whose range of sample values is from zero to infinity, binomial random variables have a finite range of sample values. This probabilistic property has been used to restrict possible reaction numbers and to avoid negative molecular numbers in stochastic simulations when larger stepsize is used. In this approach a binomial random variable is defined for a single reaction channel in order to keep the reaction number of this channel below the numbers of molecules that undergo this reaction channel. A sampling technique is also designed for the total reaction number of a reactant species that undergoes two or more reaction channels. Samples for the total reaction number are not greater than the molecular number of this species. In addition, probability properties of the binomial random variables provide stepsize conditions for restricting reaction numbers in a chosen time interval. These stepsize conditions are important properties of robust leap control strategies. Numerical results indicate that the proposed binomial leap methods can be applied to a wide range of chemical reaction systems with very good accuracy and significant improvement on efficiency over existing approaches. (c) 2004 American Institute of Physics.

  19. Midwives Performance in Early Detection of Growth and Development Irregularities of Children Based on Task Commitment

    ERIC Educational Resources Information Center

    Utami, Sri; Nursalam; Hargono, Rachmat; Susilaningrum, Rekawati

    2016-01-01

    The purpose of this study was to analyze the performance of midwives based on the task commitment. This was an observational analytic with cross sectional approach. Multistage random sampling was used to determine the public health center, proportional random sampling to selected participants. The samples were 222 midwives in the public health…

  20. [Comparison study on sampling methods of Oncomelania hupensis snail survey in marshland schistosomiasis epidemic areas in China].

    PubMed

    An, Zhao; Wen-Xin, Zhang; Zhong, Yao; Yu-Kuan, Ma; Qing, Liu; Hou-Lang, Duan; Yi-di, Shang

    2016-06-29

    To optimize and simplify the survey method of Oncomelania hupensis snail in marshland endemic region of schistosomiasis and increase the precision, efficiency and economy of the snail survey. A quadrate experimental field was selected as the subject of 50 m×50 m size in Chayegang marshland near Henghu farm in the Poyang Lake region and a whole-covered method was adopted to survey the snails. The simple random sampling, systematic sampling and stratified random sampling methods were applied to calculate the minimum sample size, relative sampling error and absolute sampling error. The minimum sample sizes of the simple random sampling, systematic sampling and stratified random sampling methods were 300, 300 and 225, respectively. The relative sampling errors of three methods were all less than 15%. The absolute sampling errors were 0.221 7, 0.302 4 and 0.047 8, respectively. The spatial stratified sampling with altitude as the stratum variable is an efficient approach of lower cost and higher precision for the snail survey.

  1. Sample size determination for bibliographic retrieval studies

    PubMed Central

    Yao, Xiaomei; Wilczynski, Nancy L; Walter, Stephen D; Haynes, R Brian

    2008-01-01

    Background Research for developing search strategies to retrieve high-quality clinical journal articles from MEDLINE is expensive and time-consuming. The objective of this study was to determine the minimal number of high-quality articles in a journal subset that would need to be hand-searched to update or create new MEDLINE search strategies for treatment, diagnosis, and prognosis studies. Methods The desired width of the 95% confidence intervals (W) for the lowest sensitivity among existing search strategies was used to calculate the number of high-quality articles needed to reliably update search strategies. New search strategies were derived in journal subsets formed by 2 approaches: random sampling of journals and top journals (having the most high-quality articles). The new strategies were tested in both the original large journal database and in a low-yielding journal (having few high-quality articles) subset. Results For treatment studies, if W was 10% or less for the lowest sensitivity among our existing search strategies, a subset of 15 randomly selected journals or 2 top journals were adequate for updating search strategies, based on each approach having at least 99 high-quality articles. The new strategies derived in 15 randomly selected journals or 2 top journals performed well in the original large journal database. Nevertheless, the new search strategies developed using the random sampling approach performed better than those developed using the top journal approach in a low-yielding journal subset. For studies of diagnosis and prognosis, no journal subset had enough high-quality articles to achieve the expected W (10%). Conclusion The approach of randomly sampling a small subset of journals that includes sufficient high-quality articles is an efficient way to update or create search strategies for high-quality articles on therapy in MEDLINE. The concentrations of diagnosis and prognosis articles are too low for this approach. PMID:18823538

  2. Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach

    ERIC Educational Resources Information Center

    Rotondi, Michael A.; Donner, Allan

    2009-01-01

    The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…

  3. Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials.

    PubMed

    Gilbert, Peter B; Yu, Xuesong; Rotnitzky, Andrea

    2014-03-15

    To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y | W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method. Copyright © 2013 John Wiley & Sons, Ltd.

  4. Optimal Auxiliary-Covariate Based Two-Phase Sampling Design for Semiparametric Efficient Estimation of a Mean or Mean Difference, with Application to Clinical Trials

    PubMed Central

    Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea

    2014-01-01

    To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semi-parametric efficient estimator is applied. This approach is made efficient by specifying the phase-two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. Simulations are performed to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. Proofs and R code are provided. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean “importance-weighted” breadth (Y) of the T cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y, and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24% in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y∣W] is important for realizing the efficiency gain, which is aided by an ample phase-two sample and by using a robust fitting method. PMID:24123289

  5. Computerized stratified random site-selection approaches for design of a ground-water-quality sampling network

    USGS Publications Warehouse

    Scott, J.C.

    1990-01-01

    Computer software was written to randomly select sites for a ground-water-quality sampling network. The software uses digital cartographic techniques and subroutines from a proprietary geographic information system. The report presents the approaches, computer software, and sample applications. It is often desirable to collect ground-water-quality samples from various areas in a study region that have different values of a spatial characteristic, such as land-use or hydrogeologic setting. A stratified network can be used for testing hypotheses about relations between spatial characteristics and water quality, or for calculating statistical descriptions of water-quality data that account for variations that correspond to the spatial characteristic. In the software described, a study region is subdivided into areal subsets that have a common spatial characteristic to stratify the population into several categories from which sampling sites are selected. Different numbers of sites may be selected from each category of areal subsets. A population of potential sampling sites may be defined by either specifying a fixed population of existing sites, or by preparing an equally spaced population of potential sites. In either case, each site is identified with a single category, depending on the value of the spatial characteristic of the areal subset in which the site is located. Sites are selected from one category at a time. One of two approaches may be used to select sites. Sites may be selected randomly, or the areal subsets in the category can be grouped into cells and sites selected randomly from each cell.

  6. Random sampling of elementary flux modes in large-scale metabolic networks.

    PubMed

    Machado, Daniel; Soons, Zita; Patil, Kiran Raosaheb; Ferreira, Eugénio C; Rocha, Isabel

    2012-09-15

    The description of a metabolic network in terms of elementary (flux) modes (EMs) provides an important framework for metabolic pathway analysis. However, their application to large networks has been hampered by the combinatorial explosion in the number of modes. In this work, we develop a method for generating random samples of EMs without computing the whole set. Our algorithm is an adaptation of the canonical basis approach, where we add an additional filtering step which, at each iteration, selects a random subset of the new combinations of modes. In order to obtain an unbiased sample, all candidates are assigned the same probability of getting selected. This approach avoids the exponential growth of the number of modes during computation, thus generating a random sample of the complete set of EMs within reasonable time. We generated samples of different sizes for a metabolic network of Escherichia coli, and observed that they preserve several properties of the full EM set. It is also shown that EM sampling can be used for rational strain design. A well distributed sample, that is representative of the complete set of EMs, should be suitable to most EM-based methods for analysis and optimization of metabolic networks. Source code for a cross-platform implementation in Python is freely available at http://code.google.com/p/emsampler. dmachado@deb.uminho.pt Supplementary data are available at Bioinformatics online.

  7. Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation

    NASA Astrophysics Data System (ADS)

    Räisänen, Petri; Barker, W. Howard

    2004-07-01

    The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.

  8. A two-stage Monte Carlo approach to the expression of uncertainty with finite sample sizes.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crowder, Stephen Vernon; Moyer, Robert D.

    2005-05-01

    Proposed supplement I to the GUM outlines a 'propagation of distributions' approach to deriving the distribution of a measurand for any non-linear function and for any set of random inputs. The supplement's proposed Monte Carlo approach assumes that the distributions of the random inputs are known exactly. This implies that the sample sizes are effectively infinite. In this case, the mean of the measurand can be determined precisely using a large number of Monte Carlo simulations. In practice, however, the distributions of the inputs will rarely be known exactly, but must be estimated using possibly small samples. If these approximatedmore » distributions are treated as exact, the uncertainty in estimating the mean is not properly taken into account. In this paper, we propose a two-stage Monte Carlo procedure that explicitly takes into account the finite sample sizes used to estimate parameters of the input distributions. We will illustrate the approach with a case study involving the efficiency of a thermistor mount power sensor. The performance of the proposed approach will be compared to the standard GUM approach for finite samples using simple non-linear measurement equations. We will investigate performance in terms of coverage probabilities of derived confidence intervals.« less

  9. Random function representation of stationary stochastic vector processes for probability density evolution analysis of wind-induced structures

    NASA Astrophysics Data System (ADS)

    Liu, Zhangjun; Liu, Zenghui

    2018-06-01

    This paper develops a hybrid approach of spectral representation and random function for simulating stationary stochastic vector processes. In the proposed approach, the high-dimensional random variables, included in the original spectral representation (OSR) formula, could be effectively reduced to only two elementary random variables by introducing the random functions that serve as random constraints. Based on this, a satisfactory simulation accuracy can be guaranteed by selecting a small representative point set of the elementary random variables. The probability information of the stochastic excitations can be fully emerged through just several hundred of sample functions generated by the proposed approach. Therefore, combined with the probability density evolution method (PDEM), it could be able to implement dynamic response analysis and reliability assessment of engineering structures. For illustrative purposes, a stochastic turbulence wind velocity field acting on a frame-shear-wall structure is simulated by constructing three types of random functions to demonstrate the accuracy and efficiency of the proposed approach. Careful and in-depth studies concerning the probability density evolution analysis of the wind-induced structure have been conducted so as to better illustrate the application prospects of the proposed approach. Numerical examples also show that the proposed approach possesses a good robustness.

  10. Conic Sampling: An Efficient Method for Solving Linear and Quadratic Programming by Randomly Linking Constraints within the Interior

    PubMed Central

    Serang, Oliver

    2012-01-01

    Linear programming (LP) problems are commonly used in analysis and resource allocation, frequently surfacing as approximations to more difficult problems. Existing approaches to LP have been dominated by a small group of methods, and randomized algorithms have not enjoyed popularity in practice. This paper introduces a novel randomized method of solving LP problems by moving along the facets and within the interior of the polytope along rays randomly sampled from the polyhedral cones defined by the bounding constraints. This conic sampling method is then applied to randomly sampled LPs, and its runtime performance is shown to compare favorably to the simplex and primal affine-scaling algorithms, especially on polytopes with certain characteristics. The conic sampling method is then adapted and applied to solve a certain quadratic program, which compute a projection onto a polytope; the proposed method is shown to outperform the proprietary software Mathematica on large, sparse QP problems constructed from mass spectometry-based proteomics. PMID:22952741

  11. Random laser action in bovine semen

    NASA Astrophysics Data System (ADS)

    Smuk, Andrei; Lazaro, Edgar; Olson, Leif P.; Lawandy, N. M.

    2011-03-01

    Experiments using bovine semen reveal that the addition of a high-gain water soluble dye results in random laser action when excited by a Q-switched, frequency doubled, Nd:Yag laser. The data shows that the linewidth collapse of the emission is correlated to the sperm count of the individual samples, potentially making this a rapid, low sample volume approach to count determination.

  12. Teachers' Methodologies and Sources of Information on HIV/AIDS for Students with Visual Impairments in Selected Residential and Integrated Schools in Ghana

    ERIC Educational Resources Information Center

    Hayford, Samuel K.; Ocansey, Frederick

    2017-01-01

    This study reports part of a national survey on sources of information, education and communication materials on HIV/AIDS available to students with visual impairments in residential, segregated, and integrated schools in Ghana. A multi-staged stratified random sampling procedure and a purposive and simple random sampling approach, where…

  13. Modeling of Academic Achievement of Primary School Students in Ethiopia Using Bayesian Multilevel Approach

    ERIC Educational Resources Information Center

    Sebro, Negusse Yohannes; Goshu, Ayele Taye

    2017-01-01

    This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…

  14. The use of single-date MODIS imagery for estimating large-scale urban impervious surface fraction with spectral mixture analysis and machine learning techniques

    NASA Astrophysics Data System (ADS)

    Deng, Chengbin; Wu, Changshan

    2013-12-01

    Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.

  15. Improvements in sub-grid, microphysics averages using quadrature based approaches

    NASA Astrophysics Data System (ADS)

    Chowdhary, K.; Debusschere, B.; Larson, V. E.

    2013-12-01

    Sub-grid variability in microphysical processes plays a critical role in atmospheric climate models. In order to account for this sub-grid variability, Larson and Schanen (2013) propose placing a probability density function on the sub-grid cloud microphysics quantities, e.g. autoconversion rate, essentially interpreting the cloud microphysics quantities as a random variable in each grid box. Random sampling techniques, e.g. Monte Carlo and Latin Hypercube, can be used to calculate statistics, e.g. averages, on the microphysics quantities, which then feed back into the model dynamics on the coarse scale. We propose an alternate approach using numerical quadrature methods based on deterministic sampling points to compute the statistical moments of microphysics quantities in each grid box. We have performed a preliminary test on the Kessler autoconversion formula, and, upon comparison with Latin Hypercube sampling, our approach shows an increased level of accuracy with a reduction in sample size by almost two orders of magnitude. Application to other microphysics processes is the subject of ongoing research.

  16. A new approach to evaluate gamma-ray measurements

    NASA Technical Reports Server (NTRS)

    Dejager, O. C.; Swanepoel, J. W. H.; Raubenheimer, B. C.; Vandervalt, D. J.

    1985-01-01

    Misunderstandings about the term random samples its implications may easily arise. Conditions under which the phases, obtained from arrival times, do not form a random sample and the dangers involved are discussed. Watson's U sup 2 test for uniformity is recommended for light curves with duty cycles larger than 10%. Under certain conditions, non-parametric density estimation may be used to determine estimates of the true light curve and its parameters.

  17. Improving Ramsey spectroscopy in the extreme-ultraviolet region with a random-sampling approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eramo, R.; Bellini, M.; European Laboratory for Non-linear Spectroscopy

    2011-04-15

    Ramsey-like techniques, based on the coherent excitation of a sample by delayed and phase-correlated pulses, are promising tools for high-precision spectroscopic tests of QED in the extreme-ultraviolet (xuv) spectral region, but currently suffer experimental limitations related to long acquisition times and critical stability issues. Here we propose a random subsampling approach to Ramsey spectroscopy that, by allowing experimentalists to reach a given spectral resolution goal in a fraction of the usual acquisition time, leads to substantial improvements in high-resolution spectroscopy and may open the way to a widespread application of Ramsey-like techniques to precision measurements in the xuv spectral region.

  18. Using auxiliary information to improve wildlife disease surveillance when infected animals are not detected: a Bayesian approach

    USGS Publications Warehouse

    Heisey, Dennis M.; Jennelle, Christopher S.; Russell, Robin E.; Walsh, Daniel P.

    2014-01-01

    There are numerous situations in which it is important to determine whether a particular disease of interest is present in a free-ranging wildlife population. However adequate disease surveillance can be labor-intensive and expensive and thus there is substantial motivation to conduct it as efficiently as possible. Surveillance is often based on the assumption of a simple random sample, but this can almost always be improved upon if there is auxiliary information available about disease risk factors. We present a Bayesian approach to disease surveillance when auxiliary risk information is available which will usually allow for substantial improvements over simple random sampling. Others have employed risk weights in surveillance, but this can result in overly optimistic statements regarding freedom from disease due to not accounting for the uncertainty in the auxiliary information; our approach remedies this. We compare our Bayesian approach to a published example of risk weights applied to chronic wasting disease in deer in Colorado, and we also present calculations to examine when uncertainty in the auxiliary information has a serious impact on the risk weights approach. Our approach allows “apples-to-apples” comparisons of surveillance efficiencies between units where heterogeneous samples were collected

  19. Image subsampling and point scoring approaches for large-scale marine benthic monitoring programs

    NASA Astrophysics Data System (ADS)

    Perkins, Nicholas R.; Foster, Scott D.; Hill, Nicole A.; Barrett, Neville S.

    2016-07-01

    Benthic imagery is an effective tool for quantitative description of ecologically and economically important benthic habitats and biota. The recent development of autonomous underwater vehicles (AUVs) allows surveying of spatial scales that were previously unfeasible. However, an AUV collects a large number of images, the scoring of which is time and labour intensive. There is a need to optimise the way that subsamples of imagery are chosen and scored to gain meaningful inferences for ecological monitoring studies. We examine the trade-off between the number of images selected within transects and the number of random points scored within images on the percent cover of target biota, the typical output of such monitoring programs. We also investigate the efficacy of various image selection approaches, such as systematic or random, on the bias and precision of cover estimates. We use simulated biotas that have varying size, abundance and distributional patterns. We find that a relatively small sampling effort is required to minimise bias. An increased precision for groups that are likely to be the focus of monitoring programs is best gained through increasing the number of images sampled rather than the number of points scored within images. For rare species, sampling using point count approaches is unlikely to provide sufficient precision, and alternative sampling approaches may need to be employed. The approach by which images are selected (simple random sampling, regularly spaced etc.) had no discernible effect on mean and variance estimates, regardless of the distributional pattern of biota. Field validation of our findings is provided through Monte Carlo resampling analysis of a previously scored benthic survey from temperate waters. We show that point count sampling approaches are capable of providing relatively precise cover estimates for candidate groups that are not overly rare. The amount of sampling required, in terms of both the number of images and number of points, varies with the abundance, size and distributional pattern of target biota. Therefore, we advocate either the incorporation of prior knowledge or the use of baseline surveys to establish key properties of intended target biota in the initial stages of monitoring programs.

  20. Adaptive Peer Sampling with Newscast

    NASA Astrophysics Data System (ADS)

    Tölgyesi, Norbert; Jelasity, Márk

    The peer sampling service is a middleware service that provides random samples from a large decentralized network to support gossip-based applications such as multicast, data aggregation and overlay topology management. Lightweight gossip-based implementations of the peer sampling service have been shown to provide good quality random sampling while also being extremely robust to many failure scenarios, including node churn and catastrophic failure. We identify two problems with these approaches. The first problem is related to message drop failures: if a node experiences a higher-than-average message drop rate then the probability of sampling this node in the network will decrease. The second problem is that the application layer at different nodes might request random samples at very different rates which can result in very poor random sampling especially at nodes with high request rates. We propose solutions for both problems. We focus on Newscast, a robust implementation of the peer sampling service. Our solution is based on simple extensions of the protocol and an adaptive self-control mechanism for its parameters, namely—without involving failure detectors—nodes passively monitor local protocol events using them as feedback for a local control loop for self-tuning the protocol parameters. The proposed solution is evaluated by simulation experiments.

  1. A design of experiments approach to validation sampling for logistic regression modeling with error-prone medical records.

    PubMed

    Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay

    2016-04-01

    Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  2. Increasing the Precision of Estimates in Follow-Up Surveys: A Case Study. AIR 1983 Annual Forum Paper.

    ERIC Educational Resources Information Center

    Clark, Sheldon B.; Nichols, James O.

    Survey data concerning teacher education program graduates were used to demonstrate the advantages of a stratified random sampling approach, with followup, relative to a one-shot mailing to an entire population. Sampling issues involved in such an approach are addressed, particularly with regard to quantifying the effects of nonresponse on the…

  3. A Comparison of Seventh Grade Thai Students' Reading Comprehension and Motivation to Read English through Applied Instruction Based on the Genre-Based Approach and the Teacher's Manual

    ERIC Educational Resources Information Center

    Sawangsamutchai, Yutthasak; Rattanavich, Saowalak

    2016-01-01

    The objective of this research is to compare the English reading comprehension and motivation to read of seventh grade Thai students taught with applied instruction through the genre-based approach and teachers' manual. A randomized pre-test post-test control group design was used through the cluster random sampling technique. The data were…

  4. An approach to pavement management in Virginia.

    DOT National Transportation Integrated Search

    1981-01-01

    The report summarizes the objectives and benefits of formal pavement management systems and outlines an approach believed by the author to be practical for Virginia. The management of Virginia interstate pavements and a proposed random-sampling plan ...

  5. Macrostructure from Microstructure: Generating Whole Systems from Ego Networks

    PubMed Central

    Smith, Jeffrey A.

    2014-01-01

    This paper presents a new simulation method to make global network inference from sampled data. The proposed simulation method takes sampled ego network data and uses Exponential Random Graph Models (ERGM) to reconstruct the features of the true, unknown network. After describing the method, the paper presents two validity checks of the approach: the first uses the 20 largest Add Health networks while the second uses the Sociology Coauthorship network in the 1990's. For each test, I take random ego network samples from the known networks and use my method to make global network inference. I find that my method successfully reproduces the properties of the networks, such as distance and main component size. The results also suggest that simpler, baseline models provide considerably worse estimates for most network properties. I end the paper by discussing the bounds/limitations of ego network sampling. I also discuss possible extensions to the proposed approach. PMID:25339783

  6. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.

    PubMed

    Westfall, Jacob; Kenny, David A; Judd, Charles M

    2014-10-01

    Researchers designing experiments in which a sample of participants responds to a sample of stimuli are faced with difficult questions about optimal study design. The conventional procedures of statistical power analysis fail to provide appropriate answers to these questions because they are based on statistical models in which stimuli are not assumed to be a source of random variation in the data, models that are inappropriate for experiments involving crossed random factors of participants and stimuli. In this article, we present new methods of power analysis for designs with crossed random factors, and we give detailed, practical guidance to psychology researchers planning experiments in which a sample of participants responds to a sample of stimuli. We extensively examine 5 commonly used experimental designs, describe how to estimate statistical power in each, and provide power analysis results based on a reasonable set of default parameter values. We then develop general conclusions and formulate rules of thumb concerning the optimal design of experiments in which a sample of participants responds to a sample of stimuli. We show that in crossed designs, statistical power typically does not approach unity as the number of participants goes to infinity but instead approaches a maximum attainable power value that is possibly small, depending on the stimulus sample. We also consider the statistical merits of designs involving multiple stimulus blocks. Finally, we provide a simple and flexible Web-based power application to aid researchers in planning studies with samples of stimuli.

  7. A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields

    DOE PAGES

    Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto

    2017-10-26

    In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less

  8. A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto

    In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less

  9. Efficiency enhancement of optimized Latin hypercube sampling strategies: Application to Monte Carlo uncertainty analysis and meta-modeling

    NASA Astrophysics Data System (ADS)

    Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad; Janssen, Hans

    2015-02-01

    The majority of literature regarding optimized Latin hypercube sampling (OLHS) is devoted to increasing the efficiency of these sampling strategies through the development of new algorithms based on the combination of innovative space-filling criteria and specialized optimization schemes. However, little attention has been given to the impact of the initial design that is fed into the optimization algorithm, on the efficiency of OLHS strategies. Previous studies, as well as codes developed for OLHS, have relied on one of the following two approaches for the selection of the initial design in OLHS: (1) the use of random points in the hypercube intervals (random LHS), and (2) the use of midpoints in the hypercube intervals (midpoint LHS). Both approaches have been extensively used, but no attempt has been previously made to compare the efficiency and robustness of their resulting sample designs. In this study we compare the two approaches and show that the space-filling characteristics of OLHS designs are sensitive to the initial design that is fed into the optimization algorithm. It is also illustrated that the space-filling characteristics of OLHS designs based on midpoint LHS are significantly better those based on random LHS. The two approaches are compared by incorporating their resulting sample designs in Monte Carlo simulation (MCS) for uncertainty propagation analysis, and then, by employing the sample designs in the selection of the training set for constructing non-intrusive polynomial chaos expansion (NIPCE) meta-models which subsequently replace the original full model in MCSs. The analysis is based on two case studies involving numerical simulation of density dependent flow and solute transport in porous media within the context of seawater intrusion in coastal aquifers. We show that the use of midpoint LHS as the initial design increases the efficiency and robustness of the resulting MCSs and NIPCE meta-models. The study also illustrates that this relative improvement decreases with increasing number of sample points and input parameter dimensions. Since the computational time and efforts for generating the sample designs in the two approaches are identical, the use of midpoint LHS as the initial design in OLHS is thus recommended.

  10. Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

    PubMed

    NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

    2017-08-01

    Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.

  11. Single-Phase Mail Survey Design for Rare Population Subgroups

    ERIC Educational Resources Information Center

    Brick, J. Michael; Andrews, William R.; Mathiowetz, Nancy A.

    2016-01-01

    Although using random digit dialing (RDD) telephone samples was the preferred method for conducting surveys of households for many years, declining response and coverage rates have led researchers to explore alternative approaches. The use of address-based sampling (ABS) has been examined for sampling the general population and subgroups, most…

  12. Sampling Methods in Cardiovascular Nursing Research: An Overview.

    PubMed

    Kandola, Damanpreet; Banner, Davina; O'Keefe-McCarthy, Sheila; Jassal, Debbie

    2014-01-01

    Cardiovascular nursing research covers a wide array of topics from health services to psychosocial patient experiences. The selection of specific participant samples is an important part of the research design and process. The sampling strategy employed is of utmost importance to ensure that a representative sample of participants is chosen. There are two main categories of sampling methods: probability and non-probability. Probability sampling is the random selection of elements from the population, where each element of the population has an equal and independent chance of being included in the sample. There are five main types of probability sampling including simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling. Non-probability sampling methods are those in which elements are chosen through non-random methods for inclusion into the research study and include convenience sampling, purposive sampling, and snowball sampling. Each approach offers distinct advantages and disadvantages and must be considered critically. In this research column, we provide an introduction to these key sampling techniques and draw on examples from the cardiovascular research. Understanding the differences in sampling techniques may aid nurses in effective appraisal of research literature and provide a reference pointfor nurses who engage in cardiovascular research.

  13. Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.

    PubMed

    Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B

    2017-04-01

    Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

  14. Scope of Various Random Number Generators in Ant System Approach for TSP

    NASA Technical Reports Server (NTRS)

    Sen, S. K.; Shaykhian, Gholam Ali

    2007-01-01

    Experimented on heuristic, based on an ant system approach for traveling Salesman problem, are several quasi and pseudo-random number generators. This experiment is to explore if any particular generator is most desirable. Such an experiment on large samples has the potential to rank the performance of the generators for the foregoing heuristic. This is just to seek an answer to the controversial performance ranking of the generators in probabilistic/statically sense.

  15. Inspection of care: Findings from an innovative demonstration

    PubMed Central

    Morris, John N.; Sherwood, Clarence C.; Dreyer, Paul

    1989-01-01

    In this article, information is presented concerning the efficacy of a sample-based approach to completing inspection of care reviews of Medicaid-supported nursing home residents. Massachusetts nursing homes were randomly assigned to full (the control group) or sample (the experimental group) review conditions. The primary research focus was to determine whether the proportion of facilities found to be deficient (based on quality of care and level of care criteria) in the experimental sample was comparable to the proportion in the control sample. The findings supported such a hypothesis: Deficient facilities appear to be equally identifiable using the random or full-sampling protocols, and the process can be completed with a considerable savings of surveyor time. PMID:10313458

  16. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

    PubMed Central

    Hemming, Karla; Taljaard, Monica

    2016-01-01

    Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808

  17. An integrate-over-temperature approach for enhanced sampling.

    PubMed

    Gao, Yi Qin

    2008-02-14

    A simple method is introduced to achieve efficient random walking in the energy space in molecular dynamics simulations which thus enhances the sampling over a large energy range. The approach is closely related to multicanonical and replica exchange simulation methods in that it allows configurations of the system to be sampled in a wide energy range by making use of Boltzmann distribution functions at multiple temperatures. A biased potential is quickly generated using this method and is then used in accelerated molecular dynamics simulations.

  18. On the design of random metasurface based devices.

    PubMed

    Dupré, Matthieu; Hsu, Liyi; Kanté, Boubacar

    2018-05-08

    Metasurfaces are generally designed by placing scatterers in periodic or pseudo-periodic grids. We propose and discuss design rules for functional metasurfaces with randomly placed anisotropic elements that randomly sample a well-defined phase function. By analyzing the focusing performance of random metasurface lenses as a function of their density and the density of the phase-maps used to design them, we find that the performance of 1D metasurfaces is mostly governed by their density while 2D metasurfaces strongly depend on both the density and the near-field coupling configuration of the surface. The proposed approach is used to design all-polarization random metalenses at near infrared frequencies. Challenges, as well as opportunities of random metasurfaces compared to periodic ones are discussed. Our results pave the way to new approaches in the design of nanophotonic structures and devices from lenses to solar energy concentrators.

  19. A study of active learning methods for named entity recognition in clinical text.

    PubMed

    Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu; Denny, Joshua C; Xu, Hua

    2015-12-01

    Named entity recognition (NER), a sequential labeling task, is one of the fundamental tasks for building clinical natural language processing (NLP) systems. Machine learning (ML) based approaches can achieve good performance, but they often require large amounts of annotated samples, which are expensive to build due to the requirement of domain experts in annotation. Active learning (AL), a sample selection approach integrated with supervised ML, aims to minimize the annotation cost while maximizing the performance of ML-based models. In this study, our goal was to develop and evaluate both existing and new AL methods for a clinical NER task to identify concepts of medical problems, treatments, and lab tests from the clinical notes. Using the annotated NER corpus from the 2010 i2b2/VA NLP challenge that contained 349 clinical documents with 20,423 unique sentences, we simulated AL experiments using a number of existing and novel algorithms in three different categories including uncertainty-based, diversity-based, and baseline sampling strategies. They were compared with the passive learning that uses random sampling. Learning curves that plot performance of the NER model against the estimated annotation cost (based on number of sentences or words in the training set) were generated to evaluate different active learning and the passive learning methods and the area under the learning curve (ALC) score was computed. Based on the learning curves of F-measure vs. number of sentences, uncertainty sampling algorithms outperformed all other methods in ALC. Most diversity-based methods also performed better than random sampling in ALC. To achieve an F-measure of 0.80, the best method based on uncertainty sampling could save 66% annotations in sentences, as compared to random sampling. For the learning curves of F-measure vs. number of words, uncertainty sampling methods again outperformed all other methods in ALC. To achieve 0.80 in F-measure, in comparison to random sampling, the best uncertainty based method saved 42% annotations in words. But the best diversity based method reduced only 7% annotation effort. In the simulated setting, AL methods, particularly uncertainty-sampling based approaches, seemed to significantly save annotation cost for the clinical NER task. The actual benefit of active learning in clinical NER should be further evaluated in a real-time setting. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

    PubMed Central

    Liu, Bo; Chen, Sanfeng; Li, Shuai; Liang, Yongsheng

    2012-01-01

    In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces. PMID:22736969

  1. Unbiased feature selection in learning random forests for high-dimensional data.

    PubMed

    Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi

    2015-01-01

    Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.

  2. Latent spatial models and sampling design for landscape genetics

    USGS Publications Warehouse

    Hanks, Ephraim M.; Hooten, Mevin B.; Knick, Steven T.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Cross, Todd B.; Schwartz, Michael K.

    2016-01-01

    We propose a spatially-explicit approach for modeling genetic variation across space and illustrate how this approach can be used to optimize spatial prediction and sampling design for landscape genetic data. We propose a multinomial data model for categorical microsatellite allele data commonly used in landscape genetic studies, and introduce a latent spatial random effect to allow for spatial correlation between genetic observations. We illustrate how modern dimension reduction approaches to spatial statistics can allow for efficient computation in landscape genetic statistical models covering large spatial domains. We apply our approach to propose a retrospective spatial sampling design for greater sage-grouse (Centrocercus urophasianus) population genetics in the western United States.

  3. Accounting for selection bias in association studies with complex survey data.

    PubMed

    Wirth, Kathleen E; Tchetgen Tchetgen, Eric J

    2014-05-01

    Obtaining representative information from hidden and hard-to-reach populations is fundamental to describe the epidemiology of many sexually transmitted diseases, including HIV. Unfortunately, simple random sampling is impractical in these settings, as no registry of names exists from which to sample the population at random. However, complex sampling designs can be used, as members of these populations tend to congregate at known locations, which can be enumerated and sampled at random. For example, female sex workers may be found at brothels and street corners, whereas injection drug users often come together at shooting galleries. Despite the logistical appeal, complex sampling schemes lead to unequal probabilities of selection, and failure to account for this differential selection can result in biased estimates of population averages and relative risks. However, standard techniques to account for selection can lead to substantial losses in efficiency. Consequently, researchers implement a variety of strategies in an effort to balance validity and efficiency. Some researchers fully or partially account for the survey design, whereas others do nothing and treat the sample as a realization of the population of interest. We use directed acyclic graphs to show how certain survey sampling designs, combined with subject-matter considerations unique to individual exposure-outcome associations, can induce selection bias. Finally, we present a novel yet simple maximum likelihood approach for analyzing complex survey data; this approach optimizes statistical efficiency at no cost to validity. We use simulated data to illustrate this method and compare it with other analytic techniques.

  4. Cluster-randomized Studies in Educational Research: Principles and Methodological Aspects.

    PubMed

    Dreyhaupt, Jens; Mayer, Benjamin; Keis, Oliver; Öchsner, Wolfgang; Muche, Rainer

    2017-01-01

    An increasing number of studies are being performed in educational research to evaluate new teaching methods and approaches. These studies could be performed more efficiently and deliver more convincing results if they more strictly applied and complied with recognized standards of scientific studies. Such an approach could substantially increase the quality in particular of prospective, two-arm (intervention) studies that aim to compare two different teaching methods. A key standard in such studies is randomization, which can minimize systematic bias in study findings; such bias may result if the two study arms are not structurally equivalent. If possible, educational research studies should also achieve this standard, although this is not yet generally the case. Some difficulties and concerns exist, particularly regarding organizational and methodological aspects. An important point to consider in educational research studies is that usually individuals cannot be randomized, because of the teaching situation, and instead whole groups have to be randomized (so-called "cluster randomization"). Compared with studies with individual randomization, studies with cluster randomization normally require (significantly) larger sample sizes and more complex methods for calculating sample size. Furthermore, cluster-randomized studies require more complex methods for statistical analysis. The consequence of the above is that a competent expert with respective special knowledge needs to be involved in all phases of cluster-randomized studies. Studies to evaluate new teaching methods need to make greater use of randomization in order to achieve scientifically convincing results. Therefore, in this article we describe the general principles of cluster randomization and how to implement these principles, and we also outline practical aspects of using cluster randomization in prospective, two-arm comparative educational research studies.

  5. Rationale and design of the research project of the South Florida Center for the Reduction of Cancer Health Disparities (SUCCESS): study protocol for a randomized controlled trial.

    PubMed

    Carrasquillo, Olveen; McCann, Sheila; Amofah, Antony; Pierre, Larry; Rodriguez, Brendaly; Alonzo, Yisel; Ilangovan, Kumar; Gonzalez, Martha; Trevil, Dinah; Byrne, Margaret M; Koru-Sengul, Tulay; Kobetz, Erin

    2014-07-23

    In the United States certain minority groups, such as racial/ethnic immigrant women, are less likely than non-Hispanic White women to be screened for cervical cancer. Barriers to such care include health insurance, cost, knowledge, attitudes, health literacy, and cultural norms and practices. Among the most promising approaches to increase screening in these groups are patient navigators that can link women to sources of appropriate care. Another recent promising approach is using human papilloma virus (HPV) self-sampling. In this manuscript, we describe our National Cancer Institute-sponsored study testing such approaches among immigrant minority women. The South Florida Center for the Reduction of Cancer Health Disparities (SUCCESS) is conducting a three-arm randomized trial among Hispanic, Haitian, and African American women in Miami-Dade County. Community health workers (CHW) based in each of three communities are recruiting 200 women at each site (600 total). Eligibility criteria include women aged 30-65 years who have not had a Pap smear test in the last 3 years. Prior to randomization, all women undergo a standardized structured interview. Women randomized to public health outreach, Group 1, receive culturally tailored educational materials. Women in Group 2 receive an individualized comprehensive cervical cancer CHW-led education session followed by patient navigation to obtain the Pap smear test at community-based facilities. Women in Group 3 have the option of navigation to a Pap smear test or performing HPV self-sampling. The primary outcome is self-report of completed screening through a Pap smear test or HPV self-sampling within 6 months after enrollment. SUCCESS is one of the first trials testing HPV self-sampling as a screening strategy among underserved minority women. If successful, HPV self-sampling may be an important option in community outreach programs aimed at reducing disparities in cervical cancer. Clinical Trials.gov # NCT02121548, registered April 21, 2014.

  6. Balancing a U-Shaped Assembly Line by Applying Nested Partitions Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhagwat, Nikhil V.

    2005-01-01

    In this study, we applied the Nested Partitions method to a U-line balancing problem and conducted experiments to evaluate the application. From the results, it is quite evident that the Nested Partitions method provided near optimal solutions (optimal in some cases). Besides, the execution time is quite short as compared to the Branch and Bound algorithm. However, for larger data sets, the algorithm took significantly longer times for execution. One of the reasons could be the way in which the random samples are generated. In the present study, a random sample is a solution in itself which requires assignment ofmore » tasks to various stations. The time taken to assign tasks to stations is directly proportional to the number of tasks. Thus, if the number of tasks increases, the time taken to generate random samples for the different regions also increases. The performance index for the Nested Partitions method in the present study was the number of stations in the random solutions (samples) generated. The total idle time for the samples can be used as another performance index. ULINO method is known to have used a combination of bounds to come up with good solutions. This approach of combining different performance indices can be used to evaluate the random samples and obtain even better solutions. Here, we used deterministic time values for the tasks. In industries where majority of tasks are performed manually, the stochastic version of the problem could be of vital importance. Experimenting with different objective functions (No. of stations was used in this study) could be of some significance to some industries where in the cost associated with creation of a new station is not the same. For such industries, the results obtained by using the present approach will not be of much value. Labor costs, task incompletion costs or a combination of those can be effectively used as alternate objective functions.« less

  7. A sub-sampled approach to extremely low-dose STEM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stevens, A.; Luzi, L.; Yang, H.

    The inpainting of randomly sub-sampled images acquired by scanning transmission electron microscopy (STEM) is an attractive method for imaging under low-dose conditions (≤ 1 e -Å 2) without changing either the operation of the microscope or the physics of the imaging process. We show that 1) adaptive sub-sampling increases acquisition speed, resolution, and sensitivity; and 2) random (non-adaptive) sub-sampling is equivalent, but faster than, traditional low-dose techniques. Adaptive sub-sampling opens numerous possibilities for the analysis of beam sensitive materials and in-situ dynamic processes at the resolution limit of the aberration corrected microscope and is demonstrated here for the analysis ofmore » the node distribution in metal-organic frameworks (MOFs).« less

  8. Sampling considerations for disease surveillance in wildlife populations

    USGS Publications Warehouse

    Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

    2008-01-01

    Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.

  9. Random whole metagenomic sequencing for forensic discrimination of soils.

    PubMed

    Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  10. Collaborative Indoor Access Point Localization Using Autonomous Mobile Robot Swarm.

    PubMed

    Awad, Fahed; Naserllah, Muhammad; Omar, Ammar; Abu-Hantash, Alaa; Al-Taj, Abrar

    2018-01-31

    Localization of access points has become an important research problem due to the wide range of applications it addresses such as dismantling critical security threats caused by rogue access points or optimizing wireless coverage of access points within a service area. Existing proposed solutions have mostly relied on theoretical hypotheses or computer simulation to demonstrate the efficiency of their methods. The techniques that rely on estimating the distance using samples of the received signal strength usually assume prior knowledge of the signal propagation characteristics of the indoor environment in hand and tend to take a relatively large number of uniformly distributed random samples. This paper presents an efficient and practical collaborative approach to detect the location of an access point in an indoor environment without any prior knowledge of the environment. The proposed approach comprises a swarm of wirelessly connected mobile robots that collaboratively and autonomously collect a relatively small number of non-uniformly distributed random samples of the access point's received signal strength. These samples are used to efficiently and accurately estimate the location of the access point. The experimental testing verified that the proposed approach can identify the location of the access point in an accurate and efficient manner.

  11. Collaborative Indoor Access Point Localization Using Autonomous Mobile Robot Swarm

    PubMed Central

    Awad, Fahed; Naserllah, Muhammad; Omar, Ammar; Abu-Hantash, Alaa; Al-Taj, Abrar

    2018-01-01

    Localization of access points has become an important research problem due to the wide range of applications it addresses such as dismantling critical security threats caused by rogue access points or optimizing wireless coverage of access points within a service area. Existing proposed solutions have mostly relied on theoretical hypotheses or computer simulation to demonstrate the efficiency of their methods. The techniques that rely on estimating the distance using samples of the received signal strength usually assume prior knowledge of the signal propagation characteristics of the indoor environment in hand and tend to take a relatively large number of uniformly distributed random samples. This paper presents an efficient and practical collaborative approach to detect the location of an access point in an indoor environment without any prior knowledge of the environment. The proposed approach comprises a swarm of wirelessly connected mobile robots that collaboratively and autonomously collect a relatively small number of non-uniformly distributed random samples of the access point’s received signal strength. These samples are used to efficiently and accurately estimate the location of the access point. The experimental testing verified that the proposed approach can identify the location of the access point in an accurate and efficient manner. PMID:29385042

  12. A statistical approach to selecting and confirming validation targets in -omics experiments

    PubMed Central

    2012-01-01

    Background Genomic technologies are, by their very nature, designed for hypothesis generation. In some cases, the hypotheses that are generated require that genome scientists confirm findings about specific genes or proteins. But one major advantage of high-throughput technology is that global genetic, genomic, transcriptomic, and proteomic behaviors can be observed. Manual confirmation of every statistically significant genomic result is prohibitively expensive. This has led researchers in genomics to adopt the strategy of confirming only a handful of the most statistically significant results, a small subset chosen for biological interest, or a small random subset. But there is no standard approach for selecting and quantitatively evaluating validation targets. Results Here we present a new statistical method and approach for statistically validating lists of significant results based on confirming only a small random sample. We apply our statistical method to show that the usual practice of confirming only the most statistically significant results does not statistically validate result lists. We analyze an extensively validated RNA-sequencing experiment to show that confirming a random subset can statistically validate entire lists of significant results. Finally, we analyze multiple publicly available microarray experiments to show that statistically validating random samples can both (i) provide evidence to confirm long gene lists and (ii) save thousands of dollars and hundreds of hours of labor over manual validation of each significant result. Conclusions For high-throughput -omics studies, statistical validation is a cost-effective and statistically valid approach to confirming lists of significant results. PMID:22738145

  13. A distance limited method for sampling downed coarse woody debris

    Treesearch

    Jeffrey H. Gove; Mark J. Ducey; Harry T. Valentine; Michael S. Williams

    2012-01-01

    A new sampling method for down coarse woody debris is proposed based on limiting the perpendicular distance from individual pieces to a randomly chosen sample point. Two approaches are presented that allow different protocols to be used to determine field measurements; estimators for each protocol are also developed. Both protocols are compared via simulation against...

  14. A weighted ℓ{sub 1}-minimization approach for sparse polynomial chaos expansions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peng, Ji; Hampton, Jerrad; Doostan, Alireza, E-mail: alireza.doostan@colorado.edu

    2014-06-15

    This work proposes a method for sparse polynomial chaos (PC) approximation of high-dimensional stochastic functions based on non-adapted random sampling. We modify the standard ℓ{sub 1}-minimization algorithm, originally proposed in the context of compressive sampling, using a priori information about the decay of the PC coefficients, when available, and refer to the resulting algorithm as weightedℓ{sub 1}-minimization. We provide conditions under which we may guarantee recovery using this weighted scheme. Numerical tests are used to compare the weighted and non-weighted methods for the recovery of solutions to two differential equations with high-dimensional random inputs: a boundary value problem with amore » random elliptic operator and a 2-D thermally driven cavity flow with random boundary condition.« less

  15. A probabilistic bridge safety evaluation against floods.

    PubMed

    Liao, Kuo-Wei; Muto, Yasunori; Chen, Wei-Lun; Wu, Bang-Ho

    2016-01-01

    To further capture the influences of uncertain factors on river bridge safety evaluation, a probabilistic approach is adopted. Because this is a systematic and nonlinear problem, MPP-based reliability analyses are not suitable. A sampling approach such as a Monte Carlo simulation (MCS) or importance sampling is often adopted. To enhance the efficiency of the sampling approach, this study utilizes Bayesian least squares support vector machines to construct a response surface followed by an MCS, providing a more precise safety index. Although there are several factors impacting the flood-resistant reliability of a bridge, previous experiences and studies show that the reliability of the bridge itself plays a key role. Thus, the goal of this study is to analyze the system reliability of a selected bridge that includes five limit states. The random variables considered here include the water surface elevation, water velocity, local scour depth, soil property and wind load. Because the first three variables are deeply affected by river hydraulics, a probabilistic HEC-RAS-based simulation is performed to capture the uncertainties in those random variables. The accuracy and variation of our solutions are confirmed by a direct MCS to ensure the applicability of the proposed approach. The results of a numerical example indicate that the proposed approach can efficiently provide an accurate bridge safety evaluation and maintain satisfactory variation.

  16. VNIR hyperspectral background characterization methods in adverse weather conditions

    NASA Astrophysics Data System (ADS)

    Romano, João M.; Rosario, Dalton; Roth, Luz

    2009-05-01

    Hyperspectral technology is currently being used by the military to detect regions of interest where potential targets may be located. Weather variability, however, may affect the ability for an algorithm to discriminate possible targets from background clutter. Nonetheless, different background characterization approaches may facilitate the ability for an algorithm to discriminate potential targets over a variety of weather conditions. In a previous paper, we introduced a new autonomous target size invariant background characterization process, the Autonomous Background Characterization (ABC) or also known as the Parallel Random Sampling (PRS) method, features a random sampling stage, a parallel process to mitigate the inclusion by chance of target samples into clutter background classes during random sampling; and a fusion of results at the end. In this paper, we will demonstrate how different background characterization approaches are able to improve performance of algorithms over a variety of challenging weather conditions. By using the Mahalanobis distance as the standard algorithm for this study, we compare the performance of different characterization methods such as: the global information, 2 stage global information, and our proposed method, ABC, using data that was collected under a variety of adverse weather conditions. For this study, we used ARDEC's Hyperspectral VNIR Adverse Weather data collection comprised of heavy, light, and transitional fog, light and heavy rain, and low light conditions.

  17. Self-reference and random sampling approach for label-free identification of DNA composition using plasmonic nanomaterials.

    PubMed

    Freeman, Lindsay M; Pang, Lin; Fainman, Yeshaiahu

    2018-05-09

    The analysis of DNA has led to revolutionary advancements in the fields of medical diagnostics, genomics, prenatal screening, and forensic science, with the global DNA testing market expected to reach revenues of USD 10.04 billion per year by 2020. However, the current methods for DNA analysis remain dependent on the necessity for fluorophores or conjugated proteins, leading to high costs associated with consumable materials and manual labor. Here, we demonstrate a potential label-free DNA composition detection method using surface-enhanced Raman spectroscopy (SERS) in which we identify the composition of cytosine and adenine within single strands of DNA. This approach depends on the fact that there is one phosphate backbone per nucleotide, which we use as a reference to compensate for systematic measurement variations. We utilize plasmonic nanomaterials with random Raman sampling to perform label-free detection of the nucleotide composition within DNA strands, generating a calibration curve from standard samples of DNA and demonstrating the capability of resolving the nucleotide composition. The work represents an innovative way for detection of the DNA composition within DNA strands without the necessity of attached labels, offering a highly sensitive and reproducible method that factors in random sampling to minimize error.

  18. Cluster-randomized Studies in Educational Research: Principles and Methodological Aspects

    PubMed Central

    Dreyhaupt, Jens; Mayer, Benjamin; Keis, Oliver; Öchsner, Wolfgang; Muche, Rainer

    2017-01-01

    An increasing number of studies are being performed in educational research to evaluate new teaching methods and approaches. These studies could be performed more efficiently and deliver more convincing results if they more strictly applied and complied with recognized standards of scientific studies. Such an approach could substantially increase the quality in particular of prospective, two-arm (intervention) studies that aim to compare two different teaching methods. A key standard in such studies is randomization, which can minimize systematic bias in study findings; such bias may result if the two study arms are not structurally equivalent. If possible, educational research studies should also achieve this standard, although this is not yet generally the case. Some difficulties and concerns exist, particularly regarding organizational and methodological aspects. An important point to consider in educational research studies is that usually individuals cannot be randomized, because of the teaching situation, and instead whole groups have to be randomized (so-called “cluster randomization”). Compared with studies with individual randomization, studies with cluster randomization normally require (significantly) larger sample sizes and more complex methods for calculating sample size. Furthermore, cluster-randomized studies require more complex methods for statistical analysis. The consequence of the above is that a competent expert with respective special knowledge needs to be involved in all phases of cluster-randomized studies. Studies to evaluate new teaching methods need to make greater use of randomization in order to achieve scientifically convincing results. Therefore, in this article we describe the general principles of cluster randomization and how to implement these principles, and we also outline practical aspects of using cluster randomization in prospective, two-arm comparative educational research studies. PMID:28584874

  19. Sparsely sampling the sky: Regular vs. random sampling

    NASA Astrophysics Data System (ADS)

    Paykari, P.; Pires, S.; Starck, J.-L.; Jaffe, A. H.

    2015-09-01

    Aims: The next generation of galaxy surveys, aiming to observe millions of galaxies, are expensive both in time and money. This raises questions regarding the optimal investment of this time and money for future surveys. In a previous work, we have shown that a sparse sampling strategy could be a powerful substitute for the - usually favoured - contiguous observation of the sky. In our previous paper, regular sparse sampling was investigated, where the sparse observed patches were regularly distributed on the sky. The regularity of the mask introduces a periodic pattern in the window function, which induces periodic correlations at specific scales. Methods: In this paper, we use a Bayesian experimental design to investigate a "random" sparse sampling approach, where the observed patches are randomly distributed over the total sparsely sampled area. Results: We find that in this setting, the induced correlation is evenly distributed amongst all scales as there is no preferred scale in the window function. Conclusions: This is desirable when we are interested in any specific scale in the galaxy power spectrum, such as the matter-radiation equality scale. As the figure of merit shows, however, there is no preference between regular and random sampling to constrain the overall galaxy power spectrum and the cosmological parameters.

  20. Hyperspectral analysis of clay minerals

    NASA Astrophysics Data System (ADS)

    Janaki Rama Suresh, G.; Sreenivas, K.; Sivasamy, R.

    2014-11-01

    A study was carried out by collecting soil samples from parts of Gwalior and Shivpuri district, Madhya Pradesh in order to assess the dominant clay mineral of these soils using hyperspectral data, as 0.4 to 2.5 μm spectral range provides abundant and unique information about many important earth-surface minerals. Understanding the spectral response along with the soil chemical properties can provide important clues for retrieval of mineralogical soil properties. The soil samples were collected based on stratified random sampling approach and dominant clay minerals were identified through XRD analysis. The absorption feature parameters like depth, width, area and asymmetry of the absorption peaks were derived from spectral profile of soil samples through DISPEC tool. The derived absorption feature parameters were used as inputs for modelling the dominant soil clay mineral present in the unknown samples using Random forest approach which resulted in kappa accuracy of 0.795. Besides, an attempt was made to classify the Hyperion data using Spectral Angle Mapper (SAM) algorithm with an overall accuracy of 68.43 %. Results showed that kaolinite was the dominant mineral present in the soils followed by montmorillonite in the study area.

  1. Indirect estimation of signal-dependent noise with nonadaptive heterogeneous samples.

    PubMed

    Azzari, Lucio; Foi, Alessandro

    2014-08-01

    We consider the estimation of signal-dependent noise from a single image. Unlike conventional algorithms that build a scatterplot of local mean-variance pairs from either small or adaptively selected homogeneous data samples, our proposed approach relies on arbitrarily large patches of heterogeneous data extracted at random from the image. We demonstrate the feasibility of our approach through an extensive theoretical analysis based on mixture of Gaussian distributions. A prototype algorithm is also developed in order to validate the approach on simulated data as well as on real camera raw images.

  2. Sampling designs for HIV molecular epidemiology with application to Honduras.

    PubMed

    Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I

    2005-11-01

    Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.

  3. TemperSAT: A new efficient fair-sampling random k-SAT solver

    NASA Astrophysics Data System (ADS)

    Fang, Chao; Zhu, Zheng; Katzgraber, Helmut G.

    The set membership problem is of great importance to many applications and, in particular, database searches for target groups. Recently, an approach to speed up set membership searches based on the NP-hard constraint-satisfaction problem (random k-SAT) has been developed. However, the bottleneck of the approach lies in finding the solution to a large SAT formula efficiently and, in particular, a large number of independent solutions is needed to reduce the probability of false positives. Unfortunately, traditional random k-SAT solvers such as WalkSAT are biased when seeking solutions to the Boolean formulas. By porting parallel tempering Monte Carlo to the sampling of binary optimization problems, we introduce a new algorithm (TemperSAT) whose performance is comparable to current state-of-the-art SAT solvers for large k with the added benefit that theoretically it can find many independent solutions quickly. We illustrate our results by comparing to the currently fastest implementation of WalkSAT, WalkSATlm.

  4. Individualizing drug dosage with longitudinal data.

    PubMed

    Zhu, Xiaolu; Qu, Annie

    2016-10-30

    We propose a two-step procedure to personalize drug dosage over time under the framework of a log-linear mixed-effect model. We model patients' heterogeneity using subject-specific random effects, which are treated as the realizations of an unspecified stochastic process. We extend the conditional quadratic inference function to estimate both fixed-effect coefficients and individual random effects on a longitudinal training data sample in the first step and propose an adaptive procedure to estimate new patients' random effects and provide dosage recommendations for new patients in the second step. An advantage of our approach is that we do not impose any distribution assumption on estimating random effects. Moreover, the new approach can accommodate more general time-varying covariates corresponding to random effects. We show in theory and numerical studies that the proposed method is more efficient compared with existing approaches, especially when covariates are time varying. In addition, a real data example of a clozapine study confirms that our two-step procedure leads to more accurate drug dosage recommendations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Network Sampling with Memory: A proposal for more efficient sampling from social networks.

    PubMed

    Mouw, Ted; Verdery, Ashton M

    2012-08-01

    Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations where study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk based approach such as Respondent Driven Sampling (RDS) can result in high design effects (DE)-the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high design effect means that more cases must be collected to achieve the same level of precision as SRS. In this paper we propose an alternative strategy, Network Sampling with Memory (NSM), which collects network data from respondents in order to reduce design effects and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a "List" mode, where all individuals on the revealed network list are sampled with the same cumulative probability, with a "Search" mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared to RDS and SRS on 162 school and university networks from Add Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average design effect for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE=1), and 98.5% lower than the average DE we observed for RDS.

  6. Network Sampling with Memory: A proposal for more efficient sampling from social networks

    PubMed Central

    Mouw, Ted; Verdery, Ashton M.

    2013-01-01

    Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations where study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk based approach such as Respondent Driven Sampling (RDS) can result in high design effects (DE)—the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high design effect means that more cases must be collected to achieve the same level of precision as SRS. In this paper we propose an alternative strategy, Network Sampling with Memory (NSM), which collects network data from respondents in order to reduce design effects and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a “List” mode, where all individuals on the revealed network list are sampled with the same cumulative probability, with a “Search” mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared to RDS and SRS on 162 school and university networks from Add Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average design effect for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE=1), and 98.5% lower than the average DE we observed for RDS. PMID:24159246

  7. Multidimensional Normalization to Minimize Plate Effects of Suspension Bead Array Data.

    PubMed

    Hong, Mun-Gwan; Lee, Woojoo; Nilsson, Peter; Pawitan, Yudi; Schwenk, Jochen M

    2016-10-07

    Enhanced by the growing number of biobanks, biomarker studies can now be performed with reasonable statistical power by using large sets of samples. Antibody-based proteomics by means of suspension bead arrays offers one attractive approach to analyze serum, plasma, or CSF samples for such studies in microtiter plates. To expand measurements beyond single batches, with either 96 or 384 samples per plate, suitable normalization methods are required to minimize the variation between plates. Here we propose two normalization approaches utilizing MA coordinates. The multidimensional MA (multi-MA) and MA-loess both consider all samples of a microtiter plate per suspension bead array assay and thus do not require any external reference samples. We demonstrate the performance of the two MA normalization methods with data obtained from the analysis of 384 samples including both serum and plasma. Samples were randomized across 96-well sample plates, processed, and analyzed in assay plates, respectively. Using principal component analysis (PCA), we could show that plate-wise clusters found in the first two components were eliminated by multi-MA normalization as compared with other normalization methods. Furthermore, we studied the correlation profiles between random pairs of antibodies and found that both MA normalization methods substantially reduced the inflated correlation introduced by plate effects. Normalization approaches using multi-MA and MA-loess minimized batch effects arising from the analysis of several assay plates with antibody suspension bead arrays. In a simulated biomarker study, multi-MA restored associations lost due to plate effects. Our normalization approaches, which are available as R package MDimNormn, could also be useful in studies using other types of high-throughput assay data.

  8. Analysis and Recognition of Traditional Chinese Medicine Pulse Based on the Hilbert-Huang Transform and Random Forest in Patients with Coronary Heart Disease

    PubMed Central

    Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie

    2015-01-01

    Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation. PMID:26180536

  9. Analysis and Recognition of Traditional Chinese Medicine Pulse Based on the Hilbert-Huang Transform and Random Forest in Patients with Coronary Heart Disease.

    PubMed

    Guo, Rui; Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie

    2015-01-01

    Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation.

  10. Women Secondary Principals in Texas 1998 and 2011: Movement toward Equity

    ERIC Educational Resources Information Center

    Marczynski, Jean C.; Gates, Gordon S.

    2013-01-01

    Purpose: The purpose of this paper is to analyze data gathered in 1998 and 2011 from representative samples of women secondary school principals in Texas to identify differences in personal, professional, leadership, and school characteristics. Design/methodology/approach: Two proportionate, random samples were drawn of women secondary principals…

  11. Neither fixed nor random: weighted least squares meta-analysis.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2015-06-15

    This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects. Copyright © 2015 John Wiley & Sons, Ltd.

  12. Stemflow estimation in a redwood forest using model-based stratified random sampling

    Treesearch

    Jack Lewis

    2003-01-01

    Model-based stratified sampling is illustrated by a case study of stemflow volume in a redwood forest. The approach is actually a model-assisted sampling design in which auxiliary information (tree diameter) is utilized in the design of stratum boundaries to optimize the efficiency of a regression or ratio estimator. The auxiliary information is utilized in both the...

  13. An Intrinsic Algorithm for Parallel Poisson Disk Sampling on Arbitrary Surfaces.

    PubMed

    Ying, Xiang; Xin, Shi-Qing; Sun, Qian; He, Ying

    2013-03-08

    Poisson disk sampling plays an important role in a variety of visual computing, due to its useful statistical property in distribution and the absence of aliasing artifacts. While many effective techniques have been proposed to generate Poisson disk distribution in Euclidean space, relatively few work has been reported to the surface counterpart. This paper presents an intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces. We propose a new technique for parallelizing the dart throwing. Rather than the conventional approaches that explicitly partition the spatial domain to generate the samples in parallel, our approach assigns each sample candidate a random and unique priority that is unbiased with regard to the distribution. Hence, multiple threads can process the candidates simultaneously and resolve conflicts by checking the given priority values. It is worth noting that our algorithm is accurate as the generated Poisson disks are uniformly and randomly distributed without bias. Our method is intrinsic in that all the computations are based on the intrinsic metric and are independent of the embedding space. This intrinsic feature allows us to generate Poisson disk distributions on arbitrary surfaces. Furthermore, by manipulating the spatially varying density function, we can obtain adaptive sampling easily.

  14. Application of random effects to the study of resource selection by animals

    USGS Publications Warehouse

    Gillies, C.S.; Hebblewhite, M.; Nielsen, S.E.; Krawchuk, M.A.; Aldridge, Cameron L.; Frair, J.L.; Saher, D.J.; Stevens, C.E.; Jerde, C.L.

    2006-01-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence.2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability.3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed.4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects.5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection.6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  15. Application of random effects to the study of resource selection by animals.

    PubMed

    Gillies, Cameron S; Hebblewhite, Mark; Nielsen, Scott E; Krawchuk, Meg A; Aldridge, Cameron L; Frair, Jacqueline L; Saher, D Joanne; Stevens, Cameron E; Jerde, Christopher L

    2006-07-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence. 2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability. 3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed. 4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects. 5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection. 6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  16. Focus on Function – a randomized controlled trial comparing two rehabilitation interventions for young children with cerebral palsy

    PubMed Central

    Law, Mary; Darrah, Johanna; Pollock, Nancy; Rosenbaum, Peter; Russell, Dianne; Walter, Stephen D; Petrenchik, Theresa; Wilson, Brenda; Wright, Virginia

    2007-01-01

    Background Children with cerebral palsy receive a variety of long-term physical and occupational therapy interventions to facilitate development and to enhance functional independence in movement, self-care, play, school activities and leisure. Considerable human and financial resources are directed at the "intervention" of the problems of cerebral palsy, although the available evidence supporting current interventions is inconclusive. A considerable degree of uncertainty remains about the appropriate therapeutic approaches to manage the habilitation of children with cerebral palsy. The primary objective of this project is to conduct a multi-site randomized clinical trial to evaluate the efficacy of a task/context-focused approach compared to a child-focused remediation approach in improving performance of functional tasks and mobility, increasing participation in everyday activities, and improving quality of life in children 12 months to 5 years of age who have cerebral palsy. Method/Design A multi-centred randomized controlled trial research design will be used. Children will be recruited from a representative sample of children attending publicly-funded regional children's rehabilitation centers serving children with disabilities in Ontario and Alberta in Canada. Target sample size is 220 children with cerebral palsy aged 12 months to 5 years at recruitment date. Therapists are randomly assigned to deliver either a context-focused approach or a child-focused approach. Children follow their therapist into their treatment arm. Outcomes will be evaluated at baseline, after 6 months of treatment and at a 3-month follow-up period. Outcomes represent the components of the International Classification of Functioning, Disability and Health, including body function and structure (range of motion), activities (performance of functional tasks, motor function), participation (involvement in formal and informal activities), and environment (parent perceptions of care, parental empowerment). Discussion This paper presents the background information, design and protocol for a randomized controlled trial comparing a task/context-focused approach to a child-focused remediation approach in improving functional outcomes for young children with cerebral palsy. Trial registration [clinical trial registration #: NCT00469872] PMID:17900362

  17. An intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces.

    PubMed

    Ying, Xiang; Xin, Shi-Qing; Sun, Qian; He, Ying

    2013-09-01

    Poisson disk sampling has excellent spatial and spectral properties, and plays an important role in a variety of visual computing. Although many promising algorithms have been proposed for multidimensional sampling in euclidean space, very few studies have been reported with regard to the problem of generating Poisson disks on surfaces due to the complicated nature of the surface. This paper presents an intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces. In sharp contrast to the conventional parallel approaches, our method neither partitions the given surface into small patches nor uses any spatial data structure to maintain the voids in the sampling domain. Instead, our approach assigns each sample candidate a random and unique priority that is unbiased with regard to the distribution. Hence, multiple threads can process the candidates simultaneously and resolve conflicts by checking the given priority values. Our algorithm guarantees that the generated Poisson disks are uniformly and randomly distributed without bias. It is worth noting that our method is intrinsic and independent of the embedding space. This intrinsic feature allows us to generate Poisson disk patterns on arbitrary surfaces in IR(n). To our knowledge, this is the first intrinsic, parallel, and accurate algorithm for surface Poisson disk sampling. Furthermore, by manipulating the spatially varying density function, we can obtain adaptive sampling easily.

  18. Sampling pig farms at the abattoir in a cross-sectional study - Evaluation of a sampling method.

    PubMed

    Birkegård, Anna Camilla; Halasa, Tariq; Toft, Nils

    2017-09-15

    A cross-sectional study design is relatively inexpensive, fast and easy to conduct when compared to other study designs. Careful planning is essential to obtaining a representative sample of the population, and the recommended approach is to use simple random sampling from an exhaustive list of units in the target population. This approach is rarely feasible in practice, and other sampling procedures must often be adopted. For example, when slaughter pigs are the target population, sampling the pigs on the slaughter line may be an alternative to on-site sampling at a list of farms. However, it is difficult to sample a large number of farms from an exact predefined list, due to the logistics and workflow of an abattoir. Therefore, it is necessary to have a systematic sampling procedure and to evaluate the obtained sample with respect to the study objective. We propose a method for 1) planning, 2) conducting, and 3) evaluating the representativeness and reproducibility of a cross-sectional study when simple random sampling is not possible. We used an example of a cross-sectional study with the aim of quantifying the association of antimicrobial resistance and antimicrobial consumption in Danish slaughter pigs. It was not possible to visit farms within the designated timeframe. Therefore, it was decided to use convenience sampling at the abattoir. Our approach was carried out in three steps: 1) planning: using data from meat inspection to plan at which abattoirs and how many farms to sample; 2) conducting: sampling was carried out at five abattoirs; 3) evaluation: representativeness was evaluated by comparing sampled and non-sampled farms, and the reproducibility of the study was assessed through simulated sampling based on meat inspection data from the period where the actual data collection was carried out. In the cross-sectional study samples were taken from 681 Danish pig farms, during five weeks from February to March 2015. The evaluation showed that the sampling procedure was reproducible with results comparable to the collected sample. However, the sampling procedure favoured sampling of large farms. Furthermore, both under-sampled and over-sampled areas were found using scan statistics. In conclusion, sampling conducted at abattoirs can provide a spatially representative sample. Hence it is a possible cost-effective alternative to simple random sampling. However, it is important to assess the properties of the resulting sample so that any potential selection bias can be addressed when reporting the findings. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Improving Riverine Constituent Concentration and Flux Estimation by Accounting for Antecedent Discharge Conditions

    NASA Astrophysics Data System (ADS)

    Zhang, Q.; Ball, W. P.

    2016-12-01

    Regression-based approaches are often employed to estimate riverine constituent concentrations and fluxes based on typically sparse concentration observations. One such approach is the WRTDS ("Weighted Regressions on Time, Discharge, and Season") method, which has been shown to provide more accurate estimates than prior approaches. Centered on WRTDS, this work was aimed at developing improved models for constituent concentration and flux estimation by accounting for antecedent discharge conditions. Twelve modified models were developed and tested, each of which contains one additional variable to represent antecedent conditions. High-resolution ( daily) data at nine monitoring sites were used to evaluate the relative merits of the models for estimation of six constituents - chloride (Cl), nitrate-plus-nitrite (NOx), total Kjeldahl nitrogen (TKN), total phosphorus (TP), soluble reactive phosphorus (SRP), and suspended sediment (SS). For each site-constituent combination, 30 concentration subsets were generated from the original data through Monte Carlo sub-sampling and then used to evaluate model performance. For the sub-sampling, three sampling strategies were adopted: (A) 1 random sample each month (12/year), (B) 12 random monthly samples plus additional 8 random samples per year (20/year), and (C) 12 regular (non-storm) and 8 storm samples per year (20/year). The modified models show general improvement over the original model under all three sampling strategies. Major improvements were achieved for NOx by the long-term flow-anomaly model and for Cl by the ADF (average discounted flow) model and the short-term flow-anomaly model. Moderate improvements were achieved for SS, TP, and TKN by the ADF model. By contrast, no such achievement was achieved for SRP by any proposed model. In terms of sampling strategy, performance of all models was generally best using strategy C and worst using strategy A, and especially so for SS, TP, and SRP, confirming the value of routinely collecting storm-flow samples. Overall, this work provides a comprehensive set of statistical evidence for supporting the incorporation of antecedent discharge conditions into WRTDS for constituent concentration and flux estimation, thereby combining the advantages of two recent developments in water quality modeling.

  20. Comparing the efficiency of digital and conventional soil mapping to predict soil types in a semi-arid region in Iran

    NASA Astrophysics Data System (ADS)

    Zeraatpisheh, Mojtaba; Ayoubi, Shamsollah; Jafari, Azam; Finke, Peter

    2017-05-01

    The efficiency of different digital and conventional soil mapping approaches to produce categorical maps of soil types is determined by cost, sample size, accuracy and the selected taxonomic level. The efficiency of digital and conventional soil mapping approaches was examined in the semi-arid region of Borujen, central Iran. This research aimed to (i) compare two digital soil mapping approaches including Multinomial logistic regression and random forest, with the conventional soil mapping approach at four soil taxonomic levels (order, suborder, great group and subgroup levels), (ii) validate the predicted soil maps by the same validation data set to determine the best method for producing the soil maps, and (iii) select the best soil taxonomic level by different approaches at three sample sizes (100, 80, and 60 point observations), in two scenarios with and without a geomorphology map as a spatial covariate. In most predicted maps, using both digital soil mapping approaches, the best results were obtained using the combination of terrain attributes and the geomorphology map, although differences between the scenarios with and without the geomorphology map were not significant. Employing the geomorphology map increased map purity and the Kappa index, and led to a decrease in the 'noisiness' of soil maps. Multinomial logistic regression had better performance at higher taxonomic levels (order and suborder levels); however, random forest showed better performance at lower taxonomic levels (great group and subgroup levels). Multinomial logistic regression was less sensitive than random forest to a decrease in the number of training observations. The conventional soil mapping method produced a map with larger minimum polygon size because of traditional cartographic criteria used to make the geological map 1:100,000 (on which the conventional soil mapping map was largely based). Likewise, conventional soil mapping map had also a larger average polygon size that resulted in a lower level of detail. Multinomial logistic regression at the order level (map purity of 0.80), random forest at the suborder (map purity of 0.72) and great group level (map purity of 0.60), and conventional soil mapping at the subgroup level (map purity of 0.48) produced the most accurate maps in the study area. The multinomial logistic regression method was identified as the most effective approach based on a combined index of map purity, map information content, and map production cost. The combined index also showed that smaller sample size led to a preference for the order level, while a larger sample size led to a preference for the great group level.

  1. Toward a Principled Sampling Theory for Quasi-Orders

    PubMed Central

    Ünlü, Ali; Schrepp, Martin

    2016-01-01

    Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets. PMID:27965601

  2. Toward a Principled Sampling Theory for Quasi-Orders.

    PubMed

    Ünlü, Ali; Schrepp, Martin

    2016-01-01

    Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets.

  3. Censoring approach to the detection limits in X-ray fluorescence analysis

    NASA Astrophysics Data System (ADS)

    Pajek, M.; Kubala-Kukuś, A.

    2004-10-01

    We demonstrate that the effect of detection limits in the X-ray fluorescence analysis (XRF), which limits the determination of very low concentrations of trace elements and results in appearance of the so-called "nondetects", can be accounted for using the statistical concept of censoring. More precisely, the results of such measurements can be viewed as the left random censored data, which can further be analyzed using the Kaplan-Meier method correcting the data for the presence of nondetects. Using this approach, the results of measured, detection limit censored concentrations can be interpreted in a nonparametric manner including the correction for the nondetects, i.e. the measurements in which the concentrations were found to be below the actual detection limits. Moreover, using the Monte Carlo simulation technique we show that by using the Kaplan-Meier approach the corrected mean concentrations for a population of the samples can be estimated within a few percent uncertainties with respect of the simulated, uncensored data. This practically means that the final uncertainties of estimated mean values are limited in fact by the number of studied samples and not by the correction procedure itself. The discussed random-left censoring approach was applied to analyze the XRF detection-limit-censored concentration measurements of trace elements in biomedical samples.

  4. Factors Influencing Mathematic Problem-Solving Ability of Sixth Grade Students

    ERIC Educational Resources Information Center

    Pimta, Sakorn; Tayraukham, Sombat; Nuangchalerm, Prasart

    2009-01-01

    Problem statement: This study aims to investigate factors influencing mathematic problem-solving ability of sixth grade students. One thousand and twenty eight of sixth grade students, studying in the second semester of academic year 2007 were sampled by stratified random sampling technique. Approach: The research instruments used in the study…

  5. Simulation-Extrapolation for Estimating Means and Causal Effects with Mismeasured Covariates

    ERIC Educational Resources Information Center

    Lockwood, J. R.; McCaffrey, Daniel F.

    2015-01-01

    Regression, weighting and related approaches to estimating a population mean from a sample with nonrandom missing data often rely on the assumption that conditional on covariates, observed samples can be treated as random. Standard methods using this assumption generally will fail to yield consistent estimators when covariates are measured with…

  6. Relationships among Taiwanese Children's Computer Game Use, Academic Achievement and Parental Governing Approach

    ERIC Educational Resources Information Center

    Yeh, Duen-Yian; Cheng, Ching-Hsue

    2016-01-01

    This study examined the relationships among children's computer game use, academic achievement and parental governing approach to propose probable answers for the doubts of Taiwanese parents. 355 children (ages 11-14) were randomly sampled from 20 elementary schools in a typically urbanised county in Taiwan. Questionnaire survey (five questions)…

  7. The role of different sampling methods in improving biological activity prediction using deep belief network.

    PubMed

    Ghasemi, Fahimeh; Fassihi, Afshin; Pérez-Sánchez, Horacio; Mehri Dehnavi, Alireza

    2017-02-05

    Thousands of molecules and descriptors are available for a medicinal chemist thanks to the technological advancements in different branches of chemistry. This fact as well as the correlation between them has raised new problems in quantitative structure activity relationship studies. Proper parameter initialization in statistical modeling has merged as another challenge in recent years. Random selection of parameters leads to poor performance of deep neural network (DNN). In this research, deep belief network (DBN) was applied to initialize DNNs. DBN is composed of some stacks of restricted Boltzmann machine, an energy-based method that requires computing log likelihood gradient for all samples. Three different sampling approaches were suggested to solve this gradient. In this respect, the impact of DBN was applied based on the different sampling approaches mentioned above to initialize the DNN architecture in predicting biological activity of all fifteen Kaggle targets that contain more than 70k molecules. The same as other fields of processing research, the outputs of these models demonstrated significant superiority to that of DNN with random parameters. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  8. DS/LPI autocorrelation detection in noise plus random-tone interference. [Direct Sequence Low-Probabilty of Intercept

    NASA Technical Reports Server (NTRS)

    Hinedi, S.; Polydoros, A.

    1988-01-01

    The authors present and analyze a frequency-noncoherent two-lag autocorrelation statistic for the wideband detection of random BPSK signals in noise-plus-random-multitone interference. It is shown that this detector is quite robust to the presence or absence of interference and its specific parameter values, contrary to the case of an energy detector. The rule assumes knowledge of the data rate and the active scenario under H0. It is concluded that the real-time autocorrelation domain and its samples (lags) are a viable approach for detecting random signals in dense environments.

  9. Computational methods for efficient structural reliability and reliability sensitivity analysis

    NASA Technical Reports Server (NTRS)

    Wu, Y.-T.

    1993-01-01

    This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.

  10. Scope of Various Random Number Generators in ant System Approach for TSP

    NASA Technical Reports Server (NTRS)

    Sen, S. K.; Shaykhian, Gholam Ali

    2007-01-01

    Experimented on heuristic, based on an ant system approach for traveling salesman problem, are several quasi- and pseudo-random number generators. This experiment is to explore if any particular generator is most desirable. Such an experiment on large samples has the potential to rank the performance of the generators for the foregoing heuristic. This is mainly to seek an answer to the controversial issue "which generator is the best in terms of quality of the result (accuracy) as well as cost of producing the result (time/computational complexity) in a probabilistic/statistical sense."

  11. Reducing seed dependent variability of non-uniformly sampled multidimensional NMR data

    NASA Astrophysics Data System (ADS)

    Mobli, Mehdi

    2015-07-01

    The application of NMR spectroscopy to study the structure, dynamics and function of macromolecules requires the acquisition of several multidimensional spectra. The one-dimensional NMR time-response from the spectrometer is extended to additional dimensions by introducing incremented delays in the experiment that cause oscillation of the signal along "indirect" dimensions. For a given dimension the delay is incremented at twice the rate of the maximum frequency (Nyquist rate). To achieve high-resolution requires acquisition of long data records sampled at the Nyquist rate. This is typically a prohibitive step due to time constraints, resulting in sub-optimal data records to the detriment of subsequent analyses. The multidimensional NMR spectrum itself is typically sparse, and it has been shown that in such cases it is possible to use non-Fourier methods to reconstruct a high-resolution multidimensional spectrum from a random subset of non-uniformly sampled (NUS) data. For a given acquisition time, NUS has the potential to improve the sensitivity and resolution of a multidimensional spectrum, compared to traditional uniform sampling. The improvements in sensitivity and/or resolution achieved by NUS are heavily dependent on the distribution of points in the random subset acquired. Typically, random points are selected from a probability density function (PDF) weighted according to the NMR signal envelope. In extreme cases as little as 1% of the data is subsampled. The heavy under-sampling can result in poor reproducibility, i.e. when two experiments are carried out where the same number of random samples is selected from the same PDF but using different random seeds. Here, a jittered sampling approach is introduced that is shown to improve random seed dependent reproducibility of multidimensional spectra generated from NUS data, compared to commonly applied NUS methods. It is shown that this is achieved due to the low variability of the inherent sensitivity of the random subset chosen from a given PDF. Finally, it is demonstrated that metrics used to find optimal NUS distributions are heavily dependent on the inherent sensitivity of the random subset, and such optimisation is therefore less critical when using the proposed sampling scheme.

  12. A Method for Estimating View Transformations from Image Correspondences Based on the Harmony Search Algorithm.

    PubMed

    Cuevas, Erik; Díaz, Margarita

    2015-01-01

    In this paper, a new method for robustly estimating multiple view relations from point correspondences is presented. The approach combines the popular random sampling consensus (RANSAC) algorithm and the evolutionary method harmony search (HS). With this combination, the proposed method adopts a different sampling strategy than RANSAC to generate putative solutions. Under the new mechanism, at each iteration, new candidate solutions are built taking into account the quality of the models generated by previous candidate solutions, rather than purely random as it is the case of RANSAC. The rules for the generation of candidate solutions (samples) are motivated by the improvisation process that occurs when a musician searches for a better state of harmony. As a result, the proposed approach can substantially reduce the number of iterations still preserving the robust capabilities of RANSAC. The method is generic and its use is illustrated by the estimation of homographies, considering synthetic and real images. Additionally, in order to demonstrate the performance of the proposed approach within a real engineering application, it is employed to solve the problem of position estimation in a humanoid robot. Experimental results validate the efficiency of the proposed method in terms of accuracy, speed, and robustness.

  13. Modeling and Simulation of High Dimensional Stochastic Multiscale PDE Systems at the Exascale

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kevrekidis, Ioannis

    2017-03-22

    The thrust of the proposal was to exploit modern data-mining tools in a way that will create a systematic, computer-assisted approach to the representation of random media -- and also to the representation of the solutions of an array of important physicochemical processes that take place in/on such media. A parsimonious representation/parametrization of the random media links directly (via uncertainty quantification tools) to good sampling of the distribution of random media realizations. It also links directly to modern multiscale computational algorithms (like the equation-free approach that has been developed in our group) and plays a crucial role in accelerating themore » scientific computation of solutions of nonlinear PDE models (deterministic or stochastic) in such media – both solutions in particular realizations of the random media, and estimation of the statistics of the solutions over multiple realizations (e.g. expectations).« less

  14. True Randomness from Big Data.

    PubMed

    Papakonstantinou, Periklis A; Woodruff, David P; Yang, Guang

    2016-09-26

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.

  15. True Randomness from Big Data

    NASA Astrophysics Data System (ADS)

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-09-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.

  16. True Randomness from Big Data

    PubMed Central

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-01-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests. PMID:27666514

  17. Confidence intervals for a difference between lognormal means in cluster randomization trials.

    PubMed

    Poirier, Julia; Zou, G Y; Koval, John

    2017-04-01

    Cluster randomization trials, in which intact social units are randomized to different interventions, have become popular in the last 25 years. Outcomes from these trials in many cases are positively skewed, following approximately lognormal distributions. When inference is focused on the difference between treatment arm arithmetic means, existent confidence interval procedures either make restricting assumptions or are complex to implement. We approach this problem by assuming log-transformed outcomes from each treatment arm follow a one-way random effects model. The treatment arm means are functions of multiple parameters for which separate confidence intervals are readily available, suggesting that the method of variance estimates recovery may be applied to obtain closed-form confidence intervals. A simulation study showed that this simple approach performs well in small sample sizes in terms of empirical coverage, relatively balanced tail errors, and interval widths as compared to existing methods. The methods are illustrated using data arising from a cluster randomization trial investigating a critical pathway for the treatment of community acquired pneumonia.

  18. Unsupervised Metric Fusion Over Multiview Data by Graph Random Walk-Based Cross-View Diffusion.

    PubMed

    Wang, Yang; Zhang, Wenjie; Wu, Lin; Lin, Xuemin; Zhao, Xiang

    2017-01-01

    Learning an ideal metric is crucial to many tasks in computer vision. Diverse feature representations may combat this problem from different aspects; as visual data objects described by multiple features can be decomposed into multiple views, thus often provide complementary information. In this paper, we propose a cross-view fusion algorithm that leads to a similarity metric for multiview data by systematically fusing multiple similarity measures. Unlike existing paradigms, we focus on learning distance measure by exploiting a graph structure of data samples, where an input similarity matrix can be improved through a propagation of graph random walk. In particular, we construct multiple graphs with each one corresponding to an individual view, and a cross-view fusion approach based on graph random walk is presented to derive an optimal distance measure by fusing multiple metrics. Our method is scalable to a large amount of data by enforcing sparsity through an anchor graph representation. To adaptively control the effects of different views, we dynamically learn view-specific coefficients, which are leveraged into graph random walk to balance multiviews. However, such a strategy may lead to an over-smooth similarity metric where affinities between dissimilar samples may be enlarged by excessively conducting cross-view fusion. Thus, we figure out a heuristic approach to controlling the iteration number in the fusion process in order to avoid over smoothness. Extensive experiments conducted on real-world data sets validate the effectiveness and efficiency of our approach.

  19. Evaluation of a Class of Simple and Effective Uncertainty Methods for Sparse Samples of Random Variables and Functions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin

    When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less

  20. Pre-Survey Text Messages (SMS) Improve Participation Rate in an Australian Mobile Telephone Survey: An Experimental Study.

    PubMed

    Dal Grande, Eleonora; Chittleborough, Catherine Ruth; Campostrini, Stefano; Dollard, Maureen; Taylor, Anne Winifred

    2016-01-01

    Mobile telephone numbers are increasingly being included in household surveys samples. As approach letters cannot be sent because many do not have address details, alternatives approaches have been considered. This study assesses the effectiveness of sending a short message service (SMS) to a random sample of mobile telephone numbers to increase response rates. A simple random sample of 9000 Australian mobile telephone numbers: 4500 were randomly assigned to be sent a pre-notification SMS, and the remaining 4500 did not have a SMS sent. Adults aged 18 years and over, and currently in paid employment, were eligible to participate. American Association for Public Opinion Research formulas were used to calculated response cooperation and refusal rates. Response and cooperation rate were higher for the SMS groups (12.4% and 28.6% respectively) than the group with no SMS (7.7% and 16.0%). Refusal rates were lower for the SMS group (27.3%) than the group with no SMS (35.9%). When asked, 85.8% of the pre-notification group indicated they remembered receiving a SMS about the study. Sending a pre-notification SMS is effective in improving participation in population-based surveys. Response rates were increased by 60% and cooperation rates by 79%.

  1. Meditation and Yoga for Posttraumatic Stress Disorder: A Meta-Analytic Review of Randomized Controlled Trials

    PubMed Central

    Gallegos, Autumn M.; Crean, Hugh F.; Pigeon, Wilfred R.; Heffner, Kathi L.

    2018-01-01

    Posttraumatic stress disorder (PTSD) is a chronic and debilitating disorder that affects the lives of 7-8% of adults in the U.S. Although several interventions demonstrate clinical effectiveness for treating PTSD, many patients continue to have residual symptoms and ask for a variety of treatment options. Complementary health approaches, such as meditation and yoga, hold promise for treating symptoms of PTSD. This meta-analysis evaluates the effect size (ES) of yoga and meditation on PTSD outcomes in adult patients. We also examined whether the intervention type, PTSD outcome measure, study population, sample size, or control condition moderated the effects of complementary approaches on PTSD outcomes. The studies included were 19 randomized control trials with data on 1,173 participants. A random effects model yielded a statistically significant ES in the small to medium range (ES = −.39, p < .001, 95% CI [−.57, −.22]). There were no appreciable differences between intervention types, study population, outcome measures, or control condition. There was, however, a marginally significant higher ES for sample size ≤ 30 (ES = −.78, k = 5). These findings suggest that meditation and yoga are promising complementary approaches in the treatment of PTSD among adults and warrant further study. PMID:29100863

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Fangyan; Zhang, Song; Chung Wong, Pak

    Effectively visualizing large graphs and capturing the statistical properties are two challenging tasks. To aid in these two tasks, many sampling approaches for graph simplification have been proposed, falling into three categories: node sampling, edge sampling, and traversal-based sampling. It is still unknown which approach is the best. We evaluate commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates. We conduct our evaluation on three graph models: random graphs, small-world graphs, and scale-free graphs. Initial results indicate that the effectiveness of a sampling method is dependent on the graph model, themore » size of the graph, and the desired statistical property. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task, and the results presented can be incorporated into graph visualization and analysis tools.« less

  3. Calculating Confidence, Uncertainty, and Numbers of Samples When Using Statistical Sampling Approaches to Characterize and Clear Contaminated Areas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.

    2013-04-27

    This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that amore » decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 2. qualitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0 3. quantitative data (e.g., contaminant concentrations expressed as CFU/cm2) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 4. quantitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0. For Situation 2, the hotspot sampling approach provides for stating with Z% confidence that a hotspot of specified shape and size with detectable contamination will be found. Also for Situation 2, the CJR approach provides for stating with X% confidence that at least Y% of the decision area does not contain detectable contamination. Forms of these statements for the other three situations are discussed in Section 2.2. Statistical methods that account for FNR > 0 currently only exist for the hotspot sampling approach with qualitative data (or quantitative data converted to qualitative data). This report documents the current status of methods and formulas for the hotspot and CJR sampling approaches. Limitations of these methods are identified. Extensions of the methods that are applicable when FNR = 0 to account for FNR > 0, or to address other limitations, will be documented in future revisions of this report if future funding supports the development of such extensions. For quantitative data, this report also presents statistical methods and formulas for 1. quantifying the uncertainty in measured sample results 2. estimating the true surface concentration corresponding to a surface sample 3. quantifying the uncertainty of the estimate of the true surface concentration. All of the methods and formulas discussed in the report were applied to example situations to illustrate application of the methods and interpretation of the results.« less

  4. Diversity of Approaches to Problems by Students Enrolled in a Non-Calculus College Physics Course.

    ERIC Educational Resources Information Center

    Cohen, Anna Foner

    This study was undertaken to determine whether hierarchically structured problem situations could be developed and used to identify where individual solutions to problems in elementary physics are interrupted and whether examination of student approaches to problem situations would be useful to physics teaching. A random sample of 26 students…

  5. Patterns and Impact of Comorbidity and Multimorbidity among Community-Resident American Indian Elders

    ERIC Educational Resources Information Center

    John, Robert; Kerby, Dave S.; Hennessy, Catherine Hagan

    2003-01-01

    Purpose: The purpose of this study is to suggest a new approach to identifying patterns of comorbidity and multimorbidity. Design and Methods: A random sample of 1,039 rural community-resident American Indian elders aged 60 years and older was surveyed. Comorbidity was investigated with four standard approaches, and with cluster analysis. Results:…

  6. Cluster Randomized Test-Negative Design (CR-TND) Trials: A Novel and Efficient Method to Assess the Efficacy of Community Level Dengue Interventions.

    PubMed

    Anders, Katherine L; Cutcher, Zoe; Kleinschmidt, Immo; Donnelly, Christl A; Ferguson, Neil M; Indriani, Citra; O'Neill, Scott L; Jewell, Nicholas P; Simmons, Cameron P

    2018-05-07

    Cluster randomized trials are the gold standard for assessing efficacy of community-level interventions, such as vector control strategies against dengue. We describe a novel cluster randomized trial methodology with a test-negative design, which offers advantages over traditional approaches. It utilizes outcome-based sampling of patients presenting with a syndrome consistent with the disease of interest, who are subsequently classified as test-positive cases or test-negative controls on the basis of diagnostic testing. We use simulations of a cluster trial to demonstrate validity of efficacy estimates under the test-negative approach. This demonstrates that, provided study arms are balanced for both test-negative and test-positive illness at baseline and that other test-negative design assumptions are met, the efficacy estimates closely match true efficacy. We also briefly discuss analytical considerations for an odds ratio-based effect estimate arising from clustered data, and outline potential approaches to analysis. We conclude that application of the test-negative design to certain cluster randomized trials could increase their efficiency and ease of implementation.

  7. Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats

    USGS Publications Warehouse

    Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.

    2012-01-01

    This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.

  8. Systematic random sampling of the comet assay.

    PubMed

    McArt, Darragh G; Wasson, Gillian R; McKerr, George; Saetzler, Kurt; Reed, Matt; Howard, C Vyvyan

    2009-07-01

    The comet assay is a technique used to quantify DNA damage and repair at a cellular level. In the assay, cells are embedded in agarose and the cellular content is stripped away leaving only the DNA trapped in an agarose cavity which can then be electrophoresed. The damaged DNA can enter the agarose and migrate while the undamaged DNA cannot and is retained. DNA damage is measured as the proportion of the migratory 'tail' DNA compared to the total DNA in the cell. The fundamental basis of these arbitrary values is obtained in the comet acquisition phase using fluorescence microscopy with a stoichiometric stain in tandem with image analysis software. Current methods deployed in such an acquisition are expected to be both objectively and randomly obtained. In this paper we examine the 'randomness' of the acquisition phase and suggest an alternative method that offers both objective and unbiased comet selection. In order to achieve this, we have adopted a survey sampling approach widely used in stereology, which offers a method of systematic random sampling (SRS). This is desirable as it offers an impartial and reproducible method of comet analysis that can be used both manually or automated. By making use of an unbiased sampling frame and using microscope verniers, we are able to increase the precision of estimates of DNA damage. Results obtained from a multiple-user pooled variation experiment showed that the SRS technique attained a lower variability than that of the traditional approach. The analysis of a single user with repetition experiment showed greater individual variances while not being detrimental to overall averages. This would suggest that the SRS method offers a better reflection of DNA damage for a given slide and also offers better user reproducibility.

  9. A weighted belief-propagation algorithm for estimating volume-related properties of random polytopes

    NASA Astrophysics Data System (ADS)

    Font-Clos, Francesc; Massucci, Francesco Alessandro; Pérez Castillo, Isaac

    2012-11-01

    In this work we introduce a novel weighted message-passing algorithm based on the cavity method for estimating volume-related properties of random polytopes, properties which are relevant in various research fields ranging from metabolic networks, to neural networks, to compressed sensing. We propose, as opposed to adopting the usual approach consisting in approximating the real-valued cavity marginal distributions by a few parameters, using an algorithm to faithfully represent the entire marginal distribution. We explain various alternatives for implementing the algorithm and benchmarking the theoretical findings by showing concrete applications to random polytopes. The results obtained with our approach are found to be in very good agreement with the estimates produced by the Hit-and-Run algorithm, known to produce uniform sampling.

  10. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chowdhary, Kenny; Najm, Habib N.

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  11. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE PAGES

    Chowdhary, Kenny; Najm, Habib N.

    2016-04-13

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  12. Sampled-Data Consensus of Linear Multi-agent Systems With Packet Losses.

    PubMed

    Zhang, Wenbing; Tang, Yang; Huang, Tingwen; Kurths, Jurgen

    In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.

  13. Land use and pollution patterns on the Great Lakes. [eastern wisconsin

    NASA Technical Reports Server (NTRS)

    Haugen, R. K. (Principal Investigator); Mckim, H. L.; Marlar, T. L.

    1975-01-01

    The author has identified the following significant results. The final mapping of the large watersheds of the Manitowoc and the Oconto was done using the 25% sampling approach. Comparisons were made with earlier strip mapping efforts of the Oconto and Manitowoc watersheds. Regional differences were noted. Strip mapping of the Oconto resulted in overestimation of the amount of agricultural land compared to the random sampling method. For the Manitowoc, the strip mapping approach produced a slight underestimate of agricultural land, and an overestimate of the forest category.

  14. Design and implementation of a dental caries prevention trial in remote Canadian Aboriginal communities.

    PubMed

    Harrison, Rosamund; Veronneau, Jacques; Leroux, Brian

    2010-05-13

    The goal of this cluster randomized trial is to test the effectiveness of a counseling approach, Motivational Interviewing, to control dental caries in young Aboriginal children. Motivational Interviewing, a client-centred, directive counseling style, has not yet been evaluated as an approach for promotion of behaviour change in indigenous communities in remote settings. Aboriginal women were hired from the 9 communities to recruit expectant and new mothers to the trial, administer questionnaires and deliver the counseling to mothers in the test communities. The goal is for mothers to receive the intervention during pregnancy and at their child's immunization visits. Data on children's dental health status and family dental health practices will be collected when children are 30-months of age. The communities were randomly allocated to test or control group by a random "draw" over community radio. Sample size and power were determined based on an anticipated 20% reduction in caries prevalence. Randomization checks were conducted between groups. In the 5 test and 4 control communities, 272 of the original target sample size of 309 mothers have been recruited over a two-and-a-half year period. A power calculation using the actual attained sample size showed power to be 79% to detect a treatment effect. If an attrition fraction of 4% per year is maintained, power will remain at 80%. Power will still be > 90% to detect a 25% reduction in caries prevalence. The distribution of most baseline variables was similar for the two randomized groups of mothers. However, despite the random assignment of communities to treatment conditions, group differences exist for stage of pregnancy and prior tooth extractions in the family. Because of the group imbalances on certain variables, control of baseline variables will be done in the analyses of treatment effects. This paper explains the challenges of conducting randomized trials in remote settings, the importance of thorough community collaboration, and also illustrates the likelihood that some baseline variables that may be clinically important will be unevenly split in group-randomized trials when the number of groups is small. This trial is registered as ISRCTN41467632.

  15. Design and implementation of a dental caries prevention trial in remote Canadian Aboriginal communities

    PubMed Central

    2010-01-01

    Background The goal of this cluster randomized trial is to test the effectiveness of a counseling approach, Motivational Interviewing, to control dental caries in young Aboriginal children. Motivational Interviewing, a client-centred, directive counseling style, has not yet been evaluated as an approach for promotion of behaviour change in indigenous communities in remote settings. Methods/design Aboriginal women were hired from the 9 communities to recruit expectant and new mothers to the trial, administer questionnaires and deliver the counseling to mothers in the test communities. The goal is for mothers to receive the intervention during pregnancy and at their child's immunization visits. Data on children's dental health status and family dental health practices will be collected when children are 30-months of age. The communities were randomly allocated to test or control group by a random "draw" over community radio. Sample size and power were determined based on an anticipated 20% reduction in caries prevalence. Randomization checks were conducted between groups. Discussion In the 5 test and 4 control communities, 272 of the original target sample size of 309 mothers have been recruited over a two-and-a-half year period. A power calculation using the actual attained sample size showed power to be 79% to detect a treatment effect. If an attrition fraction of 4% per year is maintained, power will remain at 80%. Power will still be > 90% to detect a 25% reduction in caries prevalence. The distribution of most baseline variables was similar for the two randomized groups of mothers. However, despite the random assignment of communities to treatment conditions, group differences exist for stage of pregnancy and prior tooth extractions in the family. Because of the group imbalances on certain variables, control of baseline variables will be done in the analyses of treatment effects. This paper explains the challenges of conducting randomized trials in remote settings, the importance of thorough community collaboration, and also illustrates the likelihood that some baseline variables that may be clinically important will be unevenly split in group-randomized trials when the number of groups is small. Trial registration This trial is registered as ISRCTN41467632. PMID:20465831

  16. Using maximum entropy modeling for optimal selection of sampling sites for monitoring networks

    USGS Publications Warehouse

    Stohlgren, Thomas J.; Kumar, Sunil; Barnett, David T.; Evangelista, Paul H.

    2011-01-01

    Environmental monitoring programs must efficiently describe state shifts. We propose using maximum entropy modeling to select dissimilar sampling sites to capture environmental variability at low cost, and demonstrate a specific application: sample site selection for the Central Plains domain (453,490 km2) of the National Ecological Observatory Network (NEON). We relied on four environmental factors: mean annual temperature and precipitation, elevation, and vegetation type. A “sample site” was defined as a 20 km × 20 km area (equal to NEON’s airborne observation platform [AOP] footprint), within which each 1 km2 cell was evaluated for each environmental factor. After each model run, the most environmentally dissimilar site was selected from all potential sample sites. The iterative selection of eight sites captured approximately 80% of the environmental envelope of the domain, an improvement over stratified random sampling and simple random designs for sample site selection. This approach can be widely used for cost-efficient selection of survey and monitoring sites.

  17. Improving riverine constituent concentration and flux estimation by accounting for antecedent discharge conditions

    NASA Astrophysics Data System (ADS)

    Zhang, Qian; Ball, William P.

    2017-04-01

    Regression-based approaches are often employed to estimate riverine constituent concentrations and fluxes based on typically sparse concentration observations. One such approach is the recently developed WRTDS ("Weighted Regressions on Time, Discharge, and Season") method, which has been shown to provide more accurate estimates than prior approaches in a wide range of applications. Centered on WRTDS, this work was aimed at developing improved models for constituent concentration and flux estimation by accounting for antecedent discharge conditions. Twelve modified models were developed and tested, each of which contains one additional flow variable to represent antecedent conditions and which can be directly derived from the daily discharge record. High-resolution (∼daily) data at nine diverse monitoring sites were used to evaluate the relative merits of the models for estimation of six constituents - chloride (Cl), nitrate-plus-nitrite (NOx), total Kjeldahl nitrogen (TKN), total phosphorus (TP), soluble reactive phosphorus (SRP), and suspended sediment (SS). For each site-constituent combination, 30 concentration subsets were generated from the original data through Monte Carlo subsampling and then used to evaluate model performance. For the subsampling, three sampling strategies were adopted: (A) 1 random sample each month (12/year), (B) 12 random monthly samples plus additional 8 random samples per year (20/year), and (C) flow-stratified sampling with 12 regular (non-storm) and 8 storm samples per year (20/year). Results reveal that estimation performance varies with both model choice and sampling strategy. In terms of model choice, the modified models show general improvement over the original model under all three sampling strategies. Major improvements were achieved for NOx by the long-term flow-anomaly model and for Cl by the ADF (average discounted flow) model and the short-term flow-anomaly model. Moderate improvements were achieved for SS, TP, and TKN by the ADF model. By contrast, no such achievement was achieved for SRP by any proposed model. In terms of sampling strategy, performance of all models (including the original) was generally best using strategy C and worst using strategy A, and especially so for SS, TP, and SRP, confirming the value of routinely collecting stormflow samples. Overall, this work provides a comprehensive set of statistical evidence for supporting the incorporation of antecedent discharge conditions into the WRTDS model for estimation of constituent concentration and flux, thereby combining the advantages of two recent developments in water quality modeling.

  18. A multiple-objective optimal exploration strategy

    USGS Publications Warehouse

    Christakos, G.; Olea, R.A.

    1988-01-01

    Exploration for natural resources is accomplished through partial sampling of extensive domains. Such imperfect knowledge is subject to sampling error. Complex systems of equations resulting from modelling based on the theory of correlated random fields are reduced to simple analytical expressions providing global indices of estimation variance. The indices are utilized by multiple objective decision criteria to find the best sampling strategies. The approach is not limited by geometric nature of the sampling, covers a wide range in spatial continuity and leads to a step-by-step procedure. ?? 1988.

  19. WE-AB-207A-04: Random Undersampled Cone Beam CT: Theoretical Analysis and a Novel Reconstruction Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, C; Chen, L; Jia, X

    2016-06-15

    Purpose: Reducing x-ray exposure and speeding up data acquisition motived studies on projection data undersampling. It is an important question that for a given undersampling ratio, what the optimal undersampling approach is. In this study, we propose a new undersampling scheme: random-ray undersampling. We will mathematically analyze its projection matrix properties and demonstrate its advantages. We will also propose a new reconstruction method that simultaneously performs CT image reconstruction and projection domain data restoration. Methods: By representing projection operator under the basis of singular vectors of full projection operator, matrix representations for an undersampling case can be generated and numericalmore » singular value decomposition can be performed. We compared properties of matrices among three undersampling approaches: regular-view undersampling, regular-ray undersampling, and the proposed random-ray undersampling. To accomplish CT reconstruction for random undersampling, we developed a novel method that iteratively performs CT reconstruction and missing projection data restoration via regularization approaches. Results: For a given undersampling ratio, random-ray undersampling preserved mathematical properties of full projection operator better than the other two approaches. This translates to advantages of reconstructing CT images at lower errors. Different types of image artifacts were observed depending on undersampling strategies, which were ascribed to the unique singular vectors of the sampling operators in the image domain. We tested the proposed reconstruction algorithm on a Forbid phantom with only 30% of the projection data randomly acquired. Reconstructed image error was reduced from 9.4% in a TV method to 7.6% in the proposed method. Conclusion: The proposed random-ray undersampling is mathematically advantageous over other typical undersampling approaches. It may permit better image reconstruction at the same undersampling ratio. The novel algorithm suitable for this random-ray undersampling was able to reconstruct high-quality images.« less

  20. A System Approach to Navy Medical Education and Training. Appendix 18. Radiation Technician.

    DTIC Science & Technology

    1974-08-31

    attrition was forecast to approximate twenty percent, final sample and sub-sample sizes were adjusted accordingly. Stratified random sampling... HYPERTENSIVE INTRAVENOUS PYELOGRAMS 2 ITAKE RENAL LOOPOGRAMI I 3 ITAKE CIXU, I.Eo CONSTANT INFUSION 4 10 RENAL SPLIT FUNCTION TEST, E.G. STAMEY 5...ITAKE PORTAL FILM OF AREA BEING TREATED WITH COBALT 32 [INFORM DOCTOR OF UNEXPECTED X-RAY FINDINGS 33 IREAD X-RAY FILMS FOR TECHNICAL ADEQUACY 34

  1. Influence of Head Teachers' General and Instructional Supervisory Practices on Teachers' Work Performance in Secondary Schools in Entebbe Municipality, Wakiso District, Uganda

    ERIC Educational Resources Information Center

    Jared, Nzabonimpa Buregeya

    2011-01-01

    The study examined the Influence of Secondary School Head Teachers' General and Instructional Supervisory Practices on Teachers' Work Performance. Qualitative and qualitative methods with a descriptive-correlational research approach were used in the study. Purposive sampling technique alongside random sampling technique was used to select the…

  2. Effects of Cooperative Concept Mapping Teaching Approach on Secondary School Students' Motivation in Biology in Gucha District, Kenya

    ERIC Educational Resources Information Center

    Keraro, Fred Nyabuti; Wachanga, Samuel W.; Orora, William

    2007-01-01

    This study investigated the effects of using the cooperative concept mapping (CCM) teaching approach on secondary school students' motivation in biology. A non equivalent control group design under the quasi-experimental research was used in which a random sample of four co-educational secondary schools was used. The four schools were randomly…

  3. Algorithmic Approaches for Place Recognition in Featureless, Walled Environments

    DTIC Science & Technology

    2015-01-01

    inertial measurement unit LIDAR light detection and ranging RANSAC random sample consensus SLAM simultaneous localization and mapping SUSAN smallest...algorithm 38 21 Typical input image for general junction based algorithm 39 22 Short exposure image of hallway junction taken by LIDAR 40 23...discipline of simultaneous localization and mapping ( SLAM ) has been studied intensively over the past several years. Many technical approaches

  4. Effect of Self Regulated Learning Approach on Junior Secondary School Students' Achievement in Basic Science

    ERIC Educational Resources Information Center

    Nwafor, Chika E.; Obodo, Abigail Chikaodinaka; Okafor, Gabriel

    2015-01-01

    This study explored the effect of self-regulated learning approach on junior secondary school students' achievement in basic science. Quasi-experimental design was used for the study.Two co-educational schools were drawn for the study through simple random sampling technique. One school was assigned to the treatment group while the other was…

  5. Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies1

    PubMed Central

    Bowden, Jack; Relton, Caroline; Davey Smith, George

    2016-01-01

    Mendelian randomization (MR) is an increasingly important tool for appraising causality in observational epidemiology. The technique exploits the principle that genotypes are not generally susceptible to reverse causation bias and confounding, reflecting their fixed nature and Mendel’s first and second laws of inheritance. The approach is, however, subject to important limitations and assumptions that, if unaddressed or compounded by poor study design, can lead to erroneous conclusions. Nevertheless, the advent of 2-sample approaches (in which exposure and outcome are measured in separate samples) and the increasing availability of open-access data from large consortia of genome-wide association studies and population biobanks mean that the approach is likely to become routine practice in evidence synthesis and causal inference research. In this article we provide an overview of the design, analysis, and interpretation of MR studies, with a special emphasis on assumptions and limitations. We also consider different analytic strategies for strengthening causal inference. Although impossible to prove causality with any single approach, MR is a highly cost-effective strategy for prioritizing intervention targets for disease prevention and for strengthening the evidence base for public health policy. PMID:26961927

  6. Wireless Sensor Array Network DoA Estimation from Compressed Array Data via Joint Sparse Representation.

    PubMed

    Yu, Kai; Yin, Ming; Luo, Ji-An; Wang, Yingguan; Bao, Ming; Hu, Yu-Hen; Wang, Zhi

    2016-05-23

    A compressive sensing joint sparse representation direction of arrival estimation (CSJSR-DoA) approach is proposed for wireless sensor array networks (WSAN). By exploiting the joint spatial and spectral correlations of acoustic sensor array data, the CSJSR-DoA approach provides reliable DoA estimation using randomly-sampled acoustic sensor data. Since random sampling is performed at remote sensor arrays, less data need to be transmitted over lossy wireless channels to the fusion center (FC), and the expensive source coding operation at sensor nodes can be avoided. To investigate the spatial sparsity, an upper bound of the coherence of incoming sensor signals is derived assuming a linear sensor array configuration. This bound provides a theoretical constraint on the angular separation of acoustic sources to ensure the spatial sparsity of the received acoustic sensor array signals. The Cram e ´ r-Rao bound of the CSJSR-DoA estimator that quantifies the theoretical DoA estimation performance is also derived. The potential performance of the CSJSR-DoA approach is validated using both simulations and field experiments on a prototype WSAN platform. Compared to existing compressive sensing-based DoA estimation methods, the CSJSR-DoA approach shows significant performance improvement.

  7. Longitudinal analysis of bioaccumulative contaminants in freshwater fishes

    USGS Publications Warehouse

    Sun, Jielun; Kim, Y.; Schmitt, C.J.

    2003-01-01

    The National Contaminant Biomonitoring Program (NCBP) was initiated in 1967 as a component of the National Pesticide Monitoring program. It consists of periodic collection of freshwater fish and other samples and the analysis of the concentrations of persistent environmental contaminants in these samples. For the analysis, the common approach has been to apply the mixed two-way ANOVA model to combined data. A main disadvantage of this method is that it cannot give a detailed temporal trend of the concentrations since the data are grouped. In this paper, we present an alternative approach that performs a longitudinal analysis of the information using random effects models. In the new approach, no grouping is needed and the data are treated as samples from continuous stochastic processes, which seems more appropriate than ANOVA for the problem.

  8. Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.

    PubMed

    You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary

    2011-02-01

    The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.

  9. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR)

    PubMed Central

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-01-01

    Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100

  10. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR).

    PubMed

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-06-01

    Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.

  11. Sampling Strategies and Processing of Biobank Tissue Samples from Porcine Biomedical Models.

    PubMed

    Blutke, Andreas; Wanke, Rüdiger

    2018-03-06

    In translational medical research, porcine models have steadily become more popular. Considering the high value of individual animals, particularly of genetically modified pig models, and the often-limited number of available animals of these models, establishment of (biobank) collections of adequately processed tissue samples suited for a broad spectrum of subsequent analyses methods, including analyses not specified at the time point of sampling, represent meaningful approaches to take full advantage of the translational value of the model. With respect to the peculiarities of porcine anatomy, comprehensive guidelines have recently been established for standardized generation of representative, high-quality samples from different porcine organs and tissues. These guidelines are essential prerequisites for the reproducibility of results and their comparability between different studies and investigators. The recording of basic data, such as organ weights and volumes, the determination of the sampling locations and of the numbers of tissue samples to be generated, as well as their orientation, size, processing and trimming directions, are relevant factors determining the generalizability and usability of the specimen for molecular, qualitative, and quantitative morphological analyses. Here, an illustrative, practical, step-by-step demonstration of the most important techniques for generation of representative, multi-purpose biobank specimen from porcine tissues is presented. The methods described here include determination of organ/tissue volumes and densities, the application of a volume-weighted systematic random sampling procedure for parenchymal organs by point-counting, determination of the extent of tissue shrinkage related to histological embedding of samples, and generation of randomly oriented samples for quantitative stereological analyses, such as isotropic uniform random (IUR) sections generated by the "Orientator" and "Isector" methods, and vertical uniform random (VUR) sections.

  12. Investigating Factorial Invariance of Latent Variables Across Populations When Manifest Variables Are Missing Completely

    PubMed Central

    Widaman, Keith F.; Grimm, Kevin J.; Early, Dawnté R.; Robins, Richard W.; Conger, Rand D.

    2013-01-01

    Difficulties arise in multiple-group evaluations of factorial invariance if particular manifest variables are missing completely in certain groups. Ad hoc analytic alternatives can be used in such situations (e.g., deleting manifest variables), but some common approaches, such as multiple imputation, are not viable. At least 3 solutions to this problem are viable: analyzing differing sets of variables across groups, using pattern mixture approaches, and a new method using random number generation. The latter solution, proposed in this article, is to generate pseudo-random normal deviates for all observations for manifest variables that are missing completely in a given sample and then to specify multiple-group models in a way that respects the random nature of these values. An empirical example is presented in detail comparing the 3 approaches. The proposed solution can enable quantitative comparisons at the latent variable level between groups using programs that require the same number of manifest variables in each group. PMID:24019738

  13. Data splitting for artificial neural networks using SOM-based stratified sampling.

    PubMed

    May, R J; Maier, H R; Dandy, G C

    2010-03-01

    Data splitting is an important consideration during artificial neural network (ANN) development where hold-out cross-validation is commonly employed to ensure generalization. Even for a moderate sample size, the sampling methodology used for data splitting can have a significant effect on the quality of the subsets used for training, testing and validating an ANN. Poor data splitting can result in inaccurate and highly variable model performance; however, the choice of sampling methodology is rarely given due consideration by ANN modellers. Increased confidence in the sampling is of paramount importance, since the hold-out sampling is generally performed only once during ANN development. This paper considers the variability in the quality of subsets that are obtained using different data splitting approaches. A novel approach to stratified sampling, based on Neyman sampling of the self-organizing map (SOM), is developed, with several guidelines identified for setting the SOM size and sample allocation in order to minimize the bias and variance in the datasets. Using an example ANN function approximation task, the SOM-based approach is evaluated in comparison to random sampling, DUPLEX, systematic stratified sampling, and trial-and-error sampling to minimize the statistical differences between data sets. Of these approaches, DUPLEX is found to provide benchmark performance with good model performance, with no variability. The results show that the SOM-based approach also reliably generates high-quality samples and can therefore be used with greater confidence than other approaches, especially in the case of non-uniform datasets, with the benefit of scalability to perform data splitting on large datasets. Copyright 2009 Elsevier Ltd. All rights reserved.

  14. The Impact of Individual Differences on a Bilingual Vocabulary Approach for Latino Preschoolers.

    PubMed

    Méndez, Lucía I; Crais, Elizabeth R; Kainz, Kirsten

    2018-04-17

    The purpose of this study was twofold: First, we replicated in a new sample our previous findings that a culturally and linguistically responsive (CLR) bilingual approach for English vocabulary instruction for preschool Latino dual language learners was effective. Subsequently, we investigated whether the positive effect of CLR instruction varies as a function of individual child characteristics, including baseline vocabulary levels and gender. Using a randomized pretest-posttest follow-up group design, we first replicated our previous study (N = 42) with a new sample by randomly assigning 35 Spanish-speaking Latino preschoolers to a CLR bilingual group or an English-only group. The preschoolers received small-group evidence-informed shared readings targeting 30 English words 3 times a week for 5 weeks in their preschools. Vocabulary outcomes were measured using both standardized and researcher-developed measures. We subsequently conducted further studies with the combined sample size of 77 children to examine the variability in intervention effects related to child gender and baseline vocabulary levels. The direct replication study confirmed findings of our earlier work suggesting that the CLR bilingual approach promoted greater gains in L1 and L2 vocabulary than in an English-only approach. The extension studies revealed that the effect of the CLR bilingual vocabulary approach on English and Spanish vocabulary outcomes was not impacted by gender or vocabulary status at baseline. This study provides additional evidence of the benefits of strategically combining L1 and L2 for vocabulary instruction over an English-only approach. Our findings also suggest that preschool Latino dual language learners can benefit from a bilingual vocabulary instructional approach regardless of gender or baseline vocabulary levels in L1.

  15. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  16. A Method for Estimating View Transformations from Image Correspondences Based on the Harmony Search Algorithm

    PubMed Central

    Cuevas, Erik; Díaz, Margarita

    2015-01-01

    In this paper, a new method for robustly estimating multiple view relations from point correspondences is presented. The approach combines the popular random sampling consensus (RANSAC) algorithm and the evolutionary method harmony search (HS). With this combination, the proposed method adopts a different sampling strategy than RANSAC to generate putative solutions. Under the new mechanism, at each iteration, new candidate solutions are built taking into account the quality of the models generated by previous candidate solutions, rather than purely random as it is the case of RANSAC. The rules for the generation of candidate solutions (samples) are motivated by the improvisation process that occurs when a musician searches for a better state of harmony. As a result, the proposed approach can substantially reduce the number of iterations still preserving the robust capabilities of RANSAC. The method is generic and its use is illustrated by the estimation of homographies, considering synthetic and real images. Additionally, in order to demonstrate the performance of the proposed approach within a real engineering application, it is employed to solve the problem of position estimation in a humanoid robot. Experimental results validate the efficiency of the proposed method in terms of accuracy, speed, and robustness. PMID:26339228

  17. Censoring: a new approach for detection limits in total-reflection X-ray fluorescence

    NASA Astrophysics Data System (ADS)

    Pajek, M.; Kubala-Kukuś, A.; Braziewicz, J.

    2004-08-01

    It is shown that the detection limits in the total-reflection X-ray fluorescence (TXRF), which restrict quantification of very low concentrations of trace elements in the samples, can be accounted for using the statistical concept of censoring. We demonstrate that the incomplete TXRF measurements containing the so-called "nondetects", i.e. the non-measured concentrations falling below the detection limits and represented by the estimated detection limit values, can be viewed as the left random-censored data, which can be further analyzed using the Kaplan-Meier (KM) method correcting for nondetects. Within this approach, which uses the Kaplan-Meier product-limit estimator to obtain the cumulative distribution function corrected for the nondetects, the mean value and median of the detection limit censored concentrations can be estimated in a non-parametric way. The Monte Carlo simulations performed show that the Kaplan-Meier approach yields highly accurate estimates for the mean and median concentrations, being within a few percent with respect to the simulated, uncensored data. This means that the uncertainties of KM estimated mean value and median are limited in fact only by the number of studied samples and not by the applied correction procedure for nondetects itself. On the other hand, it is observed that, in case when the concentration of a given element is not measured in all the samples, simple approaches to estimate a mean concentration value from the data yield erroneous, systematically biased results. The discussed random-left censoring approach was applied to analyze the TXRF detection-limit-censored concentration measurements of trace elements in biomedical samples. We emphasize that the Kaplan-Meier approach allows one to estimate the mean concentrations being substantially below the mean level of detection limits. Consequently, this approach gives a new access to lower the effective detection limits for TXRF method, which is of prime interest for investigation of metallic impurities on the silicon wafers.

  18. Study protocol for a cluster randomized trial of the Community of Voices choir intervention to promote the health and well-being of diverse older adults.

    PubMed

    Johnson, Julene K; Nápoles, Anna M; Stewart, Anita L; Max, Wendy B; Santoyo-Olsson, Jasmine; Freyre, Rachel; Allison, Theresa A; Gregorich, Steven E

    2015-10-13

    Older adults are the fastest growing segment of the United States population. There is an immediate need to identify novel, cost-effective community-based approaches that promote health and well-being for older adults, particularly those from diverse racial/ethnic and socioeconomic backgrounds. Because choral singing is multi-modal (requires cognitive, physical, and psychosocial engagement), it has the potential to improve health outcomes across several dimensions to help older adults remain active and independent. The purpose of this study is to examine the effect of a community choir program (Community of Voices) on health and well-being and to examine its costs and cost-effectiveness in a large sample of diverse, community-dwelling older adults. In this cluster randomized controlled trial, diverse adults age 60 and older were enrolled at Administration on Aging-supported senior centers and completed baseline assessments. The senior centers were randomly assigned to either start the choir immediately (intervention group) or wait 6 months to start (control). Community of Voices is a culturally tailored choir program delivered at the senior centers by professional music conductors that reflects three components of engagement (cognitive, physical, and psychosocial). We describe the nature of the study including the cluster randomized trial study design, sampling frame, sample size calculation, methods of recruitment and assessment, and primary and secondary outcomes. The study involves conducting a randomized trial of an intervention as delivered in "real-world" settings. The choir program was designed using a novel translational approach that integrated evidence-based research on the benefits of singing for older adults, community best practices related to community choirs for older adults, and the perspective of the participating communities. The practicality and relatively low cost of the choir intervention means it can be incorporated into a variety of community settings and adapted to diverse cultures and languages. If successful, this program will be a practical and acceptable community-based approach for promoting health and well-being of older adults. ClinicalTrials.gov NCT01869179 registered 9 January 2013.

  19. Using GIS to generate spatially balanced random survey designs for natural resource applications.

    PubMed

    Theobald, David M; Stevens, Don L; White, Denis; Urquhart, N Scott; Olsen, Anthony R; Norman, John B

    2007-07-01

    Sampling of a population is frequently required to understand trends and patterns in natural resource management because financial and time constraints preclude a complete census. A rigorous probability-based survey design specifies where to sample so that inferences from the sample apply to the entire population. Probability survey designs should be used in natural resource and environmental management situations because they provide the mathematical foundation for statistical inference. Development of long-term monitoring designs demand survey designs that achieve statistical rigor and are efficient but remain flexible to inevitable logistical or practical constraints during field data collection. Here we describe an approach to probability-based survey design, called the Reversed Randomized Quadrant-Recursive Raster, based on the concept of spatially balanced sampling and implemented in a geographic information system. This provides environmental managers a practical tool to generate flexible and efficient survey designs for natural resource applications. Factors commonly used to modify sampling intensity, such as categories, gradients, or accessibility, can be readily incorporated into the spatially balanced sample design.

  20. Eddy-covariance data with low signal-to-noise ratio: time-lag determination, uncertainties and limit of detection

    NASA Astrophysics Data System (ADS)

    Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.

    2015-10-01

    All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. Here, we are applying a consistent approach based on auto- and cross-covariance functions to quantify the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining data sets from several analysers and using simulations, we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time lag eliminates these effects (provided the time lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.

  1. Eddy-covariance data with low signal-to-noise ratio: time-lag determination, uncertainties and limit of detection

    NASA Astrophysics Data System (ADS)

    Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.

    2015-03-01

    All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. We are here applying a consistent approach based on auto- and cross-covariance functions to quantifying the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time-lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time-lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining datasets from several analysers and using simulations we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time-lag eliminates these effects (provided the time-lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time-lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.

  2. Using Non-experimental Data to Estimate Treatment Effects

    PubMed Central

    Stuart, Elizabeth A.; Marcus, Sue M.; Horvitz-Lennon, Marcela V.; Gibbons, Robert D.; Normand, Sharon-Lise T.

    2009-01-01

    While much psychiatric research is based on randomized controlled trials (RCTs), where patients are randomly assigned to treatments, sometimes RCTs are not feasible. This paper describes propensity score approaches, which are increasingly used for estimating treatment effects in non-experimental settings. The primary goal of propensity score methods is to create sets of treated and comparison subjects who look as similar as possible, in essence replicating a randomized experiment, at least with respect to observed patient characteristics. A study to estimate the metabolic effects of antipsychotic medication in a sample of Florida Medicaid beneficiaries with schizophrenia illustrates methods. PMID:20563313

  3. Designing clinical trials for amblyopia

    PubMed Central

    Holmes, Jonathan M.

    2015-01-01

    Randomized clinical trial (RCT) study design leads to one of the highest levels of evidence, and is a preferred study design over cohort studies, because randomization reduces bias and maximizes the chance that even unknown confounding factors will be balanced between treatment groups. Recent randomized clinical trials and observational studies in amblyopia can be taken together to formulate an evidence-based approach to amblyopia treatment, which is presented in this review. When designing future clinical studies of amblyopia treatment, issues such as regression to the mean, sample size and trial duration must be considered, since each may impact study results and conclusions. PMID:25752747

  4. Generalized essential energy space random walks to more effectively accelerate solute sampling in aqueous environment

    NASA Astrophysics Data System (ADS)

    Lv, Chao; Zheng, Lianqing; Yang, Wei

    2012-01-01

    Molecular dynamics sampling can be enhanced via the promoting of potential energy fluctuations, for instance, based on a Hamiltonian modified with the addition of a potential-energy-dependent biasing term. To overcome the diffusion sampling issue, which reveals the fact that enlargement of event-irrelevant energy fluctuations may abolish sampling efficiency, the essential energy space random walk (EESRW) approach was proposed earlier. To more effectively accelerate the sampling of solute conformations in aqueous environment, in the current work, we generalized the EESRW method to a two-dimension-EESRW (2D-EESRW) strategy. Specifically, the essential internal energy component of a focused region and the essential interaction energy component between the focused region and the environmental region are employed to define the two-dimensional essential energy space. This proposal is motivated by the general observation that in different conformational events, the two essential energy components have distinctive interplays. Model studies on the alanine dipeptide and the aspartate-arginine peptide demonstrate sampling improvement over the original one-dimension-EESRW strategy; with the same biasing level, the present generalization allows more effective acceleration of the sampling of conformational transitions in aqueous solution. The 2D-EESRW generalization is readily extended to higher dimension schemes and employed in more advanced enhanced-sampling schemes, such as the recent orthogonal space random walk method.

  5. Quantum random bit generation using energy fluctuations in stimulated Raman scattering.

    PubMed

    Bustard, Philip J; England, Duncan G; Nunn, Josh; Moffatt, Doug; Spanner, Michael; Lausten, Rune; Sussman, Benjamin J

    2013-12-02

    Random number sequences are a critical resource in modern information processing systems, with applications in cryptography, numerical simulation, and data sampling. We introduce a quantum random number generator based on the measurement of pulse energy quantum fluctuations in Stokes light generated by spontaneously-initiated stimulated Raman scattering. Bright Stokes pulse energy fluctuations up to five times the mean energy are measured with fast photodiodes and converted to unbiased random binary strings. Since the pulse energy is a continuous variable, multiple bits can be extracted from a single measurement. Our approach can be generalized to a wide range of Raman active materials; here we demonstrate a prototype using the optical phonon line in bulk diamond.

  6. Robust reliable sampled-data control for switched systems with application to flight control

    NASA Astrophysics Data System (ADS)

    Sakthivel, R.; Joby, Maya; Shi, P.; Mathiyalagan, K.

    2016-11-01

    This paper addresses the robust reliable stabilisation problem for a class of uncertain switched systems with random delays and norm bounded uncertainties. The main aim of this paper is to obtain the reliable robust sampled-data control design which involves random time delay with an appropriate gain control matrix for achieving the robust exponential stabilisation for uncertain switched system against actuator failures. In particular, the involved delays are assumed to be randomly time-varying which obeys certain mutually uncorrelated Bernoulli distributed white noise sequences. By constructing an appropriate Lyapunov-Krasovskii functional (LKF) and employing an average-dwell time approach, a new set of criteria is derived for ensuring the robust exponential stability of the closed-loop switched system. More precisely, the Schur complement and Jensen's integral inequality are used in derivation of stabilisation criteria. By considering the relationship among the random time-varying delay and its lower and upper bounds, a new set of sufficient condition is established for the existence of reliable robust sampled-data control in terms of solution to linear matrix inequalities (LMIs). Finally, an illustrative example based on the F-18 aircraft model is provided to show the effectiveness of the proposed design procedures.

  7. The Politics of Speaking: An Approach to Evaluating Bilingual-Bicultural Schools. Bilingual Education Paper Series, Vol. 1, No. 6.

    ERIC Educational Resources Information Center

    Erickson, Frederick

    A method of evaluating bilingual-bicultural education programs that has a sociolinguistic basis uses samples of the language spoken by a number of bilingual program students as they go through their school day. A random sample of the child's speech would be continuously recorded for an hour, with a bilingual observer taking running notes on where…

  8. Application of a Monte Carlo framework with bootstrapping for quantification of uncertainty in baseline map of carbon emissions from deforestation in Tropical Regions

    Treesearch

    William Salas; Steve Hagen

    2013-01-01

    This presentation will provide an overview of an approach for quantifying uncertainty in spatial estimates of carbon emission from land use change. We generate uncertainty bounds around our final emissions estimate using a randomized, Monte Carlo (MC)-style sampling technique. This approach allows us to combine uncertainty from different sources without making...

  9. A Comparative Study of a Team Vs. a Non-Team Teaching Approach in High School Biology.

    ERIC Educational Resources Information Center

    Monaco, William J.; Szabo, Michael

    The purpose of this study was to obtain empirical data for the evaluation of a team approach for teaching biology. A sample of 147 sophomore students was divided into six sections; each section was randomly assigned as control or treatment. The team consisted of three instructors. Each member taught a control group for an entire year. After an…

  10. Effects of Eclectic Learning Approach on Students' Academic Achievement and Retention in English at Elementary Level

    ERIC Educational Resources Information Center

    Suleman, Qaiser; Hussain, Ishtiaq

    2016-01-01

    The purpose of the research paper was to investigate the effect of eclectic learning approach on the academic achievement and retention of students in English at elementary level. A sample of forty students of 8th grade randomly selected from Government Boys High School Khurram District Karak was used. It was an experimental study and that's why…

  11. Quantifying the impact of time-varying baseline risk adjustment in the self-controlled risk interval design.

    PubMed

    Li, Lingling; Kulldorff, Martin; Russek-Cohen, Estelle; Kawai, Alison Tse; Hua, Wei

    2015-12-01

    The self-controlled risk interval design is commonly used to assess the association between an acute exposure and an adverse event of interest, implicitly adjusting for fixed, non-time-varying covariates. Explicit adjustment needs to be made for time-varying covariates, for example, age in young children. It can be performed via either a fixed or random adjustment. The random-adjustment approach can provide valid point and interval estimates but requires access to individual-level data for an unexposed baseline sample. The fixed-adjustment approach does not have this requirement and will provide a valid point estimate but may underestimate the variance. We conducted a comprehensive simulation study to evaluate their performance. We designed the simulation study using empirical data from the Food and Drug Administration-sponsored Mini-Sentinel Post-licensure Rapid Immunization Safety Monitoring Rotavirus Vaccines and Intussusception study in children 5-36.9 weeks of age. The time-varying confounder is age. We considered a variety of design parameters including sample size, relative risk, time-varying baseline risks, and risk interval length. The random-adjustment approach has very good performance in almost all considered settings. The fixed-adjustment approach can be used as a good alternative when the number of events used to estimate the time-varying baseline risks is at least the number of events used to estimate the relative risk, which is almost always the case. We successfully identified settings in which the fixed-adjustment approach can be used as a good alternative and provided guidelines on the selection and implementation of appropriate analyses for the self-controlled risk interval design. Copyright © 2015 John Wiley & Sons, Ltd.

  12. Self-balanced real-time photonic scheme for ultrafast random number generation

    NASA Astrophysics Data System (ADS)

    Li, Pu; Guo, Ya; Guo, Yanqiang; Fan, Yuanlong; Guo, Xiaomin; Liu, Xianglian; Shore, K. Alan; Dubrova, Elena; Xu, Bingjie; Wang, Yuncai; Wang, Anbang

    2018-06-01

    We propose a real-time self-balanced photonic method for extracting ultrafast random numbers from broadband randomness sources. In place of electronic analog-to-digital converters (ADCs), the balanced photo-detection technology is used to directly quantize optically sampled chaotic pulses into a continuous random number stream. Benefitting from ultrafast photo-detection, our method can efficiently eliminate the generation rate bottleneck from electronic ADCs which are required in nearly all the available fast physical random number generators. A proof-of-principle experiment demonstrates that using our approach 10 Gb/s real-time and statistically unbiased random numbers are successfully extracted from a bandwidth-enhanced chaotic source. The generation rate achieved experimentally here is being limited by the bandwidth of the chaotic source. The method described has the potential to attain a real-time rate of 100 Gb/s.

  13. Simple-random-sampling-based multiclass text classification algorithm.

    PubMed

    Liu, Wuying; Wang, Lin; Yi, Mianzhu

    2014-01-01

    Multiclass text classification (MTC) is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC) algorithm. Supported by a token level memory to store labeled documents, the SRSMTC algorithm uses a text retrieval approach to solve text classification problems. The experimental results on the TanCorp data set show that SRSMTC algorithm can achieve the state-of-the-art performance at greatly reduced space-time requirements.

  14. Pseudo-Random Sequence Modifications for Ion Mobility Orthogonal Time of Flight Mass Spectrometry

    PubMed Central

    Clowers, Brian H.; Belov, Mikhail E.; Prior, David C.; Danielson, William F.; Ibrahim, Yehia; Smith, Richard D.

    2008-01-01

    Due to the inherently low duty cycle of ion mobility spectrometry (IMS) experiments that sample from continuous ion sources, a range of experimental advances have been developed to maximize ion utilization efficiency. The use of ion trapping mechanisms prior to the ion mobility drift tube has demonstrated significant gains over discrete sampling from continuous sources; however, these technologies have traditionally relied upon a signal averaging to attain analytically relevant signal-to-noise ratios (SNR). Multiplexed (MP) techniques based upon the Hadamard transform offer an alternative experimental approach by which ion utilization efficiency can be elevated to ∼ 50 %. Recently, our research group demonstrated a unique multiplexed ion mobility time-of-flight (MP-IMS-TOF) approach that incorporates ion trapping and can extend ion utilization efficiency beyond 50 %. However, the spectral reconstruction of the multiplexed signal using this experiment approach requires the use of sample-specific weighing designs. Though general weighing designs have been shown to significantly enhance ion utilization efficiency using this MP technique, such weighing designs cannot be applied to all samples. By modifying both the ion funnel trap and the pseudo random sequence (PRS) used for the MP experiment we have eliminated the need for complex weighing matrices. For both simple and complex mixtures SNR enhancements of up to 13 were routinely observed as compared to the SA-IMS-TOF experiment. In addition, this new class of PRS provides a two fold enhancement in ion throughput compared to the traditional HT-IMS experiment. PMID:18311942

  15. Validation of optical codes based on 3D nanostructures

    NASA Astrophysics Data System (ADS)

    Carnicer, Artur; Javidi, Bahram

    2017-05-01

    Image information encoding using random phase masks produce speckle-like noise distributions when the sample is propagated in the Fresnel domain. As a result, information cannot be accessed by simple visual inspection. Phase masks can be easily implemented in practice by attaching cello-tape to the plain-text message. Conventional 2D-phase masks can be generalized to 3D by combining glass and diffusers resulting in a more complex, physical unclonable function. In this communication, we model the behavior of a 3D phase mask using a simple approach: light is propagated trough glass using the angular spectrum of plane waves whereas the diffusor is described as a random phase mask and a blurring effect on the amplitude of the propagated wave. Using different designs for the 3D phase mask and multiple samples, we demonstrate that classification is possible using the k-nearest neighbors and random forests machine learning algorithms.

  16. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    PubMed Central

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  17. Constraining Thermal Histories by Monte Carlo Simulation of Mg-Fe Isotopic Profiles in Olivine

    NASA Astrophysics Data System (ADS)

    Sio, C. K. I.; Dauphas, N.

    2016-12-01

    In thermochronology, random time-temperature (t-T) paths are generated and used as inputs to model fission track data. This random search method is used to identify a range of acceptable thermal histories that can describe the data. We have extended this modeling approach to magmatic systems. This approach utilizes both the chemical and stable isotope profiles measured in crystals as model constraints. Specifically, the isotopic profiles are used to determine the relative contribution of crystal growth vs. diffusion in generating chemical profiles, and to detect changes in melt composition. With this information, tighter constraints can be placed on the thermal evolution of magmatic bodies. We use an olivine phenocryst from the Kilauea Iki lava lake, HI, to demonstrate proof of concept. We treat this sample as one with little geologic context, then compare our modeling results to the known thermal history experienced by that sample. To complete forward modeling, we use MELTS to estimate the boundary condition, initial and quench temperatures. We also assume a simple relationship between crystal growth and cooling rate. Another important parameter is the isotopic effect for diffusion (i.e., the relative diffusivity of the light vs. heavy isotope of an element). The isotopic effects for Mg and Fe diffusion in olivine have been estimated based on natural samples; experiments to better constrain these parameters are underway. We find that 40% of the random t-T paths can be used to fit the Mg-Fe chemical profiles. However, only a few can be used to simultaneously fit the Mg-Fe isotopic profiles. These few t-T paths are close to the independently determined t-T history of the sample. This modeling approach can be further extended other igneous and metamorphic systems where data exist for diffusion rates, crystal growth rates, and isotopic effects for diffusion.

  18. Effects of unstratified and centre-stratified randomization in multi-centre clinical trials.

    PubMed

    Anisimov, Vladimir V

    2011-01-01

    This paper deals with the analysis of randomization effects in multi-centre clinical trials. The two randomization schemes most often used in clinical trials are considered: unstratified and centre-stratified block-permuted randomization. The prediction of the number of patients randomized to different treatment arms in different regions during the recruitment period accounting for the stochastic nature of the recruitment and effects of multiple centres is investigated. A new analytic approach using a Poisson-gamma patient recruitment model (patients arrive at different centres according to Poisson processes with rates sampled from a gamma distributed population) and its further extensions is proposed. Closed-form expressions for corresponding distributions of the predicted number of the patients randomized in different regions are derived. In the case of two treatments, the properties of the total imbalance in the number of patients on treatment arms caused by using centre-stratified randomization are investigated and for a large number of centres a normal approximation of imbalance is proved. The impact of imbalance on the power of the study is considered. It is shown that the loss of statistical power is practically negligible and can be compensated by a minor increase in sample size. The influence of patient dropout is also investigated. The impact of randomization on predicted drug supply overage is discussed. Copyright © 2010 John Wiley & Sons, Ltd.

  19. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

    PubMed

    Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

    2013-12-01

    Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.

  20. Knowledge-based nonuniform sampling in multidimensional NMR.

    PubMed

    Schuyler, Adam D; Maciejewski, Mark W; Arthanari, Haribabu; Hoch, Jeffrey C

    2011-07-01

    The full resolution afforded by high-field magnets is rarely realized in the indirect dimensions of multidimensional NMR experiments because of the time cost of uniformly sampling to long evolution times. Emerging methods utilizing nonuniform sampling (NUS) enable high resolution along indirect dimensions by sampling long evolution times without sampling at every multiple of the Nyquist sampling interval. While the earliest NUS approaches matched the decay of sampling density to the decay of the signal envelope, recent approaches based on coupled evolution times attempt to optimize sampling by choosing projection angles that increase the likelihood of resolving closely-spaced resonances. These approaches employ knowledge about chemical shifts to predict optimal projection angles, whereas prior applications of tailored sampling employed only knowledge of the decay rate. In this work we adapt the matched filter approach as a general strategy for knowledge-based nonuniform sampling that can exploit prior knowledge about chemical shifts and is not restricted to sampling projections. Based on several measures of performance, we find that exponentially weighted random sampling (envelope matched sampling) performs better than shift-based sampling (beat matched sampling). While shift-based sampling can yield small advantages in sensitivity, the gains are generally outweighed by diminished robustness. Our observation that more robust sampling schemes are only slightly less sensitive than schemes highly optimized using prior knowledge about chemical shifts has broad implications for any multidimensional NMR study employing NUS. The results derived from simulated data are demonstrated with a sample application to PfPMT, the phosphoethanolamine methyltransferase of the human malaria parasite Plasmodium falciparum.

  1. Comparative study of feature selection with ensemble learning using SOM variants

    NASA Astrophysics Data System (ADS)

    Filali, Ameni; Jlassi, Chiraz; Arous, Najet

    2017-03-01

    Ensemble learning has succeeded in the growth of stability and clustering accuracy, but their runtime prohibits them from scaling up to real-world applications. This study deals the problem of selecting a subset of the most pertinent features for every cluster from a dataset. The proposed method is another extension of the Random Forests approach using self-organizing maps (SOM) variants to unlabeled data that estimates the out-of-bag feature importance from a set of partitions. Every partition is created using a various bootstrap sample and a random subset of the features. Then, we show that the process internal estimates are used to measure variable pertinence in Random Forests are also applicable to feature selection in unsupervised learning. This approach aims to the dimensionality reduction, visualization and cluster characterization at the same time. Hence, we provide empirical results on nineteen benchmark data sets indicating that RFS can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art unsupervised methods, with a very limited subset of features. The approach proves promise to treat with very broad domains.

  2. Analysis of creative mathematic thinking ability in problem based learning model based on self-regulation learning

    NASA Astrophysics Data System (ADS)

    Munahefi, D. N.; Waluya, S. B.; Rochmad

    2018-03-01

    The purpose of this research identified the effectiveness of Problem Based Learning (PBL) models based on Self Regulation Leaning (SRL) on the ability of mathematical creative thinking and analyzed the ability of mathematical creative thinking of high school students in solving mathematical problems. The population of this study was students of grade X SMA N 3 Klaten. The research method used in this research was sequential explanatory. Quantitative stages with simple random sampling technique, where two classes were selected randomly as experimental class was taught with the PBL model based on SRL and control class was taught with expository model. The selection of samples at the qualitative stage was non-probability sampling technique in which each selected 3 students were high, medium, and low academic levels. PBL model with SRL approach effectived to students’ mathematical creative thinking ability. The ability of mathematical creative thinking of low academic level students with PBL model approach of SRL were achieving the aspect of fluency and flexibility. Students of academic level were achieving fluency and flexibility aspects well. But the originality of students at the academic level was not yet well structured. Students of high academic level could reach the aspect of originality.

  3. Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach

    PubMed Central

    Risso, Davide; Taglioli, Luca; De Iasio, Sergio; Gueresi, Paola; Alfani, Guido; Nelli, Sergio; Rossi, Paolo; Paoli, Giorgio; Tofanelli, Sergio

    2015-01-01

    This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447–2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8–30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics. PMID:26452043

  4. Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach.

    PubMed

    Risso, Davide; Taglioli, Luca; De Iasio, Sergio; Gueresi, Paola; Alfani, Guido; Nelli, Sergio; Rossi, Paolo; Paoli, Giorgio; Tofanelli, Sergio

    2015-01-01

    This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447-2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8-30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics.

  5. Contextual Approach with Guided Discovery Learning and Brain Based Learning in Geometry Learning

    NASA Astrophysics Data System (ADS)

    Kartikaningtyas, V.; Kusmayadi, T. A.; Riyadi

    2017-09-01

    The aim of this study was to combine the contextual approach with Guided Discovery Learning (GDL) and Brain Based Learning (BBL) in geometry learning of junior high school. Furthermore, this study analysed the effect of contextual approach with GDL and BBL in geometry learning. GDL-contextual and BBL-contextual was built from the steps of GDL and BBL that combined with the principles of contextual approach. To validate the models, it uses quasi experiment which used two experiment groups. The sample had been chosen by stratified cluster random sampling. The sample was 150 students of grade 8th in junior high school. The data were collected through the student’s mathematics achievement test that given after the treatment of each group. The data analysed by using one way ANOVA with different cell. The result shows that GDL-contextual has not different effect than BBL-contextual on mathematics achievement in geometry learning. It means both the two models could be used in mathematics learning as the innovative way in geometry learning.

  6. USING HISTORICAL BIOLOGICAL DATA TO EVALUATE STATUS AND TRENDS IN THE BIG DARBY CREEK WATERSHED (OHIO, USA)

    EPA Science Inventory

    Assessment of watershed ecological status and trends is challenging for managers who lack randomly or consistently sampled data, or monitoring programs developed from a watershed perspective. This study investigated analytical approaches for assessment of status and trends using ...

  7. Optimal sampling with prior information of the image geometry in microfluidic MRI.

    PubMed

    Han, S H; Cho, H; Paulsen, J L

    2015-03-01

    Recent advances in MRI acquisition for microscopic flows enable unprecedented sensitivity and speed in a portable NMR/MRI microfluidic analysis platform. However, the application of MRI to microfluidics usually suffers from prolonged acquisition times owing to the combination of the required high resolution and wide field of view necessary to resolve details within microfluidic channels. When prior knowledge of the image geometry is available as a binarized image, such as for microfluidic MRI, it is possible to reduce sampling requirements by incorporating this information into the reconstruction algorithm. The current approach to the design of the partial weighted random sampling schemes is to bias toward the high signal energy portions of the binarized image geometry after Fourier transformation (i.e. in its k-space representation). Although this sampling prescription is frequently effective, it can be far from optimal in certain limiting cases, such as for a 1D channel, or more generally yield inefficient sampling schemes at low degrees of sub-sampling. This work explores the tradeoff between signal acquisition and incoherent sampling on image reconstruction quality given prior knowledge of the image geometry for weighted random sampling schemes, finding that optimal distribution is not robustly determined by maximizing the acquired signal but from interpreting its marginal change with respect to the sub-sampling rate. We develop a corresponding sampling design methodology that deterministically yields a near optimal sampling distribution for image reconstructions incorporating knowledge of the image geometry. The technique robustly identifies optimal weighted random sampling schemes and provides improved reconstruction fidelity for multiple 1D and 2D images, when compared to prior techniques for sampling optimization given knowledge of the image geometry. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. An in silico approach helped to identify the best experimental design, population, and outcome for future randomized clinical trials.

    PubMed

    Bajard, Agathe; Chabaud, Sylvie; Cornu, Catherine; Castellan, Anne-Charlotte; Malik, Salma; Kurbatova, Polina; Volpert, Vitaly; Eymard, Nathalie; Kassai, Behrouz; Nony, Patrice

    2016-01-01

    The main objective of our work was to compare different randomized clinical trial (RCT) experimental designs in terms of power, accuracy of the estimation of treatment effect, and number of patients receiving active treatment using in silico simulations. A virtual population of patients was simulated and randomized in potential clinical trials. Treatment effect was modeled using a dose-effect relation for quantitative or qualitative outcomes. Different experimental designs were considered, and performances between designs were compared. One thousand clinical trials were simulated for each design based on an example of modeled disease. According to simulation results, the number of patients needed to reach 80% power was 50 for crossover, 60 for parallel or randomized withdrawal, 65 for drop the loser (DL), and 70 for early escape or play the winner (PW). For a given sample size, each design had its own advantage: low duration (parallel, early escape), high statistical power and precision (crossover), and higher number of patients receiving the active treatment (PW and DL). Our approach can help to identify the best experimental design, population, and outcome for future RCTs. This may be particularly useful for drug development in rare diseases, theragnostic approaches, or personalized medicine. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Multilevel covariance regression with correlated random effects in the mean and variance structure.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2017-09-01

    Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. The Effect of Educational Modules Strategy on the Direct and Postponed Study's Achievement of Seventh Primary Grade Students in Science, in Comparison with the Conventional Approach

    ERIC Educational Resources Information Center

    Alelaimat, Abeer Rashed; Ghoneem, Khowla Abd Al Raheem

    2012-01-01

    This study aimed at revealing the effect of educational modules strategy on the direct and postponed study's achievement of seventh primary grade students in science, in comparison with the conventional approach. The sample of the study consists of (174) male and female students randomly chosen from schools in the city of Mafraq, students are…

  11. Driven Boson Sampling.

    PubMed

    Barkhofen, Sonja; Bartley, Tim J; Sansoni, Linda; Kruse, Regina; Hamilton, Craig S; Jex, Igor; Silberhorn, Christine

    2017-01-13

    Sampling the distribution of bosons that have undergone a random unitary evolution is strongly believed to be a computationally hard problem. Key to outperforming classical simulations of this task is to increase both the number of input photons and the size of the network. We propose driven boson sampling, in which photons are input within the network itself, as a means to approach this goal. We show that the mean number of photons entering a boson sampling experiment can exceed one photon per input mode, while maintaining the required complexity, potentially leading to less stringent requirements on the input states for such experiments. When using heralded single-photon sources based on parametric down-conversion, this approach offers an ∼e-fold enhancement in the input state generation rate over scattershot boson sampling, reaching the scaling limit for such sources. This approach also offers a dramatic increase in the signal-to-noise ratio with respect to higher-order photon generation from such probabilistic sources, which removes the need for photon number resolution during the heralding process as the size of the system increases.

  12. A fosmid cloning strategy for detecting the widest possible spectrum of microbes from the international space station drinking water system.

    PubMed

    Choi, Sangdun; Chang, Mi Sook; Stuecker, Tara; Chung, Christine; Newcombe, David A; Venkateswaran, Kasthuri

    2012-12-01

    In this study, fosmid cloning strategies were used to assess the microbial populations in water from the International Space Station (ISS) drinking water system (henceforth referred to as Prebiocide and Tank A water samples). The goals of this study were: to compare the sensitivity of the fosmid cloning strategy with that of traditional culture-based and 16S rRNA-based approaches and to detect the widest possible spectrum of microbial populations during the water purification process. Initially, microbes could not be cultivated, and conventional PCR failed to amplify 16S rDNA fragments from these low biomass samples. Therefore, randomly primed rolling-circle amplification was used to amplify any DNA that might be present in the samples, followed by size selection by using pulsed-field gel electrophoresis. The amplified high-molecular-weight DNA from both samples was cloned into fosmid vectors. Several hundred clones were randomly selected for sequencing, followed by Blastn/Blastx searches. Sequences encoding specific genes from Burkholderia, a species abundant in the soil and groundwater, were found in both samples. Bradyrhizobium and Mesorhizobium, which belong to rhizobia, a large community of nitrogen fixers often found in association with plant roots, were present in the Prebiocide samples. Ralstonia, which is prevalent in soils with a high heavy metal content, was detected in the Tank A samples. The detection of many unidentified sequences suggests the presence of potentially novel microbial fingerprints. The bacterial diversity detected in this pilot study using a fosmid vector approach was higher than that detected by conventional 16S rRNA gene sequencing.

  13. School Climate, Principal Support and Collaboration among Portuguese Teachers

    ERIC Educational Resources Information Center

    Castro Silva, José; Amante, Lúcia; Morgado, José

    2017-01-01

    This article analyses the relationship between school principal support and teacher collaboration among Portuguese teachers. Data were collected from a random sample of 234 teachers in middle and secondary schools. The use of a combined approach using linear and multiple regression tests concluded that the school principal support, through the…

  14. A Random Forest Approach to Predict the Spatial Distribution of Sediment Pollution in an Estuarine System

    EPA Science Inventory

    Modeling the magnitude and distribution of sediment-bound pollutants in estuaries is often limited by incomplete knowledge of the site and inadequate sample density. To address these modeling limitations, a decision-support tool framework was conceived that predicts sediment cont...

  15. Predicting the Spatial Distribution of Organic Contaminants in an Estuarine System using a Random Forest Approach

    EPA Science Inventory

    Modeling the magnitude and distribution of estuarine sediment contamination by pollutants of historic (e.g. PCB) and emerging concern (e.g., personal care products, PCP) is often limited by incomplete site knowledge and inadequate sediment contamination sampling. We tested a mode...

  16. Beliefs about Child Support Modification Following Remarriage and Subsequent Childbirth

    ERIC Educational Resources Information Center

    Hans, Jason D.

    2009-01-01

    Framed by equity theory, fairness beliefs regarding child support modification to account for the financial impact of remarriage and subsequent childbirth were assessed. Based on a random sample of 407 Kentucky residents using a multiple segment factorial vignette approach, modification was supported by 57% of respondents following remarriage, but…

  17. International Education: A Case Study from the University of Jordan

    ERIC Educational Resources Information Center

    Jabbar, Sinaria Kamil Abdel

    2012-01-01

    This paper describes international education at the University of Jordan (UJ). Specifically it investigates a random sample of international students comprising Americans, Europeans and Asians. A field survey approach with qualitative and quantitative dimensions was used. Questionnaires were used to solicit information from the students. In…

  18. EMP: A Diagnostic and Therapeutic Approach.

    ERIC Educational Resources Information Center

    Stephens, Jill

    A 2-year study examined the relevance of English-as-a-second-language instructional materials for students in health care fields. The study gathered specific information on common medical activities requiring the use of English and identified problem areas. A random sample of students of medicine, pharmacy, and research science was surveyed, the…

  19. The Advantages of Hierarchical Linear Modeling. ERIC/AE Digest.

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    This digest introduces hierarchical data structure, describes how hierarchical models work, and presents three approaches to analyzing hierarchical data. Hierarchical, or nested data, present several problems for analysis. People or creatures that exist within hierarchies tend to be more similar to each other than people randomly sampled from the…

  20. Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes.

    PubMed

    Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H

    2015-11-30

    We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.

  1. Methods and analysis of realizing randomized grouping.

    PubMed

    Hu, Liang-Ping; Bao, Xiao-Lei; Wang, Qi

    2011-07-01

    Randomization is one of the four basic principles of research design. The meaning of randomization includes two aspects: one is to randomly select samples from the population, which is known as random sampling; the other is to randomly group all the samples, which is called randomized grouping. Randomized grouping can be subdivided into three categories: completely, stratified and dynamically randomized grouping. This article mainly introduces the steps of complete randomization, the definition of dynamic randomization and the realization of random sampling and grouping by SAS software.

  2. Estimation of treatment effect in a subpopulation: An empirical Bayes approach.

    PubMed

    Shen, Changyu; Li, Xiaochun; Jeong, Jaesik

    2016-01-01

    It is well recognized that the benefit of a medical intervention may not be distributed evenly in the target population due to patient heterogeneity, and conclusions based on conventional randomized clinical trials may not apply to every person. Given the increasing cost of randomized trials and difficulties in recruiting patients, there is a strong need to develop analytical approaches to estimate treatment effect in subpopulations. In particular, due to limited sample size for subpopulations and the need for multiple comparisons, standard analysis tends to yield wide confidence intervals of the treatment effect that are often noninformative. We propose an empirical Bayes approach to combine both information embedded in a target subpopulation and information from other subjects to construct confidence intervals of the treatment effect. The method is appealing in its simplicity and tangibility in characterizing the uncertainty about the true treatment effect. Simulation studies and a real data analysis are presented.

  3. A Two-Stage Estimation Method for Random Coefficient Differential Equation Models with Application to Longitudinal HIV Dynamic Data.

    PubMed

    Fang, Yun; Wu, Hulin; Zhu, Li-Xing

    2011-07-01

    We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anheier, Norman C.; Cannon, Bret D.; Martinez, Alonzo

    The International Atomic Energy Agency’s (IAEA’s) long-term research and development plan calls for more cost-effective and efficient safeguard methods to detect and deter misuse of gaseous centrifuge enrichment plants (GCEPs). The IAEA’s current safeguards approaches at GCEPs are based on a combination of routine and random inspections that include environmental sampling and destructive assay (DA) sample collection from UF6 in-process material and selected cylinders. Samples are then shipped offsite for subsequent laboratory analysis. In this paper, a new DA sample collection and onsite analysis approach that could help to meet challenges in transportation and chain of custody for UF6 DAmore » samples is introduced. This approach uses a handheld sampler concept and a Laser Ablation, Laser Absorbance Spectrometry (LAARS) analysis instrument, both currently under development at the Pacific Northwest National Laboratory. A LAARS analysis instrument could be temporarily or permanently deployed in the IAEA control room of the facility, in the IAEA data acquisition cabinet, for example. The handheld PNNL DA sampler design collects and stabilizes a much smaller DA sample mass compared to current sampling methods. The significantly lower uranium mass reduces the sample radioactivity and the stabilization approach diminishes the risk of uranium and hydrogen fluoride release. These attributes enable safe sample handling needed during onsite LAARS assay and may help ease shipping challenges for samples to be processed at the IAEA’s offsite laboratory. The LAARS and DA sampler implementation concepts will be described and preliminary technical viability results presented.« less

  5. Multi-factor challenge/response approach for remote biometric authentication

    NASA Astrophysics Data System (ADS)

    Al-Assam, Hisham; Jassim, Sabah A.

    2011-06-01

    Although biometric authentication is perceived to be more reliable than traditional authentication schemes, it becomes vulnerable to many attacks when it comes to remote authentication over open networks and raises serious privacy concerns. This paper proposes a biometric-based challenge-response approach to be used for remote authentication between two parties A and B over open networks. In the proposed approach, a remote authenticator system B (e.g. a bank) challenges its client A who wants to authenticate his/her self to the system by sending a one-time public random challenge. The client A responds by employing the random challenge along with secret information obtained from a password and a token to produce a one-time cancellable representation of his freshly captured biometric sample. The one-time biometric representation, which is based on multi-factor, is then sent back to B for matching. Here, we argue that eavesdropping of the one-time random challenge and/or the resulting one-time biometric representation does not compromise the security of the system, and no information about the original biometric data is leaked. In addition to securing biometric templates, the proposed protocol offers a practical solution for the replay attack on biometric systems. Moreover, we propose a new scheme for generating a password-based pseudo random numbers/permutation to be used as a building block in the proposed approach. The proposed scheme is also designed to provide protection against repudiation. We illustrate the viability and effectiveness of the proposed approach by experimental results based on two biometric modalities: fingerprint and face biometrics.

  6. Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?

    PubMed

    Mukaka, Mavuto; White, Sarah A; Terlouw, Dianne J; Mwapasa, Victor; Kalilani-Phiri, Linda; Faragher, E Brian

    2016-07-22

    Missing outcomes can seriously impair the ability to make correct inferences from randomized controlled trials (RCTs). Complete case (CC) analysis is commonly used, but it reduces sample size and is perceived to lead to reduced statistical efficiency of estimates while increasing the potential for bias. As multiple imputation (MI) methods preserve sample size, they are generally viewed as the preferred analytical approach. We examined this assumption, comparing the performance of CC and MI methods to determine risk difference (RD) estimates in the presence of missing binary outcomes. We conducted simulation studies of 5000 simulated data sets with 50 imputations of RCTs with one primary follow-up endpoint at different underlying levels of RD (3-25 %) and missing outcomes (5-30 %). For missing at random (MAR) or missing completely at random (MCAR) outcomes, CC method estimates generally remained unbiased and achieved precision similar to or better than MI methods, and high statistical coverage. Missing not at random (MNAR) scenarios yielded invalid inferences with both methods. Effect size estimate bias was reduced in MI methods by always including group membership even if this was unrelated to missingness. Surprisingly, under MAR and MCAR conditions in the assessed scenarios, MI offered no statistical advantage over CC methods. While MI must inherently accompany CC methods for intention-to-treat analyses, these findings endorse CC methods for per protocol risk difference analyses in these conditions. These findings provide an argument for the use of the CC approach to always complement MI analyses, with the usual caveat that the validity of the mechanism for missingness be thoroughly discussed. More importantly, researchers should strive to collect as much data as possible.

  7. Ensemble Feature Learning of Genomic Data Using Support Vector Machine

    PubMed Central

    Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.

    2016-01-01

    The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923

  8. A remote sensing and geographic information system approach to sampling malaria vector habitats in Chiapas, Mexico

    NASA Astrophysics Data System (ADS)

    Beck, L.; Wood, B.; Whitney, S.; Rossi, R.; Spanner, M.; Rodriguez, M.; Rodriguez-Ramirez, A.; Salute, J.; Legters, L.; Roberts, D.; Rejmankova, E.; Washino, R.

    1993-08-01

    This paper describes a procedure whereby remote sensing and geographic information system (GIS) technologies are used in a sample design to study the habitat of Anopheles albimanus, one of the principle vectors of malaria in Central America. This procedure incorporates Landsat-derived land cover maps with digital elevation and road network data to identify a random selection of larval habitats accessible for field sampling. At the conclusion of the sampling season, the larval counts will be used to determine habitat productivity, and then integrated with information on human settlement to assess where people are at high risk of malaria. This aproach would be appropriate in areas where land cover information is lacking and problems of access constrain field sampling. The use of a GIS also permits other data (such as insecticide spraying data) to the incorporated in the sample design as they arise. This approach would also be pertinent for other tropical vector-borne diseases, particularly where human activities impact disease vector habitat.

  9. Preliminary Characterization of Erythrocytes Deformability on the Entropy-Complexity Plane

    PubMed Central

    Korol, Ana M; D’Arrigo, Mabel; Foresto, Patricia; Pérez, Susana; Martín, Maria T; Rosso, Osualdo A

    2010-01-01

    We present an application of wavelet-based Information Theory quantifiers (Normalized Total Shannon Entropy, MPR-Statistical Complexity and Entropy-Complexity plane) on red blood cells membrane viscoelasticity characterization. These quantifiers exhibit important localization advantages provided by the Wavelet Theory. The present approach produces a clear characterization of this dynamical system, finding out an evident manifestation of a random process on the red cell samples of healthy individuals, and its sharp reduction of randomness on analyzing a human haematological disease, such as β-thalassaemia minor. PMID:21611139

  10. Combining in silico and in cerebro approaches for virtual screening and pose prediction in SAMPL4.

    PubMed

    Voet, Arnout R D; Kumar, Ashutosh; Berenger, Francois; Zhang, Kam Y J

    2014-04-01

    The SAMPL challenges provide an ideal opportunity for unbiased evaluation and comparison of different approaches used in computational drug design. During the fourth round of this SAMPL challenge, we participated in the virtual screening and binding pose prediction on inhibitors targeting the HIV-1 integrase enzyme. For virtual screening, we used well known and widely used in silico methods combined with personal in cerebro insights and experience. Regular docking only performed slightly better than random selection, but the performance was significantly improved upon incorporation of additional filters based on pharmacophore queries and electrostatic similarities. The best performance was achieved when logical selection was added. For the pose prediction, we utilized a similar consensus approach that amalgamated the results of the Glide-XP docking with structural knowledge and rescoring. The pose prediction results revealed that docking displayed reasonable performance in predicting the binding poses. However, prediction performance can be improved utilizing scientific experience and rescoring approaches. In both the virtual screening and pose prediction challenges, the top performance was achieved by our approaches. Here we describe the methods and strategies used in our approaches and discuss the rationale of their performances.

  11. Combining in silico and in cerebro approaches for virtual screening and pose prediction in SAMPL4

    NASA Astrophysics Data System (ADS)

    Voet, Arnout R. D.; Kumar, Ashutosh; Berenger, Francois; Zhang, Kam Y. J.

    2014-04-01

    The SAMPL challenges provide an ideal opportunity for unbiased evaluation and comparison of different approaches used in computational drug design. During the fourth round of this SAMPL challenge, we participated in the virtual screening and binding pose prediction on inhibitors targeting the HIV-1 integrase enzyme. For virtual screening, we used well known and widely used in silico methods combined with personal in cerebro insights and experience. Regular docking only performed slightly better than random selection, but the performance was significantly improved upon incorporation of additional filters based on pharmacophore queries and electrostatic similarities. The best performance was achieved when logical selection was added. For the pose prediction, we utilized a similar consensus approach that amalgamated the results of the Glide-XP docking with structural knowledge and rescoring. The pose prediction results revealed that docking displayed reasonable performance in predicting the binding poses. However, prediction performance can be improved utilizing scientific experience and rescoring approaches. In both the virtual screening and pose prediction challenges, the top performance was achieved by our approaches. Here we describe the methods and strategies used in our approaches and discuss the rationale of their performances.

  12. [Krigle estimation and its simulated sampling of Chilo suppressalis population density].

    PubMed

    Yuan, Zheming; Bai, Lianyang; Wang, Kuiwu; Hu, Xiangyue

    2004-07-01

    In order to draw up a rational sampling plan for the larvae population of Chilo suppressalis, an original population and its two derivative populations, random population and sequence population, were sampled and compared with random sampling, gap-range-random sampling, and a new systematic sampling integrated Krigle interpolation and random original position. As for the original population whose distribution was up to aggregative and dependence range in line direction was 115 cm (6.9 units), gap-range-random sampling in line direction was more precise than random sampling. Distinguishing the population pattern correctly is the key to get a better precision. Gap-range-random sampling and random sampling are fit for aggregated population and random population, respectively, but both of them are difficult to apply in practice. Therefore, a new systematic sampling named as Krigle sample (n = 441) was developed to estimate the density of partial sample (partial estimation, n = 441) and population (overall estimation, N = 1500). As for original population, the estimated precision of Krigle sample to partial sample and population was better than that of investigation sample. With the increase of the aggregation intensity of population, Krigel sample was more effective than investigation sample in both partial estimation and overall estimation in the appropriate sampling gap according to the dependence range.

  13. Species conservation profiles of a random sample of world spiders I: Agelenidae to Filistatidae

    PubMed Central

    Seppälä, Sini; Henriques, Sérgio; Draney, Michael L; Foord, Stefan; Gibbons, Alastair T; Gomez, Luz A; Kariko, Sarah; Malumbres-Olarte, Jagoba; Milne, Marc; Vink, Cor J

    2018-01-01

    Abstract Background The IUCN Red List of Threatened Species is the most widely used information source on the extinction risk of species. One of the uses of the Red List is to evaluate and monitor the state of biodiversity and a possible approach for this purpose is the Red List Index (RLI). For many taxa, mainly hyperdiverse groups, it is not possible within available resources to assess all known species. In such cases, a random sample of species might be selected for assessment and the results derived from it extrapolated for the entire group - the Sampled Red List Index (SRLI). With the current contribution and the three following papers, we intend to create the first point in time of a future spider SRLI encompassing 200 species distributed across the world. New information A sample of 200 species of spiders were randomly selected from the World Spider Catalogue, an updated global database containing all recognised species names for the group. The 200 selected species where divided taxonomically at the family level and the familes were ordered alphabetically. In this publication, we present the conservation profiles of 46 species belonging to the famillies alphabetically arranged between Agelenidae and Filistatidae, which encompassed Agelenidae, Amaurobiidae, Anyphaenidae, Araneidae, Archaeidae, Barychelidae, Clubionidae, Corinnidae, Ctenidae, Ctenizidae, Cyatholipidae, Dictynidae, Dysderidae, Eresidae and Filistatidae. PMID:29725239

  14. Species conservation profiles of a random sample of world spiders I: Agelenidae to Filistatidae.

    PubMed

    Seppälä, Sini; Henriques, Sérgio; Draney, Michael L; Foord, Stefan; Gibbons, Alastair T; Gomez, Luz A; Kariko, Sarah; Malumbres-Olarte, Jagoba; Milne, Marc; Vink, Cor J; Cardoso, Pedro

    2018-01-01

    The IUCN Red List of Threatened Species is the most widely used information source on the extinction risk of species. One of the uses of the Red List is to evaluate and monitor the state of biodiversity and a possible approach for this purpose is the Red List Index (RLI). For many taxa, mainly hyperdiverse groups, it is not possible within available resources to assess all known species. In such cases, a random sample of species might be selected for assessment and the results derived from it extrapolated for the entire group - the Sampled Red List Index (SRLI). With the current contribution and the three following papers, we intend to create the first point in time of a future spider SRLI encompassing 200 species distributed across the world. A sample of 200 species of spiders were randomly selected from the World Spider Catalogue, an updated global database containing all recognised species names for the group. The 200 selected species where divided taxonomically at the family level and the familes were ordered alphabetically. In this publication, we present the conservation profiles of 46 species belonging to the famillies alphabetically arranged between Agelenidae and Filistatidae, which encompassed Agelenidae, Amaurobiidae, Anyphaenidae, Araneidae, Archaeidae, Barychelidae, Clubionidae, Corinnidae, Ctenidae, Ctenizidae, Cyatholipidae, Dictynidae, Dysderidae, Eresidae and Filistatidae.

  15. An approach for addressing hard-to-detect hot spots.

    PubMed

    Abelquist, Eric W; King, David A; Miller, Laurence F; Viars, James A

    2013-05-01

    The Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) survey approach is comprised of systematic random sampling coupled with radiation scanning to assess acceptability of potential hot spots. Hot spot identification for some radionuclides may not be possible due to the very weak gamma or x-ray radiation they emit-these hard-to-detect nuclides are unlikely to be identified by field scans. Similarly, scanning technology is not yet available for chemical contamination. For both hard-to-detect nuclides and chemical contamination, hot spots are only identified via volumetric sampling. The remedial investigation and cleanup of sites under the Comprehensive Environmental Response, Compensation, and Liability Act typically includes the collection of samples over relatively large exposure units, and concentration limits are applied assuming the contamination is more or less uniformly distributed. However, data collected from contaminated sites demonstrate contamination is often highly localized. These highly localized areas, or hot spots, will only be identified if sample densities are high or if the environmental characterization program happens to sample directly from the hot spot footprint. This paper describes a Bayesian approach for addressing hard-to-detect nuclides and chemical hot spots. The approach begins using available data (e.g., as collected using the standard approach) to predict the probability that an unacceptable hot spot is present somewhere in the exposure unit. This Bayesian approach may even be coupled with the graded sampling approach to optimize hot spot characterization. Once the investigator concludes that the presence of hot spots is likely, then the surveyor should use the data quality objectives process to generate an appropriate sample campaign that optimizes the identification of risk-relevant hot spots.

  16. Parts on Demand: Evaluation of Approaches to Achieve Flexible Manufacturing Systems for Navy Partson Demand. Volume 1

    DTIC Science & Technology

    1984-02-01

    measurable impact if changed. The following items were included in the sample: * Mark Zero Items -Low demand insurance items which represent about three...R&D efforts reviewed. The resulting assessment highlighted the generic enabling technologies and cross- cutting R&D projects required to focus current...supplied by spot buys, and which may generate Navy Inventory Control Numbers (NICN). Random samples of data were extracted from the Master Data File ( MDF

  17. Robust Learning Control Design for Quantum Unitary Transformations.

    PubMed

    Wu, Chengzhi; Qi, Bo; Chen, Chunlin; Dong, Daoyi

    2017-12-01

    Robust control design for quantum unitary transformations has been recognized as a fundamental and challenging task in the development of quantum information processing due to unavoidable decoherence or operational errors in the experimental implementation of quantum operations. In this paper, we extend the systematic methodology of sampling-based learning control (SLC) approach with a gradient flow algorithm for the design of robust quantum unitary transformations. The SLC approach first uses a "training" process to find an optimal control strategy robust against certain ranges of uncertainties. Then a number of randomly selected samples are tested and the performance is evaluated according to their average fidelity. The approach is applied to three typical examples of robust quantum transformation problems including robust quantum transformations in a three-level quantum system, in a superconducting quantum circuit, and in a spin chain system. Numerical results demonstrate the effectiveness of the SLC approach and show its potential applications in various implementation of quantum unitary transformations.

  18. Securing image information using double random phase encoding and parallel compressive sensing with updated sampling processes

    NASA Astrophysics Data System (ADS)

    Hu, Guiqiang; Xiao, Di; Wang, Yong; Xiang, Tao; Zhou, Qing

    2017-11-01

    Recently, a new kind of image encryption approach using compressive sensing (CS) and double random phase encoding has received much attention due to the advantages such as compressibility and robustness. However, this approach is found to be vulnerable to chosen plaintext attack (CPA) if the CS measurement matrix is re-used. Therefore, designing an efficient measurement matrix updating mechanism that ensures resistance to CPA is of practical significance. In this paper, we provide a novel solution to update the CS measurement matrix by altering the secret sparse basis with the help of counter mode operation. Particularly, the secret sparse basis is implemented by a reality-preserving fractional cosine transform matrix. Compared with the conventional CS-based cryptosystem that totally generates all the random entries of measurement matrix, our scheme owns efficiency superiority while guaranteeing resistance to CPA. Experimental and analysis results show that the proposed scheme has a good security performance and has robustness against noise and occlusion.

  19. Active music therapy approach in amyotrophic lateral sclerosis: a randomized-controlled trial.

    PubMed

    Raglio, Alfredo; Giovanazzi, Elena; Pain, Debora; Baiardi, Paola; Imbriani, Chiara; Imbriani, Marcello; Mora, Gabriele

    2016-12-01

    This randomized controlled study assessed the efficacy of active music therapy (AMT) on anxiety, depression, and quality of life in amyotrophic lateral sclerosis (ALS). Communication and relationship during AMT treatment were also evaluated. Thirty patients were assigned randomly to experimental [AMT plus standard of care (SC)] or control (SC) groups. AMT consisted of 12 sessions (three times a week), whereas the SC treatment was based on physical and speech rehabilitation sessions, occupational therapy, and psychological support. ALS Functional Rating Scale-Revised, Hospital Anxiety and Depression Scale, McGill Quality of Life Questionnaire, and Music Therapy Rating Scale were administered to assess functional, psychological, and music therapy outcomes. The AMT group improved significantly in McGill Quality of Life Questionnaire global scores (P=0.035) and showed a positive trend in nonverbal and sonorous-music relationship during the treatment. Further studies involving larger samples in a longer AMT intervention are needed to confirm the effectiveness of this approach in ALS.

  20. A Markov random field based approach to the identification of meat and bone meal in feed by near-infrared spectroscopic imaging.

    PubMed

    Jiang, Xunpeng; Yang, Zengling; Han, Lujia

    2014-07-01

    Contaminated meat and bone meal (MBM) in animal feedstuff has been the source of bovine spongiform encephalopathy (BSE) disease in cattle, leading to a ban in its use, so methods for its detection are essential. In this study, five pure feed and five pure MBM samples were used to prepare two sets of sample arrangements: set A for investigating the discrimination of individual feed/MBM particles and set B for larger numbers of overlapping particles. The two sets were used to test a Markov random field (MRF)-based approach. A Fourier transform infrared (FT-IR) imaging system was used for data acquisition. The spatial resolution of the near-infrared (NIR) spectroscopic image was 25 μm × 25 μm. Each spectrum was the average of 16 scans across the wavenumber range 7,000-4,000 cm(-1), at intervals of 8 cm(-1). This study introduces an innovative approach to analyzing NIR spectroscopic images: an MRF-based approach has been developed using the iterated conditional mode (ICM) algorithm, integrating initial labeling-derived results from support vector machine discriminant analysis (SVMDA) and observation data derived from the results of principal component analysis (PCA). The results showed that MBM covered by feed could be successfully recognized with an overall accuracy of 86.59% and a Kappa coefficient of 0.68. Compared with conventional methods, the MRF-based approach is capable of extracting spectral information combined with spatial information from NIR spectroscopic images. This new approach enhances the identification of MBM using NIR spectroscopic imaging.

  1. Generalized optimal design for two-arm, randomized phase II clinical trials with endpoints from the exponential dispersion family.

    PubMed

    Jiang, Wei; Mahnken, Jonathan D; He, Jianghua; Mayo, Matthew S

    2016-11-01

    For two-arm randomized phase II clinical trials, previous literature proposed an optimal design that minimizes the total sample sizes subject to multiple constraints on the standard errors of the estimated event rates and their difference. The original design is limited to trials with dichotomous endpoints. This paper extends the original approach to be applicable to phase II clinical trials with endpoints from the exponential dispersion family distributions. The proposed optimal design minimizes the total sample sizes needed to provide estimates of population means of both arms and their difference with pre-specified precision. Its applications on data from specific distribution families are discussed under multiple design considerations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Feasibility and Integrity of a Parent-Teacher Consultation Intervention for ADHD Students

    ERIC Educational Resources Information Center

    Murray, Desiree W.; Rabiner, David; Schulte, Ann; Newitt, Kristy

    2008-01-01

    This study examined the feasibility and integrity of a daily report card (DRC) intervention in a small sample of randomly assigned elementary students with previously diagnosed ADHD and classroom impairment. In order to enhance implementation, a conjoint behavioral consultation approach was used in which parents were engaged as active participants…

  3. Linking Teacher Competences to Organizational Citizenship Behaviour: The Role of Empowerment

    ERIC Educational Resources Information Center

    Kasekende, Francis; Munene, John C.; Otengei, Samson Omuudu; Ntayi, Joseph Mpeera

    2016-01-01

    Purpose: The purpose of this paper is to examine relationship between teacher competences and organizational citizenship behavior (OCB) with empowerment as a mediating factor. Design/methodology/approach: The study took a cross-sectional descriptive and analytical design. Using cluster and random sampling procedures, data were obtained from 383…

  4. Approaches to Building Teacher-Parent Cooperation

    ERIC Educational Resources Information Center

    Cankar, Franc; Deutsch, Tomi; Sentocnik, Sonja

    2012-01-01

    The purpose of this study was to explore the areas of cooperation in which parent and teacher expectations were the same and where they differed. Data were obtained from a sample of 55 randomly selected primary schools. We analyzed school-to home communications, parental influence on school decisions, and parent involvement in different school…

  5. Motivation among Public Primary School Teachers in Mauritius

    ERIC Educational Resources Information Center

    Seebaluck, Ashley Keshwar; Seegum, Trisha Devi

    2013-01-01

    Purpose: The purpose of this study was to critically analyse the factors that affect the motivation of public primary school teachers and also to investigate if there is any relationship between teacher motivation and job satisfaction in Mauritius. Design/methodology/approach: Simple random sampling method was used to collect data from 250 primary…

  6. Perceptions of Online Credentials for School Principals

    ERIC Educational Resources Information Center

    Richardson, Jayson W.; McLeod, Scott; Dikkers, Amy Garrett

    2011-01-01

    Purpose: The purpose of this study is to investigate the perceptions of human resource directors in the USA about online credentials earned by K-12 school principals and principal candidates. Design/methodology/approach: In this mixed methods study, a survey was sent to a random sample of 500 human resource directors in K-12 school districts…

  7. Preferences Regarding School Sexuality Education among Elementary Schoolchildren's Parents

    ERIC Educational Resources Information Center

    Dake, Joseph A.; Price, James H.; Baksovich, Christine M.; Wielinski, Margaret

    2014-01-01

    Background: A comprehensive review of the literature failed to find any studies to assess elementary school parents' preferred philosophical approach to teaching sexuality education and sexuality education topics discussed by parents. All previous research reported parent data for grades K-12 or grades 9-12 only. Methods: A random sample of 2400…

  8. Assessing Professional Openness to a Novel Publication Approach.

    ERIC Educational Resources Information Center

    DeFiore, Roberta M.; Kramer, Thomas J.

    In response to criticism of the peer review publication process, a study surveyed a random sample of 600 members of the American Psychological Association (APA) to determine (1) whether professional openness exists regarding the publication of a journal highlighting studies that, due to a difficulty in the experimental procedure, would usually be…

  9. Encouraging Sixth-Grade Students' Problem-Solving Performance by Teaching through Problem Solving

    ERIC Educational Resources Information Center

    Bostic, Jonathan D.; Pape, Stephen J.; Jacobbe, Tim

    2016-01-01

    This teaching experiment provided students with continuous engagement in a problem-solving based instructional approach during one mathematics unit. Three sections of sixth-grade mathematics were sampled from a school in Florida, U.S.A. and one section was randomly assigned to experience teaching through problem solving. Students' problem-solving…

  10. Cigarette Smoking and Anti-Smoking Counseling Practices among Physicians in Wuhan, China

    ERIC Educational Resources Information Center

    Gong, Jie; Zhang, Zhifeng; Zhu, Zhaoyang; Wan, Jun; Yang, Niannian; Li, Fang; Sun, Huiling; Li, Weiping; Xia, Jiang; Zhou, Dunjin; Chen, Xinguang

    2012-01-01

    Purpose: The paper seeks to report data on cigarette smoking, anti-smoking practices, physicians' receipt of anti-smoking training, and the association between receipt of the training and anti-smoking practice among physicians in Wuhan, China. Design/methodology/approach: Participants were selected through the stratified random sampling method.…

  11. PRELIMINARY STUDIES ON THE POPULATION GENETICS OF THE CENTRAL STONEROLLER (CAMPOSTOMA ANOMALUM) FROM THE GREAT MIAMI RIVER BASIN, OHIO

    EPA Science Inventory

    Molecular approaches are particularly useful for measuring genetic diversity and were applied to samples of central stonerollers obtained from sites along tributaries to the Great Miami River in Ohio. We used Random Amplified Polymorphic DNA (RAPD) analysis to assess the level of...

  12. PRELIMINARY STUDIES ON THE POPULATION GENETICS OF THE CENTRAL STONEROLLER (COMPOSTOMA ANOMALUM) FROM THE GREAT MIAMI RIVER BASIN, OHIO

    EPA Science Inventory

    Molecular approaches are particularly useful for measuring genetic diversity and were applied to samples of central stonerollers obtained from sites along tributaries to the Great Miami River in Ohio. We used Random Amplified Polymorphic DNA (RAPD) analysis to assess the level o...

  13. Testing the Efficacy of Brief Multicultural Education Interventions in White College Students

    ERIC Educational Resources Information Center

    Garriott, Patton O.; Reiter, Stephanie; Brownfield, Jenna

    2016-01-01

    This pilot study tested the overall and relative efficacy of 3 common approaches to multicultural education in a sample (N = 52) of White undergraduate students using a randomized controlled trial. Brief multicultural education interventions were delivered in the form of educational, social norming, and entertainment conditions. Affective (i.e.,…

  14. Relationships between Teacher Organizational Commitment, Psychological Hardiness and Some Demographic Variables in Turkish Primary Schools

    ERIC Educational Resources Information Center

    Sezgin, Ferudun

    2009-01-01

    Purpose: The purpose of this paper is to examine the relationships between teachers' organizational commitment perceptions and both their psychological hardiness and some demographic variables in a sample of Turkish primary schools. Design/methodology/approach: A total of 405 randomly selected teachers working at primary schools in Ankara…

  15. Nonparametric Bayesian predictive distributions for future order statistics

    Treesearch

    Richard A. Johnson; James W. Evans; David W. Green

    1999-01-01

    We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...

  16. Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.

    PubMed

    Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai

    2017-11-01

    For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.

  17. A systematic examination of a random sampling strategy for source apportionment calculations.

    PubMed

    Andersson, August

    2011-12-15

    Estimating the relative contributions from multiple potential sources of a specific component in a mixed environmental matrix is a general challenge in diverse fields such as atmospheric, environmental and earth sciences. Perhaps the most common strategy for tackling such problems is by setting up a system of linear equations for the fractional influence of different sources. Even though an algebraic solution of this approach is possible for the common situation with N+1 sources and N source markers, such methodology introduces a bias, since it is implicitly assumed that the calculated fractions and the corresponding uncertainties are independent of the variability of the source distributions. Here, a random sampling (RS) strategy for accounting for such statistical bias is examined by investigating rationally designed synthetic data sets. This random sampling methodology is found to be robust and accurate with respect to reproducibility and predictability. This method is also compared to a numerical integration solution for a two-source situation where source variability also is included. A general observation from this examination is that the variability of the source profiles not only affects the calculated precision but also the mean/median source contributions. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. A Bayesian Approach to the Paleomagnetic Conglomerate Test

    NASA Astrophysics Data System (ADS)

    Heslop, David; Roberts, Andrew P.

    2018-02-01

    The conglomerate test has served the paleomagnetic community for over 60 years as a means to detect remagnetizations. The test states that if a suite of clasts within a bed have uniformly random paleomagnetic directions, then the conglomerate cannot have experienced a pervasive event that remagnetized the clasts in the same direction. The current form of the conglomerate test is based on null hypothesis testing, which results in a binary "pass" (uniformly random directions) or "fail" (nonrandom directions) outcome. We have recast the conglomerate test in a Bayesian framework with the aim of providing more information concerning the level of support a given data set provides for a hypothesis of uniformly random paleomagnetic directions. Using this approach, we place the conglomerate test in a fully probabilistic framework that allows for inconclusive results when insufficient information is available to draw firm conclusions concerning the randomness or nonrandomness of directions. With our method, sample sets larger than those typically employed in paleomagnetism may be required to achieve strong support for a hypothesis of random directions. Given the potentially detrimental effect of unrecognized remagnetizations on paleomagnetic reconstructions, it is important to provide a means to draw statistically robust data-driven inferences. Our Bayesian analysis provides a means to do this for the conglomerate test.

  19. Wastewater testing compared to random urinalyses for the surveillance of illicit drug use in prisons

    PubMed Central

    Brewer, Alex J.; Banta-Green, Caleb J.; Ort, Christoph; Robel, Alix E.

    2015-01-01

    Introduction and Aims Illicit drug use is known to occur among inmate populations of correctional (prison) facilities. Conventional approaches to monitor illicit drug use in prisons include random urinalyses (RUAs). Conventional approaches are expected to be prone to bias because prisoners may be aware of which days of the week RUAs are conducted. Therefore, we wanted to compare wastewater loads for methamphetamine and cocaine during days with RUA testing and without. Design and Methods We collected daily 24-hour composite samples of wastewater by continuous sampling, computed daily loads for one month and compared the frequency of illicit drug detection to the number of positive RUAs. Diurnal data also were collected for three days in order to determine within-day patterns of illicit drugs excretion. Results Methamphetamine was observed in each sample of prison wastewater with no significant difference in daily mass loads between RUA testing and non-testing days. Cocaine and its major metabolite, benzoylecgonine, were observed only at levels below quantification in prison wastewater. Six RUAs were positive for methamphetamine during the month while none were positive for cocaine out of the 243 RUAs conducted. Discussion and Conclusions Wastewater analyses offer data regarding the frequency of illicit drug excretion inside the prison that RUAs alone could not detect. PMID:25100044

  20. Are there Benefits to Combining Regional Probabalistic Survey and Historic Targeted Environmental Monitoring Data to Improve Our Understanding of Overall Regional Estuary Environmental Status?

    NASA Astrophysics Data System (ADS)

    Dasher, D. H.; Lomax, T. J.; Bethe, A.; Jewett, S.; Hoberg, M.

    2016-02-01

    A regional probabilistic survey of 20 randomly selected stations, where water and sediments were sampled, was conducted over an area of Simpson Lagoon and Gwydyr Bay in the Beaufort Sea adjacent Prudhoe Bay, Alaska, in 2014. Sampling parameters included water column for temperature, salinity, dissolved oxygen, chlorophyll a, nutrients and sediments for macroinvertebrates, chemistry, i.e., trace metals and hydrocarbons, and grain size. The 2014 probabilistic survey design allows for inferences to be made of environmental status, for instance the spatial or aerial distribution of sediment trace metals within the design area sampled. Historically, since the 1970's a number of monitoring studies have been conducted in this estuary area using a targeted rather than regional probabilistic design. Targeted non-random designs were utilized to assess specific points of interest and cannot be used to make inferences to distributions of environmental parameters. Due to differences in the environmental monitoring objectives between probabilistic and targeted designs there has been limited assessment see if benefits exist to combining the two approaches. This study evaluates if a combined approach using the 2014 probabilistic survey sediment trace metal and macroinvertebrate results and historical targeted monitoring data can provide a new perspective on better understanding the environmental status of these estuaries.

  1. Modeling and Simulation of Upset-Inducing Disturbances for Digital Systems in an Electromagnetic Reverberation Chamber

    NASA Technical Reports Server (NTRS)

    Torres-Pomales, Wilfredo

    2014-01-01

    This report describes a modeling and simulation approach for disturbance patterns representative of the environment experienced by a digital system in an electromagnetic reverberation chamber. The disturbance is modeled by a multi-variate statistical distribution based on empirical observations. Extended versions of the Rejection Samping and Inverse Transform Sampling techniques are developed to generate multi-variate random samples of the disturbance. The results show that Inverse Transform Sampling returns samples with higher fidelity relative to the empirical distribution. This work is part of an ongoing effort to develop a resilience assessment methodology for complex safety-critical distributed systems.

  2. A tale of two "forests": random forest machine learning AIDS tropical forest carbon mapping.

    PubMed

    Mascaro, Joseph; Asner, Gregory P; Knapp, David E; Kennedy-Bowdoin, Ty; Martin, Roberta E; Anderson, Christopher; Higgins, Mark; Chadwick, K Dana

    2014-01-01

    Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including--in the latter case--x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called "out-of-bag"), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha(-1) when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation.

  3. A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping

    PubMed Central

    Mascaro, Joseph; Asner, Gregory P.; Knapp, David E.; Kennedy-Bowdoin, Ty; Martin, Roberta E.; Anderson, Christopher; Higgins, Mark; Chadwick, K. Dana

    2014-01-01

    Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including—in the latter case—x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called “out-of-bag”), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha−1 when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation. PMID:24489686

  4. Probability machines: consistent probability estimation using nonparametric learning machines.

    PubMed

    Malley, J D; Kruppa, J; Dasgupta, A; Malley, K G; Ziegler, A

    2012-01-01

    Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications.

  5. Meta-analysis of multiple outcomes: a multilevel approach.

    PubMed

    Van den Noortgate, Wim; López-López, José Antonio; Marín-Martínez, Fulgencio; Sánchez-Meca, Julio

    2015-12-01

    In meta-analysis, dependent effect sizes are very common. An example is where in one or more studies the effect of an intervention is evaluated on multiple outcome variables for the same sample of participants. In this paper, we evaluate a three-level meta-analytic model to account for this kind of dependence, extending the simulation results of Van den Noortgate, López-López, Marín-Martínez, and Sánchez-Meca Behavior Research Methods, 45, 576-594 (2013) by allowing for a variation in the number of effect sizes per study, in the between-study variance, in the correlations between pairs of outcomes, and in the sample size of the studies. At the same time, we explore the performance of the approach if the outcomes used in a study can be regarded as a random sample from a population of outcomes. We conclude that although this approach is relatively simple and does not require prior estimates of the sampling covariances between effect sizes, it gives appropriate mean effect size estimates, standard error estimates, and confidence interval coverage proportions in a variety of realistic situations.

  6. A Fosmid Cloning Strategy for Detecting the Widest Possible Spectrum of Microbes from the International Space Station Drinking Water System

    PubMed Central

    Choi, Sangdun; Chang, Mi Sook; Stuecker, Tara; Chung, Christine; Newcombe, David A.; Venkateswaran, Kasthuri

    2012-01-01

    In this study, fosmid cloning strategies were used to assess the microbial populations in water from the International Space Station (ISS) drinking water system (henceforth referred to as Prebiocide and Tank A water samples). The goals of this study were: to compare the sensitivity of the fosmid cloning strategy with that of traditional culture-based and 16S rRNA-based approaches and to detect the widest possible spectrum of microbial populations during the water purification process. Initially, microbes could not be cultivated, and conventional PCR failed to amplify 16S rDNA fragments from these low biomass samples. Therefore, randomly primed rolling-circle amplification was used to amplify any DNA that might be present in the samples, followed by size selection by using pulsed-field gel electrophoresis. The amplified high-molecular-weight DNA from both samples was cloned into fosmid vectors. Several hundred clones were randomly selected for sequencing, followed by Blastn/Blastx searches. Sequences encoding specific genes from Burkholderia, a species abundant in the soil and groundwater, were found in both samples. Bradyrhizobium and Mesorhizobium, which belong to rhizobia, a large community of nitrogen fixers often found in association with plant roots, were present in the Prebiocide samples. Ralstonia, which is prevalent in soils with a high heavy metal content, was detected in the Tank A samples. The detection of many unidentified sequences suggests the presence of potentially novel microbial fingerprints. The bacterial diversity detected in this pilot study using a fosmid vector approach was higher than that detected by conventional 16S rRNA gene sequencing. PMID:23346038

  7. A robust and hierarchical approach for the automatic co-registration of intensity and visible images

    NASA Astrophysics Data System (ADS)

    González-Aguilera, Diego; Rodríguez-Gonzálvez, Pablo; Hernández-López, David; Luis Lerma, José

    2012-09-01

    This paper presents a new robust approach to integrate intensity and visible images which have been acquired with a terrestrial laser scanner and a calibrated digital camera, respectively. In particular, an automatic and hierarchical method for the co-registration of both sensors is developed. The approach integrates several existing solutions to improve the performance of the co-registration between range-based and visible images: the Affine Scale-Invariant Feature Transform (A-SIFT), the epipolar geometry, the collinearity equations, the Groebner basis solution and the RANdom SAmple Consensus (RANSAC), integrating a voting scheme. The approach presented herein improves the existing co-registration approaches in automation, robustness, reliability and accuracy.

  8. Simulation methods to estimate design power: an overview for applied research.

    PubMed

    Arnold, Benjamin F; Hogan, Daniel R; Colford, John M; Hubbard, Alan E

    2011-06-20

    Estimating the required sample size and statistical power for a study is an integral part of study design. For standard designs, power equations provide an efficient solution to the problem, but they are unavailable for many complex study designs that arise in practice. For such complex study designs, computer simulation is a useful alternative for estimating study power. Although this approach is well known among statisticians, in our experience many epidemiologists and social scientists are unfamiliar with the technique. This article aims to address this knowledge gap. We review an approach to estimate study power for individual- or cluster-randomized designs using computer simulation. This flexible approach arises naturally from the model used to derive conventional power equations, but extends those methods to accommodate arbitrarily complex designs. The method is universally applicable to a broad range of designs and outcomes, and we present the material in a way that is approachable for quantitative, applied researchers. We illustrate the method using two examples (one simple, one complex) based on sanitation and nutritional interventions to improve child growth. We first show how simulation reproduces conventional power estimates for simple randomized designs over a broad range of sample scenarios to familiarize the reader with the approach. We then demonstrate how to extend the simulation approach to more complex designs. Finally, we discuss extensions to the examples in the article, and provide computer code to efficiently run the example simulations in both R and Stata. Simulation methods offer a flexible option to estimate statistical power for standard and non-traditional study designs and parameters of interest. The approach we have described is universally applicable for evaluating study designs used in epidemiologic and social science research.

  9. Simulation methods to estimate design power: an overview for applied research

    PubMed Central

    2011-01-01

    Background Estimating the required sample size and statistical power for a study is an integral part of study design. For standard designs, power equations provide an efficient solution to the problem, but they are unavailable for many complex study designs that arise in practice. For such complex study designs, computer simulation is a useful alternative for estimating study power. Although this approach is well known among statisticians, in our experience many epidemiologists and social scientists are unfamiliar with the technique. This article aims to address this knowledge gap. Methods We review an approach to estimate study power for individual- or cluster-randomized designs using computer simulation. This flexible approach arises naturally from the model used to derive conventional power equations, but extends those methods to accommodate arbitrarily complex designs. The method is universally applicable to a broad range of designs and outcomes, and we present the material in a way that is approachable for quantitative, applied researchers. We illustrate the method using two examples (one simple, one complex) based on sanitation and nutritional interventions to improve child growth. Results We first show how simulation reproduces conventional power estimates for simple randomized designs over a broad range of sample scenarios to familiarize the reader with the approach. We then demonstrate how to extend the simulation approach to more complex designs. Finally, we discuss extensions to the examples in the article, and provide computer code to efficiently run the example simulations in both R and Stata. Conclusions Simulation methods offer a flexible option to estimate statistical power for standard and non-traditional study designs and parameters of interest. The approach we have described is universally applicable for evaluating study designs used in epidemiologic and social science research. PMID:21689447

  10. Improved Compressive Sensing of Natural Scenes Using Localized Random Sampling

    PubMed Central

    Barranca, Victor J.; Kovačič, Gregor; Zhou, Douglas; Cai, David

    2016-01-01

    Compressive sensing (CS) theory demonstrates that by using uniformly-random sampling, rather than uniformly-spaced sampling, higher quality image reconstructions are often achievable. Considering that the structure of sampling protocols has such a profound impact on the quality of image reconstructions, we formulate a new sampling scheme motivated by physiological receptive field structure, localized random sampling, which yields significantly improved CS image reconstructions. For each set of localized image measurements, our sampling method first randomly selects an image pixel and then measures its nearby pixels with probability depending on their distance from the initially selected pixel. We compare the uniformly-random and localized random sampling methods over a large space of sampling parameters, and show that, for the optimal parameter choices, higher quality image reconstructions can be consistently obtained by using localized random sampling. In addition, we argue that the localized random CS optimal parameter choice is stable with respect to diverse natural images, and scales with the number of samples used for reconstruction. We expect that the localized random sampling protocol helps to explain the evolutionarily advantageous nature of receptive field structure in visual systems and suggests several future research areas in CS theory and its application to brain imaging. PMID:27555464

  11. Prevalence Incidence Mixture Models

    Cancer.gov

    The R package and webtool fits Prevalence Incidence Mixture models to left-censored and irregularly interval-censored time to event data that is commonly found in screening cohorts assembled from electronic health records. Absolute and relative risk can be estimated for simple random sampling, and stratified sampling (the two approaches of superpopulation and a finite population are supported for target populations). Non-parametric (absolute risks only), semi-parametric, weakly-parametric (using B-splines), and some fully parametric (such as the logistic-Weibull) models are supported.

  12. The mean and variance of phylogenetic diversity under rarefaction

    PubMed Central

    Matsen, Frederick A.

    2013-01-01

    Summary Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required. PMID:23833701

  13. The mean and variance of phylogenetic diversity under rarefaction.

    PubMed

    Nipperess, David A; Matsen, Frederick A

    2013-06-01

    Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required.

  14. Assessing the Relationship of Ancient and Modern Populations

    PubMed Central

    Schraiber, Joshua G.

    2018-01-01

    Genetic material sequenced from ancient samples is revolutionizing our understanding of the recent evolutionary past. However, ancient DNA is often degraded, resulting in low coverage, error-prone sequencing. Several solutions exist to this problem, ranging from simple approach, such as selecting a read at random for each site, to more complicated approaches involving genotype likelihoods. In this work, we present a novel method for assessing the relationship of an ancient sample with a modern population, while accounting for sequencing error and postmortem damage by analyzing raw reads from multiple ancient individuals simultaneously. We show that, when analyzing SNP data, it is better to sequence more ancient samples to low coverage: two samples sequenced to 0.5× coverage provide better resolution than a single sample sequenced to 2× coverage. We also examined the power to detect whether an ancient sample is directly ancestral to a modern population, finding that, with even a few high coverage individuals, even ancient samples that are very slightly diverged from the modern population can be detected with ease. When we applied our approach to European samples, we found that no ancient samples represent direct ancestors of modern Europeans. We also found that, as shown previously, the most ancient Europeans appear to have had the smallest effective population sizes, indicating a role for agriculture in modern population growth. PMID:29167200

  15. Total-reflection X-ray fluorescence studies of trace elements in biomedical samples

    NASA Astrophysics Data System (ADS)

    Kubala-Kukuś, A.; Braziewicz, J.; Pajek, M.

    2004-08-01

    Application of the total-reflection X-ray fluorescence (TXRF) analysis in the studies of trace element contents in biomedical samples is discussed in the following aspects: (i) a nature of trace element concentration distributions, (ii) censoring approach to the detection limits, and (iii) a comparison of two sets of censored data. The paper summarizes the recent results achieved in this topics, in particular, the lognormal, or more general logstable, nature of concentration distribution of trace elements, the random left-censoring and the Kaplan-Meier approach accounting for detection limits and, finally, the application of the logrank test to compare the censored concentrations measured for two groups. These new aspects, which are of importance for applications of the TXRF in different fields, are discussed here in the context of TXRF studies of trace element in various samples of medical interest.

  16. Dynamic laser speckle analyzed considering inhomogeneities in the biological sample

    NASA Astrophysics Data System (ADS)

    Braga, Roberto A.; González-Peña, Rolando J.; Viana, Dimitri Campos; Rivera, Fernando Pujaico

    2017-04-01

    Dynamic laser speckle phenomenon allows a contactless and nondestructive way to monitor biological changes that are quantified by second-order statistics applied in the images in time using a secondary matrix known as time history of the speckle pattern (THSP). To avoid being time consuming, the traditional way to build the THSP restricts the data to a line or column. Our hypothesis is that the spatial restriction of the information could compromise the results, particularly when undesirable and unexpected optical inhomogeneities occur, such as in cell culture media. It tested a spatial random approach to collect the points to form a THSP. Cells in a culture medium and in drying paint, representing homogeneous samples in different levels, were tested, and a comparison with the traditional method was carried out. An alternative random selection based on a Gaussian distribution around a desired position was also presented. The results showed that the traditional protocol presented higher variation than the outcomes using the random method. The higher the inhomogeneity of the activity map, the higher the efficiency of the proposed method using random points. The Gaussian distribution proved to be useful when there was a well-defined area to monitor.

  17. Quantifying uncertainty in discharge measurements: A new approach

    USGS Publications Warehouse

    Kiang, J.E.; Cohn, T.A.; Mason, R.R.

    2009-01-01

    The accuracy of discharge measurements using velocity meters and the velocity-area method is typically assessed based on empirical studies that may not correspond to conditions encountered in practice. In this paper, a statistical approach for assessing uncertainty based on interpolated variance estimation (IVE) is introduced. The IVE method quantifies all sources of random uncertainty in the measured data. This paper presents results employing data from sites where substantial over-sampling allowed for the comparison of IVE-estimated uncertainty and observed variability among repeated measurements. These results suggest that the IVE approach can provide approximate estimates of measurement uncertainty. The use of IVE to estimate the uncertainty of a discharge measurement would provide the hydrographer an immediate determination of uncertainty and help determine whether there is a need for additional sampling in problematic river cross sections. ?? 2009 ASCE.

  18. Expedite random structure searching using objects from Wyckoff positions

    NASA Astrophysics Data System (ADS)

    Wang, Shu-Wei; Hsing, Cheng-Rong; Wei, Ching-Ming

    2018-02-01

    Random structure searching has been proved to be a powerful approach to search and find the global minimum and the metastable structures. A true random sampling is in principle needed yet it would be highly time-consuming and/or practically impossible to find the global minimum for the complicated systems in their high-dimensional configuration space. Thus the implementations of reasonable constraints, such as adopting system symmetries to reduce the independent dimension in structural space and/or imposing chemical information to reach and relax into low-energy regions, are the most essential issues in the approach. In this paper, we propose the concept of "object" which is either an atom or composed of a set of atoms (such as molecules or carbonates) carrying a symmetry defined by one of the Wyckoff positions of space group and through this process it allows the searching of global minimum for a complicated system to be confined in a greatly reduced structural space and becomes accessible in practice. We examined several representative materials, including Cd3As2 crystal, solid methanol, high-pressure carbonates (FeCO3), and Si(111)-7 × 7 reconstructed surface, to demonstrate the power and the advantages of using "object" concept in random structure searching.

  19. Mendelian Randomization.

    PubMed

    Grover, Sandeep; Del Greco M, Fabiola; Stein, Catherine M; Ziegler, Andreas

    2017-01-01

    Confounding and reverse causality have prevented us from drawing meaningful clinical interpretation even in well-powered observational studies. Confounding may be attributed to our inability to randomize the exposure variable in observational studies. Mendelian randomization (MR) is one approach to overcome confounding. It utilizes one or more genetic polymorphisms as a proxy for the exposure variable of interest. Polymorphisms are randomly distributed in a population, they are static throughout an individual's lifetime, and may thus help in inferring directionality in exposure-outcome associations. Genome-wide association studies (GWAS) or meta-analyses of GWAS are characterized by large sample sizes and the availability of many single nucleotide polymorphisms (SNPs), making GWAS-based MR an attractive approach. GWAS-based MR comes with specific challenges, including multiple causality. Despite shortcomings, it still remains one of the most powerful techniques for inferring causality.With MR still an evolving concept with complex statistical challenges, the literature is relatively scarce in terms of providing working examples incorporating real datasets. In this chapter, we provide a step-by-step guide for causal inference based on the principles of MR with a real dataset using both individual and summary data from unrelated individuals. We suggest best possible practices and give recommendations based on the current literature.

  20. Strategies for informed sample size reduction in adaptive controlled clinical trials

    NASA Astrophysics Data System (ADS)

    Arandjelović, Ognjen

    2017-12-01

    Clinical trial adaptation refers to any adjustment of the trial protocol after the onset of the trial. The main goal is to make the process of introducing new medical interventions to patients more efficient. The principal challenge, which is an outstanding research problem, is to be found in the question of how adaptation should be performed so as to minimize the chance of distorting the outcome of the trial. In this paper, we propose a novel method for achieving this. Unlike most of the previously published work, our approach focuses on trial adaptation by sample size adjustment, i.e. by reducing the number of trial participants in a statistically informed manner. Our key idea is to select the sample subset for removal in a manner which minimizes the associated loss of information. We formalize this notion and describe three algorithms which approach the problem in different ways, respectively, using (i) repeated random draws, (ii) a genetic algorithm, and (iii) what we term pair-wise sample compatibilities. Experiments on simulated data demonstrate the effectiveness of all three approaches, with a consistently superior performance exhibited by the pair-wise sample compatibilities-based method.

  1. Non-target time trend screening: a data reduction strategy for detecting emerging contaminants in biological samples.

    PubMed

    Plassmann, Merle M; Tengstrand, Erik; Åberg, K Magnus; Benskin, Jonathan P

    2016-06-01

    Non-targeted mass spectrometry-based approaches for detecting novel xenobiotics in biological samples are hampered by the occurrence of naturally fluctuating endogenous substances, which are difficult to distinguish from environmental contaminants. Here, we investigate a data reduction strategy for datasets derived from a biological time series. The objective is to flag reoccurring peaks in the time series based on increasing peak intensities, thereby reducing peak lists to only those which may be associated with emerging bioaccumulative contaminants. As a result, compounds with increasing concentrations are flagged while compounds displaying random, decreasing, or steady-state time trends are removed. As an initial proof of concept, we created artificial time trends by fortifying human whole blood samples with isotopically labelled standards. Different scenarios were investigated: eight model compounds had a continuously increasing trend in the last two to nine time points, and four model compounds had a trend that reached steady state after an initial increase. Each time series was investigated at three fortification levels and one unfortified series. Following extraction, analysis by ultra performance liquid chromatography high-resolution mass spectrometry, and data processing, a total of 21,700 aligned peaks were obtained. Peaks displaying an increasing trend were filtered from randomly fluctuating peaks using time trend ratios and Spearman's rank correlation coefficients. The first approach was successful in flagging model compounds spiked at only two to three time points, while the latter approach resulted in all model compounds ranking in the top 11 % of the peak lists. Compared to initial peak lists, a combination of both approaches reduced the size of datasets by 80-85 %. Overall, non-target time trend screening represents a promising data reduction strategy for identifying emerging bioaccumulative contaminants in biological samples. Graphical abstract Using time trends to filter out emerging contaminants from large peak lists.

  2. Combining counts and incidence data: an efficient approach for estimating the log-normal species abundance distribution and diversity indices.

    PubMed

    Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G

    2012-10-01

    Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.

  3. Evaluating Bayesian spatial methods for modelling species distributions with clumped and restricted occurrence data.

    PubMed

    Redding, David W; Lucas, Tim C D; Blackburn, Tim M; Jones, Kate E

    2017-01-01

    Statistical approaches for inferring the spatial distribution of taxa (Species Distribution Models, SDMs) commonly rely on available occurrence data, which is often clumped and geographically restricted. Although available SDM methods address some of these factors, they could be more directly and accurately modelled using a spatially-explicit approach. Software to fit models with spatial autocorrelation parameters in SDMs are now widely available, but whether such approaches for inferring SDMs aid predictions compared to other methodologies is unknown. Here, within a simulated environment using 1000 generated species' ranges, we compared the performance of two commonly used non-spatial SDM methods (Maximum Entropy Modelling, MAXENT and boosted regression trees, BRT), to a spatial Bayesian SDM method (fitted using R-INLA), when the underlying data exhibit varying combinations of clumping and geographic restriction. Finally, we tested how any recommended methodological settings designed to account for spatially non-random patterns in the data impact inference. Spatial Bayesian SDM method was the most consistently accurate method, being in the top 2 most accurate methods in 7 out of 8 data sampling scenarios. Within high-coverage sample datasets, all methods performed fairly similarly. When sampling points were randomly spread, BRT had a 1-3% greater accuracy over the other methods and when samples were clumped, the spatial Bayesian SDM method had a 4%-8% better AUC score. Alternatively, when sampling points were restricted to a small section of the true range all methods were on average 10-12% less accurate, with greater variation among the methods. Model inference under the recommended settings to account for autocorrelation was not impacted by clumping or restriction of data, except for the complexity of the spatial regression term in the spatial Bayesian model. Methods, such as those made available by R-INLA, can be successfully used to account for spatial autocorrelation in an SDM context and, by taking account of random effects, produce outputs that can better elucidate the role of covariates in predicting species occurrence. Given that it is often unclear what the drivers are behind data clumping in an empirical occurrence dataset, or indeed how geographically restricted these data are, spatially-explicit Bayesian SDMs may be the better choice when modelling the spatial distribution of target species.

  4. Accounting for treatment by center interaction in sample size determinations and the use of surrogate outcomes in the pessary for the prevention of preterm birth trial: a simulation study.

    PubMed

    Willan, Andrew R

    2016-07-05

    The Pessary for the Prevention of Preterm Birth Study (PS3) is an international, multicenter, randomized clinical trial designed to examine the effectiveness of the Arabin pessary in preventing preterm birth in pregnant women with a short cervix. During the design of the study two methodological issues regarding power and sample size were raised. Since treatment in the Standard Arm will vary between centers, it is anticipated that so too will the probability of preterm birth in that arm. This will likely result in a treatment by center interaction, and the issue of how this will affect the sample size requirements was raised. The sample size requirements to examine the effect of the pessary on the baby's clinical outcome was prohibitively high, so the second issue is how best to examine the effect on clinical outcome. The approaches taken to address these issues are presented. Simulation and sensitivity analysis were used to address the sample size issue. The probability of preterm birth in the Standard Arm was assumed to vary between centers following a Beta distribution with a mean of 0.3 and a coefficient of variation of 0.3. To address the second issue a Bayesian decision model is proposed that combines the information regarding the between-treatment difference in the probability of preterm birth from PS3 with the data from the Multiple Courses of Antenatal Corticosteroids for Preterm Birth Study that relate preterm birth and perinatal mortality/morbidity. The approach provides a between-treatment comparison with respect to the probability of a bad clinical outcome. The performance of the approach was assessed using simulation and sensitivity analysis. Accounting for a possible treatment by center interaction increased the sample size from 540 to 700 patients per arm for the base case. The sample size requirements increase with the coefficient of variation and decrease with the number of centers. Under the same assumptions used for determining the sample size requirements, the simulated mean probability that pessary reduces the risk of perinatal mortality/morbidity is 0.98. The simulated mean decreased with coefficient of variation and increased with the number of clinical sites. Employing simulation and sensitivity analysis is a useful approach for determining sample size requirements while accounting for the additional uncertainty due to a treatment by center interaction. Using a surrogate outcome in conjunction with a Bayesian decision model is an efficient way to compare important clinical outcomes in a randomized clinical trial in situations where the direct approach requires a prohibitively high sample size.

  5. Prevalence of substandard and falsified artemisinin-based combination antimalarial medicines on Bioko Island, Equatorial Guinea.

    PubMed

    Kaur, Harparkash; Allan, Elizabeth Louise; Mamadu, Ibrahim; Hall, Zoe; Green, Michael D; Swamidos, Isabel; Dwivedi, Prabha; Culzoni, Maria Julia; Fernandez, Facundo M; Garcia, Guillermo; Hergott, Dianna; Monti, Feliciano

    2017-01-01

    Poor-quality artemisinin-containing antimalarials (ACAs), including falsified and substandard formulations, pose serious health concerns in malaria endemic countries. They can harm patients, contribute to the rise in drug resistance and increase the public's mistrust of health systems. Systematic assessment of drug quality is needed to gain knowledge on the prevalence of the problem, to provide Ministries of Health with evidence on which local regulators can take action. We used three sampling approaches to purchase 677 ACAs from 278 outlets on Bioko Island, Equatorial Guinea as follows: convenience survey using mystery client (n=16 outlets, 31 samples), full island-wide survey using mystery client (n=174 outlets, 368 samples) and randomised survey using an overt sampling approach (n=88 outlets, 278 samples). The stated active pharmaceutical ingredients (SAPIs) were assessed using high-performance liquid chromatography and confirmed by mass spectrometry at three independent laboratories. Content analysis showed 91.0% of ACAs were of acceptable quality, 1.6% were substandard and 7.4% falsified. No degraded medicines were detected. The prevalence of medicines without the SAPIs was higher for ACAs purchased in the convenience survey compared with the estimates obtained using the full island-wide survey-mystery client and randomised-overt sampling approaches. Comparable results were obtained for full island survey-mystery client and randomised overt. However, the availability of purchased artesunate monotherapies differed substantially according to the sampling approach used (convenience, 45.2%; full island-wide survey-mystery client, 32.6%; random-overt sampling approach, 21.9%). Of concern is that 37.1% (n=62) of these were falsified. Falsified ACAs were found on Bioko Island, with the prevalence ranging between 6.1% and 16.1%, depending on the sampling method used. These findings underscore the vital need for national authorities to track the scale of ineffective medicines that jeopardise treatment of life-threatening diseases and value of a representative sampling approach to obtain/measure the true prevalence of poor-quality medicines.

  6. A Comparative Study of Factors Influencing Male and Female Lecturers' Job Satisfaction in Ghanaian Higher Education

    ERIC Educational Resources Information Center

    Amos, Patricia Mawusi; Acquah, Sakina; Antwi, Theresa; Adzifome, Nixon Saba

    2015-01-01

    The study sought to compare factors influencing male and female lecturers' job satisfaction. Cross-sectional survey designs employing both quantitative and qualitative approaches were adopted for the study. Simple random sampling was used to select 163 lecturers from the four oldest public universities in Ghana. Celep's (2000) Organisational…

  7. Measuring Organizational Learning Capability in Indian Managers and Establishing Firm Performance Linkage: An Empirical Analysis

    ERIC Educational Resources Information Center

    Bhatnagar, Jyotsna

    2006-01-01

    Purpose: The purpose of this research is to measure Organizational Learning Capability (OLC) perception in the managers of public, private and multinational organizations and establish the link between OLC and firm performance. Design/methodology/approach: The data were collected from a sample of 612 managers randomly drawn from Indian industry,…

  8. The Use and Validation of Qualitative Methods Used in Program Evaluation.

    ERIC Educational Resources Information Center

    Plucker, Frank E.

    When conducting a two-year college program review, there are several advantages to supplementing the standard quantitative research approach with qualitative measures. Qualitative research does not depend on a large number of random samples, it uses a flexible design which can be refined as the research is executed, and it generates findings in a…

  9. Causal Factors Influencing Adversity Quotient of Twelfth Grade and Third-Year Vocational Students

    ERIC Educational Resources Information Center

    Pangma, Rachapoom; Tayraukham, Sombat; Nuangchalerm, Prasart

    2009-01-01

    Problem statement: The aim of this research was to study the causal factors influencing students' adversity between twelfth grade and third-year vocational students in Sisaket province, Thailand. Six hundred and seventy two of twelfth grade and 376 third-year vocational students were selected by multi-stage random sampling techniques. Approach:…

  10. School Performance Trajectories and the Challenges for Principal Succession

    ERIC Educational Resources Information Center

    Lee, Linda C.

    2015-01-01

    Purpose: The purpose of this paper is to use empirical data on new principals to clarify the connection between different succession situations and the challenges their successor principals face. Design/methodology/approach: The study draws on two waves of interview data from a random sample of 16 new elementary school principals in a major urban…

  11. Demographic trends in Claremont California’s street tree population

    Treesearch

    Natalie S. van Doorn; E. Gregory McPherson

    2018-01-01

    The aim of this study was to quantify street tree population dynamics in the city of Claremont, CA. A repeated measures survey (2000 and 2014) based on a stratified random sampling approach across size classes and for the most abundant 21 species was analyzed to calculate removal, growth, and replacement planting rates. Demographic rates were estimated using a...

  12. A Needs-Based Approach to the Development of a Diagnostic College English Speaking Test

    ERIC Educational Resources Information Center

    Zhao, Zhongbao

    2014-01-01

    This paper investigated the current situation of oral English teaching, learning, and assessment at the tertiary level in China through needs analysis and explored the implications for the development of a diagnostic speaking test. Through random sampling, the researcher administered both a student questionnaire and a teacher questionnaire to over…

  13. Using Empirical Data to Set Cutoff Scores.

    ERIC Educational Resources Information Center

    Hills, John R.

    Six experimental approaches to the problems of setting cutoff scores and choosing proper test length are briefly mentioned. Most of these methods share the premise that a test is a random sample of items, from a domain associated with a carefully specified objective. Each item is independent and is scored zero or one, with no provision for…

  14. Analysis of the Special Studies Program Based on the Interviews of Its Students.

    ERIC Educational Resources Information Center

    Esp, Barbarann; Torelli, Alexis

    The special studies program at Hofstra University is designed for high school graduates applying to the university whose educational backgrounds require a more personalized approach to introductory college work. An attempt is made to minimize the risk of poor academic performance during the first year in college. A random sample of 24 students in…

  15. Teacher-Reported Use of Empirically Validated and Standards-Based Instructional Approaches in Secondary Mathematics

    ERIC Educational Resources Information Center

    Gagnon, Joseph Calvin; Maccini, Paula

    2007-01-01

    A random sample of 167 secondary special and general educators who taught math to students with emotional and behavioral disorders (EBD) and learning disabilities (LD) responded to a mail survey. The survey examined teacher perceptions of (a) definition of math; (b) familiarity with course topics; (c) effectiveness of methods courses; (d)…

  16. An unconventional approach to ecosystem unit classification in western North Carolina, USA

    Treesearch

    W. Henry McNab; Sara A. Browning; Steven A. Simon; Penelope E. Fouts

    1999-01-01

    The authors used an unconventional combination of data transformation and multivariate analyses to reduce subjectivity in identification of ecosystem units in a mountainous region of western North Carolina, USA. Vegetative cover and environmental variables were measured on 79 stratified, randomly located, 0.1 ha sample plots in a 4000 ha watershed. Binary...

  17. Social-Economic Factors, Student Factors, Student Academic Goals and Performance of Students in Institutions of Higher Learning in Uganda

    ERIC Educational Resources Information Center

    Ngoma, Muhammed; Ntale, Peter Dithan; Abaho, Earnest

    2017-01-01

    This article evaluates the relationship between social-economic factors, students' factors, student academic goals and performance of students. The study adopts a cross-sectional survey, with largely quantitative approaches. A sample of 950 students was randomly and proportionately drawn from undergraduates in four institutions of higher learning.…

  18. Developing a "Social Presence Scale" for E-Learning Environments

    ERIC Educational Resources Information Center

    Kilic Cakmak, Ebru; Cebi, Ayça; Kan, Adnan

    2014-01-01

    The purpose of the current study is to develop a "social presence scale" for e-learning environments. A systematic approach was followed for developing the scale. The scale was applied to 461 students registered in seven different programs at Gazi University. The sample was split into two subsamples on a random basis (n1 = 261; n2 =…

  19. Managing Human Resource Capabilities for Sustainable Competitive Advantage: An Empirical Analysis from Indian Global Organisations

    ERIC Educational Resources Information Center

    Khandekar, Aradhana; Sharma, Anuradha

    2005-01-01

    Purpose: The purpose of this article is to examine the role of human resource capability (HRC) in organisational performance and sustainable competitive advantage (SCA) in Indian global organisations. Design/Methodology/Approach: To carry out the present study, an empirical research on a random sample of 300 line or human resource managers from…

  20. The Reasons for the Reluctance of Princess Alia University College Students' from Practicing Sports Activities

    ERIC Educational Resources Information Center

    Odat, Jebril

    2015-01-01

    This study aimed at investigating the reasons lying behind the reluctance of participation in sport activities among Alia Princess College female students, using descriptive approach. The population of the study consisted of (2000) female students, whereas the sample was of (200) students. They were randomly selected and a questionnaire of 31…

  1. An Examination of Pay Facets and Referent Groups for Assessing Pay Satisfaction of Male Elementary School Principals

    ERIC Educational Resources Information Center

    Young, I. Phillip; Young, Karen Holsey; Okhremtchouk, Irina; Castaneda, Jose Moreno

    2009-01-01

    Pay satisfaction was assessed according to different facets (pay level, benefits, pay structure, and pay raises) and potential referent groups (teachers and elementary school principals) for a random sample of male elementary school principals. A structural model approach was used that considers facets of the pay process, potential others as…

  2. Variability And Uncertainty Analysis Of Contaminant Transport Model Using Fuzzy Latin Hypercube Sampling Technique

    NASA Astrophysics Data System (ADS)

    Kumar, V.; Nayagum, D.; Thornton, S.; Banwart, S.; Schuhmacher2, M.; Lerner, D.

    2006-12-01

    Characterization of uncertainty associated with groundwater quality models is often of critical importance, as for example in cases where environmental models are employed in risk assessment. Insufficient data, inherent variability and estimation errors of environmental model parameters introduce uncertainty into model predictions. However, uncertainty analysis using conventional methods such as standard Monte Carlo sampling (MCS) may not be efficient, or even suitable, for complex, computationally demanding models and involving different nature of parametric variability and uncertainty. General MCS or variant of MCS such as Latin Hypercube Sampling (LHS) assumes variability and uncertainty as a single random entity and the generated samples are treated as crisp assuming vagueness as randomness. Also when the models are used as purely predictive tools, uncertainty and variability lead to the need for assessment of the plausible range of model outputs. An improved systematic variability and uncertainty analysis can provide insight into the level of confidence in model estimates, and can aid in assessing how various possible model estimates should be weighed. The present study aims to introduce, Fuzzy Latin Hypercube Sampling (FLHS), a hybrid approach of incorporating cognitive and noncognitive uncertainties. The noncognitive uncertainty such as physical randomness, statistical uncertainty due to limited information, etc can be described by its own probability density function (PDF); whereas the cognitive uncertainty such estimation error etc can be described by the membership function for its fuzziness and confidence interval by ?-cuts. An important property of this theory is its ability to merge inexact generated data of LHS approach to increase the quality of information. The FLHS technique ensures that the entire range of each variable is sampled with proper incorporation of uncertainty and variability. A fuzzified statistical summary of the model results will produce indices of sensitivity and uncertainty that relate the effects of heterogeneity and uncertainty of input variables to model predictions. The feasibility of the method is validated to assess uncertainty propagation of parameter values for estimation of the contamination level of a drinking water supply well due to transport of dissolved phenolics from a contaminated site in the UK.

  3. Randomly picked cosmid clones overlap the pyrB and oriC gap in the physical map of the E. coli chromosome.

    PubMed Central

    Knott, V; Rees, D J; Cheng, Z; Brownlee, G G

    1988-01-01

    Sets of overlapping cosmid clones generated by random sampling and fingerprinting methods complement data at pyrB (96.5') and oriC (84') in the published physical map of E. coli. A new cloning strategy using sheared DNA, and a low copy, inducible cosmid vector were used in order to reduce bias in libraries, in conjunction with micro-methods for preparing cosmid DNA from a large number of clones. Our results are relevant to the design of the best approach to the physical mapping of large genomes. PMID:2834694

  4. Digital simulation of two-dimensional random fields with arbitrary power spectra and non-Gaussian probability distribution functions.

    PubMed

    Yura, Harold T; Hanson, Steen G

    2012-04-01

    Methods for simulation of two-dimensional signals with arbitrary power spectral densities and signal amplitude probability density functions are disclosed. The method relies on initially transforming a white noise sample set of random Gaussian distributed numbers into a corresponding set with the desired spectral distribution, after which this colored Gaussian probability distribution is transformed via an inverse transform into the desired probability distribution. In most cases the method provides satisfactory results and can thus be considered an engineering approach. Several illustrative examples with relevance for optics are given.

  5. A review of accuracy assessment for object-based image analysis: From per-pixel to per-polygon approaches

    NASA Astrophysics Data System (ADS)

    Ye, Su; Pontius, Robert Gilmore; Rakshit, Rahul

    2018-07-01

    Object-based image analysis (OBIA) has gained widespread popularity for creating maps from remotely sensed data. Researchers routinely claim that OBIA procedures outperform pixel-based procedures; however, it is not immediately obvious how to evaluate the degree to which an OBIA map compares to reference information in a manner that accounts for the fact that the OBIA map consists of objects that vary in size and shape. Our study reviews 209 journal articles concerning OBIA published between 2003 and 2017. We focus on the three stages of accuracy assessment: (1) sampling design, (2) response design and (3) accuracy analysis. First, we report the literature's overall characteristics concerning OBIA accuracy assessment. Simple random sampling was the most used method among probability sampling strategies, slightly more than stratified sampling. Office interpreted remotely sensed data was the dominant reference source. The literature reported accuracies ranging from 42% to 96%, with an average of 85%. A third of the articles failed to give sufficient information concerning accuracy methodology such as sampling scheme and sample size. We found few studies that focused specifically on the accuracy of the segmentation. Second, we identify a recent increase of OBIA articles in using per-polygon approaches compared to per-pixel approaches for accuracy assessment. We clarify the impacts of the per-pixel versus the per-polygon approaches respectively on sampling, response design and accuracy analysis. Our review defines the technical and methodological needs in the current per-polygon approaches, such as polygon-based sampling, analysis of mixed polygons, matching of mapped with reference polygons and assessment of segmentation accuracy. Our review summarizes and discusses the current issues in object-based accuracy assessment to provide guidance for improved accuracy assessments for OBIA.

  6. Comparison of NMR simulations of porous media derived from analytical and voxelized representations.

    PubMed

    Jin, Guodong; Torres-Verdín, Carlos; Toumelin, Emmanuel

    2009-10-01

    We develop and compare two formulations of the random-walk method, grain-based and voxel-based, to simulate the nuclear-magnetic-resonance (NMR) response of fluids contained in various models of porous media. The grain-based approach uses a spherical grain pack as input, where the solid surface is analytically defined without an approximation. In the voxel-based approach, the input is a computer-tomography or computer-generated image of reconstructed porous media. Implementation of the two approaches is largely the same, except for the representation of porous media. For comparison, both approaches are applied to various analytical and digitized models of porous media: isolated spherical pore, simple cubic packing of spheres, and random packings of monodisperse and polydisperse spheres. We find that spin magnetization decays much faster in the digitized models than in their analytical counterparts. The difference in decay rate relates to the overestimation of surface area due to the discretization of the sample; it cannot be eliminated even if the voxel size decreases. However, once considering the effect of surface-area increase in the simulation of surface relaxation, good quantitative agreement is found between the two approaches. Different grain or pore shapes entail different rates of increase of surface area, whereupon we emphasize that the value of the "surface-area-corrected" coefficient may not be universal. Using an example of X-ray-CT image of Fontainebleau rock sample, we show that voxel size has a significant effect on the calculated surface area and, therefore, on the numerically simulated magnetization response.

  7. The impact of traffic sign deficit on road traffic accidents in Nigeria.

    PubMed

    Ezeibe, Christian; Ilo, Chukwudi; Oguonu, Chika; Ali, Alphonsus; Abada, Ifeanyi; Ezeibe, Ezinwanne; Oguonu, Chukwunonso; Abada, Felicia; Izueke, Edwin; Agbo, Humphrey

    2018-04-04

    This study assesses the impact of traffic sign deficit on road traffic accidents in Nigeria. The participants were 720 commercial vehicle drivers. While simple random sampling was used to select 6 out of 137 federal highways, stratified random sampling was used to select six categories of commercial vehicle drivers. The study used qual-dominant mixed methods approach comprising key informant interviews; group interviews; field observation; policy appraisal and secondary literature on traffic signs. Result shows that the failure of government to provide and maintain traffic signs in order to guide road users through the numerous accident black spots on the highways is the major cause of road accidents in Nigeria. The study argues that provision and maintenance of traffic signs present opportunity to promoting safety on the highways and achieving the sustainable development goals.

  8. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference

    PubMed Central

    Maples, Brian K.; Gravel, Simon; Kenny, Eimear E.; Bustamante, Carlos D.

    2013-01-01

    Local-ancestry inference is an important step in the genetic analysis of fully sequenced human genomes. Current methods can only detect continental-level ancestry (i.e., European versus African versus Asian) accurately even when using millions of markers. Here, we present RFMix, a powerful discriminative modeling approach that is faster (∼30×) and more accurate than existing methods. We accomplish this by using a conditional random field parameterized by random forests trained on reference panels. RFMix is capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors. RFMix shows high sensitivity and specificity in simulated Hispanics/Latinos and African Americans and admixed Europeans, Africans, and Asians. Finally, we demonstrate that African Americans in HapMap contain modest (but nonzero) levels of Native American ancestry (∼0.4%). PMID:23910464

  9. From medium heterogeneity to flow and transport: A time-domain random walk approach

    NASA Astrophysics Data System (ADS)

    Hakoun, V.; Comolli, A.; Dentz, M.

    2017-12-01

    The prediction of flow and transport processes in heterogeneous porous media is based on the qualitative and quantitative understanding of the interplay between 1) spatial variability of hydraulic conductivity, 2) groundwater flow and 3) solute transport. Using a stochastic modeling approach, we study this interplay through direct numerical simulations of Darcy flow and advective transport in heterogeneous media. First, we study flow in correlated hydraulic permeability fields and shed light on the relationship between the statistics of log-hydraulic conductivity, a medium attribute, and the flow statistics. Second, we determine relationships between Eulerian and Lagrangian velocity statistics, this means, between flow and transport attributes. We show how Lagrangian statistics and thus transport behaviors such as late particle arrival times are influenced by the medium heterogeneity on one hand and the initial particle velocities on the other. We find that equidistantly sampled Lagrangian velocities can be described by a Markov process that evolves on the characteristic heterogeneity length scale. We employ a stochastic relaxation model for the equidistantly sampled particle velocities, which is parametrized by the velocity correlation length. This description results in a time-domain random walk model for the particle motion, whose spatial transitions are characterized by the velocity correlation length and temporal transitions by the particle velocities. This approach relates the statistical medium and flow properties to large scale transport, and allows for conditioning on the initial particle velocities and thus to the medium properties in the injection region. The approach is tested against direct numerical simulations.

  10. Estimation of river and stream temperature trends under haphazard sampling

    USGS Publications Warehouse

    Gray, Brian R.; Lyubchich, Vyacheslav; Gel, Yulia R.; Rogala, James T.; Robertson, Dale M.; Wei, Xiaoqiao

    2015-01-01

    Long-term temporal trends in water temperature in rivers and streams are typically estimated under the assumption of evenly-spaced space-time measurements. However, sampling times and dates associated with historical water temperature datasets and some sampling designs may be haphazard. As a result, trends in temperature may be confounded with trends in time or space of sampling which, in turn, may yield biased trend estimators and thus unreliable conclusions. We address this concern using multilevel (hierarchical) linear models, where time effects are allowed to vary randomly by day and date effects by year. We evaluate the proposed approach by Monte Carlo simulations with imbalance, sparse data and confounding by trend in time and date of sampling. Simulation results indicate unbiased trend estimators while results from a case study of temperature data from the Illinois River, USA conform to river thermal assumptions. We also propose a new nonparametric bootstrap inference on multilevel models that allows for a relatively flexible and distribution-free quantification of uncertainties. The proposed multilevel modeling approach may be elaborated to accommodate nonlinearities within days and years when sampling times or dates typically span temperature extremes.

  11. State estimator for multisensor systems with irregular sampling and time-varying delays

    NASA Astrophysics Data System (ADS)

    Peñarrocha, I.; Sanchis, R.; Romero, J. A.

    2012-08-01

    This article addresses the state estimation in linear time-varying systems with several sensors with different availability, randomly sampled in time and whose measurements have a time-varying delay. The approach is based on a modification of the Kalman filter with the negative-time measurement update strategy, avoiding running back the full standard Kalman filter, the use of full augmented order models or the use of reorganisation techniques, leading to a lower implementation cost algorithm. The update equations are run every time a new measurement is available, independently of the time when it was taken. The approach is useful for networked control systems, systems with long delays and scarce measurements and for out-of-sequence measurements.

  12. Assessing map accuracy in a remotely sensed, ecoregion-scale cover map

    USGS Publications Warehouse

    Edwards, T.C.; Moisen, Gretchen G.; Cutler, D.R.

    1998-01-01

    Landscape- and ecoregion-based conservation efforts increasingly use a spatial component to organize data for analysis and interpretation. A challenge particular to remotely sensed cover maps generated from these efforts is how best to assess the accuracy of the cover maps, especially when they can exceed 1000 s/km2 in size. Here we develop and describe a methodological approach for assessing the accuracy of large-area cover maps, using as a test case the 21.9 million ha cover map developed for Utah Gap Analysis. As part of our design process, we first reviewed the effect of intracluster correlation and a simple cost function on the relative efficiency of cluster sample designs to simple random designs. Our design ultimately combined clustered and subsampled field data stratified by ecological modeling unit and accessibility (hereafter a mixed design). We next outline estimation formulas for simple map accuracy measures under our mixed design and report results for eight major cover types and the three ecoregions mapped as part of the Utah Gap Analysis. Overall accuracy of the map was 83.2% (SE=1.4). Within ecoregions, accuracy ranged from 78.9% to 85.0%. Accuracy by cover type varied, ranging from a low of 50.4% for barren to a high of 90.6% for man modified. In addition, we examined gains in efficiency of our mixed design compared with a simple random sample approach. In regard to precision, our mixed design was more precise than a simple random design, given fixed sample costs. We close with a discussion of the logistical constraints facing attempts to assess the accuracy of large-area, remotely sensed cover maps.

  13. Improving the evidence base in palliative care to inform practice and policy: thinking outside the box.

    PubMed

    Aoun, Samar M; Nekolaichuk, Cheryl

    2014-12-01

    The adoption of evidence-based hierarchies and research methods from other disciplines may not completely translate to complex palliative care settings. The heterogeneity of the palliative care population, complexity of clinical presentations, and fluctuating health states present significant research challenges. The aim of this narrative review was to explore the debate about the use of current evidence-based approaches for conducting research, such as randomized controlled trials and other study designs, in palliative care, and more specifically to (1) describe key myths about palliative care research; (2) highlight substantive challenges of conducting palliative care research, using case illustrations; and (3) propose specific strategies to address some of these challenges. Myths about research in palliative care revolve around evidence hierarchies, sample heterogeneity, random assignment, participant burden, and measurement issues. Challenges arise because of the complex physical, psychological, existential, and spiritual problems faced by patients, families, and service providers. These challenges can be organized according to six general domains: patient, system/organization, context/setting, study design, research team, and ethics. A number of approaches for dealing with challenges in conducting research fall into five separate domains: study design, sampling, conceptual, statistical, and measures and outcomes. Although randomized controlled trials have their place whenever possible, alternative designs may offer more feasible research protocols that can be successfully implemented in palliative care. Therefore, this article highlights "outside the box" approaches that would benefit both clinicians and researchers in the palliative care field. Ultimately, the selection of research designs is dependent on a clearly articulated research question, which drives the research process. Copyright © 2014 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  14. Compressive sensing based wireless sensor for structural health monitoring

    NASA Astrophysics Data System (ADS)

    Bao, Yuequan; Zou, Zilong; Li, Hui

    2014-03-01

    Data loss is a common problem for monitoring systems based on wireless sensors. Reliable communication protocols, which enhance communication reliability by repetitively transmitting unreceived packets, is one approach to tackle the problem of data loss. An alternative approach allows data loss to some extent and seeks to recover the lost data from an algorithmic point of view. Compressive sensing (CS) provides such a data loss recovery technique. This technique can be embedded into smart wireless sensors and effectively increases wireless communication reliability without retransmitting the data. The basic idea of CS-based approach is that, instead of transmitting the raw signal acquired by the sensor, a transformed signal that is generated by projecting the raw signal onto a random matrix, is transmitted. Some data loss may occur during the transmission of this transformed signal. However, according to the theory of CS, the raw signal can be effectively reconstructed from the received incomplete transformed signal given that the raw signal is compressible in some basis and the data loss ratio is low. This CS-based technique is implemented into the Imote2 smart sensor platform using the foundation of Illinois Structural Health Monitoring Project (ISHMP) Service Tool-suite. To overcome the constraints of limited onboard resources of wireless sensor nodes, a method called random demodulator (RD) is employed to provide memory and power efficient construction of the random sampling matrix. Adaptation of RD sampling matrix is made to accommodate data loss in wireless transmission and meet the objectives of the data recovery. The embedded program is tested in a series of sensing and communication experiments. Examples and parametric study are presented to demonstrate the applicability of the embedded program as well as to show the efficacy of CS-based data loss recovery for real wireless SHM systems.

  15. Evaluation of different approaches for identifying optimal sites to predict mean hillslope soil moisture content

    NASA Astrophysics Data System (ADS)

    Liao, Kaihua; Zhou, Zhiwen; Lai, Xiaoming; Zhu, Qing; Feng, Huihui

    2017-04-01

    The identification of representative soil moisture sampling sites is important for the validation of remotely sensed mean soil moisture in a certain area and ground-based soil moisture measurements in catchment or hillslope hydrological studies. Numerous approaches have been developed to identify optimal sites for predicting mean soil moisture. Each method has certain advantages and disadvantages, but they have rarely been evaluated and compared. In our study, surface (0-20 cm) soil moisture data from January 2013 to March 2016 (a total of 43 sampling days) were collected at 77 sampling sites on a mixed land-use (tea and bamboo) hillslope in the hilly area of Taihu Lake Basin, China. A total of 10 methods (temporal stability (TS) analyses based on 2 indices, K-means clustering based on 6 kinds of inputs and 2 random sampling strategies) were evaluated for determining optimal sampling sites for mean soil moisture estimation. They were TS analyses based on the smallest index of temporal stability (ITS, a combination of the mean relative difference and standard deviation of relative difference (SDRD)) and based on the smallest SDRD, K-means clustering based on soil properties and terrain indices (EFs), repeated soil moisture measurements (Theta), EFs plus one-time soil moisture data (EFsTheta), and the principal components derived from EFs (EFs-PCA), Theta (Theta-PCA), and EFsTheta (EFsTheta-PCA), and global and stratified random sampling strategies. Results showed that the TS based on the smallest ITS was better (RMSE = 0.023 m3 m-3) than that based on the smallest SDRD (RMSE = 0.034 m3 m-3). The K-means clustering based on EFsTheta (-PCA) was better (RMSE <0.020 m3 m-3) than these based on EFs (-PCA) and Theta (-PCA). The sampling design stratified by the land use was more efficient than the global random method. Forty and 60 sampling sites are needed for stratified sampling and global sampling respectively to make their performances comparable to the best K-means method (EFsTheta-PCA). Overall, TS required only one site, but its accuracy was limited. The best K-means method required <8 sites and yielded high accuracy, but extra soil and terrain information is necessary when using this method. The stratified sampling strategy can only be used if no pre-knowledge about soil moisture variation is available. This information will help in selecting the optimal methods for estimation the area mean soil moisture.

  16. Model's sparse representation based on reduced mixed GMsFE basis methods

    NASA Astrophysics Data System (ADS)

    Jiang, Lijian; Li, Qiuqi

    2017-06-01

    In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a large number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.

  17. Model's sparse representation based on reduced mixed GMsFE basis methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Lijian, E-mail: ljjiang@hnu.edu.cn; Li, Qiuqi, E-mail: qiuqili@hnu.edu.cn

    2017-06-01

    In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a largemore » number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.« less

  18. A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution.

    PubMed

    Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme

    2013-07-01

    The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. Supplementary data are available at Bioinformatics online.

  19. Data-driven probability concentration and sampling on manifold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu

    2016-09-15

    A new methodology is proposed for generating realizations of a random vector with values in a finite-dimensional Euclidean space that are statistically consistent with a dataset of observations of this vector. The probability distribution of this random vector, while a priori not known, is presumed to be concentrated on an unknown subset of the Euclidean space. A random matrix is introduced whose columns are independent copies of the random vector and for which the number of columns is the number of data points in the dataset. The approach is based on the use of (i) the multidimensional kernel-density estimation methodmore » for estimating the probability distribution of the random matrix, (ii) a MCMC method for generating realizations for the random matrix, (iii) the diffusion-maps approach for discovering and characterizing the geometry and the structure of the dataset, and (iv) a reduced-order representation of the random matrix, which is constructed using the diffusion-maps vectors associated with the first eigenvalues of the transition matrix relative to the given dataset. The convergence aspects of the proposed methodology are analyzed and a numerical validation is explored through three applications of increasing complexity. The proposed method is found to be robust to noise levels and data complexity as well as to the intrinsic dimension of data and the size of experimental datasets. Both the methodology and the underlying mathematical framework presented in this paper contribute new capabilities and perspectives at the interface of uncertainty quantification, statistical data analysis, stochastic modeling and associated statistical inverse problems.« less

  20. Incorporating uncertainty and motion in Intensity Modulated Radiation Therapy treatment planning

    NASA Astrophysics Data System (ADS)

    Martin, Benjamin Charles

    In radiation therapy, one seeks to destroy a tumor while minimizing the damage to surrounding healthy tissue. Intensity Modulated Radiation Therapy (IMRT) uses overlapping beams of x-rays that add up to a high dose within the target and a lower dose in the surrounding healthy tissue. IMRT relies on optimization techniques to create high quality treatments. Unfortunately, the possible conformality is limited by the need to ensure coverage even if there is organ movement or deformation. Currently, margins are added around the tumor to ensure coverage based on an assumed motion range. This approach does not ensure high quality treatments. In the standard IMRT optimization problem, an objective function measures the deviation of the dose from the clinical goals. The optimization then finds the beamlet intensities that minimize the objective function. When modeling uncertainty, the dose delivered from a given set of beamlet intensities is a random variable. Thus the objective function is also a random variable. In our stochastic formulation we minimize the expected value of this objective function. We developed a problem formulation that is both flexible and fast enough for use on real clinical cases. While working on accelerating the stochastic optimization, we developed a technique of voxel sampling. Voxel sampling is a randomized algorithms approach to a steepest descent problem based on estimating the gradient by only calculating the dose to a fraction of the voxels within the patient. When combined with an automatic sampling rate adaptation technique, voxel sampling produced an order of magnitude speed up in IMRT optimization. We also develop extensions of our results to Intensity Modulated Proton Therapy (IMPT). Due to the physics of proton beams the stochastic formulation yields visibly different and better plans than normal optimization. The results of our research have been incorporated into a software package OPT4D, which is an IMRT and IMPT optimization tool that we developed.

  1. Group investigation with scientific approach in mathematics learning

    NASA Astrophysics Data System (ADS)

    Indarti, D.; Mardiyana; Pramudya, I.

    2018-03-01

    The aim of this research is to find out the effect of learning model toward mathematics achievement. This research is quasi-experimental research. The population of research is all VII grade students of Karanganyar regency in the academic year of 2016/2017. The sample of this research was taken using stratified cluster random sampling technique. Data collection was done based on mathematics achievement test. The data analysis technique used one-way ANOVA following the normality test with liliefors method and homogeneity test with Bartlett method. The results of this research is the mathematics learning using Group Investigation learning model with scientific approach produces the better mathematics learning achievement than learning with conventional model on material of quadrilateral. Group Investigation learning model with scientific approach can be used by the teachers in mathematics learning, especially in the material of quadrilateral, which is can improve the mathematics achievement.

  2. Evaluation of three sampling methods to monitor outcomes of antiretroviral treatment programmes in low- and middle-income countries.

    PubMed

    Tassie, Jean-Michel; Malateste, Karen; Pujades-Rodríguez, Mar; Poulet, Elisabeth; Bennett, Diane; Harries, Anthony; Mahy, Mary; Schechter, Mauro; Souteyrand, Yves; Dabis, François

    2010-11-10

    Retention of patients on antiretroviral therapy (ART) over time is a proxy for quality of care and an outcome indicator to monitor ART programs. Using existing databases (Antiretroviral in Lower Income Countries of the International Databases to Evaluate AIDS and Médecins Sans Frontières), we evaluated three sampling approaches to simplify the generation of outcome indicators. We used individual patient data from 27 ART sites and included 27,201 ART-naive adults (≥15 years) who initiated ART in 2005. For each site, we generated two outcome indicators at 12 months, retention on ART and proportion of patients lost to follow-up (LFU), first using all patient data and then within a smaller group of patients selected using three sampling methods (random, systematic and consecutive sampling). For each method and each site, 500 samples were generated, and the average result was compared with the unsampled value. The 95% sampling distribution (SD) was expressed as the 2.5(th) and 97.5(th) percentile values from the 500 samples. Overall, retention on ART was 76.5% (range 58.9-88.6) and the proportion of patients LFU, 13.5% (range 0.8-31.9). Estimates of retention from sampling (n = 5696) were 76.5% (SD 75.4-77.7) for random, 76.5% (75.3-77.5) for systematic and 76.0% (74.1-78.2) for the consecutive method. Estimates for the proportion of patients LFU were 13.5% (12.6-14.5), 13.5% (12.6-14.3) and 14.0% (12.5-15.5), respectively. With consecutive sampling, 50% of sites had SD within ±5% of the unsampled site value. Our results suggest that random, systematic or consecutive sampling methods are feasible for monitoring ART indicators at national level. However, sampling may not produce precise estimates in some sites.

  3. 3D statistical shape models incorporating 3D random forest regression voting for robust CT liver segmentation

    NASA Astrophysics Data System (ADS)

    Norajitra, Tobias; Meinzer, Hans-Peter; Maier-Hein, Klaus H.

    2015-03-01

    During image segmentation, 3D Statistical Shape Models (SSM) usually conduct a limited search for target landmarks within one-dimensional search profiles perpendicular to the model surface. In addition, landmark appearance is modeled only locally based on linear profiles and weak learners, altogether leading to segmentation errors from landmark ambiguities and limited search coverage. We present a new method for 3D SSM segmentation based on 3D Random Forest Regression Voting. For each surface landmark, a Random Regression Forest is trained that learns a 3D spatial displacement function between the according reference landmark and a set of surrounding sample points, based on an infinite set of non-local randomized 3D Haar-like features. Landmark search is then conducted omni-directionally within 3D search spaces, where voxelwise forest predictions on landmark position contribute to a common voting map which reflects the overall position estimate. Segmentation experiments were conducted on a set of 45 CT volumes of the human liver, of which 40 images were randomly chosen for training and 5 for testing. Without parameter optimization, using a simple candidate selection and a single resolution approach, excellent results were achieved, while faster convergence and better concavity segmentation were observed, altogether underlining the potential of our approach in terms of increased robustness from distinct landmark detection and from better search coverage.

  4. Effects of a Multicomponent Life-Style Intervention on Weight, Glycemic Control, Depressive Symptoms, and Renal Function in Low-Income, Minority Patients With Type 2 Diabetes: Results of the Community Approach to Lifestyle Modification for Diabetes Randomized Controlled Trial.

    PubMed

    Moncrieft, Ashley E; Llabre, Maria M; McCalla, Judith Rey; Gutt, Miriam; Mendez, Armando J; Gellman, Marc D; Goldberg, Ronald B; Schneiderman, Neil

    2016-09-01

    Few interventions have combined life-style and psychosocial approaches in the context of Type 2 diabetes management. The purpose of this study was to determine the effect of a multicomponent behavioral intervention on weight, glycemic control, renal function, and depressive symptoms in a sample of overweight/obese adults with Type 2 diabetes and marked depressive symptoms. A sample of 111 adults with Type 2 diabetes were randomly assigned to a 1-year intervention (n = 57) or usual care (n = 54) in a parallel groups design. Primary outcomes included weight, glycosylated hemoglobin, and Beck Depression Inventory II score. Estimated glomerular filtration rate served as a secondary outcome. All measures were assessed at baseline and 6 and 12 months after randomization by assessors blind to randomization. Latent growth modeling was used to examine intervention effects on each outcome. The intervention resulted in decreased weight (mean [M] = 0.322 kg, standard error [SE] = 0.124 kg, p = .010) and glycosylated hemoglobin (M = 0.066%, SE = 0.028%, p = .017), and Beck Depression Inventory II scores (M = 1.009, SE = 0.226, p < .001), and improved estimated glomerular filtration rate (M = 0.742 ml·min·1.73 m, SE = 0.318 ml·min·1.73 m, p = .020) each month during the first 6 months relative to usual care. Multicomponent behavioral interventions targeting weight loss and depressive symptoms as well as diet and physical activity are efficacious in the management of Type 2 diabetes. This study is registered at Clinicaltrials.gov ID: NCT01739205.

  5. Robustness-Based Design Optimization Under Data Uncertainty

    NASA Technical Reports Server (NTRS)

    Zaman, Kais; McDonald, Mark; Mahadevan, Sankaran; Green, Lawrence

    2010-01-01

    This paper proposes formulations and algorithms for design optimization under both aleatory (i.e., natural or physical variability) and epistemic uncertainty (i.e., imprecise probabilistic information), from the perspective of system robustness. The proposed formulations deal with epistemic uncertainty arising from both sparse and interval data without any assumption about the probability distributions of the random variables. A decoupled approach is proposed in this paper to un-nest the robustness-based design from the analysis of non-design epistemic variables to achieve computational efficiency. The proposed methods are illustrated for the upper stage design problem of a two-stage-to-orbit (TSTO) vehicle, where the information on the random design inputs are only available as sparse point and/or interval data. As collecting more data reduces uncertainty but increases cost, the effect of sample size on the optimality and robustness of the solution is also studied. A method is developed to determine the optimal sample size for sparse point data that leads to the solutions of the design problem that are least sensitive to variations in the input random variables.

  6. A new method for reconstruction of solar irradiance

    NASA Astrophysics Data System (ADS)

    Privalsky, Victor

    2018-07-01

    The purpose of this research is to show how time series should be reconstructed using an example with the data on total solar irradiation (TSI) of the Earth and on sunspot numbers (SSN) since 1749. The traditional approach through regression equation(s) is designed for time-invariant vectors of random variables and is not applicable to time series, which present random functions of time. The autoregressive reconstruction (ARR) method suggested here requires fitting a multivariate stochastic difference equation to the target/proxy time series. The reconstruction is done through the scalar equation for the target time series with the white noise term excluded. The time series approach is shown to provide a better reconstruction of TSI than the correlation/regression method. A reconstruction criterion is introduced which allows one to define in advance the achievable level of success in the reconstruction. The conclusion is that time series, including the total solar irradiance, cannot be reconstructed properly if the data are not treated as sample records of random processes and analyzed in both time and frequency domains.

  7. Evaluation of an integrated treatment for active duty service members with comorbid posttraumatic stress disorder and major depressive disorder: Study protocol for a randomized controlled trial.

    PubMed

    Walter, Kristen H; Glassman, Lisa H; Michael Hunt, W; Otis, Nicholas P; Thomsen, Cynthia J

    2018-01-01

    Posttraumatic stress disorder (PTSD) commonly co-occurs with major depressive disorder (MDD) in both civilian and military/veteran populations. Existing, evidence-based PTSD treatments, such as cognitive processing therapy (CPT), often reduce symptoms of both PTSD and depression; however, findings related to the influence of comorbid MDD on PTSD treatment outcomes are mixed, and few studies use samples of individuals with both conditions. Behavioral activation (BA), an approach that relies on behavioral principles, is an effective treatment for depression. We have integrated BA into CPT (BA+CPT), a more cognitive approach, to address depressive symptoms among active duty service members with both PTSD and comorbid MDD. We describe an ongoing randomized controlled trial investigating the efficacy of our innovative, integrated BA+CPT intervention, compared with standard CPT, for active duty service members with PTSD and comorbid MDD. We detail the development of this integrated treatment, as well as the design and implementation of the randomized controlled trial, to evaluate its effect on symptoms. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Low rank magnetic resonance fingerprinting.

    PubMed

    Mazor, Gal; Weizman, Lior; Tal, Assaf; Eldar, Yonina C

    2016-08-01

    Magnetic Resonance Fingerprinting (MRF) is a relatively new approach that provides quantitative MRI using randomized acquisition. Extraction of physical quantitative tissue values is preformed off-line, based on acquisition with varying parameters and a dictionary generated according to the Bloch equations. MRF uses hundreds of radio frequency (RF) excitation pulses for acquisition, and therefore high under-sampling ratio in the sampling domain (k-space) is required. This under-sampling causes spatial artifacts that hamper the ability to accurately estimate the quantitative tissue values. In this work, we introduce a new approach for quantitative MRI using MRF, called Low Rank MRF. We exploit the low rank property of the temporal domain, on top of the well-known sparsity of the MRF signal in the generated dictionary domain. We present an iterative scheme that consists of a gradient step followed by a low rank projection using the singular value decomposition. Experiments on real MRI data demonstrate superior results compared to conventional implementation of compressed sensing for MRF at 15% sampling ratio.

  9. A causal framework for understanding the effect of losses to follow-up on epidemiologic analyses in clinic-based cohorts: the case of HIV-infected patients on antiretroviral therapy in Africa.

    PubMed

    Geng, Elvin H; Glidden, David V; Bangsberg, David R; Bwana, Mwebesa Bosco; Musinguzi, Nicholas; Nash, Denis; Metcalfe, John Z; Yiannoutsos, Constantin T; Martin, Jeffrey N; Petersen, Maya L

    2012-05-15

    Although clinic-based cohorts are most representative of the "real world," they are susceptible to loss to follow-up. Strategies for managing the impact of loss to follow-up are therefore needed to maximize the value of studies conducted in these cohorts. The authors evaluated adult patients starting antiretroviral therapy at an HIV/AIDS clinic in Uganda, where 29% of patients were lost to follow-up after 2 years (January 1, 2004-September 30, 2007). Unweighted, inverse probability of censoring weighted (IPCW), and sampling-based approaches (using supplemental data from a sample of lost patients subsequently tracked in the community) were used to identify the predictive value of sex on mortality. Directed acyclic graphs (DAGs) were used to explore the structural basis for bias in each approach. Among 3,628 patients, unweighted and IPCW analyses found men to have higher mortality than women, whereas the sampling-based approach did not. DAGs encoding knowledge about the data-generating process, including the fact that death is a cause of being classified as lost to follow-up in this setting, revealed "collider" bias in the unweighted and IPCW approaches. In a clinic-based cohort in Africa, unweighted and IPCW approaches-which rely on the "missing at random" assumption-yielded biased estimates. A sampling-based approach can in general strengthen epidemiologic analyses conducted in many clinic-based cohorts, including those examining other diseases.

  10. Out-of-School Care and Problem Behavior Trajectories Among Low-Income Adolescents: Individual, Family, and Neighborhood Characteristics as Added Risks

    ERIC Educational Resources Information Center

    Coley, Rebekah Levine; Morris, Jodi Eileen; Hernandez, Daphne

    2004-01-01

    Using a developmental systems approach, this study considered longitudinal links between adolescents' out-of-school care experiences and behavioral trajectories within a random sample of 819 adolescents ages 10 to 14 years at Wave 1 from low-income, urban families. Multiple aspects of context were considered, including the location, supervision,…

  11. The Private School Market in Kuwait: A Field Study on Educational Investment Behavior of Kuwaiti Families

    ERIC Educational Resources Information Center

    Alqahtani, Abdulmuhsen Ayedh

    2014-01-01

    The current study aims at exploring Kuwaiti families' educational investment behavior pursuant to the selection of a specific private school for their children from the private school market. Using the quantitative approach and the principles of marketing research, a survey was administered to a randomly selected sample of Kuwaiti families (n =…

  12. Measuring, Understanding, and Responding to Covert Social Networks: Passive and Active Tomography

    DTIC Science & Technology

    2017-11-29

    Methods for generating a random sample of networks with desired properties are important tools for the analysis of social , biological, and information...on Theoretical Foundations for Statistical Network Analysis at the Isaac Newton Institute for Mathematical Sciences at Cambridge U. (organized by...Approach SOCIAL SCIENCES STATISTICS EECS Problems span three disciplines Scientific focus is needed at the interfaces

  13. Awareness of Message Source and Its Association with the Impacts of Sun Protection Campaigns in Australia

    ERIC Educational Resources Information Center

    Smith, Ben J.; Bauman, Adrian E.; McKenzie, Jeanie; Thomas, Margaret

    2005-01-01

    Purpose: To examine whether awareness of the source of sun protection campaigns in New South Wales, Australia was associated with message recall and sun protection knowledge and behaviours. Design/methodology/approach: Telephone surveys of random samples (n = 800) of parents and other carers of children under 12 years of age were conducted before…

  14. Internet Access and Usage by Secondary School Students in Morogoro Municipality, Tanzania

    ERIC Educational Resources Information Center

    Tarimo, Ronald; Kavishe, George

    2017-01-01

    The purpose of this paper was to report results of a study on the investigation of the Internet access and usage by secondary school students in Morogoro municipality in Tanzania. A simple random sampling technique was used to select 120 students from six schools. The data was collected through a questionnaire. A quantitative approach using the…

  15. Human Capital Planning in Higher Education Institutions: A Strategic Human Resource Development Initiative in Jordan

    ERIC Educational Resources Information Center

    Khasawneh, Samer

    2011-01-01

    Purpose: The primary purpose of this study is to determine the status of human capital planning in higher education institutions in Jordan. Design/methodology/approach: A random sample of 120 faculty members (in administrative positions) responded to a human capital planning (HCP) survey. The survey consisted of a pool of 38 items distributed over…

  16. Response rate, response time, and economic costs of survey research: A randomized trial of practicing pharmacists.

    PubMed

    Hardigan, Patrick C; Popovici, Ioana; Carvajal, Manuel J

    2016-01-01

    There is a gap between increasing demands from pharmacy journals, publishers, and reviewers for high survey response rates and the actual responses often obtained in the field by survey researchers. Presumably demands have been set high because response rates, times, and costs affect the validity and reliability of survey results. Explore the extent to which survey response rates, average response times, and economic costs are affected by conditions under which pharmacist workforce surveys are administered. A random sample of 7200 U.S. practicing pharmacists was selected. The sample was stratified by delivery method, questionnaire length, item placement, and gender of respondent for a total of 300 observations within each subgroup. A job satisfaction survey was administered during March-April 2012. Delivery method was the only classification showing significant differences in response rates and average response times. The postal mail procedure accounted for the highest response rates of completed surveys, but the email method exhibited the quickest turnaround. A hybrid approach, consisting of a combination of postal and electronic means, showed the least favorable results. Postal mail was 2.9 times more cost effective than the email approach and 4.6 times more cost effective than the hybrid approach. Researchers seeking to increase practicing pharmacists' survey participation and reduce response time and related costs can benefit from the analytical procedures tested here. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Probabilistic Structures Analysis Methods (PSAM) for select space propulsion system components

    NASA Technical Reports Server (NTRS)

    1991-01-01

    The basic formulation for probabilistic finite element analysis is described and demonstrated on a few sample problems. This formulation is based on iterative perturbation that uses the factorized stiffness on the unperturbed system as the iteration preconditioner for obtaining the solution to the perturbed problem. This approach eliminates the need to compute, store and manipulate explicit partial derivatives of the element matrices and force vector, which not only reduces memory usage considerably, but also greatly simplifies the coding and validation tasks. All aspects for the proposed formulation were combined in a demonstration problem using a simplified model of a curved turbine blade discretized with 48 shell elements, and having random pressure and temperature fields with partial correlation, random uniform thickness, and random stiffness at the root.

  18. Nonuniform sampling theorems for random signals in the linear canonical transform domain

    NASA Astrophysics Data System (ADS)

    Shuiqing, Xu; Congmei, Jiang; Yi, Chai; Youqiang, Hu; Lei, Huang

    2018-06-01

    Nonuniform sampling can be encountered in various practical processes because of random events or poor timebase. The analysis and applications of the nonuniform sampling for deterministic signals related to the linear canonical transform (LCT) have been well considered and researched, but up to now no papers have been published regarding the various nonuniform sampling theorems for random signals related to the LCT. The aim of this article is to explore the nonuniform sampling and reconstruction of random signals associated with the LCT. First, some special nonuniform sampling models are briefly introduced. Second, based on these models, some reconstruction theorems for random signals from various nonuniform samples associated with the LCT have been derived. Finally, the simulation results are made to prove the accuracy of the sampling theorems. In addition, the latent real practices of the nonuniform sampling for random signals have been also discussed.

  19. Sample-Based Surface Coloring

    PubMed Central

    Bürger, Kai; Krüger, Jens; Westermann, Rüdiger

    2011-01-01

    In this paper, we present a sample-based approach for surface coloring, which is independent of the original surface resolution and representation. To achieve this, we introduce the Orthogonal Fragment Buffer (OFB)—an extension of the Layered Depth Cube—as a high-resolution view-independent surface representation. The OFB is a data structure that stores surface samples at a nearly uniform distribution over the surface, and it is specifically designed to support efficient random read/write access to these samples. The data access operations have a complexity that is logarithmic in the depth complexity of the surface. Thus, compared to data access operations in tree data structures like octrees, data-dependent memory access patterns are greatly reduced. Due to the particular sampling strategy that is employed to generate an OFB, it also maintains sample coherence, and thus, exhibits very good spatial access locality. Therefore, OFB-based surface coloring performs significantly faster than sample-based approaches using tree structures. In addition, since in an OFB, the surface samples are internally stored in uniform 2D grids, OFB-based surface coloring can efficiently be realized on the GPU to enable interactive coloring of high-resolution surfaces. On the OFB, we introduce novel algorithms for color painting using volumetric and surface-aligned brushes, and we present new approaches for particle-based color advection along surfaces in real time. Due to the intermediate surface representation we choose, our method can be used to color polygonal surfaces as well as any other type of surface that can be sampled. PMID:20616392

  20. On grey levels in random CAPTCHA generation

    NASA Astrophysics Data System (ADS)

    Newton, Fraser; Kouritzin, Michael A.

    2011-06-01

    A CAPTCHA is an automatically generated test designed to distinguish between humans and computer programs; specifically, they are designed to be easy for humans but difficult for computer programs to pass in order to prevent the abuse of resources by automated bots. They are commonly seen guarding webmail registration forms, online auction sites, and preventing brute force attacks on passwords. In the following, we address the question: How does adding a grey level to random CAPTCHA generation affect the utility of the CAPTCHA? We treat the problem of generating the random CAPTCHA as one of random field simulation: An initial state of background noise is evolved over time using Gibbs sampling and an efficient algorithm for generating correlated random variables. This approach has already been found to yield highly-readable yet difficult-to-crack CAPTCHAs. We detail how the requisite parameters for introducing grey levels are estimated and how we generate the random CAPTCHA. The resulting CAPTCHA will be evaluated in terms of human readability as well as its resistance to automated attacks in the forms of character segmentation and optical character recognition.

  1. Generalized Ensemble Sampling of Enzyme Reaction Free Energy Pathways

    PubMed Central

    Wu, Dongsheng; Fajer, Mikolai I.; Cao, Liaoran; Cheng, Xiaolin; Yang, Wei

    2016-01-01

    Free energy path sampling plays an essential role in computational understanding of chemical reactions, particularly those occurring in enzymatic environments. Among a variety of molecular dynamics simulation approaches, the generalized ensemble sampling strategy is uniquely attractive for the fact that it not only can enhance the sampling of rare chemical events but also can naturally ensure consistent exploration of environmental degrees of freedom. In this review, we plan to provide a tutorial-like tour on an emerging topic: generalized ensemble sampling of enzyme reaction free energy path. The discussion is largely focused on our own studies, particularly ones based on the metadynamics free energy sampling method and the on-the-path random walk path sampling method. We hope that this mini presentation will provide interested practitioners some meaningful guidance for future algorithm formulation and application study. PMID:27498634

  2. Effects of open marsh water management on numbers of larval salt marsh mosquitoes

    USGS Publications Warehouse

    James-Pirri, Mary-Jane; Ginsberg, Howard S.; Erwin, R. Michael; Taylor, Janith

    2009-01-01

    Open marsh water management (OMWM) is a commonly used approach to manage salt marsh mosquitoes than can obviate the need for pesticide application and at the same time, partially restore natural functions of grid-ditched marshes. OMWM includes a variety of hydrologic manipulations, often tailored to the specific conditions on individual marshes, so the overall effectiveness of this approach is difficult to assess. Here, we report the results of controlled field trials to assess the effects of two approaches to OMWM on larval mosquito production at National Wildlife Refuges (NWR). A traditional OMWM approach, using pond construction and radial ditches was used at Edwin B. Forsythe NWR in New Jersey, and a ditch-plugging approach was used at Parker River NWR in Massachusetts. Mosquito larvae were sampled from randomly placed stations on paired treatment and control marshes at each refuge. The proportion of sampling stations that were wet declined after OMWM at the Forsythe site, but not at the Parker River site. The proportion of samples with larvae present and mean larval densities, declined significantly at the treatment sites on both refuges relative to the control marshes. Percentage of control for the 2 yr posttreatment, compared with the 2 yr pretreatment, was >90% at both treatment sites.

  3. System reliability of randomly vibrating structures: Computational modeling and laboratory testing

    NASA Astrophysics Data System (ADS)

    Sundar, V. S.; Ammanagi, S.; Manohar, C. S.

    2015-09-01

    The problem of determination of system reliability of randomly vibrating structures arises in many application areas of engineering. We discuss in this paper approaches based on Monte Carlo simulations and laboratory testing to tackle problems of time variant system reliability estimation. The strategy we adopt is based on the application of Girsanov's transformation to the governing stochastic differential equations which enables estimation of probability of failure with significantly reduced number of samples than what is needed in a direct simulation study. Notably, we show that the ideas from Girsanov's transformation based Monte Carlo simulations can be extended to conduct laboratory testing to assess system reliability of engineering structures with reduced number of samples and hence with reduced testing times. Illustrative examples include computational studies on a 10-degree of freedom nonlinear system model and laboratory/computational investigations on road load response of an automotive system tested on a four-post test rig.

  4. Quantifying Adventitious Error in a Covariance Structure as a Random Effect

    PubMed Central

    Wu, Hao; Browne, Michael W.

    2017-01-01

    We present an approach to quantifying errors in covariance structures in which adventitious error, identified as the process underlying the discrepancy between the population and the structured model, is explicitly modeled as a random effect with a distribution, and the dispersion parameter of this distribution to be estimated gives a measure of misspecification. Analytical properties of the resultant procedure are investigated and the measure of misspecification is found to be related to the RMSEA. An algorithm is developed for numerical implementation of the procedure. The consistency and asymptotic sampling distributions of the estimators are established under a new asymptotic paradigm and an assumption weaker than the standard Pitman drift assumption. Simulations validate the asymptotic sampling distributions and demonstrate the importance of accounting for the variations in the parameter estimates due to adventitious error. Two examples are also given as illustrations. PMID:25813463

  5. Auxiliary Parameter MCMC for Exponential Random Graph Models

    NASA Astrophysics Data System (ADS)

    Byshkin, Maksym; Stivala, Alex; Mira, Antonietta; Krause, Rolf; Robins, Garry; Lomi, Alessandro

    2016-11-01

    Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.

  6. On Adding Structure to Unstructured Overlay Networks

    NASA Astrophysics Data System (ADS)

    Leitão, João; Carvalho, Nuno A.; Pereira, José; Oliveira, Rui; Rodrigues, Luís

    Unstructured peer-to-peer overlay networks are very resilient to churn and topology changes, while requiring little maintenance cost. Therefore, they are an infrastructure to build highly scalable large-scale services in dynamic networks. Typically, the overlay topology is defined by a peer sampling service that aims at maintaining, in each process, a random partial view of peers in the system. The resulting random unstructured topology is suboptimal when a specific performance metric is considered. On the other hand, structured approaches (for instance, a spanning tree) may optimize a given target performance metric but are highly fragile. In fact, the cost for maintaining structures with strong constraints may easily become prohibitive in highly dynamic networks. This chapter discusses different techniques that aim at combining the advantages of unstructured and structured networks. Namely we focus on two distinct approaches, one based on optimizing the overlay and another based on optimizing the gossip mechanism itself.

  7. High energy X-ray phase and dark-field imaging using a random absorption mask.

    PubMed

    Wang, Hongchang; Kashyap, Yogesh; Cai, Biao; Sawhney, Kawal

    2016-07-28

    High energy X-ray imaging has unique advantage over conventional X-ray imaging, since it enables higher penetration into materials with significantly reduced radiation damage. However, the absorption contrast in high energy region is considerably low due to the reduced X-ray absorption cross section for most materials. Even though the X-ray phase and dark-field imaging techniques can provide substantially increased contrast and complementary information, fabricating dedicated optics for high energies still remain a challenge. To address this issue, we present an alternative X-ray imaging approach to produce transmission, phase and scattering signals at high X-ray energies by using a random absorption mask. Importantly, in addition to the synchrotron radiation source, this approach has been demonstrated for practical imaging application with a laboratory-based microfocus X-ray source. This new imaging method could be potentially useful for studying thick samples or heavy materials for advanced research in materials science.

  8. Product demand forecasts using wavelet kernel support vector machine and particle swarm optimization in manufacture system

    NASA Astrophysics Data System (ADS)

    Wu, Qi

    2010-03-01

    Demand forecasts play a crucial role in supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Aiming at demand series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the space (quadratic continuous integral space). In this paper, we present a hybrid intelligent system combining the wavelet kernel support vector machine and particle swarm optimization for demand forecasting. The results of application in car sale series forecasting show that the forecasting approach based on the hybrid PSOWv-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given, which proves that this method is, for the discussed example, better than hybrid PSOv-SVM and other traditional methods.

  9. Aqueous Cleaning and Validation for Space Shuttle Propulsion Hardware at the White Sands Test Facility

    NASA Technical Reports Server (NTRS)

    Hornung, Steven D.; Biesinger, Paul; Kirsch, Mike; Beeson, Harold; Leuders, Kathy

    1999-01-01

    The NASA White Sands Test Facility (WSTF) has developed an entirely aqueous final cleaning and verification process to replace the current chlorofluorocarbon (CFC) 113 based process. This process has been accepted for final cleaning and cleanliness verification of WSTF ground support equipment. The aqueous process relies on ultrapure water at 50 C (323 K) and ultrasonic agitation for removal of organic compounds and particulate. The cleanliness is verified bv determining the total organic carbon (TOC) content and filtration with particulate counting. The effectiveness of the aqueous methods for detecting hydrocarbon contamination and particulate was compared to the accepted CFC 113 sampling procedures. Testing with known contaminants, such as hydraulic fluid and cutting and lubricating oils, to establish a correlation between aqueous TOC and CFC 113 nonvolatile residue (NVR) was performed. Particulate sampling on cleaned batches of hardware that were randomly separated and sampled by the two methods was performed. This paper presents the approach and results, and discusses the issues in establishing the equivalence of aqueous sampling to CFC 113 sampling, while describing the approach for implementing aqueous techniques on Space Shuttle Propulsion hardware.

  10. The effect of Think Pair Share (TPS) using scientific approach on students’ self-confidence and mathematical problem-solving

    NASA Astrophysics Data System (ADS)

    Rifa’i, A.; Lestari, H. P.

    2018-03-01

    This study was designed to know the effects of Think Pair Share using Scientific Approach on students' self-confidence and mathematical problem-solving. Quasi-experimental with pre-test post-test non-equivalent group method was used as a basis for design this study. Self-confidence questionnaire and problem-solving test have been used for measurement of the two variables. Two classes of the first grade in religious senior high school (MAN) in Indonesia were randomly selected for this study. Teaching sequence and series from mathematics book at control group in the traditional way and at experiment group has been in TPS using scientific approach learning method. For data analysis regarding students’ problem-solving skill and self-confidence, One-Sample t-Test, Independent Sample t-Test, and Multivariate of Variance (MANOVA) were used. The results showed that (1) TPS using a scientific approach and traditional learning had positive effects (2) TPS using scientific approach learning in comparative with traditional learning had a more significant effect on students’ self-confidence and problem-solving skill.

  11. Problem Posing with Realistic Mathematics Education Approach in Geometry Learning

    NASA Astrophysics Data System (ADS)

    Mahendra, R.; Slamet, I.; Budiyono

    2017-09-01

    One of the difficulties of students in the learning of geometry is on the subject of plane that requires students to understand the abstract matter. The aim of this research is to determine the effect of Problem Posing learning model with Realistic Mathematics Education Approach in geometry learning. This quasi experimental research was conducted in one of the junior high schools in Karanganyar, Indonesia. The sample was taken using stratified cluster random sampling technique. The results of this research indicate that the model of Problem Posing learning with Realistic Mathematics Education Approach can improve students’ conceptual understanding significantly in geometry learning especially on plane topics. It is because students on the application of Problem Posing with Realistic Mathematics Education Approach are become to be active in constructing their knowledge, proposing, and problem solving in realistic, so it easier for students to understand concepts and solve the problems. Therefore, the model of Problem Posing learning with Realistic Mathematics Education Approach is appropriately applied in mathematics learning especially on geometry material. Furthermore, the impact can improve student achievement.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Lin, E-mail: godyalin@163.com; Singh, Uttam, E-mail: uttamsingh@hri.res.in; Pati, Arun K., E-mail: akpati@hri.res.in

    Compact expressions for the average subentropy and coherence are obtained for random mixed states that are generated via various probability measures. Surprisingly, our results show that the average subentropy of random mixed states approaches the maximum value of the subentropy which is attained for the maximally mixed state as we increase the dimension. In the special case of the random mixed states sampled from the induced measure via partial tracing of random bipartite pure states, we establish the typicality of the relative entropy of coherence for random mixed states invoking the concentration of measure phenomenon. Our results also indicate thatmore » mixed quantum states are less useful compared to pure quantum states in higher dimension when we extract quantum coherence as a resource. This is because of the fact that average coherence of random mixed states is bounded uniformly, however, the average coherence of random pure states increases with the increasing dimension. As an important application, we establish the typicality of relative entropy of entanglement and distillable entanglement for a specific class of random bipartite mixed states. In particular, most of the random states in this specific class have relative entropy of entanglement and distillable entanglement equal to some fixed number (to within an arbitrary small error), thereby hugely reducing the complexity of computation of these entanglement measures for this specific class of mixed states.« less

  13. Study of Dynamic Characteristics of Aeroelastic Systems Utilizing Randomdec Signatures

    NASA Technical Reports Server (NTRS)

    Chang, C. S.

    1975-01-01

    The feasibility of utilizing the random decrement method in conjunction with a signature analysis procedure to determine the dynamic characteristics of an aeroelastic system for the purpose of on-line prediction of potential on-set of flutter was examined. Digital computer programs were developed to simulate sampled response signals of a two-mode aeroelastic system. Simulated response data were used to test the random decrement method. A special curve-fit approach was developed for analyzing the resulting signatures. A number of numerical 'experiments' were conducted on the combined processes. The method is capable of determining frequency and damping values accurately from randomdec signatures of carefully selected lengths.

  14. A rational approach to legacy data validation when transitioning between electronic health record systems.

    PubMed

    Pageler, Natalie M; Grazier G'Sell, Max Jacob; Chandler, Warren; Mailes, Emily; Yang, Christine; Longhurst, Christopher A

    2016-09-01

    The objective of this project was to use statistical techniques to determine the completeness and accuracy of data migrated during electronic health record conversion. Data validation during migration consists of mapped record testing and validation of a sample of the data for completeness and accuracy. We statistically determined a randomized sample size for each data type based on the desired confidence level and error limits. The only error identified in the post go-live period was a failure to migrate some clinical notes, which was unrelated to the validation process. No errors in the migrated data were found during the 12- month post-implementation period. Compared to the typical industry approach, we have demonstrated that a statistical approach to sampling size for data validation can ensure consistent confidence levels while maximizing efficiency of the validation process during a major electronic health record conversion. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches.

    PubMed

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2014-07-10

    In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.

  16. A random effects meta-analysis model with Box-Cox transformation.

    PubMed

    Yamaguchi, Yusuke; Maruo, Kazushi; Partlett, Christopher; Riley, Richard D

    2017-07-19

    In a random effects meta-analysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval. We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects meta-analysis model where a Box-Cox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the within-study sampling distributions and the random effects distribution. When sampling distributions are approximately normal, non-normality in the overall distribution will be mainly due to the random effects distribution, especially when the between-study variation is large relative to the within-study variation. The Box-Cox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the meta-analysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable. A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I 2 from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model. The random effects meta-analysis with the Box-Cox transformation may be an important tool for examining robustness of traditional meta-analysis results against skewness on the observed treatment effect estimates. Further critical evaluation of the method is needed.

  17. Empirical likelihood-based tests for stochastic ordering

    PubMed Central

    BARMI, HAMMOU EL; MCKEAGUE, IAN W.

    2013-01-01

    This paper develops an empirical likelihood approach to testing for the presence of stochastic ordering among univariate distributions based on independent random samples from each distribution. The proposed test statistic is formed by integrating a localized empirical likelihood statistic with respect to the empirical distribution of the pooled sample. The asymptotic null distribution of this test statistic is found to have a simple distribution-free representation in terms of standard Brownian bridge processes. The approach is used to compare the lengths of rule of Roman Emperors over various historical periods, including the “decline and fall” phase of the empire. In a simulation study, the power of the proposed test is found to improve substantially upon that of a competing test due to El Barmi and Mukerjee. PMID:23874142

  18. A novel approach to non-biased systematic random sampling: a stereologic estimate of Purkinje cells in the human cerebellum.

    PubMed

    Agashiwala, Rajiv M; Louis, Elan D; Hof, Patrick R; Perl, Daniel P

    2008-10-21

    Non-biased systematic sampling using the principles of stereology provides accurate quantitative estimates of objects within neuroanatomic structures. However, the basic principles of stereology are not optimally suited for counting objects that selectively exist within a limited but complex and convoluted portion of the sample, such as occurs when counting cerebellar Purkinje cells. In an effort to quantify Purkinje cells in association with certain neurodegenerative disorders, we developed a new method for stereologic sampling of the cerebellar cortex, involving calculating the volume of the cerebellar tissues, identifying and isolating the Purkinje cell layer and using this information to extrapolate non-biased systematic sampling data to estimate the total number of Purkinje cells in the tissues. Using this approach, we counted Purkinje cells in the right cerebella of four human male control specimens, aged 41, 67, 70 and 84 years, and estimated the total Purkinje cell number for the four entire cerebella to be 27.03, 19.74, 20.44 and 22.03 million cells, respectively. The precision of the method is seen when comparing the density of the cells within the tissue: 266,274, 173,166, 167,603 and 183,575 cells/cm3, respectively. Prior literature documents Purkinje cell counts ranging from 14.8 to 30.5 million cells. These data demonstrate the accuracy of our approach. Our novel approach, which offers an improvement over previous methodologies, is of value for quantitative work of this nature. This approach could be applied to morphometric studies of other similarly complex tissues as well.

  19. A novel approach to non-biased systematic random sampling: A stereologic estimate of Purkinje cells in the human cerebellum

    PubMed Central

    Agashiwala, Rajiv M.; Louis, Elan D.; Hof, Patrick R.; Perl, Daniel P.

    2010-01-01

    Non-biased systematic sampling using the principles of stereology provides accurate quantitative estimates of objects within neuroanatomic structures. However, the basic principles of stereology are not optimally suited for counting objects that selectively exist within a limited but complex and convoluted portion of the sample, such as occurs when counting cerebellar Purkinje cells. In an effort to quantify Purkinje cells in association with certain neurodegenerative disorders, we developed a new method for stereologic sampling of the cerebellar cortex, involving calculating the volume of the cerebellar tissues, identifying and isolating the Purkinje cell layer and using this information to extrapolate non-biased systematic sampling data to estimate the total number of Purkinje cells in the tissues. Using this approach, we counted Purkinje cells in the right cerebella of four human male control specimens, aged 41, 67, 70 and 84 years, and estimated the total Purkinje cell number for the four entire cerebella to be 27.03, 19.74, 20.44 and 22.03 million cells, respectively. The precision of the method is seen when comparing the density of the cells within the tissue: 266,274, 173,166, 167,603 and 183,575 cells/cm3, respectively. Prior literature documents Purkinje cell counts ranging from 14.8 to 30.5 million cells. These data demonstrate the accuracy of our approach. Our novel approach, which offers an improvement over previous methodologies, is of value for quantitative work of this nature. This approach could be applied to morphometric studies of other similarly complex tissues as well. PMID:18725208

  20. A sub-sampled approach to extremely low-dose STEM

    DOE PAGES

    Stevens, A.; Luzi, L.; Yang, H.; ...

    2018-01-22

    The inpainting of deliberately and randomly sub-sampled images offers a potential means to image specimens at a high resolution and under extremely low-dose conditions (≤1 e -/Å 2) using a scanning transmission electron microscope. We show that deliberate sub-sampling acquires images at least an order of magnitude faster than conventional low-dose methods for an equivalent electron dose. More importantly, when adaptive sub-sampling is implemented to acquire the images, there is a significant increase in the resolution and sensitivity which accompanies the increase in imaging speed. Lastly, we demonstrate the potential of this method for beam sensitive materials and in-situ observationsmore » by experimentally imaging the node distribution in a metal-organic framework.« less

  1. A sub-sampled approach to extremely low-dose STEM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stevens, A.; Luzi, L.; Yang, H.

    The inpainting of deliberately and randomly sub-sampled images offers a potential means to image specimens at a high resolution and under extremely low-dose conditions (≤1 e -/Å 2) using a scanning transmission electron microscope. We show that deliberate sub-sampling acquires images at least an order of magnitude faster than conventional low-dose methods for an equivalent electron dose. More importantly, when adaptive sub-sampling is implemented to acquire the images, there is a significant increase in the resolution and sensitivity which accompanies the increase in imaging speed. Lastly, we demonstrate the potential of this method for beam sensitive materials and in-situ observationsmore » by experimentally imaging the node distribution in a metal-organic framework.« less

  2. Assessing the quality of reproductive health services in Egypt via exit interviews.

    PubMed

    Zaky, Hassan H M; Khattab, Hind A S; Galal, Dina

    2007-05-01

    This study assesses the quality of reproductive health services using client satisfaction exit interviews among three groups of primary health care units run by the Ministry of Health and Population of Egypt. Each group applied a different model of intervention. The Ministry will use the results in assessing its reproductive health component in the health sector reform program, and benefits from the strengths of other models of intervention. The sample was selected in two stages. First, a stratified random sampling procedure was used to select the health units. Then the sample of female clients in each health unit was selected using the systematic random approach, whereby one in every two women visiting the unit was approached. All women in the sample coming for reproductive health services were included in the analysis. The results showed that reproductive health beneficiaries at the units implementing the new health sector reform program were more satisfied with the quality of services. Still there were various areas where clients showed significant dissatisfaction, such as waiting time, interior furnishings, cleanliness of the units and consultation time. The study showed that the staff of these units did not provide a conductive social environment as other interventions did. A significant proportion of women expressed their intention to go to private physicians owing to their flexible working hours and variety specializations. Beneficiaries were generally more satisfied with the quality of health services after attending the reformed units than the other types of units, but the generalization did not fully apply. Areas of weakness are identified.

  3. Small Sample Performance of Bias-corrected Sandwich Estimators for Cluster-Randomized Trials with Binary Outcomes

    PubMed Central

    Li, Peng; Redden, David T.

    2014-01-01

    SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738

  4. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption.

    PubMed

    Hartwig, Fernando Pires; Davey Smith, George; Bowden, Jack

    2017-12-01

    Mendelian randomization (MR) is being increasingly used to strengthen causal inference in observational studies. Availability of summary data of genetic associations for a variety of phenotypes from large genome-wide association studies (GWAS) allows straightforward application of MR using summary data methods, typically in a two-sample design. In addition to the conventional inverse variance weighting (IVW) method, recently developed summary data MR methods, such as the MR-Egger and weighted median approaches, allow a relaxation of the instrumental variable assumptions. Here, a new method - the mode-based estimate (MBE) - is proposed to obtain a single causal effect estimate from multiple genetic instruments. The MBE is consistent when the largest number of similar (identical in infinite samples) individual-instrument causal effect estimates comes from valid instruments, even if the majority of instruments are invalid. We evaluate the performance of the method in simulations designed to mimic the two-sample summary data setting, and demonstrate its use by investigating the causal effect of plasma lipid fractions and urate levels on coronary heart disease risk. The MBE presented less bias and lower type-I error rates than other methods under the null in many situations. Its power to detect a causal effect was smaller compared with the IVW and weighted median methods, but was larger than that of MR-Egger regression, with sample size requirements typically smaller than those available from GWAS consortia. The MBE relaxes the instrumental variable assumptions, and should be used in combination with other approaches in sensitivity analyses. © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association

  5. Prevalence of substandard and falsified artemisinin-based combination antimalarial medicines on Bioko Island, Equatorial Guinea

    PubMed Central

    Kaur, Harparkash; Allan, Elizabeth Louise; Mamadu, Ibrahim; Hall, Zoe; Green, Michael D; Swamidos, Isabel; Dwivedi, Prabha; Culzoni, Maria Julia; Fernandez, Facundo M; Garcia, Guillermo; Hergott, Dianna; Monti, Feliciano

    2017-01-01

    Introduction Poor-quality artemisinin-containing antimalarials (ACAs), including falsified and substandard formulations, pose serious health concerns in malaria endemic countries. They can harm patients, contribute to the rise in drug resistance and increase the public’s mistrust of health systems. Systematic assessment of drug quality is needed to gain knowledge on the prevalence of the problem, to provide Ministries of Health with evidence on which local regulators can take action. Methods We used three sampling approaches to purchase 677 ACAs from 278 outlets on Bioko Island, Equatorial Guinea as follows: convenience survey using mystery client (n=16 outlets, 31 samples), full island-wide survey using mystery client (n=174 outlets, 368 samples) and randomised survey using an overt sampling approach (n=88 outlets, 278 samples). The stated active pharmaceutical ingredients (SAPIs) were assessed using high-performance liquid chromatography and confirmed by mass spectrometry at three independent laboratories. Results Content analysis showed 91.0% of ACAs were of acceptable quality, 1.6% were substandard and 7.4% falsified. No degraded medicines were detected. The prevalence of medicines without the SAPIs was higher for ACAs purchased in the convenience survey compared with the estimates obtained using the full island-wide survey-mystery client and randomised-overt sampling approaches. Comparable results were obtained for full island survey-mystery client and randomised overt. However, the availability of purchased artesunate monotherapies differed substantially according to the sampling approach used (convenience, 45.2%; full island-wide survey-mystery client, 32.6%; random-overt sampling approach, 21.9%). Of concern is that 37.1% (n=62) of these were falsified. Conclusion Falsified ACAs were found on Bioko Island, with the prevalence ranging between 6.1% and 16.1%, depending on the sampling method used. These findings underscore the vital need for national authorities to track the scale of ineffective medicines that jeopardise treatment of life-threatening diseases and value of a representative sampling approach to obtain/measure the true prevalence of poor-quality medicines. PMID:29082025

  6. The Ambivalence of the Israeli Academic Profession: Research vs. Teaching. The Academic Profession Approaches the Twenty-First Century: the Carnegie Foundation International Survey, Symposium.

    ERIC Educational Resources Information Center

    Gottlieb, Esther E.; And Others

    Attitudes of Israeli senior faculty concerning research and teaching were evaluated using the Carnegie international questionnaire. Approximately one third of the total faculty population in Israel was randomly sampled, but stratified by institutional size. The questionnaire was sent to 2,225 faculty and 502 returned completed forms (22.56…

  7. Creating Message Strategies for an AIDS Campaign: A Survey of the Basis of Student Awareness.

    ERIC Educational Resources Information Center

    Brislin, Tom; Miyamoto, Craig T.

    A study assessed the general knowledge of AIDS and its prevention among college students, and determined the source of that knowledge. A class of 17 senior-level journalism students used a focus group approach to select the most useful questions about AIDS and its prevention. The survey was administered by telephone to a random sample of 372…

  8. An Investigation Toward the Possible Expansion of the Adult Continuing Education Program in Whitley County, Indiana.

    ERIC Educational Resources Information Center

    Haworth, Richard Julius

    The major objectives of this study were to determine the adult education needs of Whitley County, Indiana; to ascertain how well these needs were being met; and to propose one or more approaches to enhance already existing efforts. Of the 352 randomly selected homes included in the sample, 268 returned completed questionnaires. Of the 268 72%…

  9. Occupational Safety & Health. Inspectors' Opinions on Improving OSHA Effectiveness. Fact Sheet for Subcommittee on Health and Safety, Committee on Education and Labor, House of Representatives.

    ERIC Educational Resources Information Center

    General Accounting Office, Washington, DC. Div. of Human Resources.

    Questionnaires gathered opinions of all Occupational Safety and Health Administration (OSHA) field supervisors and a randomly selected sample of one-third of the compliance officers about OSHA's approach to improving workplace safety and health. Major topics addressed were enforcement, safety and health standards, education and training, employer…

  10. High resolution satellite remote sensing used in a stratified random sampling scheme to quantify the constituent land cover components of the shifting cultivation mosaic of the Democratic Republic of Congo

    NASA Astrophysics Data System (ADS)

    Molinario, G.; Hansen, M.; Potapov, P.

    2016-12-01

    High resolution satellite imagery obtained from the National Geospatial Intelligence Agency through NASA was used to photo-interpret sample areas within the DRC. The area sampled is a stratifcation of the forest cover loss from circa 2014 that either occurred completely within the previosly mapped homogenous area of the Rural Complex, at it's interface with primary forest, or in isolated forest perforations. Previous research resulted in a map of these areas that contextualizes forest loss depending on where it occurs and with what spatial density, leading to a better understading of the real impacts on forest degradation of livelihood shifting cultivation. The stratified random sampling approach of these areas allows the characterization of the constituent land cover types within these areas, and their variability throughout the DRC. Shifting cultivation has a variable forest degradation footprint in the DRC depending on many factors that drive it, but it's role in forest degradation and deforestation had been disputed, leading us to investigate and quantify the clearing and reuse rates within the strata throughout the country.

  11. Accounting for randomness in measurement and sampling in studying cancer cell population dynamics.

    PubMed

    Ghavami, Siavash; Wolkenhauer, Olaf; Lahouti, Farshad; Ullah, Mukhtar; Linnebacher, Michael

    2014-10-01

    Knowing the expected temporal evolution of the proportion of different cell types in sample tissues gives an indication about the progression of the disease and its possible response to drugs. Such systems have been modelled using Markov processes. We here consider an experimentally realistic scenario in which transition probabilities are estimated from noisy cell population size measurements. Using aggregated data of FACS measurements, we develop MMSE and ML estimators and formulate two problems to find the minimum number of required samples and measurements to guarantee the accuracy of predicted population sizes. Our numerical results show that the convergence mechanism of transition probabilities and steady states differ widely from the real values if one uses the standard deterministic approach for noisy measurements. This provides support for our argument that for the analysis of FACS data one should consider the observed state as a random variable. The second problem we address is about the consequences of estimating the probability of a cell being in a particular state from measurements of small population of cells. We show how the uncertainty arising from small sample sizes can be captured by a distribution for the state probability.

  12. Urn models for response-adaptive randomized designs: a simulation study based on a non-adaptive randomized trial.

    PubMed

    Ghiglietti, Andrea; Scarale, Maria Giovanna; Miceli, Rosalba; Ieva, Francesca; Mariani, Luigi; Gavazzi, Cecilia; Paganoni, Anna Maria; Edefonti, Valeria

    2018-03-22

    Recently, response-adaptive designs have been proposed in randomized clinical trials to achieve ethical and/or cost advantages by using sequential accrual information collected during the trial to dynamically update the probabilities of treatment assignments. In this context, urn models-where the probability to assign patients to treatments is interpreted as the proportion of balls of different colors available in a virtual urn-have been used as response-adaptive randomization rules. We propose the use of Randomly Reinforced Urn (RRU) models in a simulation study based on a published randomized clinical trial on the efficacy of home enteral nutrition in cancer patients after major gastrointestinal surgery. We compare results with the RRU design with those previously published with the non-adaptive approach. We also provide a code written with the R software to implement the RRU design in practice. In detail, we simulate 10,000 trials based on the RRU model in three set-ups of different total sample sizes. We report information on the number of patients allocated to the inferior treatment and on the empirical power of the t-test for the treatment coefficient in the ANOVA model. We carry out a sensitivity analysis to assess the effect of different urn compositions. For each sample size, in approximately 75% of the simulation runs, the number of patients allocated to the inferior treatment by the RRU design is lower, as compared to the non-adaptive design. The empirical power of the t-test for the treatment effect is similar in the two designs.

  13. Speeding up Coarse Point Cloud Registration by Threshold-Independent Baysac Match Selection

    NASA Astrophysics Data System (ADS)

    Kang, Z.; Lindenbergh, R.; Pu, S.

    2016-06-01

    This paper presents an algorithm for the automatic registration of terrestrial point clouds by match selection using an efficiently conditional sampling method -- threshold-independent BaySAC (BAYes SAmpling Consensus) and employs the error metric of average point-to-surface residual to reduce the random measurement error and then approach the real registration error. BaySAC and other basic sampling algorithms usually need to artificially determine a threshold by which inlier points are identified, which leads to a threshold-dependent verification process. Therefore, we applied the LMedS method to construct the cost function that is used to determine the optimum model to reduce the influence of human factors and improve the robustness of the model estimate. Point-to-point and point-to-surface error metrics are most commonly used. However, point-to-point error in general consists of at least two components, random measurement error and systematic error as a result of a remaining error in the found rigid body transformation. Thus we employ the measure of the average point-to-surface residual to evaluate the registration accuracy. The proposed approaches, together with a traditional RANSAC approach, are tested on four data sets acquired by three different scanners in terms of their computational efficiency and quality of the final registration. The registration results show the st.dev of the average point-to-surface residuals is reduced from 1.4 cm (plain RANSAC) to 0.5 cm (threshold-independent BaySAC). The results also show that, compared to the performance of RANSAC, our BaySAC strategies lead to less iterations and cheaper computational cost when the hypothesis set is contaminated with more outliers.

  14. High resolution 4-D spectroscopy with sparse concentric shell sampling and FFT-CLEAN.

    PubMed

    Coggins, Brian E; Zhou, Pei

    2008-12-01

    Recent efforts to reduce the measurement time for multidimensional NMR experiments have fostered the development of a variety of new procedures for sampling and data processing. We recently described concentric ring sampling for 3-D NMR experiments, which is superior to radial sampling as input for processing by a multidimensional discrete Fourier transform. Here, we report the extension of this approach to 4-D spectroscopy as Randomized Concentric Shell Sampling (RCSS), where sampling points for the indirect dimensions are positioned on concentric shells, and where random rotations in the angular space are used to avoid coherent artifacts. With simulations, we show that RCSS produces a very low level of artifacts, even with a very limited number of sampling points. The RCSS sampling patterns can be adapted to fine rectangular grids to permit use of the Fast Fourier Transform in data processing, without an apparent increase in the artifact level. These artifacts can be further reduced to the noise level using the iterative CLEAN algorithm developed in radioastronomy. We demonstrate these methods on the high resolution 4-D HCCH-TOCSY spectrum of protein G's B1 domain, using only 1.2% of the sampling that would be needed conventionally for this resolution. The use of a multidimensional FFT instead of the slow DFT for initial data processing and for subsequent CLEAN significantly reduces the calculation time, yielding an artifact level that is on par with the level of the true spectral noise.

  15. High Resolution 4-D Spectroscopy with Sparse Concentric Shell Sampling and FFT-CLEAN

    PubMed Central

    Coggins, Brian E.; Zhou, Pei

    2009-01-01

    SUMMARY Recent efforts to reduce the measurement time for multidimensional NMR experiments have fostered the development of a variety of new procedures for sampling and data processing. We recently described concentric ring sampling for 3-D NMR experiments, which is superior to radial sampling as input for processing by a multidimensional discrete Fourier transform. Here, we report the extension of this approach to 4-D spectroscopy as Randomized Concentric Shell Sampling (RCSS), where sampling points for the indirect dimensions are positioned on concentric shells, and where random rotations in the angular space are used to avoid coherent artifacts. With simulations, we show that RCSS produces a very low level of artifacts, even with a very limited number of sampling points. The RCSS sampling patterns can be adapted to fine rectangular grids to permit use of the Fast Fourier Transform in data processing, without an apparent increase in the artifact level. These artifacts can be further reduced to the noise level using the iterative CLEAN algorithm developed in radioastronomy. We demonstrate these methods on the high resolution 4-D HCCH-TOCSY spectrum of protein G's B1 domain, using only 1.2% of the sampling that would be needed conventionally for this resolution. The use of a multidimensional FFT instead of the slow DFT for initial data processing and for subsequent CLEAN significantly reduces the calculation time, yielding an artifact level that is on par with the level of the true spectral noise. PMID:18853260

  16. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data

    PubMed Central

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks. PMID:29706880

  17. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data.

    PubMed

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks.

  18. Influence function based variance estimation and missing data issues in case-cohort studies.

    PubMed

    Mark, S D; Katki, H

    2001-12-01

    Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.

  19. Preventing smoking relapse via Web-based computer-tailored feedback: a randomized controlled trial.

    PubMed

    Elfeddali, Iman; Bolman, Catherine; Candel, Math J J M; Wiers, Reinout W; de Vries, Hein

    2012-08-20

    Web-based computer-tailored approaches have the potential to be successful in supporting smoking cessation. However, the potential effects of such approaches for relapse prevention and the value of incorporating action planning strategies to effectively prevent smoking relapse have not been fully explored. The Stay Quit for You (SQ4U) study compared two Web-based computer-tailored smoking relapse prevention programs with different types of planning strategies versus a control group. To assess the efficacy of two Web-based computer-tailored programs in preventing smoking relapse compared with a control group. The action planning (AP) program provided tailored feedback at baseline and invited respondents to do 6 preparatory and coping planning assignments (the first 3 assignments prior to quit date and the final 3 assignments after quit date). The action planning plus (AP+) program was an extended version of the AP program that also provided tailored feedback at 11 time points after the quit attempt. Respondents in the control group only filled out questionnaires. The study also assessed possible dose-response relationships between abstinence and adherence to the programs. The study was a randomized controlled trial with three conditions: the control group, the AP program, and the AP+ program. Respondents were daily smokers (N = 2031), aged 18 to 65 years, who were motivated and willing to quit smoking within 1 month. The primary outcome was self-reported continued abstinence 12 months after baseline. Logistic regression analyses were conducted using three samples: (1) all respondents as randomly assigned, (2) a modified sample that excluded respondents who did not make a quit attempt in conformance with the program protocol, and (3) a minimum dose sample that also excluded respondents who did not adhere to at least one of the intervention elements. Observed case analyses and conservative analyses were conducted. In the observed case analysis of the randomized sample, abstinence rates were 22% (45/202) in the control group versus 33% (63/190) in the AP program and 31% (53/174) in the AP+ program. The AP program (odds ratio 1.95, P = .005) and the AP+ program (odds ratio 1.61, P = .049) were significantly more effective than the control condition. Abstinence rates and effects differed per sample. Finally, the results suggest a dose-response relationship between abstinence and the number of program elements completed by the respondents. Despite the differences in results caused by the variation in our analysis approaches, we can conclude that Web-based computer-tailored programs combined with planning strategy assignments and feedback after the quit attempt can be effective in preventing relapse 12 months after baseline. However, adherence to the intervention seems critical for effectiveness. Finally, our results also suggest that more research is needed to assess the optimum intervention dose. Dutch Trial Register: NTR1892; http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=1892 (Archived by WebCite at http://www.webcitation.org/693S6uuPM).

  20. Methodology Series Module 5: Sampling Strategies.

    PubMed

    Setia, Maninder Singh

    2016-01-01

    Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ' Sampling Method'. There are essentially two types of sampling methods: 1) probability sampling - based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling - based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ' random sample' when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ' generalizability' of these results. In such a scenario, the researcher may want to use ' purposive sampling' for the study.

  1. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Predicting general criminal recidivism in mentally disordered offenders using a random forest approach.

    PubMed

    Pflueger, Marlon O; Franke, Irina; Graf, Marc; Hachtel, Henning

    2015-03-29

    Psychiatric expert opinions are supposed to assess the accused individual's risk of reoffending based on a valid scientific foundation. In contrast to specific recidivism, general recidivism has only been poorly considered in Continental Europe; we therefore aimed to develop a valid instrument for assessing the risk of general criminal recidivism of mentally ill offenders. Data of 259 mentally ill offenders with a median time at risk of 107 months were analyzed and combined with the individuals' criminal records. We derived risk factors for general criminal recidivism and classified re-offences by using a random forest approach. In our sample of mentally ill offenders, 51% were reconvicted. The most important predictive factors for general criminal recidivism were: number of prior convictions, age, type of index offence, diversity of criminal history, and substance abuse. With our statistical approach we were able to correctly identify 58-95% of all reoffenders and 65-97% of all committed offences (AUC = .90). Our study presents a new statistical approach to forensic-psychiatric risk-assessment, allowing experts to evaluate general risk of reoffending in mentally disordered individuals, with a special focus on high-risk groups. This approach might serve not only for expert opinions in court, but also for risk management strategies and therapeutic interventions.

  3. A cost-effective approach to the development of printed materials: a randomized controlled trial of three strategies.

    PubMed

    Paul, C L; Redman, S; Sanson-Fisher, R W

    2004-12-01

    Printed materials have been a primary mode of communication in public health education. Three major approaches to the development of these materials--the application of characteristics identified in the literature, behavioral strategies and marketing strategies--have major implications for both the effectiveness and cost of materials. However, little attention has been directed towards the cost-effectiveness of such approaches. In the present study, three pamphlets were developed using successive addition of each approach: first literature characteristics only ('C' pamphlet), then behavioral strategies ('C + B' pamphlet) and then marketing strategies ('C + B + M' pamphlet). Each pamphlet encouraged women to join a Pap Test Reminder Service (PTRS). Each pamphlet was mailed to a randomly selected sample of 2700 women aged 50-69 years. Registrations with the PTRS were monitored and 420 women in each pamphlet group were surveyed by telephone. It was reported that the 'C + B' and 'C + B + M' pamphlets were significantly more effective than the 'C' pamphlet. The 'C + B' pamphlet was the most cost-effective of the three pamphlets. There were no significant differences between any of the pamphlet groups on acceptability, knowledge or attitudes. It was suggested that the inclusion of behavioral strategies is likely to be a cost-effective approach to the development of printed health education materials.

  4. Mixing a Grounded Theory Approach with a Randomized Controlled Trial Related to Intimate Partner Violence: What Challenges Arise for Mixed Methods Research?

    PubMed Central

    Catallo, Cristina; Jack, Susan M.; Ciliska, Donna; MacMillan, Harriet L.

    2013-01-01

    Little is known about how to systematically integrate complex qualitative studies within the context of randomized controlled trials. A two-phase sequential explanatory mixed methods study was conducted in Canada to understand how women decide to disclose intimate partner violence in emergency department settings. Mixing a RCT (with a subanalysis of data) with a grounded theory approach required methodological modifications to maintain the overall rigour of this mixed methods study. Modifications were made to the following areas of the grounded theory approach to support the overall integrity of the mixed methods study design: recruitment of participants, maximum variation and negative case sampling, data collection, and analysis methods. Recommendations for future studies include: (1) planning at the outset to incorporate a qualitative approach with a RCT and to determine logical points during the RCT to integrate the qualitative component and (2) consideration for the time needed to carry out a RCT and a grounded theory approach, especially to support recruitment, data collection, and analysis. Data mixing strategies should be considered during early stages of the study, so that appropriate measures can be developed and used in the RCT to support initial coding structures and data analysis needs of the grounded theory phase. PMID:23577245

  5. Value of information analysis optimizing future trial design from a pilot study on catheter securement devices.

    PubMed

    Tuffaha, Haitham W; Reynolds, Heather; Gordon, Louisa G; Rickard, Claire M; Scuffham, Paul A

    2014-12-01

    Value of information analysis has been proposed as an alternative to the standard hypothesis testing approach, which is based on type I and type II errors, in determining sample sizes for randomized clinical trials. However, in addition to sample size calculation, value of information analysis can optimize other aspects of research design such as possible comparator arms and alternative follow-up times, by considering trial designs that maximize the expected net benefit of research, which is the difference between the expected cost of the trial and the expected value of additional information. To apply value of information methods to the results of a pilot study on catheter securement devices to determine the optimal design of a future larger clinical trial. An economic evaluation was performed using data from a multi-arm randomized controlled pilot study comparing the efficacy of four types of catheter securement devices: standard polyurethane, tissue adhesive, bordered polyurethane and sutureless securement device. Probabilistic Monte Carlo simulation was used to characterize uncertainty surrounding the study results and to calculate the expected value of additional information. To guide the optimal future trial design, the expected costs and benefits of the alternative trial designs were estimated and compared. Analysis of the value of further information indicated that a randomized controlled trial on catheter securement devices is potentially worthwhile. Among the possible designs for the future trial, a four-arm study with 220 patients/arm would provide the highest expected net benefit corresponding to 130% return-on-investment. The initially considered design of 388 patients/arm, based on hypothesis testing calculations, would provide lower net benefit with return-on-investment of 79%. Cost-effectiveness and value of information analyses were based on the data from a single pilot trial which might affect the accuracy of our uncertainty estimation. Another limitation was that different follow-up durations for the larger trial were not evaluated. The value of information approach allows efficient trial design by maximizing the expected net benefit of additional research. This approach should be considered early in the design of randomized clinical trials. © The Author(s) 2014.

  6. Efficient sampling of complex network with modified random walk strategies

    NASA Astrophysics Data System (ADS)

    Xie, Yunya; Chang, Shuhua; Zhang, Zhipeng; Zhang, Mi; Yang, Lei

    2018-02-01

    We present two novel random walk strategies, choosing seed node (CSN) random walk and no-retracing (NR) random walk. Different from the classical random walk sampling, the CSN and NR strategies focus on the influences of the seed node choice and path overlap, respectively. Three random walk samplings are applied in the Erdös-Rényi (ER), Barabási-Albert (BA), Watts-Strogatz (WS), and the weighted USAir networks, respectively. Then, the major properties of sampled subnets, such as sampling efficiency, degree distributions, average degree and average clustering coefficient, are studied. The similar conclusions can be reached with these three random walk strategies. Firstly, the networks with small scales and simple structures are conducive to the sampling. Secondly, the average degree and the average clustering coefficient of the sampled subnet tend to the corresponding values of original networks with limited steps. And thirdly, all the degree distributions of the subnets are slightly biased to the high degree side. However, the NR strategy performs better for the average clustering coefficient of the subnet. In the real weighted USAir networks, some obvious characters like the larger clustering coefficient and the fluctuation of degree distribution are reproduced well by these random walk strategies.

  7. Improving semi-automated segmentation by integrating learning with active sampling

    NASA Astrophysics Data System (ADS)

    Huo, Jing; Okada, Kazunori; Brown, Matthew

    2012-02-01

    Interactive segmentation algorithms such as GrowCut usually require quite a few user interactions to perform well, and have poor repeatability. In this study, we developed a novel technique to boost the performance of the interactive segmentation method GrowCut involving: 1) a novel "focused sampling" approach for supervised learning, as opposed to conventional random sampling; 2) boosting GrowCut using the machine learned results. We applied the proposed technique to the glioblastoma multiforme (GBM) brain tumor segmentation, and evaluated on a dataset of ten cases from a multiple center pharmaceutical drug trial. The results showed that the proposed system has the potential to reduce user interaction while maintaining similar segmentation accuracy.

  8. Methodology Series Module 5: Sampling Strategies

    PubMed Central

    Setia, Maninder Singh

    2016-01-01

    Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ‘ Sampling Method’. There are essentially two types of sampling methods: 1) probability sampling – based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling – based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ‘ random sample’ when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ‘ generalizability’ of these results. In such a scenario, the researcher may want to use ‘ purposive sampling’ for the study. PMID:27688438

  9. Sampling Strategies for Evaluating the Rate of Adventitious Transgene Presence in Non-Genetically Modified Crop Fields.

    PubMed

    Makowski, David; Bancal, Rémi; Bensadoun, Arnaud; Monod, Hervé; Messéan, Antoine

    2017-09-01

    According to E.U. regulations, the maximum allowable rate of adventitious transgene presence in non-genetically modified (GM) crops is 0.9%. We compared four sampling methods for the detection of transgenic material in agricultural non-GM maize fields: random sampling, stratified sampling, random sampling + ratio reweighting, random sampling + regression reweighting. Random sampling involves simply sampling maize grains from different locations selected at random from the field concerned. The stratified and reweighting sampling methods make use of an auxiliary variable corresponding to the output of a gene-flow model (a zero-inflated Poisson model) simulating cross-pollination as a function of wind speed, wind direction, and distance to the closest GM maize field. With the stratified sampling method, an auxiliary variable is used to define several strata with contrasting transgene presence rates, and grains are then sampled at random from each stratum. With the two methods involving reweighting, grains are first sampled at random from various locations within the field, and the observations are then reweighted according to the auxiliary variable. Data collected from three maize fields were used to compare the four sampling methods, and the results were used to determine the extent to which transgene presence rate estimation was improved by the use of stratified and reweighting sampling methods. We found that transgene rate estimates were more accurate and that substantially smaller samples could be used with sampling strategies based on an auxiliary variable derived from a gene-flow model. © 2017 Society for Risk Analysis.

  10. Stable and efficient retrospective 4D-MRI using non-uniformly distributed quasi-random numbers

    NASA Astrophysics Data System (ADS)

    Breuer, Kathrin; Meyer, Cord B.; Breuer, Felix A.; Richter, Anne; Exner, Florian; Weng, Andreas M.; Ströhle, Serge; Polat, Bülent; Jakob, Peter M.; Sauer, Otto A.; Flentje, Michael; Weick, Stefan

    2018-04-01

    The purpose of this work is the development of a robust and reliable three-dimensional (3D) Cartesian imaging technique for fast and flexible retrospective 4D abdominal MRI during free breathing. To this end, a non-uniform quasi random (NU-QR) reordering of the phase encoding (k y –k z ) lines was incorporated into 3D Cartesian acquisition. The proposed sampling scheme allocates more phase encoding points near the k-space origin while reducing the sampling density in the outer part of the k-space. Respiratory self-gating in combination with SPIRiT-reconstruction is used for the reconstruction of abdominal data sets in different respiratory phases (4D-MRI). Six volunteers and three patients were examined at 1.5 T during free breathing. Additionally, data sets with conventional two-dimensional (2D) linear and 2D quasi random phase encoding order were acquired for the volunteers for comparison. A quantitative evaluation of image quality versus scan times (from 70 s to 626 s) for the given sampling schemes was obtained by calculating the normalized mutual information (NMI) for all volunteers. Motion estimation was accomplished by calculating the maximum derivative of a signal intensity profile of a transition (e.g. tumor or diaphragm). The 2D non-uniform quasi-random distribution of phase encoding lines in Cartesian 3D MRI yields more efficient undersampling patterns for parallel imaging compared to conventional uniform quasi-random and linear sampling. Median NMI values of NU-QR sampling are the highest for all scan times. Therefore, within the same scan time 4D imaging could be performed with improved image quality. The proposed method allows for the reconstruction of motion artifact reduced 4D data sets with isotropic spatial resolution of 2.1  ×  2.1  ×  2.1 mm3 in a short scan time, e.g. 10 respiratory phases in only 3 min. Cranio-caudal tumor displacements between 23 and 46 mm could be observed. NU-QR sampling enables for stable 4D-MRI with high temporal and spatial resolution within short scan time for visualization of organ or tumor motion during free breathing. Further studies, e.g. the application of the method for radiotherapy planning are needed to investigate the clinical applicability and diagnostic value of the approach.

  11. Sampling Large Graphs for Anticipatory Analytics

    DTIC Science & Technology

    2015-05-15

    low. C. Random Area Sampling Random area sampling [8] is a “ snowball ” sampling method in which a set of random seed vertices are selected and areas... Sampling Large Graphs for Anticipatory Analytics Lauren Edwards, Luke Johnson, Maja Milosavljevic, Vijay Gadepally, Benjamin A. Miller Lincoln...systems, greater human-in-the-loop involvement, or through complex algorithms. We are investigating the use of sampling to mitigate these challenges

  12. Electromagnetic Scattering by Fully Ordered and Quasi-Random Rigid Particulate Samples

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Dlugach, Janna M.; Mackowski, Daniel W.

    2016-01-01

    In this paper we have analyzed circumstances under which a rigid particulate sample can behave optically as a true discrete random medium consisting of particles randomly moving relative to each other during measurement. To this end, we applied the numerically exact superposition T-matrix method to model far-field scattering characteristics of fully ordered and quasi-randomly arranged rigid multiparticle groups in fixed and random orientations. We have shown that, in and of itself, averaging optical observables over movements of a rigid sample as a whole is insufficient unless it is combined with a quasi-random arrangement of the constituent particles in the sample. Otherwise, certain scattering effects typical of discrete random media (including some manifestations of coherent backscattering) may not be accurately replicated.

  13. Extraction of linear features on SAR imagery

    NASA Astrophysics Data System (ADS)

    Liu, Junyi; Li, Deren; Mei, Xin

    2006-10-01

    Linear features are usually extracted from SAR imagery by a few edge detectors derived from the contrast ratio edge detector with a constant probability of false alarm. On the other hand, the Hough Transform is an elegant way of extracting global features like curve segments from binary edge images. Randomized Hough Transform can reduce the computation time and memory usage of the HT drastically. While Randomized Hough Transform will bring about a great deal of cells invalid during the randomized sample. In this paper, we propose a new approach to extract linear features on SAR imagery, which is an almost automatic algorithm based on edge detection and Randomized Hough Transform. The presented improved method makes full use of the directional information of each edge candidate points so as to solve invalid cumulate problems. Applied result is in good agreement with the theoretical study, and the main linear features on SAR imagery have been extracted automatically. The method saves storage space and computational time, which shows its effectiveness and applicability.

  14. List randomization for soliciting experience of intimate partner violence: Application to the evaluation of Zambia's unconditional child grant program.

    PubMed

    Peterman, Amber; Palermo, Tia M; Handa, Sudhanshu; Seidenfeld, David

    2018-03-01

    Social scientists have increasingly invested in understanding how to improve data quality and measurement of sensitive topics in household surveys. We utilize the technique of list randomization to collect measures of physical intimate partner violence in an experimental impact evaluation of the Government of Zambia's Child Grant Program. The Child Grant Program is an unconditional cash transfer, which targeted female caregivers of children under the age of 5 in rural areas to receive the equivalent of US $24 as a bimonthly stipend. The implementation results show that the list randomization methodology functioned as planned, with approximately 15% of the sample identifying 12-month prevalence of physical intimate partner violence. According to this measure, after 4 years, the program had no measurable effect on partner violence. List randomization is a promising approach to incorporate sensitive measures into multitopic evaluations; however, more research is needed to improve upon methodology for application to measurement of violence. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Robustly Aligning a Shape Model and Its Application to Car Alignment of Unknown Pose.

    PubMed

    Li, Yan; Gu, Leon; Kanade, Takeo

    2011-09-01

    Precisely localizing in an image a set of feature points that form a shape of an object, such as car or face, is called alignment. Previous shape alignment methods attempted to fit a whole shape model to the observed data, based on the assumption of Gaussian observation noise and the associated regularization process. However, such an approach, though able to deal with Gaussian noise in feature detection, turns out not to be robust or precise because it is vulnerable to gross feature detection errors or outliers resulting from partial occlusions or spurious features from the background or neighboring objects. We address this problem by adopting a randomized hypothesis-and-test approach. First, a Bayesian inference algorithm is developed to generate a shape-and-pose hypothesis of the object from a partial shape or a subset of feature points. For alignment, a large number of hypotheses are generated by randomly sampling subsets of feature points, and then evaluated to find the one that minimizes the shape prediction error. This method of randomized subset-based matching can effectively handle outliers and recover the correct object shape. We apply this approach on a challenging data set of over 5,000 different-posed car images, spanning a wide variety of car types, lighting, background scenes, and partial occlusions. Experimental results demonstrate favorable improvements over previous methods on both accuracy and robustness.

  16. Secondary outcome analysis for data from an outcome-dependent sampling design.

    PubMed

    Pan, Yinghao; Cai, Jianwen; Longnecker, Matthew P; Zhou, Haibo

    2018-04-22

    Outcome-dependent sampling (ODS) scheme is a cost-effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second- and higher-order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method. Copyright © 2018 John Wiley & Sons, Ltd.

  17. The non-equilibrium allele frequency spectrum in a Poisson random field framework.

    PubMed

    Kaj, Ingemar; Mugal, Carina F

    2016-10-01

    In population genetic studies, the allele frequency spectrum (AFS) efficiently summarizes genome-wide polymorphism data and shapes a variety of allele frequency-based summary statistics. While existing theory typically features equilibrium conditions, emerging methodology requires an analytical understanding of the build-up of the allele frequencies over time. In this work, we use the framework of Poisson random fields to derive new representations of the non-equilibrium AFS for the case of a Wright-Fisher population model with selection. In our approach, the AFS is a scaling-limit of the expectation of a Poisson stochastic integral and the representation of the non-equilibrium AFS arises in terms of a fixation time probability distribution. The known duality between the Wright-Fisher diffusion process and a birth and death process generalizing Kingman's coalescent yields an additional representation. The results carry over to the setting of a random sample drawn from the population and provide the non-equilibrium behavior of sample statistics. Our findings are consistent with and extend a previous approach where the non-equilibrium AFS solves a partial differential forward equation with a non-traditional boundary condition. Moreover, we provide a bridge to previous coalescent-based work, and hence tie several frameworks together. Since frequency-based summary statistics are widely used in population genetics, for example, to identify candidate loci of adaptive evolution, to infer the demographic history of a population, or to improve our understanding of the underlying mechanics of speciation events, the presented results are potentially useful for a broad range of topics. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Bridging the gap between formal and experience-based knowledge for context-aware laparoscopy.

    PubMed

    Katić, Darko; Schuck, Jürgen; Wekerle, Anna-Laura; Kenngott, Hannes; Müller-Stich, Beat Peter; Dillmann, Rüdiger; Speidel, Stefanie

    2016-06-01

    Computer assistance is increasingly common in surgery. However, the amount of information is bound to overload processing abilities of surgeons. We propose methods to recognize the current phase of a surgery for context-aware information filtering. The purpose is to select the most suitable subset of information for surgical situations which require special assistance. We combine formal knowledge, represented by an ontology, and experience-based knowledge, represented by training samples, to recognize phases. For this purpose, we have developed two different methods. Firstly, we use formal knowledge about possible phase transitions to create a composition of random forests. Secondly, we propose a method based on cultural optimization to infer formal rules from experience to recognize phases. The proposed methods are compared with a purely formal knowledge-based approach using rules and a purely experience-based one using regular random forests. The comparative evaluation on laparoscopic pancreas resections and adrenalectomies employs a consistent set of quality criteria on clean and noisy input. The rule-based approaches proved best with noisefree data. The random forest-based ones were more robust in the presence of noise. Formal and experience-based knowledge can be successfully combined for robust phase recognition.

  19. Assessment of wadeable stream resources in the driftless area ecoregion in Western Wisconsin using a probabilistic sampling design.

    PubMed

    Miller, Michael A; Colby, Alison C C; Kanehl, Paul D; Blocksom, Karen

    2009-03-01

    The Wisconsin Department of Natural Resources (WDNR), with support from the U.S. EPA, conducted an assessment of wadeable streams in the Driftless Area ecoregion in western Wisconsin using a probabilistic sampling design. This ecoregion encompasses 20% of Wisconsin's land area and contains 8,800 miles of perennial streams. Randomly-selected stream sites (n = 60) equally distributed among stream orders 1-4 were sampled. Watershed land use, riparian and in-stream habitat, water chemistry, macroinvertebrate, and fish assemblage data were collected at each true random site and an associated "modified-random" site on each stream that was accessed via a road crossing nearest to the true random site. Targeted least-disturbed reference sites (n = 22) were also sampled to develop reference conditions for various physical, chemical, and biological measures. Cumulative distribution function plots of various measures collected at the true random sites evaluated with reference condition thresholds, indicate that high proportions of the random sites (and by inference the entire Driftless Area wadeable stream population) show some level of degradation. Study results show no statistically significant differences between the true random and modified-random sample sites for any of the nine physical habitat, 11 water chemistry, seven macroinvertebrate, or eight fish metrics analyzed. In Wisconsin's Driftless Area, 79% of wadeable stream lengths were accessible via road crossings. While further evaluation of the statistical rigor of using a modified-random sampling design is warranted, sampling randomly-selected stream sites accessed via the nearest road crossing may provide a more economical way to apply probabilistic sampling in stream monitoring programs.

  20. Family physicians' approach to psychotherapy and counseling. Perceptions and practices.

    PubMed Central

    Swanson, J. G.

    1994-01-01

    To determine how family physicians perceive the support they get for psychotherapy and counseling, we surveyed a random sample of Ontario College of Family Physicians members. Of 100 physicians who had family medicine residency training with psychotherapy experience, 43% indicated that such training was inadequate for their current needs. Because family physicians often provide psychotherapy and counseling, their training should reflect the needs found in practice. PMID:8080505

  1. A Description of the Building Materials Data Base for Portland, Maine.

    DTIC Science & Technology

    1986-06-01

    WORDS (Continue on reveree side if neceseary and Identify by block number)". Acid precipitation, , Data bases, Damage assessment, Environmental...protection) Damage from acid deposition, Portland, Maine Damage to buildings, - Statistical analysis, . 20. ASsrRACT (Conlaue a reverse e(A It n -cwery md...types and amounts of building surface materials ex- posed to acid deposition. The stratified, systematic, unaligned random sampling approach was used

  2. Meta-analysis with missing study-level sample variance data.

    PubMed

    Chowdhry, Amit K; Dworkin, Robert H; McDermott, Michael P

    2016-07-30

    We consider a study-level meta-analysis with a normally distributed outcome variable and possibly unequal study-level variances, where the object of inference is the difference in means between a treatment and control group. A common complication in such an analysis is missing sample variances for some studies. A frequently used approach is to impute the weighted (by sample size) mean of the observed variances (mean imputation). Another approach is to include only those studies with variances reported (complete case analysis). Both mean imputation and complete case analysis are only valid under the missing-completely-at-random assumption, and even then the inverse variance weights produced are not necessarily optimal. We propose a multiple imputation method employing gamma meta-regression to impute the missing sample variances. Our method takes advantage of study-level covariates that may be used to provide information about the missing data. Through simulation studies, we show that multiple imputation, when the imputation model is correctly specified, is superior to competing methods in terms of confidence interval coverage probability and type I error probability when testing a specified group difference. Finally, we describe a similar approach to handling missing variances in cross-over studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  3. Multiple imputation methods for nonparametric inference on cumulative incidence with missing cause of failure

    PubMed Central

    Lee, Minjung; Dignam, James J.; Han, Junhee

    2014-01-01

    We propose a nonparametric approach for cumulative incidence estimation when causes of failure are unknown or missing for some subjects. Under the missing at random assumption, we estimate the cumulative incidence function using multiple imputation methods. We develop asymptotic theory for the cumulative incidence estimators obtained from multiple imputation methods. We also discuss how to construct confidence intervals for the cumulative incidence function and perform a test for comparing the cumulative incidence functions in two samples with missing cause of failure. Through simulation studies, we show that the proposed methods perform well. The methods are illustrated with data from a randomized clinical trial in early stage breast cancer. PMID:25043107

  4. Random anisotropy model approach on ion beam sputtered Co 20Cu 80 granular alloy

    NASA Astrophysics Data System (ADS)

    Errahmani, H.; Hassanaı̈n, N.; Berrada, A.; Abid, M.; Lassri, H.; Schmerber, G.; Dinia, A.

    2002-03-01

    The Co 20Cu 80 granular film has been elaborated using ion beam sputtering technique. The magnetic properties of the sample were studied in the temperature range 5-300 K at H⩽50 kOe. From the thermomagnetisation curve, which is found to obey to the Bloch law, we have extracted the spin wave stiffness constant D and the exchange constant A. The magnetic experimental results have been interpreted in the framework of random anisotropy model. We have determined the local anisotropy constant KL and the local correlation length of anisotropy axis Ra, which is compared to the experimental grains size obtained by transmission electronic microscopy.

  5. The fidelity of Kepler eclipsing binary parameters inferred by the neural network

    NASA Astrophysics Data System (ADS)

    Holanda, N.; da Silva, J. R. P.

    2018-04-01

    This work aims to test the fidelity and efficiency of obtaining automatic orbital elements of eclipsing binary systems, from light curves using neural network models. We selected a random sample with 78 systems, from over 1400 eclipsing binary detached obtained from the Kepler Eclipsing Binaries Catalog, processed using the neural network approach. The orbital parameters of the sample systems were measured applying the traditional method of light curve adjustment with uncertainties calculated by the bootstrap method, employing the JKTEBOP code. These estimated parameters were compared with those obtained by the neural network approach for the same systems. The results reveal a good agreement between techniques for the sum of the fractional radii and moderate agreement for e cos ω and e sin ω, but orbital inclination is clearly underestimated in neural network tests.

  6. Understanding the Effects of Sampling on Healthcare Risk Modeling for the Prediction of Future High-Cost Patients

    NASA Astrophysics Data System (ADS)

    Moturu, Sai T.; Liu, Huan; Johnson, William G.

    Rapidly rising healthcare costs represent one of the major issues plaguing the healthcare system. Data from the Arizona Health Care Cost Containment System, Arizona's Medicaid program provide a unique opportunity to exploit state-of-the-art machine learning and data mining algorithms to analyze data and provide actionable findings that can aid cost containment. Our work addresses specific challenges in this real-life healthcare application with respect to data imbalance in the process of building predictive risk models for forecasting high-cost patients. We survey the literature and propose novel data mining approaches customized for this compelling application with specific focus on non-random sampling. Our empirical study indicates that the proposed approach is highly effective and can benefit further research on cost containment in the healthcare industry.

  7. The fidelity of Kepler eclipsing binary parameters inferred by the neural network

    NASA Astrophysics Data System (ADS)

    Holanda, N.; da Silva, J. R. P.

    2018-07-01

    This work aims to test the fidelity and efficiency of obtaining automatic orbital elements of eclipsing binary systems, from light curves using neural network models. We selected a random sample with 78 systems, from over 1400 detached eclipsing binaries obtained from the Kepler Eclipsing Binaries Catalog, processed using the neural network approach. The orbital parameters of the sample systems were measured applying the traditional method of light-curve adjustment with uncertainties calculated by the bootstrap method, employing the JKTEBOP code. These estimated parameters were compared with those obtained by the neural network approach for the same systems. The results reveal a good agreement between techniques for the sum of the fractional radii and moderate agreement for e cosω and e sinω, but orbital inclination is clearly underestimated in neural network tests.

  8. Using Bayesian Adaptive Trial Designs for Comparative Effectiveness Research: A Virtual Trial Execution.

    PubMed

    Luce, Bryan R; Connor, Jason T; Broglio, Kristine R; Mullins, C Daniel; Ishak, K Jack; Saunders, Elijah; Davis, Barry R

    2016-09-20

    Bayesian and adaptive clinical trial designs offer the potential for more efficient processes that result in lower sample sizes and shorter trial durations than traditional designs. To explore the use and potential benefits of Bayesian adaptive clinical trial designs in comparative effectiveness research. Virtual execution of ALLHAT (Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial) as if it had been done according to a Bayesian adaptive trial design. Comparative effectiveness trial of antihypertensive medications. Patient data sampled from the more than 42 000 patients enrolled in ALLHAT with publicly available data. Number of patients randomly assigned between groups, trial duration, observed numbers of events, and overall trial results and conclusions. The Bayesian adaptive approach and original design yielded similar overall trial conclusions. The Bayesian adaptive trial randomly assigned more patients to the better-performing group and would probably have ended slightly earlier. This virtual trial execution required limited resampling of ALLHAT patients for inclusion in RE-ADAPT (REsearch in ADAptive methods for Pragmatic Trials). Involvement of a data monitoring committee and other trial logistics were not considered. In a comparative effectiveness research trial, Bayesian adaptive trial designs are a feasible approach and potentially generate earlier results and allocate more patients to better-performing groups. National Heart, Lung, and Blood Institute.

  9. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE PAGES

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; ...

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are too laborious to be applied to hundreds of experimental conditions across multiple bacteria. Here, we describe an approach, random bar code transposon-site sequencing (RB-TnSeq), which greatly simplifies the measurement of gene fitness by using bar code sequencing (BarSeq) to monitor the abundance of mutants. We performed 387 genome-wide fitness assays across five bacteria and identified phenotypes for over 5,000 genes. RB-TnSeq can be applied to diverse bacteria and is a powerful tool to annotate uncharacterized genes using phenotype data.« less

  10. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are too laborious to be applied to hundreds of experimental conditions across multiple bacteria. Here, we describe an approach, random bar code transposon-site sequencing (RB-TnSeq), which greatly simplifies the measurement of gene fitness by using bar code sequencing (BarSeq) to monitor the abundance of mutants. We performed 387 genome-wide fitness assays across five bacteria and identified phenotypes for over 5,000 genes. RB-TnSeq can be applied to diverse bacteria and is a powerful tool to annotate uncharacterized genes using phenotype data.« less

  11. Sample size requirements for the design of reliability studies: precision consideration.

    PubMed

    Shieh, Gwowen

    2014-09-01

    In multilevel modeling, the intraclass correlation coefficient based on the one-way random-effects model is routinely employed to measure the reliability or degree of resemblance among group members. To facilitate the advocated practice of reporting confidence intervals in future reliability studies, this article presents exact sample size procedures for precise interval estimation of the intraclass correlation coefficient under various allocation and cost structures. Although the suggested approaches do not admit explicit sample size formulas and require special algorithms for carrying out iterative computations, they are more accurate than the closed-form formulas constructed from large-sample approximations with respect to the expected width and assurance probability criteria. This investigation notes the deficiency of existing methods and expands the sample size methodology for the design of reliability studies that have not previously been discussed in the literature.

  12. Outcome-Dependent Sampling with Interval-Censored Failure Time Data

    PubMed Central

    Zhou, Qingning; Cai, Jianwen; Zhou, Haibo

    2017-01-01

    Summary Epidemiologic studies and disease prevention trials often seek to relate an exposure variable to a failure time that suffers from interval-censoring. When the failure rate is low and the time intervals are wide, a large cohort is often required so as to yield reliable precision on the exposure-failure-time relationship. However, large cohort studies with simple random sampling could be prohibitive for investigators with a limited budget, especially when the exposure variables are expensive to obtain. Alternative cost-effective sampling designs and inference procedures are therefore desirable. We propose an outcome-dependent sampling (ODS) design with interval-censored failure time data, where we enrich the observed sample by selectively including certain more informative failure subjects. We develop a novel sieve semiparametric maximum empirical likelihood approach for fitting the proportional hazards model to data from the proposed interval-censoring ODS design. This approach employs the empirical likelihood and sieve methods to deal with the infinite-dimensional nuisance parameters, which greatly reduces the dimensionality of the estimation problem and eases the computation difficulty. The consistency and asymptotic normality of the resulting regression parameter estimator are established. The results from our extensive simulation study show that the proposed design and method works well for practical situations and is more efficient than the alternative designs and competing approaches. An example from the Atherosclerosis Risk in Communities (ARIC) study is provided for illustration. PMID:28771664

  13. Effective surveillance strategies following a potential classical Swine Fever incursion in a remote wild pig population in North-Western Australia.

    PubMed

    Leslie, E; Cowled, B; Graeme Garner, M; Toribio, J-A L M L; Ward, M P

    2014-10-01

    Early disease detection and efficient methods of proving disease freedom can substantially improve the response to incursions of important transboundary animal diseases in previously free regions. We used a spatially explicit, stochastic disease spread model to simulate the spread of classical swine fever in wild pigs in a remote region of northern Australia and to assess the performance of disease surveillance strategies to detect infection at different time points and to delineate the size of the resulting outbreak. Although disease would likely be detected, simple random sampling was suboptimal. Radial and leapfrog sampling improved the effectiveness of surveillance at various stages of the simulated disease incursion. This work indicates that at earlier stages, radial sampling can reduce epidemic length and achieve faster outbreak delineation and control, but at later stages leapfrog sampling will outperform radial sampling in relation to supporting faster disease control with a less-extensive outbreak area. Due to the complexity of wildlife population dynamics and group behaviour, a targeted approach to surveillance needs to be implemented for the efficient use of resources and time. Using a more situation-based surveillance approach and accounting for disease distribution and the time period over which an epidemic has occurred is the best way to approach the selection of an appropriate surveillance strategy. © 2013 Blackwell Verlag GmbH.

  14. A low-rank matrix recovery approach for energy efficient EEG acquisition for a wireless body area network.

    PubMed

    Majumdar, Angshul; Gogna, Anupriya; Ward, Rabab

    2014-08-25

    We address the problem of acquiring and transmitting EEG signals in Wireless Body Area Networks (WBAN) in an energy efficient fashion. In WBANs, the energy is consumed by three operations: sensing (sampling), processing and transmission. Previous studies only addressed the problem of reducing the transmission energy. For the first time, in this work, we propose a technique to reduce sensing and processing energy as well: this is achieved by randomly under-sampling the EEG signal. We depart from previous Compressed Sensing based approaches and formulate signal recovery (from under-sampled measurements) as a matrix completion problem. A new algorithm to solve the matrix completion problem is derived here. We test our proposed method and find that the reconstruction accuracy of our method is significantly better than state-of-the-art techniques; and we achieve this while saving sensing, processing and transmission energy. Simple power analysis shows that our proposed methodology consumes considerably less power compared to previous CS based techniques.

  15. Stalkers and harassers of British royalty: an exploration of proxy behaviours for violence.

    PubMed

    James, David V; Mullen, Paul E; Meloy, J Reid; Pathé, Michele T; Preston, Lulu; Darnley, Brian; Farnham, Frank R; Scalora, Mario J

    2011-01-01

    Study of risk factors for violence to prominent people is difficult because of low base rates. This study of harassers of the royal family examined factors suggested in the literature as proxies for violence--breaching security barriers, achieving proximity, approach with a weapon, and approach with homicidal ideation. A stratified sample of different types of approach behaviour was randomly extracted from 2,332 Royalty Protection Police files, which had been divided into behavioural types. The final sample size was 275. Significant differences in illness symptomatology and motivation were found for each proxy group. Querulants were significantly over-represented in three of the four groups. There was generally little overlap between the proxy groups. There is no evidence of the proxy items examined being part of a "pathway to violence". Different motivations may be associated with different patterns of risk. Risk assessment must incorporate knowledge of the interactions between motivation, mental state, and behaviour. Copyright © 2010 John Wiley & Sons, Ltd.

  16. Geographic Information Systems to Assess External Validity in Randomized Trials.

    PubMed

    Savoca, Margaret R; Ludwig, David A; Jones, Stedman T; Jason Clodfelter, K; Sloop, Joseph B; Bollhalter, Linda Y; Bertoni, Alain G

    2017-08-01

    To support claims that RCTs can reduce health disparities (i.e., are translational), it is imperative that methodologies exist to evaluate the tenability of external validity in RCTs when probabilistic sampling of participants is not employed. Typically, attempts at establishing post hoc external validity are limited to a few comparisons across convenience variables, which must be available in both sample and population. A Type 2 diabetes RCT was used as an example of a method that uses a geographic information system to assess external validity in the absence of a priori probabilistic community-wide diabetes risk sampling strategy. A geographic information system, 2009-2013 county death certificate records, and 2013-2014 electronic medical records were used to identify community-wide diabetes prevalence. Color-coded diabetes density maps provided visual representation of these densities. Chi-square goodness of fit statistic/analysis tested the degree to which distribution of RCT participants varied across density classes compared to what would be expected, given simple random sampling of the county population. Analyses were conducted in 2016. Diabetes prevalence areas as represented by death certificate and electronic medical records were distributed similarly. The simple random sample model was not a good fit for death certificate record (chi-square, 17.63; p=0.0001) and electronic medical record data (chi-square, 28.92; p<0.0001). Generally, RCT participants were oversampled in high-diabetes density areas. Location is a highly reliable "principal variable" associated with health disparities. It serves as a directly measurable proxy for high-risk underserved communities, thus offering an effective and practical approach for examining external validity of RCTs. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  17. Treating stimuli as a random factor in social psychology: a new and comprehensive solution to a pervasive but largely ignored problem.

    PubMed

    Judd, Charles M; Westfall, Jacob; Kenny, David A

    2012-07-01

    Throughout social and cognitive psychology, participants are routinely asked to respond in some way to experimental stimuli that are thought to represent categories of theoretical interest. For instance, in measures of implicit attitudes, participants are primed with pictures of specific African American and White stimulus persons sampled in some way from possible stimuli that might have been used. Yet seldom is the sampling of stimuli taken into account in the analysis of the resulting data, in spite of numerous warnings about the perils of ignoring stimulus variation (Clark, 1973; Kenny, 1985; Wells & Windschitl, 1999). Part of this failure to attend to stimulus variation is due to the demands imposed by traditional analysis of variance procedures for the analysis of data when both participants and stimuli are treated as random factors. In this article, we present a comprehensive solution using mixed models for the analysis of data with crossed random factors (e.g., participants and stimuli). We show the substantial biases inherent in analyses that ignore one or the other of the random factors, and we illustrate the substantial advantages of the mixed models approach with both hypothetical and actual, well-known data sets in social psychology (Bem, 2011; Blair, Chapleau, & Judd, 2005; Correll, Park, Judd, & Wittenbrink, 2002). PsycINFO Database Record (c) 2012 APA, all rights reserved

  18. A random spatial sampling method in a rural developing nation

    Treesearch

    Michelle C. Kondo; Kent D.W. Bream; Frances K. Barg; Charles C. Branas

    2014-01-01

    Nonrandom sampling of populations in developing nations has limitations and can inaccurately estimate health phenomena, especially among hard-to-reach populations such as rural residents. However, random sampling of rural populations in developing nations can be challenged by incomplete enumeration of the base population. We describe a stratified random sampling method...

  19. Reduction of display artifacts by random sampling

    NASA Technical Reports Server (NTRS)

    Ahumada, A. J., Jr.; Nagel, D. C.; Watson, A. B.; Yellott, J. I., Jr.

    1983-01-01

    The application of random-sampling techniques to remove visible artifacts (such as flicker, moire patterns, and paradoxical motion) introduced in TV-type displays by discrete sequential scanning is discussed and demonstrated. Sequential-scanning artifacts are described; the window of visibility defined in spatiotemporal frequency space by Watson and Ahumada (1982 and 1983) and Watson et al. (1983) is explained; the basic principles of random sampling are reviewed and illustrated by the case of the human retina; and it is proposed that the sampling artifacts can be replaced by random noise, which can then be shifted to frequency-space regions outside the window of visibility. Vertical sequential, single-random-sequence, and continuously renewed random-sequence plotting displays generating 128 points at update rates up to 130 Hz are applied to images of stationary and moving lines, and best results are obtained with the single random sequence for the stationary lines and with the renewed random sequence for the moving lines.

  20. Implementing an Accurate and Rapid Sparse Sampling Approach for Low-Dose Atomic Resolution STEM Imaging

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kovarik, Libor; Stevens, Andrew J.; Liyu, Andrey V.

    Aberration correction for scanning transmission electron microscopes (STEM) has dramatically increased spatial image resolution for beam-stable materials, but it is the sample stability rather than the microscope that often limits the practical resolution of STEM images. To extract physical information from images of beam sensitive materials it is becoming clear that there is a critical dose/dose-rate below which the images can be interpreted as representative of the pristine material, while above it the observation is dominated by beam effects. Here we describe an experimental approach for sparse sampling in the STEM and in-painting image reconstruction in order to reduce themore » electron dose/dose-rate to the sample during imaging. By characterizing the induction limited rise-time and hysteresis in scan coils, we show that sparse line-hopping approach to scan randomization can be implemented that optimizes both the speed of the scan and the amount of the sample that needs to be illuminated by the beam. The dose and acquisition time for the sparse sampling is shown to be effectively decreased by factor of 5x relative to conventional acquisition, permitting imaging of beam sensitive materials to be obtained without changing the microscope operating parameters. As a result, the use of sparse line-hopping scan to acquire STEM images is demonstrated with atomic resolution aberration corrected Z-contrast images of CaCO 3, a material that is traditionally difficult to image by TEM/STEM because of dose issues.« less

  1. Implementing an Accurate and Rapid Sparse Sampling Approach for Low-Dose Atomic Resolution STEM Imaging

    DOE PAGES

    Kovarik, Libor; Stevens, Andrew J.; Liyu, Andrey V.; ...

    2016-10-17

    Aberration correction for scanning transmission electron microscopes (STEM) has dramatically increased spatial image resolution for beam-stable materials, but it is the sample stability rather than the microscope that often limits the practical resolution of STEM images. To extract physical information from images of beam sensitive materials it is becoming clear that there is a critical dose/dose-rate below which the images can be interpreted as representative of the pristine material, while above it the observation is dominated by beam effects. Here we describe an experimental approach for sparse sampling in the STEM and in-painting image reconstruction in order to reduce themore » electron dose/dose-rate to the sample during imaging. By characterizing the induction limited rise-time and hysteresis in scan coils, we show that sparse line-hopping approach to scan randomization can be implemented that optimizes both the speed of the scan and the amount of the sample that needs to be illuminated by the beam. The dose and acquisition time for the sparse sampling is shown to be effectively decreased by factor of 5x relative to conventional acquisition, permitting imaging of beam sensitive materials to be obtained without changing the microscope operating parameters. The use of sparse line-hopping scan to acquire STEM images is demonstrated with atomic resolution aberration corrected Z-contrast images of CaCO3, a material that is traditionally difficult to image by TEM/STEM because of dose issues.« less

  2. FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

    PubMed

    El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

    2016-01-01

    A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein-DNA interfaces.

  3. Evaluation of bias and logistics in a survey of adults at increased risk for oral health decrements.

    PubMed

    Gilbert, G H; Duncan, R P; Kulley, A M; Coward, R T; Heft, M W

    1997-01-01

    Designing research to include sufficient respondents in groups at highest risk for oral health decrements can present unique challenges. Our purpose was to evaluate bias and logistics in this survey of adults at increased risk for oral health decrements. We used a telephone survey methodology that employed both listed numbers and random digit dialing to identify dentate persons 45 years old or older and to oversample blacks, poor persons, and residents of nonmetropolitan counties. At a second stage, a subsample of the respondents to the initial telephone screening was selected for further study, which consisted of a baseline in-person interview and a clinical examination. We assessed bias due to: (1) limiting the sample to households with telephones, (2) using predominantly listed numbers instead of random digit dialing, and (3) nonresponse at two stages of data collection. While this approach apparently created some biases in the sample, they were small in magnitude. Specifically, limiting the sample to households with telephones biased the sample overall toward more females, larger households, and fewer functionally impaired persons. Using predominantly listed numbers led to a modest bias toward selection of persons more likely to be younger, healthier, female, have had a recent dental visit, and reside in smaller households. Blacks who were selected randomly at a second stage were more likely to participate in baseline data gathering than their white counterparts. Comparisons of the data obtained in this survey with those from recent national surveys suggest that this methodology for sampling high-risk groups did not substantively bias the sample with respect to two important dental parameters, prevalence of edentulousness and dental care use, nor were conclusions about multivariate associations with dental care recency substantively affected. This method of sampling persons at high risk for oral health decrements resulted in only modest bias with respect to the population of interest.

  4. Social network recruitment for Yo Puedo: an innovative sexual health intervention in an underserved urban neighborhood—sample and design implications.

    PubMed

    Minnis, Alexandra M; vanDommelen-Gonzalez, Evan; Luecke, Ellen; Cheng, Helen; Dow, William; Bautista-Arredondo, Sergio; Padian, Nancy S

    2015-02-01

    Most existing evidence-based sexual health interventions focus on individual-level behavior, even though there is substantial evidence that highlights the influential role of social environments in shaping adolescents' behaviors and reproductive health outcomes. We developed Yo Puedo, a combined conditional cash transfer and life skills intervention for youth to promote educational attainment, job training, and reproductive health wellness that we then evaluated for feasibility among 162 youth aged 16-21 years in a predominantly Latino community in San Francisco, CA. The intervention targeted youth's social networks and involved recruitment and randomization of small social network clusters. In this paper we describe the design of the feasibility study and report participants' baseline characteristics. Furthermore, we examined the sample and design implications of recruiting social network clusters as the unit of randomization. Baseline data provide evidence that we successfully enrolled high risk youth using a social network recruitment approach in community and school-based settings. Nearly all participants (95%) were high risk for adverse educational and reproductive health outcomes based on multiple measures of low socioeconomic status (81%) and/or reported high risk behaviors (e.g., gang affiliation, past pregnancy, recent unprotected sex, frequent substance use; 62%). We achieved variability in the study sample through heterogeneity in recruitment of the index participants, whereas the individuals within the small social networks of close friends demonstrated substantial homogeneity across sociodemographic and risk profile characteristics. Social networks recruitment was feasible and yielded a sample of high risk youth willing to enroll in a randomized study to evaluate a novel sexual health intervention.

  5. The positive deviance/hearth approach to reducing child malnutrition: systematic review.

    PubMed

    Bisits Bullen, Piroska A

    2011-11-01

    The Positive Deviance/Hearth approach aims to rehabilitate malnourished children using practices from mothers in the community who have well-nourished children despite living in poverty. This study assesses its effectiveness in a range of settings. Systematic review of peer reviewed intervention trials and grey literature evaluation reports of child malnutrition programs using the Positive Deviance/Hearth approach. Ten peer reviewed studies and 14 grey literature reports met the inclusion criteria. These described results for 17 unique Positive Deviance/Hearth programs in 12 countries. Nine programs used a pre- and post-test design without a control, which limited the conclusions that could be drawn. Eight used more robust designs such as non-randomized trials, non-randomized cross-sectional sibling studies and randomized controlled trials (RCTs). Of the eight programs that reported nutritional outcomes, five reported some type of positive result in terms of nutritional status - although the improvement was not always as large as predicted, or across the entire target population. Both the two RCTs demonstrated improvements in carer feeding practices. Qualitative results unanimously reported high levels of satisfaction from participants and recipient communities. Overall this study shows mixed results in terms of program effectiveness, although some Positive Deviance/Hearth programs have clearly been successful in particular settings. Sibling studies suggest that the Positive Deviance/Hearth approach may have a role in preventing malnutrition, not just rehabilitation. Further research is needed using more robust study designs and larger sample sizes. Issues related to community participation and consistency in reporting results need to be addressed. © 2011 Blackwell Publishing Ltd.

  6. Optimal estimation for discrete time jump processes

    NASA Technical Reports Server (NTRS)

    Vaca, M. V.; Tretter, S. A.

    1977-01-01

    Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are obtained. The approach is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. A general representation for optimum estimates and recursive equations for minimum mean squared error (MMSE) estimates are obtained. MMSE estimates are nonlinear functions of the observations. The problem of estimating the rate of a DTJP when the rate is a random variable with a probability density function of the form cx super K (l-x) super m and show that the MMSE estimates are linear in this case. This class of density functions explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.

  7. Optimal estimation for discrete time jump processes

    NASA Technical Reports Server (NTRS)

    Vaca, M. V.; Tretter, S. A.

    1978-01-01

    Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are derived. The approach used is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. Thus a general representation is obtained for optimum estimates, and recursive equations are derived for minimum mean-squared error (MMSE) estimates. In general, MMSE estimates are nonlinear functions of the observations. The problem is considered of estimating the rate of a DTJP when the rate is a random variable with a beta probability density function and the jump amplitudes are binomially distributed. It is shown that the MMSE estimates are linear. The class of beta density functions is rather rich and explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.

  8. Subrandom methods for multidimensional nonuniform sampling.

    PubMed

    Worley, Bradley

    2016-08-01

    Methods of nonuniform sampling that utilize pseudorandom number sequences to select points from a weighted Nyquist grid are commonplace in biomolecular NMR studies, due to the beneficial incoherence introduced by pseudorandom sampling. However, these methods require the specification of a non-arbitrary seed number in order to initialize a pseudorandom number generator. Because the performance of pseudorandom sampling schedules can substantially vary based on seed number, this can complicate the task of routine data collection. Approaches such as jittered sampling and stochastic gap sampling are effective at reducing random seed dependence of nonuniform sampling schedules, but still require the specification of a seed number. This work formalizes the use of subrandom number sequences in nonuniform sampling as a means of seed-independent sampling, and compares the performance of three subrandom methods to their pseudorandom counterparts using commonly applied schedule performance metrics. Reconstruction results using experimental datasets are also provided to validate claims made using these performance metrics. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. A Random Finite Set Approach to Space Junk Tracking and Identification

    DTIC Science & Technology

    2014-09-03

    Final 3. DATES COVERED (From - To) 31 Jan 13 – 29 Apr 14 4. TITLE AND SUBTITLE A Random Finite Set Approach to Space Junk Tracking and...01-2013 to 29-04-2014 4. TITLE AND SUBTITLE A Random Finite Set Approach to Space Junk Tracking and Identification 5a. CONTRACT NUMBER FA2386-13...Prescribed by ANSI Std Z39-18 A Random Finite Set Approach to Space Junk Tracking and Indentification Ba-Ngu Vo1, Ba-Tuong Vo1, 1Department of

  10. Honest Importance Sampling with Multiple Markov Chains

    PubMed Central

    Tan, Aixin; Doss, Hani; Hobert, James P.

    2017-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection. PMID:28701855

  11. Honest Importance Sampling with Multiple Markov Chains.

    PubMed

    Tan, Aixin; Doss, Hani; Hobert, James P

    2015-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.

  12. VIGAN: Missing View Imputation with Generative Adversarial Networks.

    PubMed

    Shang, Chao; Palmer, Aaron; Sun, Jiangwen; Chen, Ko-Shin; Lu, Jin; Bi, Jinbo

    2017-01-01

    In an era when big data are becoming the norm, there is less concern with the quantity but more with the quality and completeness of the data. In many disciplines, data are collected from heterogeneous sources, resulting in multi-view or multi-modal datasets. The missing data problem has been challenging to address in multi-view data analysis. Especially, when certain samples miss an entire view of data, it creates the missing view problem. Classic multiple imputations or matrix completion methods are hardly effective here when no information can be based on in the specific view to impute data for such samples. The commonly-used simple method of removing samples with a missing view can dramatically reduce sample size, thus diminishing the statistical power of a subsequent analysis. In this paper, we propose a novel approach for view imputation via generative adversarial networks (GANs), which we name by VIGAN. This approach first treats each view as a separate domain and identifies domain-to-domain mappings via a GAN using randomly-sampled data from each view, and then employs a multi-modal denoising autoencoder (DAE) to reconstruct the missing view from the GAN outputs based on paired data across the views. Then, by optimizing the GAN and DAE jointly, our model enables the knowledge integration for domain mappings and view correspondences to effectively recover the missing view. Empirical results on benchmark datasets validate the VIGAN approach by comparing against the state of the art. The evaluation of VIGAN in a genetic study of substance use disorders further proves the effectiveness and usability of this approach in life science.

  13. Revisiting sample size: are big trials the answer?

    PubMed

    Lurati Buse, Giovanna A L; Botto, Fernando; Devereaux, P J

    2012-07-18

    The superiority of the evidence generated in randomized controlled trials over observational data is not only conditional to randomization. Randomized controlled trials require proper design and implementation to provide a reliable effect estimate. Adequate random sequence generation, allocation implementation, analyses based on the intention-to-treat principle, and sufficient power are crucial to the quality of a randomized controlled trial. Power, or the probability of the trial to detect a difference when a real difference between treatments exists, strongly depends on sample size. The quality of orthopaedic randomized controlled trials is frequently threatened by a limited sample size. This paper reviews basic concepts and pitfalls in sample-size estimation and focuses on the importance of large trials in the generation of valid evidence.

  14. Behavioral parent training to address sleep disturbances in young children with autism spectrum disorder: a pilot trial

    PubMed Central

    Johnson, Cynthia R.; Turner, Kylan S.; Foldes, Emily; Brooks, Maria M.; Kronk, Rebecca; Wiggs, Luci

    2013-01-01

    Objectives A large percentage of children with autism spectrum disorders (ASD) have bedtime and sleep disturbances. However, the treatment of these disturbances has been understudied. The purpose of our study was to develop a manualized behavioral parent training (BPT) program for parents of young children with ASD and sleep disturbances and to test the feasibility, fidelity, and initial efficacy of the treatment in a small randomized controlled trial (RCT). Participants and methods Parents of a sample of 40 young children diagnosed with ASD with an average age of 3.5 years were enrolled in our study. Participants were randomized to either the BPT program group or a comparison group who were given nonsleep-related parent education. Each was individually administered a 5-session program delivered over the 8-week study. Outcome measures of feasibility, fidelity, and efficacy were collected at weeks 4 and 8 after the baseline time point. Children’s sleep was assessed by parent report and objectively by actigraphy. Results Of the 20 participants in each group, data were available for 15 participants randomized to BPT and 18 participants randomized to the comparison condition. Results supported the feasibility of the manualized parent training program and the comparison program. Treatment fidelity was high for both groups. The BPT program group significantly improved more than the comparison group based on the primary sleep outcome of parent report. There were no objective changes in sleep detected by actigraphy. Conclusions Our study is one of few RCTs of a BPT program to specifically target sleep disturbances in a well-characterized sample of young children with ASD and to demonstrate the feasibility of the approach. Initial efficacy favored the BPT program over the comparison group and suggested that this manualized parent training approach is worthy of further examination of the efficacy within a larger RCT. PMID:23993773

  15. Behavioral parent training to address sleep disturbances in young children with autism spectrum disorder: a pilot trial.

    PubMed

    Johnson, Cynthia R; Turner, Kylan S; Foldes, Emily; Brooks, Maria M; Kronk, Rebecca; Wiggs, Luci

    2013-10-01

    A large percentage of children with autism spectrum disorders (ASD) have bedtime and sleep disturbances. However, the treatment of these disturbances has been understudied. The purpose of our study was to develop a manualized behavioral parent training (BPT) program for parents of young children with ASD and sleep disturbances and to test the feasibility, fidelity, and initial efficacy of the treatment in a small randomized controlled trial (RCT). Parents of a sample of 40 young children diagnosed with ASD with an average age of 3.5years were enrolled in our study. Participants were randomized to either the BPT program group or a comparison group who were given nonsleep-related parent education. Each participant was individually administered a 5-session program delivered over the 8-week study. Outcome measures of feasibility, fidelity, and efficacy were collected at weeks 4 and 8 after the baseline time point. Children's sleep was assessed by parent report and objectively by actigraphy. Of the 20 participants in each group, data were available for 15 participants randomized to BPT and 18 participants randomized to the comparison condition. Results supported the feasibility of the manualized parent training program and the comparison program. Treatment fidelity was high for both groups. The BPT program group significantly improved more than the comparison group based on the primary sleep outcome of parent report. There were no objective changes in sleep detected by actigraphy. Our study is one of few RCTs of a BPT program to specifically target sleep disturbances in a well-characterized sample of young children with ASD and to demonstrate the feasibility of the approach. Initial efficacy favored the BPT program over the comparison group and suggested that this manualized parent training approach is worthy of further examination of the efficacy within a larger RCT. Copyright © 2013 Elsevier B.V. All rights reserved.

  16. A Comparison of Techniques for Scheduling Earth-Observing Satellites

    NASA Technical Reports Server (NTRS)

    Globus, Al; Crawford, James; Lohn, Jason; Pryor, Anna

    2004-01-01

    Scheduling observations by coordinated fleets of Earth Observing Satellites (EOS) involves large search spaces, complex constraints and poorly understood bottlenecks, conditions where evolutionary and related algorithms are often effective. However, there are many such algorithms and the best one to use is not clear. Here we compare multiple variants of the genetic algorithm: stochastic hill climbing, simulated annealing, squeaky wheel optimization and iterated sampling on ten realistically-sized EOS scheduling problems. Schedules are represented by a permutation (non-temperal ordering) of the observation requests. A simple deterministic scheduler assigns times and resources to each observation request in the order indicated by the permutation, discarding those that violate the constraints created by previously scheduled observations. Simulated annealing performs best. Random mutation outperform a more 'intelligent' mutator. Furthermore, the best mutator, by a small margin, was a novel approach we call temperature dependent random sampling that makes large changes in the early stages of evolution and smaller changes towards the end of search.

  17. Video-feedback intervention increases sensitive parenting in ethnic minority mothers: a randomized control trial.

    PubMed

    Yagmur, Sengul; Mesman, Judi; Malda, Maike; Bakermans-Kranenburg, Marian J; Ekmekci, Hatice

    2014-01-01

    Using a randomized control trial design we tested the effectiveness of a culturally sensitive adaptation of the Video-feedback Intervention to promote Positive Parenting and Sensitive Discipline (VIPP-SD) in a sample of 76 Turkish minority families in the Netherlands. The VIPP-SD was adapted based on a pilot with feedback of the target mothers, resulting in the VIPP-TM (VIPP-Turkish Minorities). The sample included families with 20-47-month-old children with high levels of externalizing problems. Maternal sensitivity, nonintrusiveness, and discipline strategies were observed during pretest and posttest home visits. The VIPP-TM was effective in increasing maternal sensitivity and nonintrusiveness, but not in enhancing discipline strategies. Applying newly learned sensitivity skills in discipline situations may take more time, especially in a cultural context that favors more authoritarian strategies. We conclude that the VIPP-SD program and its video-feedback approach can be successfully applied in immigrant families with a non-Western cultural background, with demonstrated effects on parenting sensitivity and nonintrusiveness.

  18. Value stream mapping of the Pap test processing procedure: a lean approach to improve quality and efficiency.

    PubMed

    Michael, Claire W; Naik, Kalyani; McVicker, Michael

    2013-05-01

    We developed a value stream map (VSM) of the Papanicolaou test procedure to identify opportunities to reduce waste and errors, created a new VSM, and implemented a new process emphasizing Lean tools. Preimplementation data revealed the following: (1) processing time (PT) for 1,140 samples averaged 54 hours; (2) 27 accessioning errors were detected on review of 357 random requisitions (7.6%); (3) 5 of the 20,060 tests had labeling errors that had gone undetected in the processing stage. Four were detected later during specimen processing but 1 reached the reporting stage. Postimplementation data were as follows: (1) PT for 1,355 samples averaged 31 hours; (2) 17 accessioning errors were detected on review of 385 random requisitions (4.4%); and (3) no labeling errors were undetected. Our results demonstrate that implementation of Lean methods, such as first-in first-out processes and minimizing batch size by staff actively participating in the improvement process, allows for higher quality, greater patient safety, and improved efficiency.

  19. Patient decision-making preference and physician decision-making style for contraceptive method choice in an Asian culture: does concordance matter?

    PubMed

    Alden, Dana Latham; Merz, Miwa Yamazaki; Thi, Le Minh

    2010-12-01

    This study investigates preferences for patient-physician decision-making in an emerging economy with an Asian culture. A survey of 445 randomly sampled women, aged 20-40 in Hanoi, Vietnam, revealed that pre-consultation attitudes were most positive toward a "shared" decision-making approach with the physician for contraceptive method choice. However, following random assignment to one of three vignettes (passive, shared or autonomous) featuring a young Vietnamese woman reaching a contraceptive method decision with her physician, preference was highest for the "autonomous" approach. Furthermore, discordance between pre-consultation preference for decision-making style and the physician's decision-making style negatively impacted evaluations under some but not all circumstances. This study demonstrates that, despite living in a hierarchic Asian culture, active participation in contraceptive method choice is desired by many urban Vietnamese women. However, there is variation on this dimension and adjusting the physician's style to be concordant with patient preference appears important to maximizing patient satisfaction.

  20. Environmental Health Practice: Statistically Based Performance Measurement

    PubMed Central

    Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.

    2007-01-01

    Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709

  1. A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution.

    PubMed

    Harrison, Xavier A

    2015-01-01

    Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.

  2. The Self-Adapting Focused Review System. Probability sampling of medical records to monitor utilization and quality of care.

    PubMed

    Ash, A; Schwartz, M; Payne, S M; Restuccia, J D

    1990-11-01

    Medical record review is increasing in importance as the need to identify and monitor utilization and quality of care problems grow. To conserve resources, reviews are usually performed on a subset of cases. If judgment is used to identify subgroups for review, this raises the following questions: How should subgroups be determined, particularly since the locus of problems can change over time? What standard of comparison should be used in interpreting rates of problems found in subgroups? How can population problem rates be estimated from observed subgroup rates? How can the bias be avoided that arises because reviewers know that selected cases are suspected of having problems? How can changes in problem rates over time be interpreted when evaluating intervention programs? Simple random sampling, an alternative to subgroup review, overcomes the problems implied by these questions but is inefficient. The Self-Adapting Focused Review System (SAFRS), introduced and described here, provides an adaptive approach to record selection that is based upon model-weighted probability sampling. It retains the desirable inferential properties of random sampling while allowing reviews to be concentrated on cases currently thought most likely to be problematic. Model development and evaluation are illustrated using hospital data to predict inappropriate admissions.

  3. Frequency position modulation using multi-spectral projections

    NASA Astrophysics Data System (ADS)

    Goodman, Joel; Bertoncini, Crystal; Moore, Michael; Nousain, Bryan; Cowart, Gregory

    2012-10-01

    In this paper we present an approach to harness multi-spectral projections (MSPs) to carefully shape and locate tones in the spectrum, enabling a new and robust modulation in which a signal's discrete frequency support is used to represent symbols. This method, called Frequency Position Modulation (FPM), is an innovative extension to MT-FSK and OFDM and can be non-uniformly spread over many GHz of instantaneous bandwidth (IBW), resulting in a communications system that is difficult to intercept and jam. The FPM symbols are recovered using adaptive projections that in part employ an analog polynomial nonlinearity paired with an analog-to-digital converter (ADC) sampling at a rate at that is only a fraction of the IBW of the signal. MSPs also facilitate using commercial of-the-shelf (COTS) ADCs with uniform-sampling, standing in sharp contrast to random linear projections by random sampling, which requires a full Nyquist rate sample-and-hold. Our novel communication system concept provides an order of magnitude improvement in processing gain over conventional LPI/LPD communications (e.g., FH- or DS-CDMA) and facilitates the ability to operate in interference laden environments where conventional compressed sensing receivers would fail. We quantitatively analyze the bit error rate (BER) and processing gain (PG) for a maximum likelihood based FPM demodulator and demonstrate its performance in interference laden conditions.

  4. Estimation of regionalized compositions: A comparison of three methods

    USGS Publications Warehouse

    Pawlowsky, V.; Olea, R.A.; Davis, J.C.

    1995-01-01

    A regionalized composition is a random vector function whose components are positive and sum to a constant at every point of the sampling region. Consequently, the components of a regionalized composition are necessarily spatially correlated. This spatial dependence-induced by the constant sum constraint-is a spurious spatial correlation and may lead to misinterpretations of statistical analyses. Furthermore, the cross-covariance matrices of the regionalized composition are singular, as is the coefficient matrix of the cokriging system of equations. Three methods of performing estimation or prediction of a regionalized composition at unsampled points are discussed: (1) the direct approach of estimating each variable separately; (2) the basis method, which is applicable only when a random function is available that can he regarded as the size of the regionalized composition under study; (3) the logratio approach, using the additive-log-ratio transformation proposed by J. Aitchison, which allows statistical analysis of compositional data. We present a brief theoretical review of these three methods and compare them using compositional data from the Lyons West Oil Field in Kansas (USA). It is shown that, although there are no important numerical differences, the direct approach leads to invalid results, whereas the basis method and the additive-log-ratio approach are comparable. ?? 1995 International Association for Mathematical Geology.

  5. Using regression methods to estimate stream phosphorus loads at the Illinois River, Arkansas

    USGS Publications Warehouse

    Haggard, B.E.; Soerens, T.S.; Green, W.R.; Richards, R.P.

    2003-01-01

    The development of total maximum daily loads (TMDLs) requires evaluating existing constituent loads in streams. Accurate estimates of constituent loads are needed to calibrate watershed and reservoir models for TMDL development. The best approach to estimate constituent loads is high frequency sampling, particularly during storm events, and mass integration of constituents passing a point in a stream. Most often, resources are limited and discrete water quality samples are collected on fixed intervals and sometimes supplemented with directed sampling during storm events. When resources are limited, mass integration is not an accurate means to determine constituent loads and other load estimation techniques such as regression models are used. The objective of this work was to determine a minimum number of water-quality samples needed to provide constituent concentration data adequate to estimate constituent loads at a large stream. Twenty sets of water quality samples with and without supplemental storm samples were randomly selected at various fixed intervals from a database at the Illinois River, northwest Arkansas. The random sets were used to estimate total phosphorus (TP) loads using regression models. The regression-based annual TP loads were compared to the integrated annual TP load estimated using all the data. At a minimum, monthly sampling plus supplemental storm samples (six samples per year) was needed to produce a root mean square error of less than 15%. Water quality samples should be collected at least semi-monthly (every 15 days) in studies less than two years if seasonal time factors are to be used in the regression models. Annual TP loads estimated from independently collected discrete water quality samples further demonstrated the utility of using regression models to estimate annual TP loads in this stream system.

  6. Surface Warfare Junior Officer Retention; Spouses’ Influence on Career Decisions

    DTIC Science & Technology

    1981-08-01

    1974), and (4) Self - esteem ( Rosenberg , 1965 ). In November 1978, the survey questionnaire was mailed to a random sample of 691 male surface warfare... Rosenberg , M. Society and the adolescent self -image. Princeton, NJ: Princeton University Press, 1965 . 29 ___________ Stumpf, S. 5., & Kieckhaefer, N...these services with no loss in st.- is or self - esteem . If this approach is not sLIccessful, a separate branch of Family Service. may be required

  7. A Multidisciplinary Approach to Health Monitoring and Materials Damage Prognosis for Metallic Aerospace Systems

    DTIC Science & Technology

    2013-03-01

    framework of orientation distribution functions and crack-induced texture o Quantify effects of temperature on damage behavior and damage monitoring...measurement model was obtained from hidden Markov modeling (HMM) of joint time-frequency (TF) features extracted from the PZT sensor signals using the...considered PZT sensor signals recorded from a bolted aluminum plate. About only 20% of the samples of a signal were first randomly selected as

  8. Smoothing the redshift distributions of random samples for the baryon acoustic oscillations: applications to the SDSS-III BOSS DR12 and QPM mock samples

    NASA Astrophysics Data System (ADS)

    Wang, Shao-Jiang; Guo, Qi; Cai, Rong-Gen

    2017-12-01

    We investigate the impact of different redshift distributions of random samples on the baryon acoustic oscillations (BAO) measurements of D_V(z)r_d^fid/r_d from the two-point correlation functions of galaxies in the Data Release 12 of the Baryon Oscillation Spectroscopic Survey (BOSS). Big surveys, such as BOSS, usually assign redshifts to the random samples by randomly drawing values from the measured redshift distributions of the data, which would necessarily introduce fiducial signals of fluctuations into the random samples, weakening the signals of BAO, if the cosmic variance cannot be ignored. We propose a smooth function of redshift distribution that fits the data well to populate the random galaxy samples. The resulting cosmological parameters match the input parameters of the mock catalogue very well. The significance of BAO signals has been improved by 0.33σ for a low-redshift sample and by 0.03σ for a constant-stellar-mass sample, though the absolute values do not change significantly. Given the precision of the measurements of current cosmological parameters, it would be appreciated for the future improvements on the measurements of galaxy clustering.

  9. Differential expression analysis for RNAseq using Poisson mixed models

    PubMed Central

    Sun, Shiquan; Hood, Michelle; Scott, Laura; Peng, Qinke; Mukherjee, Sayan; Tung, Jenny

    2017-01-01

    Abstract Identifying differentially expressed (DE) genes from RNA sequencing (RNAseq) studies is among the most common analyses in genomics. However, RNAseq DE analysis presents several statistical and computational challenges, including over-dispersed read counts and, in some settings, sample non-independence. Previous count-based methods rely on simple hierarchical Poisson models (e.g. negative binomial) to model independent over-dispersion, but do not account for sample non-independence due to relatedness, population structure and/or hidden confounders. Here, we present a Poisson mixed model with two random effects terms that account for both independent over-dispersion and sample non-independence. We also develop a scalable sampling-based inference algorithm using a latent variable representation of the Poisson distribution. With simulations, we show that our method properly controls for type I error and is generally more powerful than other widely used approaches, except in small samples (n <15) with other unfavorable properties (e.g. small effect sizes). We also apply our method to three real datasets that contain related individuals, population stratification or hidden confounders. Our results show that our method increases power in all three data compared to other approaches, though the power gain is smallest in the smallest sample (n = 6). Our method is implemented in MACAU, freely available at www.xzlab.org/software.html. PMID:28369632

  10. Housing First Reduces Re-offending among Formerly Homeless Adults with Mental Disorders: Results of a Randomized Controlled Trial

    PubMed Central

    Somers, Julian M.; Rezansoff, Stefanie N.; Moniruzzaman, Akm; Palepu, Anita; Patterson, Michelle

    2013-01-01

    Background Homelessness and mental illness have a strong association with public disorder and criminality. Experimental evidence indicates that Housing First (HF) increases housing stability and perceived choice among those experiencing chronic homelessness and mental disorders. HF is also associated with lower residential costs than common alternative approaches. Few studies have examined the effect of HF on criminal behavior. Methods Individuals meeting criteria for homelessness and a current mental disorder were randomized to one of three conditions treatment as usual (reference); scattered site HF; and congregate HF. Administrative data concerning justice system events were linked in order to study prior histories of offending and to test the relationship between housing status and offending following randomization for up to two years. Results The majority of the sample (67%) was involved with the justice system, with a mean of 8.07 convictions per person in the ten years prior to recruitment. The most common category of crime was “property offences” (mean = 4.09). Following randomization, the scattered site HF condition was associated with significantly lower numbers of sentences than treatment as usual (Adjusted IRR = 0.29; 95% CI 0.12–0.72). Congregate HF was associated with a marginally significant reduction in sentences compared to treatment as usual (Adjusted IRR = 0.55; 95% CI: 0.26–1.14). Conclusions This study is the first randomized controlled trial to demonstrate benefits of HF among a homeless sample with mental illness in the domain of public safety and crime. Our sample was frequently involved with the justice system, with great personal and societal costs. Further implementation of HF is strongly indicated, particularly in the scattered site format. Research examining interdependencies between housing, health, and the justice system is indicated. Trial registration ISRCTN57595077 PMID:24023796

  11. Gossip-Based Dissemination

    NASA Astrophysics Data System (ADS)

    Friedman, Roy; Kermarrec, Anne-Marie; Miranda, Hugo; Rodrigues, Luís

    Gossip-based networking has emerged as a viable approach to disseminate information reliably and efficiently in large-scale systems. Initially introduced for database replication [222], the applicability of the approach extends much further now. For example, it has been applied for data aggregation [415], peer sampling [416] and publish/subscribe systems [845]. Gossip-based protocols rely on a periodic peer-wise exchange of information in wired systems. By changing the way each peer is selected for the gossip communication, and which data are exchanged and processed [451], gossip systems can be used to perform different distributed tasks, such as, among others: overlay maintenance, distributed computation, and information dissemination (a collection of papers on gossip can be found in [451]). In a wired setting, the peer sampling service, allowing for a random or specific peer selection, is often provided as an independent service, able to operate independently from other gossip-based services [416].

  12. Bayesian hierarchical models for smoothing in two-phase studies, with application to small area estimation.

    PubMed

    Ross, Michelle; Wakefield, Jon

    2015-10-01

    Two-phase study designs are appealing since they allow for the oversampling of rare sub-populations which improves efficiency. In this paper we describe a Bayesian hierarchical model for the analysis of two-phase data. Such a model is particularly appealing in a spatial setting in which random effects are introduced to model between-area variability. In such a situation, one may be interested in estimating regression coefficients or, in the context of small area estimation, in reconstructing the population totals by strata. The efficiency gains of the two-phase sampling scheme are compared to standard approaches using 2011 birth data from the research triangle area of North Carolina. We show that the proposed method can overcome small sample difficulties and improve on existing techniques. We conclude that the two-phase design is an attractive approach for small area estimation.

  13. Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model.

    PubMed

    Hu, Jianhua; Wright, Fred A

    2007-03-01

    The identification of the genes that are differentially expressed in two-sample microarray experiments remains a difficult problem when the number of arrays is very small. We discuss the implications of using ordinary t-statistics and examine other commonly used variants. For oligonucleotide arrays with multiple probes per gene, we introduce a simple model relating the mean and variance of expression, possibly with gene-specific random effects. Parameter estimates from the model have natural shrinkage properties that guard against inappropriately small variance estimates, and the model is used to obtain a differential expression statistic. A limiting value to the positive false discovery rate (pFDR) for ordinary t-tests provides motivation for our use of the data structure to improve variance estimates. Our approach performs well compared to other proposed approaches in terms of the false discovery rate.

  14. Uncertainty characterization of HOAPS 3.3 latent heat-flux-related parameters

    NASA Astrophysics Data System (ADS)

    Liman, Julian; Schröder, Marc; Fennig, Karsten; Andersson, Axel; Hollmann, Rainer

    2018-03-01

    Latent heat flux (LHF) is one of the main contributors to the global energy budget. As the density of in situ LHF measurements over the global oceans is generally poor, the potential of remotely sensed LHF for meteorological applications is enormous. However, to date none of the available satellite products have included estimates of systematic, random, and sampling uncertainties, all of which are essential for assessing their quality. Here, the challenge is taken on by matching LHF-related pixel-level data of the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite (HOAPS) climatology (version 3.3) to in situ measurements originating from a high-quality data archive of buoys and selected ships. Assuming the ground reference to be bias-free, this allows for deriving instantaneous systematic uncertainties as a function of four atmospheric predictor variables. The approach is regionally independent and therefore overcomes the issue of sparse in situ data densities over large oceanic areas. Likewise, random uncertainties are derived, which include not only a retrieval component but also contributions from in situ measurement noise and the collocation procedure. A recently published random uncertainty decomposition approach is applied to isolate the random retrieval uncertainty of all LHF-related HOAPS parameters. It makes use of two combinations of independent data triplets of both satellite and in situ data, which are analysed in terms of their pairwise variances of differences. Instantaneous uncertainties are finally aggregated, allowing for uncertainty characterizations on monthly to multi-annual timescales. Results show that systematic LHF uncertainties range between 15 and 50 W m-2 with a global mean of 25 W m-2. Local maxima are mainly found over the subtropical ocean basins as well as along the western boundary currents. Investigations indicate that contributions from qa (U) to the overall LHF uncertainty are on the order of 60 % (25 %). From an instantaneous point of view, random retrieval uncertainties are specifically large over the subtropics with a global average of 37 W m-2. In a climatological sense, their magnitudes become negligible, as do respective sampling uncertainties. Regional and seasonal analyses suggest that largest total LHF uncertainties are seen over the Gulf Stream and the Indian monsoon region during boreal winter. In light of the uncertainty measures, the observed continuous global mean LHF increase up to 2009 needs to be treated with caution. The demonstrated approach can easily be transferred to other satellite retrievals, which increases the significance of the present work.

  15. The effect of group counseling based on self-awareness skill on sexual risk-taking among girl students in Gorgan, Iran: a randomized trial.

    PubMed

    Kabiri, Golnoosh; Ziaei, Tayebe; Aval, Masumeh Rezaei; Vakili, Mohammad Ali

    2017-09-15

    Background Sexual puberty in adolescents occurs before their mental and emotional maturity and exposes them to high-risk sexual behaviors. Because sexual risk-taking occurs before adolescents become involved in a sexual relationship, this study was conducted to identify the effect of group counseling based on self-awareness skill on sexual risk-taking among female high school students in Gorgan in order to suggest some preventative measures. Methods The present parallel study is a randomized field trial conducted on 96 girl students who were studying in grades 10, 11 and 12 of high school with an age range of 14-18 years old. Sampling was done based on a multi-stage process. In the first stage, through the randomized clustering approach, four centers among six health centers were selected. In the second stage, 96 samples were collected through consecutive sampling. Finally, the samples were divided into two intervention and control groups (each one having 48 subjects) through the simple randomized approach. It has to be noted that no blinding was done in the present study. The data were collected using a demographic specifications form and the Iranian Adolescents Risk-Taking Scale (IARS). The consultation sessions based on self-awareness skill were explained to an intervention group through 60-min sessions over 7 weeks. The pretest was conducted for both groups and the posttest was completed 1 week and 1 month after the intervention by the intervention and control groups. Finally, after the loss of follow-up/drop out, a total of 80 subjects remained in the study; 42 subjects in the intervention group and 38 subjects in the control group. Data analyses were done using SPSS v.16 along with the Freidman non-parametric test and the Mann-Whitney and Wilcoxon tests. Results The results showed that the sexual risk-taking mean scores in the intervention group (10.54 ± 15.64) were reduced by applying 1-week (8.03 ± 12.82) and 1-month (4.91 ± 10.10) follow-ups after the intervention. This reduction was statistically significant (p = 14%). However, no statistically significant difference was observed in the control group. Conclusion Group counseling based on self-awareness skill decreased the sexual risk-taking in girl students of the high school. As prevention is prior to treatment, this method could be proposed as the prevention of high-risk sexual behavior to healthcare centers and educational environments and non-government organizations (NGOs) interacting with adolescents.

  16. Constrained sampling experiments reveal principles of detection in natural scenes.

    PubMed

    Sebastian, Stephen; Abrams, Jared; Geisler, Wilson S

    2017-07-11

    A fundamental everyday visual task is to detect target objects within a background scene. Using relatively simple stimuli, vision science has identified several major factors that affect detection thresholds, including the luminance of the background, the contrast of the background, the spatial similarity of the background to the target, and uncertainty due to random variations in the properties of the background and in the amplitude of the target. Here we use an experimental approach based on constrained sampling from multidimensional histograms of natural stimuli, together with a theoretical analysis based on signal detection theory, to discover how these factors affect detection in natural scenes. We sorted a large collection of natural image backgrounds into multidimensional histograms, where each bin corresponds to a particular luminance, contrast, and similarity. Detection thresholds were measured for a subset of bins spanning the space, where a natural background was randomly sampled from a bin on each trial. In low-uncertainty conditions, both the background bin and the amplitude of the target were fixed, and, in high-uncertainty conditions, they varied randomly on each trial. We found that thresholds increase approximately linearly along all three dimensions and that detection accuracy is unaffected by background bin and target amplitude uncertainty. The results are predicted from first principles by a normalized matched-template detector, where the dynamic normalizing gain factor follows directly from the statistical properties of the natural backgrounds. The results provide an explanation for classic laws of psychophysics and their underlying neural mechanisms.

  17. Risk of bias reporting in the recent animal focal cerebral ischaemia literature.

    PubMed

    Bahor, Zsanett; Liao, Jing; Macleod, Malcolm R; Bannach-Brown, Alexandra; McCann, Sarah K; Wever, Kimberley E; Thomas, James; Ottavi, Thomas; Howells, David W; Rice, Andrew; Ananiadou, Sophia; Sena, Emily

    2017-10-15

    Findings from in vivo research may be less reliable where studies do not report measures to reduce risks of bias. The experimental stroke community has been at the forefront of implementing changes to improve reporting, but it is not known whether these efforts are associated with continuous improvements. Our aims here were firstly to validate an automated tool to assess risks of bias in published works, and secondly to assess the reporting of measures taken to reduce the risk of bias within recent literature for two experimental models of stroke. We developed and used text analytic approaches to automatically ascertain reporting of measures to reduce risk of bias from full-text articles describing animal experiments inducing middle cerebral artery occlusion (MCAO) or modelling lacunar stroke. Compared with previous assessments, there were improvements in the reporting of measures taken to reduce risks of bias in the MCAO literature but not in the lacunar stroke literature. Accuracy of automated annotation of risk of bias in the MCAO literature was 86% (randomization), 94% (blinding) and 100% (sample size calculation); and in the lacunar stroke literature accuracy was 67% (randomization), 91% (blinding) and 96% (sample size calculation). There remains substantial opportunity for improvement in the reporting of animal research modelling stroke, particularly in the lacunar stroke literature. Further, automated tools perform sufficiently well to identify whether studies report blinded assessment of outcome, but improvements are required in the tools to ascertain whether randomization and a sample size calculation were reported. © 2017 The Author(s).

  18. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    NASA Astrophysics Data System (ADS)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  19. Study protocol of the CHOiCE trial: a three-armed, randomized, controlled trial of home-based HPV self-sampling for non-participants in an organized cervical cancer screening program.

    PubMed

    Tranberg, Mette; Bech, Bodil Hammer; Blaakær, Jan; Jensen, Jørgen Skov; Svanholm, Hans; Andersen, Berit

    2016-11-03

    The effectiveness of cervical cancer screening programs is challenged by suboptimal participation and coverage. Offering cervico-vaginal self-sampling for human papillomavirus testing (HPV self-sampling) to non-participants can increase screening participation. However, the effect varies substantially among studies, especially depending on the approach used to offer HPV self-sampling. The present trial evaluates the effect on participation in an organized screening program of a HPV self-sampling kit mailed directly to the home of the woman or mailed to the woman's home on demand only, compared with the standard second reminder for regular screening. The CHOiCE trial is a parallel, randomized, controlled, open-label trial. It will include 9327 women aged 30-64 years who are living in the Central Denmark Region and who have not participated in cervical cancer screening after an invitation and one reminder. The women will be equally randomized into three arms: 1) Directly mailed a second reminder including a HPV self-sampling kit; 2) Mailed a second reminder offering a HPV self-sampling kit, to be ordered by e-mail, text message, phone, or through a webpage; and 3) Mailed a second reminder for a practitioner-collected sample (control group). The primary outcome will be the proportion of women in the intervention groups who participate by returning their HPV self-sampling kit or have a practitioner-collected sample compared with the proportion of women who have a practitioner-collected sample in the control group at 90 and 180 days after mail out of the second reminders. Per-protocol and intention-to-treat analyses will be performed. The secondary outcome will be the proportion of women with a positive HPV self-collected sample who attend follow-up testing at 30, 60, or 90 days after mail out of the results. The CHOiCE trial will provide strong and important evidence allowing us to determine if and how HPV self-sampling can be used to increase participation in cervical cancer screening. This trial therefore has the potential to improve prevention and reduce the number of deaths caused by cervical cancer. Current Controlled Trials NCT02680262 . Registered 10 February 2016.

  20. Point and interval estimation of pollinator importance: a study using pollination data of Silene caroliniana.

    PubMed

    Reynolds, Richard J; Fenster, Charles B

    2008-05-01

    Pollinator importance, the product of visitation rate and pollinator effectiveness, is a descriptive parameter of the ecology and evolution of plant-pollinator interactions. Naturally, sources of its variation should be investigated, but the SE of pollinator importance has never been properly reported. Here, a Monte Carlo simulation study and a result from mathematical statistics on the variance of the product of two random variables are used to estimate the mean and confidence limits of pollinator importance for three visitor species of the wildflower, Silene caroliniana. Both methods provided similar estimates of mean pollinator importance and its interval if the sample size of the visitation and effectiveness datasets were comparatively large. These approaches allowed us to determine that bumblebee importance was significantly greater than clearwing hawkmoth, which was significantly greater than beefly. The methods could be used to statistically quantify temporal and spatial variation in pollinator importance of particular visitor species. The approaches may be extended for estimating the variance of more than two random variables. However, unless the distribution function of the resulting statistic is known, the simulation approach is preferable for calculating the parameter's confidence limits.

  1. A Spatio-Temporally Explicit Random Encounter Model for Large-Scale Population Surveys

    PubMed Central

    Jousimo, Jussi; Ovaskainen, Otso

    2016-01-01

    Random encounter models can be used to estimate population abundance from indirect data collected by non-invasive sampling methods, such as track counts or camera-trap data. The classical Formozov–Malyshev–Pereleshin (FMP) estimator converts track counts into an estimate of mean population density, assuming that data on the daily movement distances of the animals are available. We utilize generalized linear models with spatio-temporal error structures to extend the FMP estimator into a flexible Bayesian modelling approach that estimates not only total population size, but also spatio-temporal variation in population density. We also introduce a weighting scheme to estimate density on habitats that are not covered by survey transects, assuming that movement data on a subset of individuals is available. We test the performance of spatio-temporal and temporal approaches by a simulation study mimicking the Finnish winter track count survey. The results illustrate how the spatio-temporal modelling approach is able to borrow information from observations made on neighboring locations and times when estimating population density, and that spatio-temporal and temporal smoothing models can provide improved estimates of total population size compared to the FMP method. PMID:27611683

  2. Machine Learning Algorithms for prediction of regions of high Reynolds Averaged Navier Stokes Uncertainty

    NASA Astrophysics Data System (ADS)

    Mishra, Aashwin; Iaccarino, Gianluca

    2017-11-01

    In spite of their deficiencies, RANS models represent the workhorse for industrial investigations into turbulent flows. In this context, it is essential to provide diagnostic measures to assess the quality of RANS predictions. To this end, the primary step is to identify feature importances amongst massive sets of potentially descriptive and discriminative flow features. This aids the physical interpretability of the resultant discrepancy model and its extensibility to similar problems. Recent investigations have utilized approaches such as Random Forests, Support Vector Machines and the Least Absolute Shrinkage and Selection Operator for feature selection. With examples, we exhibit how such methods may not be suitable for turbulent flow datasets. The underlying rationale, such as the correlation bias and the required conditions for the success of penalized algorithms, are discussed with illustrative examples. Finally, we provide alternate approaches using convex combinations of regularized regression approaches and randomized sub-sampling in combination with feature selection algorithms, to infer model structure from data. This research was supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).

  3. Evolutionary Ensemble for In Silico Prediction of Ames Test Mutagenicity

    NASA Astrophysics Data System (ADS)

    Chen, Huanhuan; Yao, Xin

    Driven by new regulations and animal welfare, the need to develop in silico models has increased recently as alternative approaches to safety assessment of chemicals without animal testing. This paper describes a novel machine learning ensemble approach to building an in silico model for the prediction of the Ames test mutagenicity, one of a battery of the most commonly used experimental in vitro and in vivo genotoxicity tests for safety evaluation of chemicals. Evolutionary random neural ensemble with negative correlation learning (ERNE) [1] was developed based on neural networks and evolutionary algorithms. ERNE combines the method of bootstrap sampling on training data with the method of random subspace feature selection to ensure diversity in creating individuals within an initial ensemble. Furthermore, while evolving individuals within the ensemble, it makes use of the negative correlation learning, enabling individual NNs to be trained as accurate as possible while still manage to maintain them as diverse as possible. Therefore, the resulting individuals in the final ensemble are capable of cooperating collectively to achieve better generalization of prediction. The empirical experiment suggest that ERNE is an effective ensemble approach for predicting the Ames test mutagenicity of chemicals.

  4. Entropy-Based TOA Estimation and SVM-Based Ranging Error Mitigation in UWB Ranging Systems

    PubMed Central

    Yin, Zhendong; Cui, Kai; Wu, Zhilu; Yin, Liang

    2015-01-01

    The major challenges for Ultra-wide Band (UWB) indoor ranging systems are the dense multipath and non-line-of-sight (NLOS) problems of the indoor environment. To precisely estimate the time of arrival (TOA) of the first path (FP) in such a poor environment, a novel approach of entropy-based TOA estimation and support vector machine (SVM) regression-based ranging error mitigation is proposed in this paper. The proposed method can estimate the TOA precisely by measuring the randomness of the received signals and mitigate the ranging error without the recognition of the channel conditions. The entropy is used to measure the randomness of the received signals and the FP can be determined by the decision of the sample which is followed by a great entropy decrease. The SVM regression is employed to perform the ranging-error mitigation by the modeling of the regressor between the characteristics of received signals and the ranging error. The presented numerical simulation results show that the proposed approach achieves significant performance improvements in the CM1 to CM4 channels of the IEEE 802.15.4a standard, as compared to conventional approaches. PMID:26007726

  5. Sampling maternal care behaviour in domestic dogs: What's the best approach?

    PubMed

    Czerwinski, Veronika H; Smith, Bradley P; Hynd, Philip I; Hazel, Susan J

    2017-07-01

    Our understanding of the frequency and duration of maternal care behaviours in the domestic dog during the first two postnatal weeks is limited, largely due to the inconsistencies in the sampling methodologies that have been employed. In order to develop a more concise picture of maternal care behaviour during this period, and to help establish the sampling method that represents these behaviours best, we compared a variety of time sampling methods Six litters were continuously observed for a total of 96h over postnatal days 3, 6, 9 and 12 (24h per day). Frequent (dam presence, nursing duration, contact duration) and infrequent maternal behaviours (anogenital licking duration and frequency) were coded using five different time sampling methods that included: 12-h night (1800-0600h), 12-h day (0600-1800h), one hour period during the night (1800-0600h), one hour period during the day (0600-1800h) and a one hour period anytime. Each of the one hour time sampling method consisted of four randomly chosen 15-min periods. Two random sets of four 15-min period were also analysed to ensure reliability. We then determined which of the time sampling methods averaged over the three 24-h periods best represented the frequency and duration of behaviours. As might be expected, frequently occurring behaviours were adequately represented by short (oneh) sampling periods, however this was not the case with the infrequent behaviour. Thus, we argue that the time sampling methodology employed must match the behaviour of interest. This caution applies to maternal behaviour in altricial species, such as canids, as well as all systematic behavioural observations utilising time sampling methodology. Copyright © 2017. Published by Elsevier B.V.

  6. The Danish National Health Survey 2010. Study design and respondent characteristics.

    PubMed

    Christensen, Anne Illemann; Ekholm, Ola; Glümer, Charlotte; Andreasen, Anne Helms; Hvidberg, Michael Falk; Kristensen, Peter Lund; Larsen, Finn Breinholt; Ortiz, Britta; Juel, Knud

    2012-06-01

    In 2010 the five Danish regions and the National Institute of Public Health at the University of Southern Denmark conducted a national representative health survey among the adult population in Denmark. This paper describes the study design and the sample and study population as well as the content of the questionnaire. The survey was based on five regional stratified random samples and one national random sample. The samples were mutually exclusive. A total of 298,550 individuals (16 years or older) were invited to participate. Information was collected using a mixed mode approach (paper and web questionnaires). A questionnaire with a minimum of 52 core questions was used in all six subsamples. Calibrated weights were computed in order to take account of the complex survey design and reduce non-response bias. In all, 177,639 individuals completed the questionnaire (59.5%). The response rate varied from 52.3% in the Capital Region of Denmark sample to 65.5% in the North Denmark Region sample. The response rate was particularly low among young men, unmarried people and among individuals with a different ethnic background than Danish. The survey was a result of extensive national cooperation across sectors, which makes it unique in its field of application, e.g. health surveillance, planning and prioritizing public health initiatives and research. However, the low response rate in some subgroups of the study population can pose problems in generalizing data, and efforts to increase the response rate will be important in the forthcoming surveys.

  7. Contrasting lexical similarity and formal definitions in SNOMED CT: consistency and implications.

    PubMed

    Agrawal, Ankur; Elhanan, Gai

    2014-02-01

    To quantify the presence of and evaluate an approach for detection of inconsistencies in the formal definitions of SNOMED CT (SCT) concepts utilizing a lexical method. Utilizing SCT's Procedure hierarchy, we algorithmically formulated similarity sets: groups of concepts with similar lexical structure of their fully specified name. We formulated five random samples, each with 50 similarity sets, based on the same parameter: number of parents, attributes, groups, all the former as well as a randomly selected control sample. All samples' sets were reviewed for types of formal definition inconsistencies: hierarchical, attribute assignment, attribute target values, groups, and definitional. For the Procedure hierarchy, 2111 similarity sets were formulated, covering 18.1% of eligible concepts. The evaluation revealed that 38 (Control) to 70% (Different relationships) of similarity sets within the samples exhibited significant inconsistencies. The rate of inconsistencies for the sample with different relationships was highly significant compared to Control, as well as the number of attribute assignment and hierarchical inconsistencies within their respective samples. While, at this time of the HITECH initiative, the formal definitions of SCT are only a minor consideration, in the grand scheme of sophisticated, meaningful use of captured clinical data, they are essential. However, significant portion of the concepts in the most semantically complex hierarchy of SCT, the Procedure hierarchy, are modeled inconsistently in a manner that affects their computability. Lexical methods can efficiently identify such inconsistencies and possibly allow for their algorithmic resolution. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Determination of the influence of dispersion pattern of pesticide-resistant individuals on the reliability of resistance estimates using different sampling plans.

    PubMed

    Shah, R; Worner, S P; Chapman, R B

    2012-10-01

    Pesticide resistance monitoring includes resistance detection and subsequent documentation/ measurement. Resistance detection would require at least one (≥1) resistant individual(s) to be present in a sample to initiate management strategies. Resistance documentation, on the other hand, would attempt to get an estimate of the entire population (≥90%) of the resistant individuals. A computer simulation model was used to compare the efficiency of simple random and systematic sampling plans to detect resistant individuals and to document their frequencies when the resistant individuals were randomly or patchily distributed. A patchy dispersion pattern of resistant individuals influenced the sampling efficiency of systematic sampling plans while the efficiency of random sampling was independent of such patchiness. When resistant individuals were randomly distributed, sample sizes required to detect at least one resistant individual (resistance detection) with a probability of 0.95 were 300 (1%) and 50 (10% and 20%); whereas, when resistant individuals were patchily distributed, using systematic sampling, sample sizes required for such detection were 6000 (1%), 600 (10%) and 300 (20%). Sample sizes of 900 and 400 would be required to detect ≥90% of resistant individuals (resistance documentation) with a probability of 0.95 when resistant individuals were randomly dispersed and present at a frequency of 10% and 20%, respectively; whereas, when resistant individuals were patchily distributed, using systematic sampling, a sample size of 3000 and 1500, respectively, was necessary. Small sample sizes either underestimated or overestimated the resistance frequency. A simple random sampling plan is, therefore, recommended for insecticide resistance detection and subsequent documentation.

  9. How to derive biological information from the value of the normalization constant in allometric equations.

    PubMed

    Kaitaniemi, Pekka

    2008-04-09

    Allometric equations are widely used in many branches of biological science. The potential information content of the normalization constant b in allometric equations of the form Y = bX(a) has, however, remained largely neglected. To demonstrate the potential for utilizing this information, I generated a large number of artificial datasets that resembled those that are frequently encountered in biological studies, i.e., relatively small samples including measurement error or uncontrolled variation. The value of X was allowed to vary randomly within the limits describing different data ranges, and a was set to a fixed theoretical value. The constant b was set to a range of values describing the effect of a continuous environmental variable. In addition, a normally distributed random error was added to the values of both X and Y. Two different approaches were then used to model the data. The traditional approach estimated both a and b using a regression model, whereas an alternative approach set the exponent a at its theoretical value and only estimated the value of b. Both approaches produced virtually the same model fit with less than 0.3% difference in the coefficient of determination. Only the alternative approach was able to precisely reproduce the effect of the environmental variable, which was largely lost among noise variation when using the traditional approach. The results show how the value of b can be used as a source of valuable biological information if an appropriate regression model is selected.

  10. Reality Check for the Chinese Microblog Space: A Random Sampling Approach

    PubMed Central

    Fu, King-wa; Chau, Michael

    2013-01-01

    Chinese microblogs have drawn global attention to this online application’s potential impact on the country’s social and political environment. However, representative and reliable statistics on Chinese microbloggers are limited. Using a random sampling approach, this study collected Chinese microblog data from the service provider, analyzing the profile and the pattern of usage for 29,998 microblog accounts. From our analysis, 57.4% (95% CI 56.9%,58.0%) of the accounts’ timelines were empty. Among the 12,774 non-zero statuses samples, 86.9% (95% CI 86.2%,87.4%) did not make original post in a 7-day study period. By contrast, 0.51% (95% CI 0.4%,0.65%) wrote twenty or more original posts and 0.45% (95% CI 0.35%,0.60%) reposted more than 40 unique messages within the 7-day period. A small group of microbloggers created a majority of contents and drew other users’ attention. About 4.8% (95% CI 4.4%,5.2%) of the 12,774 users contributed more than 80% (95% CI,78.6%,80.3%) of the original posts and about 4.8% (95% CI 4.5%,5.2%) managed to create posts that were reposted or received comments at least once. Moreover, a regression analysis revealed that volume of followers is a key determinant of creating original microblog posts, reposting messages, being reposted, and receiving comments. Volume of friends is found to be linked only with the number of reposts. Gender differences and regional disparities in using microblogs in China are also observed. PMID:23520502

  11. A pilot cluster randomized controlled trial of structured goal-setting following stroke.

    PubMed

    Taylor, William J; Brown, Melanie; William, Levack; McPherson, Kathryn M; Reed, Kirk; Dean, Sarah G; Weatherall, Mark

    2012-04-01

    To determine the feasibility, the cluster design effect and the variance and minimal clinical importance difference in the primary outcome in a pilot study of a structured approach to goal-setting. A cluster randomized controlled trial. Inpatient rehabilitation facilities. People who were admitted to inpatient rehabilitation following stroke who had sufficient cognition to engage in structured goal-setting and complete the primary outcome measure. Structured goal elicitation using the Canadian Occupational Performance Measure. Quality of life at 12 weeks using the Schedule for Individualised Quality of Life (SEIQOL-DW), Functional Independence Measure, Short Form 36 and Patient Perception of Rehabilitation (measuring satisfaction with rehabilitation). Assessors were blinded to the intervention. Four rehabilitation services and 41 patients were randomized. We found high values of the intraclass correlation for the outcome measures (ranging from 0.03 to 0.40) and high variance of the SEIQOL-DW (SD 19.6) in relation to the minimally importance difference of 2.1, leading to impractically large sample size requirements for a cluster randomized design. A cluster randomized design is not a practical means of avoiding contamination effects in studies of inpatient rehabilitation goal-setting. Other techniques for coping with contamination effects are necessary.

  12. Stepped Care to Optimize Pain care Effectiveness (SCOPE) trial study design and sample characteristics.

    PubMed

    Kroenke, Kurt; Krebs, Erin; Wu, Jingwei; Bair, Matthew J; Damush, Teresa; Chumbler, Neale; York, Tish; Weitlauf, Sharon; McCalley, Stephanie; Evans, Erica; Barnd, Jeffrey; Yu, Zhangsheng

    2013-03-01

    Pain is the most common physical symptom in primary care, accounting for an enormous burden in terms of patient suffering, quality of life, work and social disability, and health care and societal costs. Although collaborative care interventions are well-established for conditions such as depression, fewer systems-based interventions have been tested for chronic pain. This paper describes the study design and baseline characteristics of the enrolled sample for the Stepped Care to Optimize Pain care Effectiveness (SCOPE) study, a randomized clinical effectiveness trial conducted in five primary care clinics. SCOPE has enrolled 250 primary care veterans with persistent (3 months or longer) musculoskeletal pain of moderate severity and randomized them to either the stepped care intervention or usual care control group. Using a telemedicine collaborative care approach, the intervention couples automated symptom monitoring with a telephone-based, nurse care manager/physician pain specialist team to treat pain. The goal is to optimize analgesic management using a stepped care approach to drug selection, symptom monitoring, dose adjustment, and switching or adding medications. All subjects undergo comprehensive outcome assessments at baseline, 1, 3, 6 and 12 months by interviewers blinded to treatment group. The primary outcome is pain severity/disability, and secondary outcomes include pain beliefs and behaviors, psychological functioning, health-related quality of life and treatment satisfaction. Innovations of SCOPE include optimized analgesic management (including a stepped care approach, opioid risk stratification, and criteria-based medication adjustment), automated monitoring, and centralized care management that can cover multiple primary care practices. Published by Elsevier Inc.

  13. Investigating the Randomness of Numbers

    ERIC Educational Resources Information Center

    Pendleton, Kenn L.

    2009-01-01

    The use of random numbers is pervasive in today's world. Random numbers have practical applications in such far-flung arenas as computer simulations, cryptography, gambling, the legal system, statistical sampling, and even the war on terrorism. Evaluating the randomness of extremely large samples is a complex, intricate process. However, the…

  14. Comparison of prevalence estimation of Mycobacterium avium subsp. paratuberculosis infection by sampling slaughtered cattle with macroscopic lesions vs. systematic sampling.

    PubMed

    Elze, J; Liebler-Tenorio, E; Ziller, M; Köhler, H

    2013-07-01

    The objective of this study was to identify the most reliable approach for prevalence estimation of Mycobacterium avium ssp. paratuberculosis (MAP) infection in clinically healthy slaughtered cattle. Sampling of macroscopically suspect tissue was compared to systematic sampling. Specimens of ileum, jejunum, mesenteric and caecal lymph nodes were examined for MAP infection using bacterial microscopy, culture, histopathology and immunohistochemistry. MAP was found most frequently in caecal lymph nodes, but sampling more tissues optimized the detection rate. Examination by culture was most efficient while combination with histopathology increased the detection rate slightly. MAP was detected in 49/50 animals with macroscopic lesions representing 1.35% of the slaughtered cattle examined. Of 150 systematically sampled macroscopically non-suspect cows, 28.7% were infected with MAP. This indicates that the majority of MAP-positive cattle are slaughtered without evidence of macroscopic lesions and before clinical signs occur. For reliable prevalence estimation of MAP infection in slaughtered cattle, systematic random sampling is essential.

  15. Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

    NASA Technical Reports Server (NTRS)

    Pohorille, Andrew

    2006-01-01

    The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described by rate constants. These problems are isomorphic with chemical kinetics problems. Recently, several efficient techniques for this purpose have been developed based on the approach originally proposed by Gillespie. Although the utility of the techniques mentioned above for Bayesian problems has not been determined, further research along these lines is warranted

  16. Comparing the performance of cluster random sampling and integrated threshold mapping for targeting trachoma control, using computer simulation.

    PubMed

    Smith, Jennifer L; Sturrock, Hugh J W; Olives, Casey; Solomon, Anthony W; Brooker, Simon J

    2013-01-01

    Implementation of trachoma control strategies requires reliable district-level estimates of trachomatous inflammation-follicular (TF), generally collected using the recommended gold-standard cluster randomized surveys (CRS). Integrated Threshold Mapping (ITM) has been proposed as an integrated and cost-effective means of rapidly surveying trachoma in order to classify districts according to treatment thresholds. ITM differs from CRS in a number of important ways, including the use of a school-based sampling platform for children aged 1-9 and a different age distribution of participants. This study uses computerised sampling simulations to compare the performance of these survey designs and evaluate the impact of varying key parameters. Realistic pseudo gold standard data for 100 districts were generated that maintained the relative risk of disease between important sub-groups and incorporated empirical estimates of disease clustering at the household, village and district level. To simulate the different sampling approaches, 20 clusters were selected from each district, with individuals sampled according to the protocol for ITM and CRS. Results showed that ITM generally under-estimated the true prevalence of TF over a range of epidemiological settings and introduced more district misclassification according to treatment thresholds than did CRS. However, the extent of underestimation and resulting misclassification was found to be dependent on three main factors: (i) the district prevalence of TF; (ii) the relative risk of TF between enrolled and non-enrolled children within clusters; and (iii) the enrollment rate in schools. Although in some contexts the two methodologies may be equivalent, ITM can introduce a bias-dependent shift as prevalence of TF increases, resulting in a greater risk of misclassification around treatment thresholds. In addition to strengthening the evidence base around choice of trachoma survey methodologies, this study illustrates the use of a simulated approach in addressing operational research questions for trachoma but also other NTDs.

  17. Fitting parametric random effects models in very large data sets with application to VHA national data

    PubMed Central

    2012-01-01

    Background With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata’s gllamm or R’s lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. Data We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. Methods and results The outcome of interest was mean annual HbA1c measured over a 5 years period. Using this outcome, we compared parameter estimates from the proposed random effects meta regression (REMR) with estimates based on simple random sampling and VISN (Veterans Integrated Service Networks) based stratified sampling of the full data. Our results indicate that REMR provides parameter estimates that are less likely to be biased with tighter confidence intervals when the VISN level estimates are homogenous. Conclusion When the interest is to fit REM in repeated measures data with very large sample size, REMR can be used as a good alternative. It leads to reasonable inference for both Gaussian and non-Gaussian responses if parameter estimates are homogeneous across VISNs. PMID:23095325

  18. Minimization of required model runs in the Random Mixing approach to inverse groundwater flow and transport modeling

    NASA Astrophysics Data System (ADS)

    Hoerning, Sebastian; Bardossy, Andras; du Plessis, Jaco

    2017-04-01

    Most geostatistical inverse groundwater flow and transport modelling approaches utilize a numerical solver to minimize the discrepancy between observed and simulated hydraulic heads and/or hydraulic concentration values. The optimization procedure often requires many model runs, which for complex models lead to long run times. Random Mixing is a promising new geostatistical technique for inverse modelling. The method is an extension of the gradual deformation approach. It works by finding a field which preserves the covariance structure and maintains observed hydraulic conductivities. This field is perturbed by mixing it with new fields that fulfill the homogeneous conditions. This mixing is expressed as an optimization problem which aims to minimize the difference between the observed and simulated hydraulic heads and/or concentration values. To preserve the spatial structure, the mixing weights must lie on the unit hyper-sphere. We present a modification to the Random Mixing algorithm which significantly reduces the number of model runs required. The approach involves taking n equally spaced points on the unit circle as weights for mixing conditional random fields. Each of these mixtures provides a solution to the forward model at the conditioning locations. For each of the locations the solutions are then interpolated around the circle to provide solutions for additional mixing weights at very low computational cost. The interpolated solutions are used to search for a mixture which maximally reduces the objective function. This is in contrast to other approaches which evaluate the objective function for the n mixtures and then interpolate the obtained values. Keeping the mixture on the unit circle makes it easy to generate equidistant sampling points in the space; however, this means that only two fields are mixed at a time. Once the optimal mixture for two fields has been found, they are combined to form the input to the next iteration of the algorithm. This process is repeated until a threshold in the objective function is met or insufficient changes are produced in successive iterations.

  19. Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling

    PubMed Central

    2006-01-01

    Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083

  20. A Simulation Approach to Assessing Sampling Strategies for Insect Pests: An Example with the Balsam Gall Midge

    PubMed Central

    Carleton, R. Drew; Heard, Stephen B.; Silk, Peter J.

    2013-01-01

    Estimation of pest density is a basic requirement for integrated pest management in agriculture and forestry, and efficiency in density estimation is a common goal. Sequential sampling techniques promise efficient sampling, but their application can involve cumbersome mathematics and/or intensive warm-up sampling when pests have complex within- or between-site distributions. We provide tools for assessing the efficiency of sequential sampling and of alternative, simpler sampling plans, using computer simulation with “pre-sampling” data. We illustrate our approach using data for balsam gall midge (Paradiplosis tumifex) attack in Christmas tree farms. Paradiplosis tumifex proved recalcitrant to sequential sampling techniques. Midge distributions could not be fit by a common negative binomial distribution across sites. Local parameterization, using warm-up samples to estimate the clumping parameter k for each site, performed poorly: k estimates were unreliable even for samples of n∼100 trees. These methods were further confounded by significant within-site spatial autocorrelation. Much simpler sampling schemes, involving random or belt-transect sampling to preset sample sizes, were effective and efficient for P. tumifex. Sampling via belt transects (through the longest dimension of a stand) was the most efficient, with sample means converging on true mean density for sample sizes of n∼25–40 trees. Pre-sampling and simulation techniques provide a simple method for assessing sampling strategies for estimating insect infestation. We suspect that many pests will resemble P. tumifex in challenging the assumptions of sequential sampling methods. Our software will allow practitioners to optimize sampling strategies before they are brought to real-world applications, while potentially avoiding the need for the cumbersome calculations required for sequential sampling methods. PMID:24376556

  1. RECAL: A Computer Program for Selecting Sample Days for Recreation Use Estimation

    Treesearch

    D.L. Erickson; C.J. Liu; H. Ken Cordell; W.L. Chen

    1980-01-01

    Recreation Calendar (RECAL) is a computer program in PL/I for drawing a sample of days for estimating recreation use. With RECAL, a sampling period of any length may be chosen; simple random, stratified random, and factorial designs can be accommodated. The program randomly allocates days to strata and locations.

  2. Sample Selection in Randomized Experiments: A New Method Using Propensity Score Stratified Sampling

    ERIC Educational Resources Information Center

    Tipton, Elizabeth; Hedges, Larry; Vaden-Kiernan, Michael; Borman, Geoffrey; Sullivan, Kate; Caverly, Sarah

    2014-01-01

    Randomized experiments are often seen as the "gold standard" for causal research. Despite the fact that experiments use random assignment to treatment conditions, units are seldom selected into the experiment using probability sampling. Very little research on experimental design has focused on how to make generalizations to well-defined…

  3. Uncertainty Of Stream Nutrient Transport Estimates Using Random Sampling Of Storm Events From High Resolution Water Quality And Discharge Data

    NASA Astrophysics Data System (ADS)

    Scholefield, P. A.; Arnscheidt, J.; Jordan, P.; Beven, K.; Heathwaite, L.

    2007-12-01

    The uncertainties associated with stream nutrient transport estimates are frequently overlooked and the sampling strategy is rarely if ever investigated. Indeed, the impact of sampling strategy and estimation method on the bias and precision of stream phosphorus (P) transport calculations is little understood despite the use of such values in the calibration and testing of models of phosphorus transport. The objectives of this research were to investigate the variability and uncertainty in the estimates of total phosphorus transfers at an intensively monitored agricultural catchment. The Oona Water which is located in the Irish border region, is part of a long term monitoring program focusing on water quality. The Oona Water is a rural river catchment with grassland agriculture and scattered dwelling houses and has been monitored for total phosphorus (TP) at 10 min resolution for several years (Jordan et al, 2007). Concurrent sensitive measurements of discharge are also collected. The water quality and discharge data were provided at 1 hour resolution (averaged) and this meant that a robust estimate of the annual flow weighted concentration could be obtained by simple interpolation between points. A two-strata approach (Kronvang and Bruhn, 1996) was used to estimate flow weighted concentrations using randomly sampled storm events from the 400 identified within the time series and also base flow concentrations. Using a random stratified sampling approach for the selection of events, a series ranging from 10 through to the full 400 were used, each time generating a flow weighted mean using a load-discharge relationship identified through log-log regression and monte-carlo simulation. These values were then compared to the observed total phosphorus concentration for the catchment. Analysis of these results show the impact of sampling strategy, the inherent bias in any estimate of phosphorus concentrations and the uncertainty associated with such estimates. The estimates generated using the full time series underestimate the flow weighted mean concentration of total phosphorus. This work compliments other contemporary work in the area of load estimation uncertainty in the UK (Johnes, 2007). Johnes P,J. 2007, Uncertainties in annual riverine phosphorus load estimation: Impact of load estimation methodology, sampling frequency, baseflow index and catchment population density, Journal of hydrology 332 (1- 2): 241-258 Jordan, P., Arnscheidt, J., McGrogan, H & McCormick, S., 2007. Characterising phosphorus transfers in rural transfers using a continuous bank-side analyser. Hydrology and Earth System Science 11, 372-381 Kronvang B & Bruhn, A. J, 1996. Choice of sampling strategy and estimation method for calculating nitrogen and phosphorus transport in small lowland streams , Hydrological processes 10 (11): 1483-1501

  4. Chronic obstructive pulmonary disease self-management activation research trial (COPD-SMART): results of recruitment and baseline patient characteristics.

    PubMed

    Russo, Rennie; Coultas, David; Ashmore, Jamile; Peoples, Jennifer; Sloan, John; Jackson, Bradford E; Uhm, Minyong; Singh, Karan P; Blair, Steven N; Bae, Sejong

    2015-03-01

    To describe the recruitment methods, study participation rate, and baseline characteristics of a representative sample of outpatients with COPD eligible for pulmonary rehabilitation participating in a trial of a lifestyle behavioral intervention to increase physical activity. A patient registry was developed for recruitment using an administrative database from primary care and specialty clinics of an academic medical center in northeast Texas for a parallel group randomized trial. The registry was comprised of 5582 patients and over the course of the 30 month recruitment period 325 patients were enrolled for an overall study participation rate of 35.1%. After a 6-week COPD self-management education period provided to all enrolled patients, 305 patients were randomized into either usual care (UC; n=156) or the physical activity self-management intervention (PASM; n=149). There were no clinically significant differences in demographics, clinical characteristics, or health status indicators between the randomized groups. The results of this recruitment process demonstrate the successful use of a patient registry for enrolling a representative sample of outpatients eligible for pulmonary rehabilitation with COPD from primary and specialty care. Moreover, this approach to patient recruitment provides a model for future studies utilizing administrative databases and electronic health records. Published by Elsevier Inc.

  5. A direct observation method for auditing large urban centers using stratified sampling, mobile GIS technology and virtual environments.

    PubMed

    Lafontaine, Sean J V; Sawada, M; Kristjansson, Elizabeth

    2017-02-16

    With the expansion and growth of research on neighbourhood characteristics, there is an increased need for direct observational field audits. Herein, we introduce a novel direct observational audit method and systematic social observation instrument (SSOI) for efficiently assessing neighbourhood aesthetics over large urban areas. Our audit method uses spatial random sampling stratified by residential zoning and incorporates both mobile geographic information systems technology and virtual environments. The reliability of our method was tested in two ways: first, in 15 Ottawa neighbourhoods, we compared results at audited locations over two subsequent years, and second; we audited every residential block (167 blocks) in one neighbourhood and compared the distribution of SSOI aesthetics index scores with results from the randomly audited locations. Finally, we present interrater reliability and consistency results on all observed items. The observed neighbourhood average aesthetics index score estimated from four or five stratified random audit locations is sufficient to characterize the average neighbourhood aesthetics. The SSOI was internally consistent and demonstrated good to excellent interrater reliability. At the neighbourhood level, aesthetics is positively related to SES and physical activity and negatively correlated with BMI. The proposed approach to direct neighbourhood auditing performs sufficiently and has the advantage of financial and temporal efficiency when auditing a large city.

  6. Continuous representation of tumor microvessel density and detection of angiogenic hotspots in histological whole-slide images.

    PubMed

    Kather, Jakob Nikolas; Marx, Alexander; Reyes-Aldasoro, Constantino Carlos; Schad, Lothar R; Zöllner, Frank Gerrit; Weis, Cleo-Aron

    2015-08-07

    Blood vessels in solid tumors are not randomly distributed, but are clustered in angiogenic hotspots. Tumor microvessel density (MVD) within these hotspots correlates with patient survival and is widely used both in diagnostic routine and in clinical trials. Still, these hotspots are usually subjectively defined. There is no unbiased, continuous and explicit representation of tumor vessel distribution in histological whole slide images. This shortcoming distorts angiogenesis measurements and may account for ambiguous results in the literature. In the present study, we describe and evaluate a new method that eliminates this bias and makes angiogenesis quantification more objective and more efficient. Our approach involves automatic slide scanning, automatic image analysis and spatial statistical analysis. By comparing a continuous MVD function of the actual sample to random point patterns, we introduce an objective criterion for hotspot detection: An angiogenic hotspot is defined as a clustering of blood vessels that is very unlikely to occur randomly. We evaluate the proposed method in N=11 images of human colorectal carcinoma samples and compare the results to a blinded human observer. For the first time, we demonstrate the existence of statistically significant hotspots in tumor images and provide a tool to accurately detect these hotspots.

  7. Spline methods for approximating quantile functions and generating random samples

    NASA Technical Reports Server (NTRS)

    Schiess, J. R.; Matthews, C. G.

    1985-01-01

    Two cubic spline formulations are presented for representing the quantile function (inverse cumulative distribution function) of a random sample of data. Both B-spline and rational spline approximations are compared with analytic representations of the quantile function. It is also shown how these representations can be used to generate random samples for use in simulation studies. Comparisons are made on samples generated from known distributions and a sample of experimental data. The spline representations are more accurate for multimodal and skewed samples and to require much less time to generate samples than the analytic representation.

  8. pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.

    PubMed

    Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen

    2016-04-07

    Being one type of post-translational modifications (PTMs), protein lysine succinylation is important in regulating varieties of biological processes. It is also involved with some diseases, however. Consequently, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence having many Lys residues therein, which ones can be succinylated, and which ones cannot? To address this problem, we have developed a predictor called pSuc-Lys through (1) incorporating the sequence-coupled information into the general pseudo amino acid composition, (2) balancing out skewed training dataset by random sampling, and (3) constructing an ensemble predictor by fusing a series of individual random forest classifiers. Rigorous cross-validations indicated that it remarkably outperformed the existing methods. A user-friendly web-server for pSuc-Lys has been established at http://www.jci-bioinfo.cn/pSuc-Lys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Empirical likelihood inference in randomized clinical trials.

    PubMed

    Zhang, Biao

    2017-01-01

    In individually randomized controlled trials, in addition to the primary outcome, information is often available on a number of covariates prior to randomization. This information is frequently utilized to undertake adjustment for baseline characteristics in order to increase precision of the estimation of average treatment effects; such adjustment is usually performed via covariate adjustment in outcome regression models. Although the use of covariate adjustment is widely seen as desirable for making treatment effect estimates more precise and the corresponding hypothesis tests more powerful, there are considerable concerns that objective inference in randomized clinical trials can potentially be compromised. In this paper, we study an empirical likelihood approach to covariate adjustment and propose two unbiased estimating functions that automatically decouple evaluation of average treatment effects from regression modeling of covariate-outcome relationships. The resulting empirical likelihood estimator of the average treatment effect is as efficient as the existing efficient adjusted estimators 1 when separate treatment-specific working regression models are correctly specified, yet are at least as efficient as the existing efficient adjusted estimators 1 for any given treatment-specific working regression models whether or not they coincide with the true treatment-specific covariate-outcome relationships. We present a simulation study to compare the finite sample performance of various methods along with some results on analysis of a data set from an HIV clinical trial. The simulation results indicate that the proposed empirical likelihood approach is more efficient and powerful than its competitors when the working covariate-outcome relationships by treatment status are misspecified.

  10. Modeling Training Site Vegetation Coverage Probability with a Random Optimizing Procedure: An Artificial Neural Network Approach.

    DTIC Science & Technology

    1998-05-01

    Coverage Probability with a Random Optimization Procedure: An Artificial Neural Network Approach by Biing T. Guan, George Z. Gertner, and Alan B...Modeling Training Site Vegetation Coverage Probability with a Random Optimizing Procedure: An Artificial Neural Network Approach 6. AUTHOR(S) Biing...coverage based on past coverage. Approach A literature survey was conducted to identify artificial neural network analysis techniques applicable for

  11. DSM-5 field trials in the United States and Canada, Part I: study design, sampling strategy, implementation, and analytic approaches.

    PubMed

    Clarke, Diana E; Narrow, William E; Regier, Darrel A; Kuramoto, S Janet; Kupfer, David J; Kuhl, Emily A; Greiner, Lisa; Kraemer, Helena C

    2013-01-01

    This article discusses the design,sampling strategy, implementation,and data analytic processes of the DSM-5 Field Trials. The DSM-5 Field Trials were conducted by using a test-retest reliability design with a stratified sampling approach across six adult and four pediatric sites in the United States and one adult site in Canada. A stratified random sampling approach was used to enhance precision in the estimation of the reliability coefficients. A web-based research electronic data capture system was used for simultaneous data collection from patients and clinicians across sites and for centralized data management.Weighted descriptive analyses, intraclass kappa and intraclass correlation coefficients for stratified samples, and receiver operating curves were computed. The DSM-5 Field Trials capitalized on advances since DSM-III and DSM-IV in statistical measures of reliability (i.e., intraclass kappa for stratified samples) and other recently developed measures to determine confidence intervals around kappa estimates. Diagnostic interviews using DSM-5 criteria were conducted by 279 clinicians of varied disciplines who received training comparable to what would be available to any clinician after publication of DSM-5.Overall, 2,246 patients with various diagnoses and levels of comorbidity were enrolled,of which over 86% were seen for two diagnostic interviews. A range of reliability coefficients were observed for the categorical diagnoses and dimensional measures. Multisite field trials and training comparable to what would be available to any clinician after publication of DSM-5 provided “real-world” testing of DSM-5 proposed diagnoses.

  12. SABRE: a method for assessing the stability of gene modules in complex tissues and subject populations.

    PubMed

    Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T

    2016-11-14

    Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.

  13. Assessment of DoD Wounded Warrior Matters - Wounded Warrior Battalion - West Headquarters and Southern California Units

    DTIC Science & Technology

    2012-08-22

    urine drug tesL~ (UDTs). A BUM ED Publication wi ll follow the lnterim Guidance for sustainment. Two training programs have also been created to...line DEPARTMENT OF DEFENSE 800.424.9098 Fraud, Waste, Mismanagement, Abuse of Authority Suspected Threats to Homeland Security Unauthorized...determined how many Service members were required to be interviewed, then we applied a simple random sample approach to determine the Service members we

  14. Stochastic Control and Numerical Methods with Applications to Communications. Game Theoretic/Subsolution to Importance Sampling for Rare Event Simulation

    DTIC Science & Technology

    2008-11-01

    support to the value of the approach. 9. Scheduling and Control of Mobile Communications Networks with Randomly Time Varying Channels by Stability ...biological systems . Many examples arise in communications and queueing, due to the finite speed of signal transmission, the nonnegligible time required...without delays, the system state takes values in a subset of some finite -dimensional Euclidean space, and the control is a functional of the current

  15. Calibrating random forests for probability estimation.

    PubMed

    Dankowski, Theresa; Ziegler, Andreas

    2016-09-30

    Probabilities can be consistently estimated using random forests. It is, however, unclear how random forests should be updated to make predictions for other centers or at different time points. In this work, we present two approaches for updating random forests for probability estimation. The first method has been proposed by Elkan and may be used for updating any machine learning approach yielding consistent probabilities, so-called probability machines. The second approach is a new strategy specifically developed for random forests. Using the terminal nodes, which represent conditional probabilities, the random forest is first translated to logistic regression models. These are, in turn, used for re-calibration. The two updating strategies were compared in a simulation study and are illustrated with data from the German Stroke Study Collaboration. In most simulation scenarios, both methods led to similar improvements. In the simulation scenario in which the stricter assumptions of Elkan's method were not met, the logistic regression-based re-calibration approach for random forests outperformed Elkan's method. It also performed better on the stroke data than Elkan's method. The strength of Elkan's method is its general applicability to any probability machine. However, if the strict assumptions underlying this approach are not met, the logistic regression-based approach is preferable for updating random forests for probability estimation. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  16. Social network recruitment for Yo Puedo - an innovative sexual health intervention in an underserved urban neighborhood: sample and design implications

    PubMed Central

    Minnis, Alexandra M.; vanDommelen-Gonzalez, Evan; Luecke, Ellen; Cheng, Helen; Dow, William; Bautista-Arredondo, Sergio; Padian, Nancy S.

    2016-01-01

    Most existing evidence-based sexual health interventions focus on individual-level behavior, even though there is substantial evidence that highlights the influential role of social environments in shaping adolescents’ behaviors and reproductive health outcomes. We developed Yo Puedo, a combined conditional cash transfer (CCT) and life skills intervention for youth to promote educational attainment, job training, and reproductive health wellness that we then evaluated for feasibility among 162 youth aged 16–21 years in a predominantly Latino community in San Francisco, CA. The intervention targeted youth’s social networks and involved recruitment and randomization of small social network clusters. In this paper we describe the design of the feasibility study and report participants’ baseline characteristics. Furthermore, we examined the sample and design implications of recruiting social network clusters as the unit of randomization. Baseline data provide evidence that we successfully enrolled high risk youth using a social network recruitment approach in community and school-based settings. Nearly all participants (95%) were high risk for adverse educational and reproductive health outcomes based on multiple measures of low socioeconomic status (81%) and/or reported high risk behaviors (e.g., gang affiliation, past pregnancy, recent unprotected sex, frequent substance use) (62%). We achieved variability in the study sample through heterogeneity in recruitment of the index participants, whereas the individuals within the small social networks of close friends demonstrated substantial homogeneity across sociodemographic and risk profile characteristics. Social networks recruitment was feasible and yielded a sample of high risk youth willing to enroll in a randomized study to evaluate a novel sexual health intervention. PMID:25358834

  17. An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics

    PubMed Central

    Torii, Manabu; Yin, Lanlan; Nguyen, Thang; Mazumdar, Chand T.; Liu, Hongfang; Hartley, David M.; Nelson, Noele P.

    2014-01-01

    Purpose Early detection of infectious disease outbreaks is crucial to protecting the public health of a society. Online news articles provide timely information on disease outbreaks worldwide. In this study, we investigated automated detection of articles relevant to disease outbreaks using machine learning classifiers. In a real-life setting, it is expensive to prepare a training data set for classifiers, which usually consists of manually labeled relevant and irrelevant articles. To mitigate this challenge, we examined the use of randomly sampled unlabeled articles as well as labeled relevant articles. Methods Naïve Bayes and Support Vector Machine (SVM) classifiers were trained on 149 relevant and 149 or more randomly sampled unlabeled articles. Diverse classifiers were trained by varying the number of sampled unlabeled articles and also the number of word features. The trained classifiers were applied to 15 thousand articles published over 15 days. Top-ranked articles from each classifier were pooled and the resulting set of 1337 articles was reviewed by an expert analyst to evaluate the classifiers. Results Daily averages of areas under ROC curves (AUCs) over the 15-day evaluation period were 0.841 and 0.836, respectively, for the naïve Bayes and SVM classifier. We referenced a database of disease outbreak reports to confirm that this evaluation data set resulted from the pooling method indeed covered incidents recorded in the database during the evaluation period. Conclusions The proposed text classification framework utilizing randomly sampled unlabeled articles can facilitate a cost-effective approach to training machine learning classifiers in a real-life Internet-based biosurveillance project. We plan to examine this framework further using larger data sets and using articles in non-English languages. PMID:21134784

  18. Estimation of reference intervals from small samples: an example using canine plasma creatinine.

    PubMed

    Geffré, A; Braun, J P; Trumel, C; Concordet, D

    2009-12-01

    According to international recommendations, reference intervals should be determined from at least 120 reference individuals, which often are impossible to achieve in veterinary clinical pathology, especially for wild animals. When only a small number of reference subjects is available, the possible bias cannot be known and the normality of the distribution cannot be evaluated. A comparison of reference intervals estimated by different methods could be helpful. The purpose of this study was to compare reference limits determined from a large set of canine plasma creatinine reference values, and large subsets of this data, with estimates obtained from small samples selected randomly. Twenty sets each of 120 and 27 samples were randomly selected from a set of 1439 plasma creatinine results obtained from healthy dogs in another study. Reference intervals for the whole sample and for the large samples were determined by a nonparametric method. The estimated reference limits for the small samples were minimum and maximum, mean +/- 2 SD of native and Box-Cox-transformed values, 2.5th and 97.5th percentiles by a robust method on native and Box-Cox-transformed values, and estimates from diagrams of cumulative distribution functions. The whole sample had a heavily skewed distribution, which approached Gaussian after Box-Cox transformation. The reference limits estimated from small samples were highly variable. The closest estimates to the 1439-result reference interval for 27-result subsamples were obtained by both parametric and robust methods after Box-Cox transformation but were grossly erroneous in some cases. For small samples, it is recommended that all values be reported graphically in a dot plot or histogram and that estimates of the reference limits be compared using different methods.

  19. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

    PubMed Central

    Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260

  20. Bias modification training can alter approach bias and chocolate consumption.

    PubMed

    Schumacher, Sophie E; Kemps, Eva; Tiggemann, Marika

    2016-01-01

    Recent evidence has demonstrated that bias modification training has potential to reduce cognitive biases for attractive targets and affect health behaviours. The present study investigated whether cognitive bias modification training could be applied to reduce approach bias for chocolate and affect subsequent chocolate consumption. A sample of 120 women (18-27 years) were randomly assigned to an approach-chocolate condition or avoid-chocolate condition, in which they were trained to approach or avoid pictorial chocolate stimuli, respectively. Training had the predicted effect on approach bias, such that participants trained to approach chocolate demonstrated an increased approach bias to chocolate stimuli whereas participants trained to avoid such stimuli showed a reduced bias. Further, participants trained to avoid chocolate ate significantly less of a chocolate muffin in a subsequent taste test than participants trained to approach chocolate. Theoretically, results provide support for the dual process model's conceptualisation of consumption as being driven by implicit processes such as approach bias. In practice, approach bias modification may be a useful component of interventions designed to curb the consumption of unhealthy foods. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Random forests, a novel approach for discrimination of fish populations using parasites as biological tags.

    PubMed

    Perdiguero-Alonso, Diana; Montero, Francisco E; Kostadinova, Aneta; Raga, Juan Antonio; Barrett, John

    2008-10-01

    Due to the complexity of host-parasite relationships, discrimination between fish populations using parasites as biological tags is difficult. This study introduces, to our knowledge for the first time, random forests (RF) as a new modelling technique in the application of parasite community data as biological markers for population assignment of fish. This novel approach is applied to a dataset with a complex structure comprising 763 parasite infracommunities in population samples of Atlantic cod, Gadus morhua, from the spawning/feeding areas in five regions in the North East Atlantic (Baltic, Celtic, Irish and North seas and Icelandic waters). The learning behaviour of RF is evaluated in comparison with two other algorithms applied to class assignment problems, the linear discriminant function analysis (LDA) and artificial neural networks (ANN). The three algorithms are used to develop predictive models applying three cross-validation procedures in a series of experiments (252 models in total). The comparative approach to RF, LDA and ANN algorithms applied to the same datasets demonstrates the competitive potential of RF for developing predictive models since RF exhibited better accuracy of prediction and outperformed LDA and ANN in the assignment of fish to their regions of sampling using parasite community data. The comparative analyses and the validation experiment with a 'blind' sample confirmed that RF models performed more effectively with a large and diverse training set and a large number of variables. The discrimination results obtained for a migratory fish species with largely overlapping parasite communities reflects the high potential of RF for developing predictive models using data that are both complex and noisy, and indicates that it is a promising tool for parasite tag studies. Our results suggest that parasite community data can be used successfully to discriminate individual cod from the five different regions of the North East Atlantic studied using RF.

  2. Hankin and Reeves' approach to estimating fish abundance in small streams: Limitations and alternatives

    USGS Publications Warehouse

    Thompson, W.L.

    2003-01-01

    Hankin and Reeves' (1988) approach to estimating fish abundance in small streams has been applied in stream fish studies across North America. However, their population estimator relies on two key assumptions: (1) removal estimates are equal to the true numbers of fish, and (2) removal estimates are highly correlated with snorkel counts within a subset of sampled stream units. Violations of these assumptions may produce suspect results. To determine possible sources of the assumption violations, I used data on the abundance of steelhead Oncorhynchus mykiss from Hankin and Reeves' (1988) in a simulation composed of 50,000 repeated, stratified systematic random samples from a spatially clustered distribution. The simulation was used to investigate effects of a range of removal estimates, from 75% to 100% of true fish abundance, on overall stream fish population estimates. The effects of various categories of removal-estimates-to-snorkel-count correlation levels (r = 0.75-1.0) on fish population estimates were also explored. Simulation results indicated that Hankin and Reeves' approach may produce poor results unless removal estimates exceed at least 85% of the true number of fish within sampled units and unless correlations between removal estimates and snorkel counts are at least 0.90. A potential modification to Hankin and Reeves' approach is the inclusion of environmental covariates that affect detection rates of fish into the removal model or other mark-recapture model. A potential alternative approach is to use snorkeling combined with line transect sampling to estimate fish densities within stream units. As with any method of population estimation, a pilot study should be conducted to evaluate its usefulness, which requires a known (or nearly so) population of fish to serve as a benchmark for evaluating bias and precision of estimators.

  3. Methodological quality of behavioural weight loss studies: a systematic review

    PubMed Central

    Lemon, S. C.; Wang, M. L.; Haughton, C. F.; Estabrook, D. P.; Frisard, C. F.; Pagoto, S. L.

    2018-01-01

    Summary This systematic review assessed the methodological quality of behavioural weight loss intervention studies conducted among adults and associations between quality and statistically significant weight loss outcome, strength of intervention effectiveness and sample size. Searches for trials published between January, 2009 and December, 2014 were conducted using PUBMED, MEDLINE and PSYCINFO and identified ninety studies. Methodological quality indicators included study design, anthropometric measurement approach, sample size calculations, intent-to-treat (ITT) analysis, loss to follow-up rate, missing data strategy, sampling strategy, report of treatment receipt and report of intervention fidelity (mean = 6.3). Indicators most commonly utilized included randomized design (100%), objectively measured anthropometrics (96.7%), ITT analysis (86.7%) and reporting treatment adherence (76.7%). Most studies (62.2%) had a follow-up rate >75% and reported a loss to follow-up analytic strategy or minimal missing data (69.9%). Describing intervention fidelity (34.4%) and sampling from a known population (41.1%) were least common. Methodological quality was not associated with reporting a statistically significant result, effect size or sample size. This review found the published literature of behavioural weight loss trials to be of high quality for specific indicators, including study design and measurement. Identified for improvement include utilization of more rigorous statistical approaches to loss to follow up and better fidelity reporting. PMID:27071775

  4. Protein Multiplexed Immunoassay Analysis with R.

    PubMed

    Breen, Edmond J

    2017-01-01

    Plasma samples from 177 control and type 2 diabetes patients collected at three Australian hospitals are screened for 14 analytes using six custom-made multiplex kits across 60 96-well plates. In total 354 samples were collected from the patients, representing one baseline and one end point sample from each patient. R methods and source code for analyzing the analyte fluorescence response obtained from these samples by Luminex Bio-Plex ® xMap multiplexed immunoassay technology are disclosed. Techniques and R procedures for reading Bio-Plex ® result files for statistical analysis and data visualization are also presented. The need for technical replicates and the number of technical replicates are addressed as well as plate layout design strategies. Multinomial regression is used to determine plate to sample covariate balance. Methods for matching clinical covariate information to Bio-Plex ® results and vice versa are given. As well as methods for measuring and inspecting the quality of the fluorescence responses are presented. Both fixed and mixed-effect approaches for immunoassay statistical differential analysis are presented and discussed. A random effect approach to outlier analysis and detection is also shown. The bioinformatics R methodology present here provides a foundation for rigorous and reproducible analysis of the fluorescence response obtained from multiplexed immunoassays.

  5. A Mixed Kijima Model Using the Weibull-Based Generalized Renewal Processes

    PubMed Central

    2015-01-01

    Generalized Renewal Processes are useful for approaching the rejuvenation of dynamical systems resulting from planned or unplanned interventions. We present new perspectives for the Generalized Renewal Processes in general and for the Weibull-based Generalized Renewal Processes in particular. Disregarding from literature, we present a mixed Generalized Renewal Processes approach involving Kijima Type I and II models, allowing one to infer the impact of distinct interventions on the performance of the system under study. The first and second theoretical moments of this model are introduced as well as its maximum likelihood estimation and random sampling approaches. In order to illustrate the usefulness of the proposed Weibull-based Generalized Renewal Processes model, some real data sets involving improving, stable, and deteriorating systems are used. PMID:26197222

  6. [Burnout prevalence in pediatricians of general hospitals].

    PubMed

    Gil-Monte, Pedro R; Marucco, Mariana A

    2008-06-01

    To assess the prevalence of burnout in pediatricians of general hospitals. Non-randomized cross-sectional study carried out in the city of Buenos Aires, Argentina, in 2006. The study sample comprised 123 pediatricians working in pediatrics services of general hospitals, 89 women (72.4%) and 34 men (27.6%). Data were gathered through an anonymous, self-administered questionnaire. Burnout was measured using the Maslach Burnout Inventory, and different approaches were employed to estimate burnout prevalence. The prevalence of burnout was different according to the approach used: the prevalence was 10.6% by the United States criteria; 24.4% by the Spanish criteria; 37.4% by the Argentinean criteria and 3.25% by the Dutch clinical criteria. Burnout prevalences varied significantly depending on the approach used due to cross-cultural influences.

  7. Influences of sampling size and pattern on the uncertainty of correlation estimation between soil water content and its influencing factors

    NASA Astrophysics Data System (ADS)

    Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua

    2017-12-01

    In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.

  8. Comprehensive evaluation of direct injection mass spectrometry for the quantitative profiling of volatiles in food samples

    PubMed Central

    2016-01-01

    Although qualitative strategies based on direct injection mass spectrometry (DIMS) have recently emerged as an alternative for the rapid classification of food samples, the potential of these approaches in quantitative tasks has scarcely been addressed to date. In this paper, the applicability of different multivariate regression procedures to data collected by DIMS from simulated mixtures has been evaluated. The most relevant factors affecting quantitation, such as random noise, the number of calibration samples, type of validation, mixture complexity and similarity of mass spectra, were also considered and comprehensively discussed. Based on the conclusions drawn from simulated data, and as an example of application, experimental mass spectral fingerprints collected by direct thermal desorption coupled to mass spectrometry were used for the quantitation of major volatiles in Thymus zygis subsp. zygis chemotypes. The results obtained, validated with the direct thermal desorption coupled to gas chromatography–mass spectrometry method here used as a reference, show the potential of DIMS approaches for the fast and precise quantitative profiling of volatiles in foods. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644978

  9. Sampling large random knots in a confined space

    NASA Astrophysics Data System (ADS)

    Arsuaga, J.; Blackstone, T.; Diao, Y.; Hinson, K.; Karadayi, E.; Saito, M.

    2007-09-01

    DNA knots formed under extreme conditions of condensation, as in bacteriophage P4, are difficult to analyze experimentally and theoretically. In this paper, we propose to use the uniform random polygon model as a supplementary method to the existing methods for generating random knots in confinement. The uniform random polygon model allows us to sample knots with large crossing numbers and also to generate large diagrammatically prime knot diagrams. We show numerically that uniform random polygons sample knots with large minimum crossing numbers and certain complicated knot invariants (as those observed experimentally). We do this in terms of the knot determinants or colorings. Our numerical results suggest that the average determinant of a uniform random polygon of n vertices grows faster than O(e^{n^2}) . We also investigate the complexity of prime knot diagrams. We show rigorously that the probability that a randomly selected 2D uniform random polygon of n vertices is almost diagrammatically prime goes to 1 as n goes to infinity. Furthermore, the average number of crossings in such a diagram is at the order of O(n2). Therefore, the two-dimensional uniform random polygons offer an effective way in sampling large (prime) knots, which can be useful in various applications.

  10. Random bit generation at tunable rates using a chaotic semiconductor laser under distributed feedback.

    PubMed

    Li, Xiao-Zhou; Li, Song-Sui; Zhuang, Jun-Ping; Chan, Sze-Chun

    2015-09-01

    A semiconductor laser with distributed feedback from a fiber Bragg grating (FBG) is investigated for random bit generation (RBG). The feedback perturbs the laser to emit chaotically with the intensity being sampled periodically. The samples are then converted into random bits by a simple postprocessing of self-differencing and selecting bits. Unlike a conventional mirror that provides localized feedback, the FBG provides distributed feedback which effectively suppresses the information of the round-trip feedback delay time. Randomness is ensured even when the sampling period is commensurate with the feedback delay between the laser and the grating. Consequently, in RBG, the FBG feedback enables continuous tuning of the output bit rate, reduces the minimum sampling period, and increases the number of bits selected per sample. RBG is experimentally investigated at a sampling period continuously tunable from over 16 ns down to 50 ps, while the feedback delay is fixed at 7.7 ns. By selecting 5 least-significant bits per sample, output bit rates from 0.3 to 100 Gbps are achieved with randomness examined by the National Institute of Standards and Technology test suite.

  11. Citywide cluster randomized trial to restore blighted vacant land and its effects on violence, crime, and fear

    PubMed Central

    Branas, Charles C.; South, Eugenia; Kondo, Michelle C.; Hohl, Bernadette C.; Bourgois, Philippe; Wiebe, Douglas J.; MacDonald, John M.

    2018-01-01

    Vacant and blighted urban land is a widespread and potentially risky environmental condition encountered by millions of people on a daily basis. About 15% of the land in US cities is deemed vacant or abandoned, an area roughly the size of Switzerland. In a citywide cluster randomized controlled trial, we investigated the effects of standardized, reproducible interventions that restore vacant land on the commission of violence, crime, and the perceptions of fear and safety. Quantitative and ethnographic analyses were included in a mixed-methods approach to more fully test and explicate our findings. A total of 541 randomly sampled vacant lots were randomly assigned into treatment and control study arms; outcomes from police and 445 randomly sampled participants were analyzed over a 38-month study period. Participants living near treated vacant lots reported significantly reduced perceptions of crime (−36.8%, P < 0.05), vandalism (−39.3%, P < 0.05), and safety concerns when going outside their homes (−57.8%, P < 0.05), as well as significantly increased use of outside spaces for relaxing and socializing (75.7%, P < 0.01). Significant reductions in crime overall (−13.3%, P < 0.01), gun violence (−29.1%, P < 0.001), burglary (−21.9%, P < 0.001), and nuisances (−30.3%, P < 0.05) were also found after the treatment of vacant lots in neighborhoods below the poverty line. Blighted and vacant urban land affects people’s perceptions of safety, and their actual, physical safety. Restoration of this land can be an effective and scalable infrastructure intervention for gun violence, crime, and fear in urban neighborhoods. PMID:29483246

  12. Detection of mastitis in dairy cattle by use of mixture models for repeated somatic cell scores: a Bayesian approach via Gibbs sampling.

    PubMed

    Odegård, J; Jensen, J; Madsen, P; Gianola, D; Klemetsdal, G; Heringstad, B

    2003-11-01

    The distribution of somatic cell scores could be regarded as a mixture of at least two components depending on a cow's udder health status. A heteroscedastic two-component Bayesian normal mixture model with random effects was developed and implemented via Gibbs sampling. The model was evaluated using datasets consisting of simulated somatic cell score records. Somatic cell score was simulated as a mixture representing two alternative udder health statuses ("healthy" or "diseased"). Animals were assigned randomly to the two components according to the probability of group membership (Pm). Random effects (additive genetic and permanent environment), when included, had identical distributions across mixture components. Posterior probabilities of putative mastitis were estimated for all observations, and model adequacy was evaluated using measures of sensitivity, specificity, and posterior probability of misclassification. Fitting different residual variances in the two mixture components caused some bias in estimation of parameters. When the components were difficult to disentangle, so were their residual variances, causing bias in estimation of Pm and of location parameters of the two underlying distributions. When all variance components were identical across mixture components, the mixture model analyses returned parameter estimates essentially without bias and with a high degree of precision. Including random effects in the model increased the probability of correct classification substantially. No sizable differences in probability of correct classification were found between models in which a single cow effect (ignoring relationships) was fitted and models where this effect was split into genetic and permanent environmental components, utilizing relationship information. When genetic and permanent environmental effects were fitted, the between-replicate variance of estimates of posterior means was smaller because the model accounted for random genetic drift.

  13. Inadequacy of Conventional Grab Sampling for Remediation Decision-Making for Metal Contamination at Small-Arms Ranges.

    PubMed

    Clausen, J L; Georgian, T; Gardner, K H; Douglas, T A

    2018-01-01

    Research shows grab sampling is inadequate for evaluating military ranges contaminated with energetics because of their highly heterogeneous distribution. Similar studies assessing the heterogeneous distribution of metals at small-arms ranges (SAR) are lacking. To address this we evaluated whether grab sampling provides appropriate data for performing risk analysis at metal-contaminated SARs characterized with 30-48 grab samples. We evaluated the extractable metal content of Cu, Pb, Sb, and Zn of the field data using a Monte Carlo random resampling with replacement (bootstrapping) simulation approach. Results indicate the 95% confidence interval of the mean for Pb (432 mg/kg) at one site was 200-700 mg/kg with a data range of 5-4500 mg/kg. Considering the U.S. Environmental Protection Agency screening level for lead is 400 mg/kg, the necessity of cleanup at this site is unclear. Resampling based on populations of 7 and 15 samples, a sample size more realistic for the area yielded high false negative rates.

  14. Replica Exchange Improves Sampling in Low-Resolution Docking Stage of RosettaDock

    PubMed Central

    Zhang, Zhe; Lange, Oliver F.

    2013-01-01

    Many protein-protein docking protocols are based on a shotgun approach, in which thousands of independent random-start trajectories minimize the rigid-body degrees of freedom. Another strategy is enumerative sampling as used in ZDOCK. Here, we introduce an alternative strategy, ReplicaDock, using a small number of long trajectories of temperature replica exchange. We compare replica exchange sampling as low-resolution stage of RosettaDock with RosettaDock's original shotgun sampling as well as with ZDOCK. A benchmark of 30 complexes starting from structures of the unbound binding partners shows improved performance for ReplicaDock and ZDOCK when compared to shotgun sampling at equal or less computational expense. ReplicaDock and ZDOCK consistently reach lower energies and generate significantly more near-native conformations than shotgun sampling. Accordingly, they both improve typical metrics of prediction quality of complex structures after refinement. Additionally, the refined ReplicaDock ensembles reach significantly lower interface energies and many previously hidden features of the docking energy landscape become visible when ReplicaDock is applied. PMID:24009670

  15. Optimal spatial sampling techniques for ground truth data in microwave remote sensing of soil moisture

    NASA Technical Reports Server (NTRS)

    Rao, R. G. S.; Ulaby, F. T.

    1977-01-01

    The paper examines optimal sampling techniques for obtaining accurate spatial averages of soil moisture, at various depths and for cell sizes in the range 2.5-40 acres, with a minimum number of samples. Both simple random sampling and stratified sampling procedures are used to reach a set of recommended sample sizes for each depth and for each cell size. Major conclusions from statistical sampling test results are that (1) the number of samples required decreases with increasing depth; (2) when the total number of samples cannot be prespecified or the moisture in only one single layer is of interest, then a simple random sample procedure should be used which is based on the observed mean and SD for data from a single field; (3) when the total number of samples can be prespecified and the objective is to measure the soil moisture profile with depth, then stratified random sampling based on optimal allocation should be used; and (4) decreasing the sensor resolution cell size leads to fairly large decreases in samples sizes with stratified sampling procedures, whereas only a moderate decrease is obtained in simple random sampling procedures.

  16. The study of combining Latin Hypercube Sampling method and LU decomposition method (LULHS method) for constructing spatial random field

    NASA Astrophysics Data System (ADS)

    WANG, P. T.

    2015-12-01

    Groundwater modeling requires to assign hydrogeological properties to every numerical grid. Due to the lack of detailed information and the inherent spatial heterogeneity, geological properties can be treated as random variables. Hydrogeological property is assumed to be a multivariate distribution with spatial correlations. By sampling random numbers from a given statistical distribution and assigning a value to each grid, a random field for modeling can be completed. Therefore, statistics sampling plays an important role in the efficiency of modeling procedure. Latin Hypercube Sampling (LHS) is a stratified random sampling procedure that provides an efficient way to sample variables from their multivariate distributions. This study combines the the stratified random procedure from LHS and the simulation by using LU decomposition to form LULHS. Both conditional and unconditional simulations of LULHS were develpoed. The simulation efficiency and spatial correlation of LULHS are compared to the other three different simulation methods. The results show that for the conditional simulation and unconditional simulation, LULHS method is more efficient in terms of computational effort. Less realizations are required to achieve the required statistical accuracy and spatial correlation.

  17. Approximating prediction uncertainty for random forest regression models

    Treesearch

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  18. Optimal sample sizes for the design of reliability studies: power consideration.

    PubMed

    Shieh, Gwowen

    2014-09-01

    Intraclass correlation coefficients are used extensively to measure the reliability or degree of resemblance among group members in multilevel research. This study concerns the problem of the necessary sample size to ensure adequate statistical power for hypothesis tests concerning the intraclass correlation coefficient in the one-way random-effects model. In view of the incomplete and problematic numerical results in the literature, the approximate sample size formula constructed from Fisher's transformation is reevaluated and compared with an exact approach across a wide range of model configurations. These comprehensive examinations showed that the Fisher transformation method is appropriate only under limited circumstances, and therefore it is not recommended as a general method in practice. For advance design planning of reliability studies, the exact sample size procedures are fully described and illustrated for various allocation and cost schemes. Corresponding computer programs are also developed to implement the suggested algorithms.

  19. Experimental scattershot boson sampling

    PubMed Central

    Bentivegna, Marco; Spagnolo, Nicolò; Vitelli, Chiara; Flamini, Fulvio; Viggianiello, Niko; Latmiral, Ludovico; Mataloni, Paolo; Brod, Daniel J.; Galvão, Ernesto F.; Crespi, Andrea; Ramponi, Roberta; Osellame, Roberto; Sciarrino, Fabio

    2015-01-01

    Boson sampling is a computational task strongly believed to be hard for classical computers, but efficiently solvable by orchestrated bosonic interference in a specialized quantum computer. Current experimental schemes, however, are still insufficient for a convincing demonstration of the advantage of quantum over classical computation. A new variation of this task, scattershot boson sampling, leads to an exponential increase in speed of the quantum device, using a larger number of photon sources based on parametric down-conversion. This is achieved by having multiple heralded single photons being sent, shot by shot, into different random input ports of the interferometer. We report the first scattershot boson sampling experiments, where six different photon-pair sources are coupled to integrated photonic circuits. We use recently proposed statistical tools to analyze our experimental data, providing strong evidence that our photonic quantum simulator works as expected. This approach represents an important leap toward a convincing experimental demonstration of the quantum computational supremacy. PMID:26601164

  20. Experimental scattershot boson sampling.

    PubMed

    Bentivegna, Marco; Spagnolo, Nicolò; Vitelli, Chiara; Flamini, Fulvio; Viggianiello, Niko; Latmiral, Ludovico; Mataloni, Paolo; Brod, Daniel J; Galvão, Ernesto F; Crespi, Andrea; Ramponi, Roberta; Osellame, Roberto; Sciarrino, Fabio

    2015-04-01

    Boson sampling is a computational task strongly believed to be hard for classical computers, but efficiently solvable by orchestrated bosonic interference in a specialized quantum computer. Current experimental schemes, however, are still insufficient for a convincing demonstration of the advantage of quantum over classical computation. A new variation of this task, scattershot boson sampling, leads to an exponential increase in speed of the quantum device, using a larger number of photon sources based on parametric down-conversion. This is achieved by having multiple heralded single photons being sent, shot by shot, into different random input ports of the interferometer. We report the first scattershot boson sampling experiments, where six different photon-pair sources are coupled to integrated photonic circuits. We use recently proposed statistical tools to analyze our experimental data, providing strong evidence that our photonic quantum simulator works as expected. This approach represents an important leap toward a convincing experimental demonstration of the quantum computational supremacy.

Top