Sample records for random sampling

  1. Methods and analysis of realizing randomized grouping.

    PubMed

    Hu, Liang-Ping; Bao, Xiao-Lei; Wang, Qi

    2011-07-01

    Randomization is one of the four basic principles of research design. The meaning of randomization includes two aspects: one is to randomly select samples from the population, which is known as random sampling; the other is to randomly group all the samples, which is called randomized grouping. Randomized grouping can be subdivided into three categories: completely, stratified and dynamically randomized grouping. This article mainly introduces the steps of complete randomization, the definition of dynamic randomization and the realization of random sampling and grouping by SAS software.

  2. [Krigle estimation and its simulated sampling of Chilo suppressalis population density].

    PubMed

    Yuan, Zheming; Bai, Lianyang; Wang, Kuiwu; Hu, Xiangyue

    2004-07-01

    In order to draw up a rational sampling plan for the larvae population of Chilo suppressalis, an original population and its two derivative populations, random population and sequence population, were sampled and compared with random sampling, gap-range-random sampling, and a new systematic sampling integrated Krigle interpolation and random original position. As for the original population whose distribution was up to aggregative and dependence range in line direction was 115 cm (6.9 units), gap-range-random sampling in line direction was more precise than random sampling. Distinguishing the population pattern correctly is the key to get a better precision. Gap-range-random sampling and random sampling are fit for aggregated population and random population, respectively, but both of them are difficult to apply in practice. Therefore, a new systematic sampling named as Krigle sample (n = 441) was developed to estimate the density of partial sample (partial estimation, n = 441) and population (overall estimation, N = 1500). As for original population, the estimated precision of Krigle sample to partial sample and population was better than that of investigation sample. With the increase of the aggregation intensity of population, Krigel sample was more effective than investigation sample in both partial estimation and overall estimation in the appropriate sampling gap according to the dependence range.

  3. Systematic versus random sampling in stereological studies.

    PubMed

    West, Mark J

    2012-12-01

    The sampling that takes place at all levels of an experimental design must be random if the estimate is to be unbiased in a statistical sense. There are two fundamental ways by which one can make a random sample of the sections and positions to be probed on the sections. Using a card-sampling analogy, one can pick any card at all out of a deck of cards. This is referred to as independent random sampling because the sampling of any one card is made without reference to the position of the other cards. The other approach to obtaining a random sample would be to pick a card within a set number of cards and others at equal intervals within the deck. Systematic sampling along one axis of many biological structures is more efficient than random sampling, because most biological structures are not randomly organized. This article discusses the merits of systematic versus random sampling in stereological studies.

  4. Improved Compressive Sensing of Natural Scenes Using Localized Random Sampling

    PubMed Central

    Barranca, Victor J.; Kovačič, Gregor; Zhou, Douglas; Cai, David

    2016-01-01

    Compressive sensing (CS) theory demonstrates that by using uniformly-random sampling, rather than uniformly-spaced sampling, higher quality image reconstructions are often achievable. Considering that the structure of sampling protocols has such a profound impact on the quality of image reconstructions, we formulate a new sampling scheme motivated by physiological receptive field structure, localized random sampling, which yields significantly improved CS image reconstructions. For each set of localized image measurements, our sampling method first randomly selects an image pixel and then measures its nearby pixels with probability depending on their distance from the initially selected pixel. We compare the uniformly-random and localized random sampling methods over a large space of sampling parameters, and show that, for the optimal parameter choices, higher quality image reconstructions can be consistently obtained by using localized random sampling. In addition, we argue that the localized random CS optimal parameter choice is stable with respect to diverse natural images, and scales with the number of samples used for reconstruction. We expect that the localized random sampling protocol helps to explain the evolutionarily advantageous nature of receptive field structure in visual systems and suggests several future research areas in CS theory and its application to brain imaging. PMID:27555464

  5. Nonuniform sampling theorems for random signals in the linear canonical transform domain

    NASA Astrophysics Data System (ADS)

    Shuiqing, Xu; Congmei, Jiang; Yi, Chai; Youqiang, Hu; Lei, Huang

    2018-06-01

    Nonuniform sampling can be encountered in various practical processes because of random events or poor timebase. The analysis and applications of the nonuniform sampling for deterministic signals related to the linear canonical transform (LCT) have been well considered and researched, but up to now no papers have been published regarding the various nonuniform sampling theorems for random signals related to the LCT. The aim of this article is to explore the nonuniform sampling and reconstruction of random signals associated with the LCT. First, some special nonuniform sampling models are briefly introduced. Second, based on these models, some reconstruction theorems for random signals from various nonuniform samples associated with the LCT have been derived. Finally, the simulation results are made to prove the accuracy of the sampling theorems. In addition, the latent real practices of the nonuniform sampling for random signals have been also discussed.

  6. Methodology Series Module 5: Sampling Strategies.

    PubMed

    Setia, Maninder Singh

    2016-01-01

    Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ' Sampling Method'. There are essentially two types of sampling methods: 1) probability sampling - based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling - based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ' random sample' when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ' generalizability' of these results. In such a scenario, the researcher may want to use ' purposive sampling' for the study.

  7. A simple and efficient alternative to implementing systematic random sampling in stereological designs without a motorized microscope stage.

    PubMed

    Melvin, Neal R; Poda, Daniel; Sutherland, Robert J

    2007-10-01

    When properly applied, stereology is a very robust and efficient method to quantify a variety of parameters from biological material. A common sampling strategy in stereology is systematic random sampling, which involves choosing a random sampling [corrected] start point outside the structure of interest, and sampling relevant objects at [corrected] sites that are placed at pre-determined, equidistant intervals. This has proven to be a very efficient sampling strategy, and is used widely in stereological designs. At the microscopic level, this is most often achieved through the use of a motorized stage that facilitates the systematic random stepping across the structure of interest. Here, we report a simple, precise and cost-effective software-based alternative to accomplishing systematic random sampling under the microscope. We believe that this approach will facilitate the use of stereological designs that employ systematic random sampling in laboratories that lack the resources to acquire costly, fully automated systems.

  8. Efficient sampling of complex network with modified random walk strategies

    NASA Astrophysics Data System (ADS)

    Xie, Yunya; Chang, Shuhua; Zhang, Zhipeng; Zhang, Mi; Yang, Lei

    2018-02-01

    We present two novel random walk strategies, choosing seed node (CSN) random walk and no-retracing (NR) random walk. Different from the classical random walk sampling, the CSN and NR strategies focus on the influences of the seed node choice and path overlap, respectively. Three random walk samplings are applied in the Erdös-Rényi (ER), Barabási-Albert (BA), Watts-Strogatz (WS), and the weighted USAir networks, respectively. Then, the major properties of sampled subnets, such as sampling efficiency, degree distributions, average degree and average clustering coefficient, are studied. The similar conclusions can be reached with these three random walk strategies. Firstly, the networks with small scales and simple structures are conducive to the sampling. Secondly, the average degree and the average clustering coefficient of the sampled subnet tend to the corresponding values of original networks with limited steps. And thirdly, all the degree distributions of the subnets are slightly biased to the high degree side. However, the NR strategy performs better for the average clustering coefficient of the subnet. In the real weighted USAir networks, some obvious characters like the larger clustering coefficient and the fluctuation of degree distribution are reproduced well by these random walk strategies.

  9. Methodology Series Module 5: Sampling Strategies

    PubMed Central

    Setia, Maninder Singh

    2016-01-01

    Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ‘ Sampling Method’. There are essentially two types of sampling methods: 1) probability sampling – based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling – based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ‘ random sample’ when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ‘ generalizability’ of these results. In such a scenario, the researcher may want to use ‘ purposive sampling’ for the study. PMID:27688438

  10. Sampling Strategies for Evaluating the Rate of Adventitious Transgene Presence in Non-Genetically Modified Crop Fields.

    PubMed

    Makowski, David; Bancal, Rémi; Bensadoun, Arnaud; Monod, Hervé; Messéan, Antoine

    2017-09-01

    According to E.U. regulations, the maximum allowable rate of adventitious transgene presence in non-genetically modified (GM) crops is 0.9%. We compared four sampling methods for the detection of transgenic material in agricultural non-GM maize fields: random sampling, stratified sampling, random sampling + ratio reweighting, random sampling + regression reweighting. Random sampling involves simply sampling maize grains from different locations selected at random from the field concerned. The stratified and reweighting sampling methods make use of an auxiliary variable corresponding to the output of a gene-flow model (a zero-inflated Poisson model) simulating cross-pollination as a function of wind speed, wind direction, and distance to the closest GM maize field. With the stratified sampling method, an auxiliary variable is used to define several strata with contrasting transgene presence rates, and grains are then sampled at random from each stratum. With the two methods involving reweighting, grains are first sampled at random from various locations within the field, and the observations are then reweighted according to the auxiliary variable. Data collected from three maize fields were used to compare the four sampling methods, and the results were used to determine the extent to which transgene presence rate estimation was improved by the use of stratified and reweighting sampling methods. We found that transgene rate estimates were more accurate and that substantially smaller samples could be used with sampling strategies based on an auxiliary variable derived from a gene-flow model. © 2017 Society for Risk Analysis.

  11. Sampling Large Graphs for Anticipatory Analytics

    DTIC Science & Technology

    2015-05-15

    low. C. Random Area Sampling Random area sampling [8] is a “ snowball ” sampling method in which a set of random seed vertices are selected and areas... Sampling Large Graphs for Anticipatory Analytics Lauren Edwards, Luke Johnson, Maja Milosavljevic, Vijay Gadepally, Benjamin A. Miller Lincoln...systems, greater human-in-the-loop involvement, or through complex algorithms. We are investigating the use of sampling to mitigate these challenges

  12. Electromagnetic Scattering by Fully Ordered and Quasi-Random Rigid Particulate Samples

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Dlugach, Janna M.; Mackowski, Daniel W.

    2016-01-01

    In this paper we have analyzed circumstances under which a rigid particulate sample can behave optically as a true discrete random medium consisting of particles randomly moving relative to each other during measurement. To this end, we applied the numerically exact superposition T-matrix method to model far-field scattering characteristics of fully ordered and quasi-randomly arranged rigid multiparticle groups in fixed and random orientations. We have shown that, in and of itself, averaging optical observables over movements of a rigid sample as a whole is insufficient unless it is combined with a quasi-random arrangement of the constituent particles in the sample. Otherwise, certain scattering effects typical of discrete random media (including some manifestations of coherent backscattering) may not be accurately replicated.

  13. Assessment of wadeable stream resources in the driftless area ecoregion in Western Wisconsin using a probabilistic sampling design.

    PubMed

    Miller, Michael A; Colby, Alison C C; Kanehl, Paul D; Blocksom, Karen

    2009-03-01

    The Wisconsin Department of Natural Resources (WDNR), with support from the U.S. EPA, conducted an assessment of wadeable streams in the Driftless Area ecoregion in western Wisconsin using a probabilistic sampling design. This ecoregion encompasses 20% of Wisconsin's land area and contains 8,800 miles of perennial streams. Randomly-selected stream sites (n = 60) equally distributed among stream orders 1-4 were sampled. Watershed land use, riparian and in-stream habitat, water chemistry, macroinvertebrate, and fish assemblage data were collected at each true random site and an associated "modified-random" site on each stream that was accessed via a road crossing nearest to the true random site. Targeted least-disturbed reference sites (n = 22) were also sampled to develop reference conditions for various physical, chemical, and biological measures. Cumulative distribution function plots of various measures collected at the true random sites evaluated with reference condition thresholds, indicate that high proportions of the random sites (and by inference the entire Driftless Area wadeable stream population) show some level of degradation. Study results show no statistically significant differences between the true random and modified-random sample sites for any of the nine physical habitat, 11 water chemistry, seven macroinvertebrate, or eight fish metrics analyzed. In Wisconsin's Driftless Area, 79% of wadeable stream lengths were accessible via road crossings. While further evaluation of the statistical rigor of using a modified-random sampling design is warranted, sampling randomly-selected stream sites accessed via the nearest road crossing may provide a more economical way to apply probabilistic sampling in stream monitoring programs.

  14. A random spatial sampling method in a rural developing nation

    Treesearch

    Michelle C. Kondo; Kent D.W. Bream; Frances K. Barg; Charles C. Branas

    2014-01-01

    Nonrandom sampling of populations in developing nations has limitations and can inaccurately estimate health phenomena, especially among hard-to-reach populations such as rural residents. However, random sampling of rural populations in developing nations can be challenged by incomplete enumeration of the base population. We describe a stratified random sampling method...

  15. Reduction of display artifacts by random sampling

    NASA Technical Reports Server (NTRS)

    Ahumada, A. J., Jr.; Nagel, D. C.; Watson, A. B.; Yellott, J. I., Jr.

    1983-01-01

    The application of random-sampling techniques to remove visible artifacts (such as flicker, moire patterns, and paradoxical motion) introduced in TV-type displays by discrete sequential scanning is discussed and demonstrated. Sequential-scanning artifacts are described; the window of visibility defined in spatiotemporal frequency space by Watson and Ahumada (1982 and 1983) and Watson et al. (1983) is explained; the basic principles of random sampling are reviewed and illustrated by the case of the human retina; and it is proposed that the sampling artifacts can be replaced by random noise, which can then be shifted to frequency-space regions outside the window of visibility. Vertical sequential, single-random-sequence, and continuously renewed random-sequence plotting displays generating 128 points at update rates up to 130 Hz are applied to images of stationary and moving lines, and best results are obtained with the single random sequence for the stationary lines and with the renewed random sequence for the moving lines.

  16. Revisiting sample size: are big trials the answer?

    PubMed

    Lurati Buse, Giovanna A L; Botto, Fernando; Devereaux, P J

    2012-07-18

    The superiority of the evidence generated in randomized controlled trials over observational data is not only conditional to randomization. Randomized controlled trials require proper design and implementation to provide a reliable effect estimate. Adequate random sequence generation, allocation implementation, analyses based on the intention-to-treat principle, and sufficient power are crucial to the quality of a randomized controlled trial. Power, or the probability of the trial to detect a difference when a real difference between treatments exists, strongly depends on sample size. The quality of orthopaedic randomized controlled trials is frequently threatened by a limited sample size. This paper reviews basic concepts and pitfalls in sample-size estimation and focuses on the importance of large trials in the generation of valid evidence.

  17. [Comparison study on sampling methods of Oncomelania hupensis snail survey in marshland schistosomiasis epidemic areas in China].

    PubMed

    An, Zhao; Wen-Xin, Zhang; Zhong, Yao; Yu-Kuan, Ma; Qing, Liu; Hou-Lang, Duan; Yi-di, Shang

    2016-06-29

    To optimize and simplify the survey method of Oncomelania hupensis snail in marshland endemic region of schistosomiasis and increase the precision, efficiency and economy of the snail survey. A quadrate experimental field was selected as the subject of 50 m×50 m size in Chayegang marshland near Henghu farm in the Poyang Lake region and a whole-covered method was adopted to survey the snails. The simple random sampling, systematic sampling and stratified random sampling methods were applied to calculate the minimum sample size, relative sampling error and absolute sampling error. The minimum sample sizes of the simple random sampling, systematic sampling and stratified random sampling methods were 300, 300 and 225, respectively. The relative sampling errors of three methods were all less than 15%. The absolute sampling errors were 0.221 7, 0.302 4 and 0.047 8, respectively. The spatial stratified sampling with altitude as the stratum variable is an efficient approach of lower cost and higher precision for the snail survey.

  18. Smoothing the redshift distributions of random samples for the baryon acoustic oscillations: applications to the SDSS-III BOSS DR12 and QPM mock samples

    NASA Astrophysics Data System (ADS)

    Wang, Shao-Jiang; Guo, Qi; Cai, Rong-Gen

    2017-12-01

    We investigate the impact of different redshift distributions of random samples on the baryon acoustic oscillations (BAO) measurements of D_V(z)r_d^fid/r_d from the two-point correlation functions of galaxies in the Data Release 12 of the Baryon Oscillation Spectroscopic Survey (BOSS). Big surveys, such as BOSS, usually assign redshifts to the random samples by randomly drawing values from the measured redshift distributions of the data, which would necessarily introduce fiducial signals of fluctuations into the random samples, weakening the signals of BAO, if the cosmic variance cannot be ignored. We propose a smooth function of redshift distribution that fits the data well to populate the random galaxy samples. The resulting cosmological parameters match the input parameters of the mock catalogue very well. The significance of BAO signals has been improved by 0.33σ for a low-redshift sample and by 0.03σ for a constant-stellar-mass sample, though the absolute values do not change significantly. Given the precision of the measurements of current cosmological parameters, it would be appreciated for the future improvements on the measurements of galaxy clustering.

  19. Determination of the influence of dispersion pattern of pesticide-resistant individuals on the reliability of resistance estimates using different sampling plans.

    PubMed

    Shah, R; Worner, S P; Chapman, R B

    2012-10-01

    Pesticide resistance monitoring includes resistance detection and subsequent documentation/ measurement. Resistance detection would require at least one (≥1) resistant individual(s) to be present in a sample to initiate management strategies. Resistance documentation, on the other hand, would attempt to get an estimate of the entire population (≥90%) of the resistant individuals. A computer simulation model was used to compare the efficiency of simple random and systematic sampling plans to detect resistant individuals and to document their frequencies when the resistant individuals were randomly or patchily distributed. A patchy dispersion pattern of resistant individuals influenced the sampling efficiency of systematic sampling plans while the efficiency of random sampling was independent of such patchiness. When resistant individuals were randomly distributed, sample sizes required to detect at least one resistant individual (resistance detection) with a probability of 0.95 were 300 (1%) and 50 (10% and 20%); whereas, when resistant individuals were patchily distributed, using systematic sampling, sample sizes required for such detection were 6000 (1%), 600 (10%) and 300 (20%). Sample sizes of 900 and 400 would be required to detect ≥90% of resistant individuals (resistance documentation) with a probability of 0.95 when resistant individuals were randomly dispersed and present at a frequency of 10% and 20%, respectively; whereas, when resistant individuals were patchily distributed, using systematic sampling, a sample size of 3000 and 1500, respectively, was necessary. Small sample sizes either underestimated or overestimated the resistance frequency. A simple random sampling plan is, therefore, recommended for insecticide resistance detection and subsequent documentation.

  20. Investigating the Randomness of Numbers

    ERIC Educational Resources Information Center

    Pendleton, Kenn L.

    2009-01-01

    The use of random numbers is pervasive in today's world. Random numbers have practical applications in such far-flung arenas as computer simulations, cryptography, gambling, the legal system, statistical sampling, and even the war on terrorism. Evaluating the randomness of extremely large samples is a complex, intricate process. However, the…

  1. RECAL: A Computer Program for Selecting Sample Days for Recreation Use Estimation

    Treesearch

    D.L. Erickson; C.J. Liu; H. Ken Cordell; W.L. Chen

    1980-01-01

    Recreation Calendar (RECAL) is a computer program in PL/I for drawing a sample of days for estimating recreation use. With RECAL, a sampling period of any length may be chosen; simple random, stratified random, and factorial designs can be accommodated. The program randomly allocates days to strata and locations.

  2. Sample Selection in Randomized Experiments: A New Method Using Propensity Score Stratified Sampling

    ERIC Educational Resources Information Center

    Tipton, Elizabeth; Hedges, Larry; Vaden-Kiernan, Michael; Borman, Geoffrey; Sullivan, Kate; Caverly, Sarah

    2014-01-01

    Randomized experiments are often seen as the "gold standard" for causal research. Despite the fact that experiments use random assignment to treatment conditions, units are seldom selected into the experiment using probability sampling. Very little research on experimental design has focused on how to make generalizations to well-defined…

  3. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST)

    PubMed Central

    Xu, Chonggang; Gertner, George

    2013-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. PMID:24143037

  4. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST).

    PubMed

    Xu, Chonggang; Gertner, George

    2011-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements.

  5. Spline methods for approximating quantile functions and generating random samples

    NASA Technical Reports Server (NTRS)

    Schiess, J. R.; Matthews, C. G.

    1985-01-01

    Two cubic spline formulations are presented for representing the quantile function (inverse cumulative distribution function) of a random sample of data. Both B-spline and rational spline approximations are compared with analytic representations of the quantile function. It is also shown how these representations can be used to generate random samples for use in simulation studies. Comparisons are made on samples generated from known distributions and a sample of experimental data. The spline representations are more accurate for multimodal and skewed samples and to require much less time to generate samples than the analytic representation.

  6. Methods for sample size determination in cluster randomized trials

    PubMed Central

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-01-01

    Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515

  7. Influences of sampling size and pattern on the uncertainty of correlation estimation between soil water content and its influencing factors

    NASA Astrophysics Data System (ADS)

    Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua

    2017-12-01

    In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.

  8. Sequential time interleaved random equivalent sampling for repetitive signal.

    PubMed

    Zhao, Yijiu; Liu, Jingjing

    2016-12-01

    Compressed sensing (CS) based sampling techniques exhibit many advantages over other existing approaches for sparse signal spectrum sensing; they are also incorporated into non-uniform sampling signal reconstruction to improve the efficiency, such as random equivalent sampling (RES). However, in CS based RES, only one sample of each acquisition is considered in the signal reconstruction stage, and it will result in more acquisition runs and longer sampling time. In this paper, a sampling sequence is taken in each RES acquisition run, and the corresponding block measurement matrix is constructed using a Whittaker-Shannon interpolation formula. All the block matrices are combined into an equivalent measurement matrix with respect to all sampling sequences. We implemented the proposed approach with a multi-cores analog-to-digital converter (ADC), whose ADC cores are time interleaved. A prototype realization of this proposed CS based sequential random equivalent sampling method has been developed. It is able to capture an analog waveform at an equivalent sampling rate of 40 GHz while sampled at 1 GHz physically. Experiments indicate that, for a sparse signal, the proposed CS based sequential random equivalent sampling exhibits high efficiency.

  9. Sampling large random knots in a confined space

    NASA Astrophysics Data System (ADS)

    Arsuaga, J.; Blackstone, T.; Diao, Y.; Hinson, K.; Karadayi, E.; Saito, M.

    2007-09-01

    DNA knots formed under extreme conditions of condensation, as in bacteriophage P4, are difficult to analyze experimentally and theoretically. In this paper, we propose to use the uniform random polygon model as a supplementary method to the existing methods for generating random knots in confinement. The uniform random polygon model allows us to sample knots with large crossing numbers and also to generate large diagrammatically prime knot diagrams. We show numerically that uniform random polygons sample knots with large minimum crossing numbers and certain complicated knot invariants (as those observed experimentally). We do this in terms of the knot determinants or colorings. Our numerical results suggest that the average determinant of a uniform random polygon of n vertices grows faster than O(e^{n^2}) . We also investigate the complexity of prime knot diagrams. We show rigorously that the probability that a randomly selected 2D uniform random polygon of n vertices is almost diagrammatically prime goes to 1 as n goes to infinity. Furthermore, the average number of crossings in such a diagram is at the order of O(n2). Therefore, the two-dimensional uniform random polygons offer an effective way in sampling large (prime) knots, which can be useful in various applications.

  10. Random bit generation at tunable rates using a chaotic semiconductor laser under distributed feedback.

    PubMed

    Li, Xiao-Zhou; Li, Song-Sui; Zhuang, Jun-Ping; Chan, Sze-Chun

    2015-09-01

    A semiconductor laser with distributed feedback from a fiber Bragg grating (FBG) is investigated for random bit generation (RBG). The feedback perturbs the laser to emit chaotically with the intensity being sampled periodically. The samples are then converted into random bits by a simple postprocessing of self-differencing and selecting bits. Unlike a conventional mirror that provides localized feedback, the FBG provides distributed feedback which effectively suppresses the information of the round-trip feedback delay time. Randomness is ensured even when the sampling period is commensurate with the feedback delay between the laser and the grating. Consequently, in RBG, the FBG feedback enables continuous tuning of the output bit rate, reduces the minimum sampling period, and increases the number of bits selected per sample. RBG is experimentally investigated at a sampling period continuously tunable from over 16 ns down to 50 ps, while the feedback delay is fixed at 7.7 ns. By selecting 5 least-significant bits per sample, output bit rates from 0.3 to 100 Gbps are achieved with randomness examined by the National Institute of Standards and Technology test suite.

  11. Optimal spatial sampling techniques for ground truth data in microwave remote sensing of soil moisture

    NASA Technical Reports Server (NTRS)

    Rao, R. G. S.; Ulaby, F. T.

    1977-01-01

    The paper examines optimal sampling techniques for obtaining accurate spatial averages of soil moisture, at various depths and for cell sizes in the range 2.5-40 acres, with a minimum number of samples. Both simple random sampling and stratified sampling procedures are used to reach a set of recommended sample sizes for each depth and for each cell size. Major conclusions from statistical sampling test results are that (1) the number of samples required decreases with increasing depth; (2) when the total number of samples cannot be prespecified or the moisture in only one single layer is of interest, then a simple random sample procedure should be used which is based on the observed mean and SD for data from a single field; (3) when the total number of samples can be prespecified and the objective is to measure the soil moisture profile with depth, then stratified random sampling based on optimal allocation should be used; and (4) decreasing the sensor resolution cell size leads to fairly large decreases in samples sizes with stratified sampling procedures, whereas only a moderate decrease is obtained in simple random sampling procedures.

  12. Chi-Squared Test of Fit and Sample Size-A Comparison between a Random Sample Approach and a Chi-Square Value Adjustment Method.

    PubMed

    Bergh, Daniel

    2015-01-01

    Chi-square statistics are commonly used for tests of fit of measurement models. Chi-square is also sensitive to sample size, which is why several approaches to handle large samples in test of fit analysis have been developed. One strategy to handle the sample size problem may be to adjust the sample size in the analysis of fit. An alternative is to adopt a random sample approach. The purpose of this study was to analyze and to compare these two strategies using simulated data. Given an original sample size of 21,000, for reductions of sample sizes down to the order of 5,000 the adjusted sample size function works as good as the random sample approach. In contrast, when applying adjustments to sample sizes of lower order the adjustment function is less effective at approximating the chi-square value for an actual random sample of the relevant size. Hence, the fit is exaggerated and misfit under-estimated using the adjusted sample size function. Although there are big differences in chi-square values between the two approaches at lower sample sizes, the inferences based on the p-values may be the same.

  13. Health plan auditing: 100-percent-of-claims vs. random-sample audits.

    PubMed

    Sillup, George P; Klimberg, Ronald K

    2011-01-01

    The objective of this study was to examine the relative efficacy of two different methodologies for auditing self-funded medical claim expenses: 100-percent-of-claims auditing versus random-sampling auditing. Multiple data sets of claim errors or 'exceptions' from two Fortune-100 corporations were analysed and compared to 100 simulated audits of 300- and 400-claim random samples. Random-sample simulations failed to identify a significant number and amount of the errors that ranged from $200,000 to $750,000. These results suggest that health plan expenses of corporations could be significantly reduced if they audited 100% of claims and embraced a zero-defect approach.

  14. The study of combining Latin Hypercube Sampling method and LU decomposition method (LULHS method) for constructing spatial random field

    NASA Astrophysics Data System (ADS)

    WANG, P. T.

    2015-12-01

    Groundwater modeling requires to assign hydrogeological properties to every numerical grid. Due to the lack of detailed information and the inherent spatial heterogeneity, geological properties can be treated as random variables. Hydrogeological property is assumed to be a multivariate distribution with spatial correlations. By sampling random numbers from a given statistical distribution and assigning a value to each grid, a random field for modeling can be completed. Therefore, statistics sampling plays an important role in the efficiency of modeling procedure. Latin Hypercube Sampling (LHS) is a stratified random sampling procedure that provides an efficient way to sample variables from their multivariate distributions. This study combines the the stratified random procedure from LHS and the simulation by using LU decomposition to form LULHS. Both conditional and unconditional simulations of LULHS were develpoed. The simulation efficiency and spatial correlation of LULHS are compared to the other three different simulation methods. The results show that for the conditional simulation and unconditional simulation, LULHS method is more efficient in terms of computational effort. Less realizations are required to achieve the required statistical accuracy and spatial correlation.

  15. 40 CFR 761.130 - Sampling requirements.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... sampling scheme and the guidance document are available on EPA's PCB Web site at http://www.epa.gov/pcb, or... § 761.125(c) (2) through (4). Using its best engineering judgment, EPA may sample a statistically valid random or grid sampling technique, or both. When using engineering judgment or random “grab” samples, EPA...

  16. 40 CFR 761.130 - Sampling requirements.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... sampling scheme and the guidance document are available on EPA's PCB Web site at http://www.epa.gov/pcb, or... § 761.125(c) (2) through (4). Using its best engineering judgment, EPA may sample a statistically valid random or grid sampling technique, or both. When using engineering judgment or random “grab” samples, EPA...

  17. Misrepresenting random sampling? A systematic review of research papers in the Journal of Advanced Nursing.

    PubMed

    Williamson, Graham R

    2003-11-01

    This paper discusses the theoretical limitations of the use of random sampling and probability theory in the production of a significance level (or P-value) in nursing research. Potential alternatives, in the form of randomization tests, are proposed. Research papers in nursing, medicine and psychology frequently misrepresent their statistical findings, as the P-values reported assume random sampling. In this systematic review of studies published between January 1995 and June 2002 in the Journal of Advanced Nursing, 89 (68%) studies broke this assumption because they used convenience samples or entire populations. As a result, some of the findings may be questionable. The key ideas of random sampling and probability theory for statistical testing (for generating a P-value) are outlined. The result of a systematic review of research papers published in the Journal of Advanced Nursing is then presented, showing how frequently random sampling appears to have been misrepresented. Useful alternative techniques that might overcome these limitations are then discussed. REVIEW LIMITATIONS: This review is limited in scope because it is applied to one journal, and so the findings cannot be generalized to other nursing journals or to nursing research in general. However, it is possible that other nursing journals are also publishing research articles based on the misrepresentation of random sampling. The review is also limited because in several of the articles the sampling method was not completely clearly stated, and in this circumstance a judgment has been made as to the sampling method employed, based on the indications given by author(s). Quantitative researchers in nursing should be very careful that the statistical techniques they use are appropriate for the design and sampling methods of their studies. If the techniques they employ are not appropriate, they run the risk of misinterpreting findings by using inappropriate, unrepresentative and biased samples.

  18. The Expected Sample Variance of Uncorrelated Random Variables with a Common Mean and Some Applications in Unbalanced Random Effects Models

    ERIC Educational Resources Information Center

    Vardeman, Stephen B.; Wendelberger, Joanne R.

    2005-01-01

    There is a little-known but very simple generalization of the standard result that for uncorrelated random variables with common mean [mu] and variance [sigma][superscript 2], the expected value of the sample variance is [sigma][superscript 2]. The generalization justifies the use of the usual standard error of the sample mean in possibly…

  19. Assessment of the effect of population and diary sampling methods on estimation of school-age children exposure to fine particles.

    PubMed

    Che, W W; Frey, H Christopher; Lau, Alexis K H

    2014-12-01

    Population and diary sampling methods are employed in exposure models to sample simulated individuals and their daily activity on each simulation day. Different sampling methods may lead to variations in estimated human exposure. In this study, two population sampling methods (stratified-random and random-random) and three diary sampling methods (random resampling, diversity and autocorrelation, and Markov-chain cluster [MCC]) are evaluated. Their impacts on estimated children's exposure to ambient fine particulate matter (PM2.5 ) are quantified via case studies for children in Wake County, NC for July 2002. The estimated mean daily average exposure is 12.9 μg/m(3) for simulated children using the stratified population sampling method, and 12.2 μg/m(3) using the random sampling method. These minor differences are caused by the random sampling among ages within census tracts. Among the three diary sampling methods, there are differences in the estimated number of individuals with multiple days of exposures exceeding a benchmark of concern of 25 μg/m(3) due to differences in how multiday longitudinal diaries are estimated. The MCC method is relatively more conservative. In case studies evaluated here, the MCC method led to 10% higher estimation of the number of individuals with repeated exposures exceeding the benchmark. The comparisons help to identify and contrast the capabilities of each method and to offer insight regarding implications of method choice. Exposure simulation results are robust to the two population sampling methods evaluated, and are sensitive to the choice of method for simulating longitudinal diaries, particularly when analyzing results for specific microenvironments or for exposures exceeding a benchmark of concern. © 2014 Society for Risk Analysis.

  20. A random sampling approach for robust estimation of tissue-to-plasma ratio from extremely sparse data.

    PubMed

    Chu, Hui-May; Ette, Ene I

    2005-09-02

    his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.

  1. Conic Sampling: An Efficient Method for Solving Linear and Quadratic Programming by Randomly Linking Constraints within the Interior

    PubMed Central

    Serang, Oliver

    2012-01-01

    Linear programming (LP) problems are commonly used in analysis and resource allocation, frequently surfacing as approximations to more difficult problems. Existing approaches to LP have been dominated by a small group of methods, and randomized algorithms have not enjoyed popularity in practice. This paper introduces a novel randomized method of solving LP problems by moving along the facets and within the interior of the polytope along rays randomly sampled from the polyhedral cones defined by the bounding constraints. This conic sampling method is then applied to randomly sampled LPs, and its runtime performance is shown to compare favorably to the simplex and primal affine-scaling algorithms, especially on polytopes with certain characteristics. The conic sampling method is then adapted and applied to solve a certain quadratic program, which compute a projection onto a polytope; the proposed method is shown to outperform the proprietary software Mathematica on large, sparse QP problems constructed from mass spectometry-based proteomics. PMID:22952741

  2. Point-Sampling and Line-Sampling Probability Theory, Geometric Implications, Synthesis

    Treesearch

    L.R. Grosenbaugh

    1958-01-01

    Foresters concerned with measuring tree populations on definite areas have long employed two well-known methods of representative sampling. In list or enumerative sampling the entire tree population is tallied with a known proportion being randomly selected and measured for volume or other variables. In area sampling all trees on randomly located plots or strips...

  3. Randomized branch sampling

    Treesearch

    Harry T. Valentine

    2002-01-01

    Randomized branch sampling (RBS) is a special application of multistage probability sampling (see Sampling, environmental), which was developed originally by Jessen [3] to estimate fruit counts on individual orchard trees. In general, the method can be used to obtain estimates of many different attributes of trees or other branched plants. The usual objective of RBS is...

  4. Deterministic multidimensional nonuniform gap sampling.

    PubMed

    Worley, Bradley; Powers, Robert

    2015-12-01

    Born from empirical observations in nonuniformly sampled multidimensional NMR data relating to gaps between sampled points, the Poisson-gap sampling method has enjoyed widespread use in biomolecular NMR. While the majority of nonuniform sampling schemes are fully randomly drawn from probability densities that vary over a Nyquist grid, the Poisson-gap scheme employs constrained random deviates to minimize the gaps between sampled grid points. We describe a deterministic gap sampling method, based on the average behavior of Poisson-gap sampling, which performs comparably to its random counterpart with the additional benefit of completely deterministic behavior. We also introduce a general algorithm for multidimensional nonuniform sampling based on a gap equation, and apply it to yield a deterministic sampling scheme that combines burst-mode sampling features with those of Poisson-gap schemes. Finally, we derive a relationship between stochastic gap equations and the expectation value of their sampling probability densities. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Observational studies of patients in the emergency department: a comparison of 4 sampling methods.

    PubMed

    Valley, Morgan A; Heard, Kennon J; Ginde, Adit A; Lezotte, Dennis C; Lowenstein, Steven R

    2012-08-01

    We evaluate the ability of 4 sampling methods to generate representative samples of the emergency department (ED) population. We analyzed the electronic records of 21,662 consecutive patient visits at an urban, academic ED. From this population, we simulated different models of study recruitment in the ED by using 2 sample sizes (n=200 and n=400) and 4 sampling methods: true random, random 4-hour time blocks by exact sample size, random 4-hour time blocks by a predetermined number of blocks, and convenience or "business hours." For each method and sample size, we obtained 1,000 samples from the population. Using χ(2) tests, we measured the number of statistically significant differences between the sample and the population for 8 variables (age, sex, race/ethnicity, language, triage acuity, arrival mode, disposition, and payer source). Then, for each variable, method, and sample size, we compared the proportion of the 1,000 samples that differed from the overall ED population to the expected proportion (5%). Only the true random samples represented the population with respect to sex, race/ethnicity, triage acuity, mode of arrival, language, and payer source in at least 95% of the samples. Patient samples obtained using random 4-hour time blocks and business hours sampling systematically differed from the overall ED patient population for several important demographic and clinical variables. However, the magnitude of these differences was not large. Common sampling strategies selected for ED-based studies may affect parameter estimates for several representative population variables. However, the potential for bias for these variables appears small. Copyright © 2012. Published by Mosby, Inc.

  6. A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.

    PubMed

    Yu, Qingzhao; Zhu, Lin; Zhu, Han

    2017-11-01

    Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.

  7. Breaking through the bandwidth barrier in distributed fiber vibration sensing by sub-Nyquist randomized sampling

    NASA Astrophysics Data System (ADS)

    Zhang, Jingdong; Zhu, Tao; Zheng, Hua; Kuang, Yang; Liu, Min; Huang, Wei

    2017-04-01

    The round trip time of the light pulse limits the maximum detectable frequency response range of vibration in phase-sensitive optical time domain reflectometry (φ-OTDR). We propose a method to break the frequency response range restriction of φ-OTDR system by modulating the light pulse interval randomly which enables a random sampling for every vibration point in a long sensing fiber. This sub-Nyquist randomized sampling method is suits for detecting sparse-wideband- frequency vibration signals. Up to MHz resonance vibration signal with over dozens of frequency components and 1.153MHz single frequency vibration signal are clearly identified for a sensing range of 9.6km with 10kHz maximum sampling rate.

  8. 78 FR 57033 - United States Standards for Condition of Food Containers

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-09-17

    ... containers during production. Stationary lot sampling is the process of randomly selecting sample units from.... * * * * * Stationary lot sampling. The process of randomly selecting sample units from a lot whose production has been... less than \\1/16\\-inch Stringy seal (excessive plastic threads showing at edge of seal 222 area...

  9. Health indicators: eliminating bias from convenience sampling estimators.

    PubMed

    Hedt, Bethany L; Pagano, Marcello

    2011-02-28

    Public health practitioners are often called upon to make inference about a health indicator for a population at large when the sole available information are data gathered from a convenience sample, such as data gathered on visitors to a clinic. These data may be of the highest quality and quite extensive, but the biases inherent in a convenience sample preclude the legitimate use of powerful inferential tools that are usually associated with a random sample. In general, we know nothing about those who do not visit the clinic beyond the fact that they do not visit the clinic. An alternative is to take a random sample of the population. However, we show that this solution would be wasteful if it excluded the use of available information. Hence, we present a simple annealing methodology that combines a relatively small, and presumably far less expensive, random sample with the convenience sample. This allows us to not only take advantage of powerful inferential tools, but also provides more accurate information than that available from just using data from the random sample alone. Copyright © 2011 John Wiley & Sons, Ltd.

  10. A Systematic Review of Surgical Randomized Controlled Trials: Part 2. Funding Source, Conflict of Interest, and Sample Size in Plastic Surgery.

    PubMed

    Voineskos, Sophocles H; Coroneos, Christopher J; Ziolkowski, Natalia I; Kaur, Manraj N; Banfield, Laura; Meade, Maureen O; Chung, Kevin C; Thoma, Achilleas; Bhandari, Mohit

    2016-02-01

    The authors examined industry support, conflict of interest, and sample size in plastic surgery randomized controlled trials that compared surgical interventions. They hypothesized that industry-funded trials demonstrate statistically significant outcomes more often, and randomized controlled trials with small sample sizes report statistically significant results more frequently. An electronic search identified randomized controlled trials published between 2000 and 2013. Independent reviewers assessed manuscripts and performed data extraction. Funding source, conflict of interest, primary outcome direction, and sample size were examined. Chi-squared and independent-samples t tests were used in the analysis. The search identified 173 randomized controlled trials, of which 100 (58 percent) did not acknowledge funding status. A relationship between funding source and trial outcome direction was not observed. Both funding status and conflict of interest reporting improved over time. Only 24 percent (six of 25) of industry-funded randomized controlled trials reported authors to have independent control of data and manuscript contents. The mean number of patients randomized was 73 per trial (median, 43, minimum, 3, maximum, 936). Small trials were not found to be positive more often than large trials (p = 0.87). Randomized controlled trials with small sample size were common; however, this provides great opportunity for the field to engage in further collaboration and produce larger, more definitive trials. Reporting of trial funding and conflict of interest is historically poor, but it greatly improved over the study period. Underreporting at author and journal levels remains a limitation when assessing the relationship between funding source and trial outcomes. Improved reporting and manuscript control should be goals that both authors and journals can actively achieve.

  11. Random phase detection in multidimensional NMR.

    PubMed

    Maciejewski, Mark W; Fenwick, Matthew; Schuyler, Adam D; Stern, Alan S; Gorbatyuk, Vitaliy; Hoch, Jeffrey C

    2011-10-04

    Despite advances in resolution accompanying the development of high-field superconducting magnets, biomolecular applications of NMR require multiple dimensions in order to resolve individual resonances, and the achievable resolution is typically limited by practical constraints on measuring time. In addition to the need for measuring long evolution times to obtain high resolution, the need to distinguish the sign of the frequency constrains the ability to shorten measuring times. Sign discrimination is typically accomplished by sampling the signal with two different receiver phases or by selecting a reference frequency outside the range of frequencies spanned by the signal and then sampling at a higher rate. In the parametrically sampled (indirect) time dimensions of multidimensional NMR experiments, either method imposes an additional factor of 2 sampling burden for each dimension. We demonstrate that by using a single detector phase at each time sample point, but randomly altering the phase for different points, the sign ambiguity that attends fixed single-phase detection is resolved. Random phase detection enables a reduction in experiment time by a factor of 2 for each indirect dimension, amounting to a factor of 8 for a four-dimensional experiment, albeit at the cost of introducing sampling artifacts. Alternatively, for fixed measuring time, random phase detection can be used to double resolution in each indirect dimension. Random phase detection is complementary to nonuniform sampling methods, and their combination offers the potential for additional benefits. In addition to applications in biomolecular NMR, random phase detection could be useful in magnetic resonance imaging and other signal processing contexts.

  12. High-speed imaging using CMOS image sensor with quasi pixel-wise exposure

    NASA Astrophysics Data System (ADS)

    Sonoda, T.; Nagahara, H.; Endo, K.; Sugiyama, Y.; Taniguchi, R.

    2017-02-01

    Several recent studies in compressive video sensing have realized scene capture beyond the fundamental trade-off limit between spatial resolution and temporal resolution using random space-time sampling. However, most of these studies showed results for higher frame rate video that were produced by simulation experiments or using an optically simulated random sampling camera, because there are currently no commercially available image sensors with random exposure or sampling capabilities. We fabricated a prototype complementary metal oxide semiconductor (CMOS) image sensor with quasi pixel-wise exposure timing that can realize nonuniform space-time sampling. The prototype sensor can reset exposures independently by columns and fix these amount of exposure by rows for each 8x8 pixel block. This CMOS sensor is not fully controllable via the pixels, and has line-dependent controls, but it offers flexibility when compared with regular CMOS or charge-coupled device sensors with global or rolling shutters. We propose a method to realize pseudo-random sampling for high-speed video acquisition that uses the flexibility of the CMOS sensor. We reconstruct the high-speed video sequence from the images produced by pseudo-random sampling using an over-complete dictionary.

  13. ADAPTIVE MATCHING IN RANDOMIZED TRIALS AND OBSERVATIONAL STUDIES

    PubMed Central

    van der Laan, Mark J.; Balzer, Laura B.; Petersen, Maya L.

    2014-01-01

    SUMMARY In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, the observed data can neither be represented as the realization of n independent random variables, nor, contrary to current practice, as the realization of n/2 independent random variables (treating the matched pair as the independent sampling unit). In this paper we study estimation of the average causal effect of a treatment under experimental designs in which treatment allocation potentially depends on the pre-intervention covariates of all units included in the sample. We define efficient targeted minimum loss based estimators for this general design, present a theorem that establishes the desired asymptotic normality of these estimators and allows for asymptotically valid statistical inference, and discuss implementation of these estimators. We further investigate the relative asymptotic efficiency of this design compared with a design in which unit-specific treatment assignment depends only on the units’ covariates. Our findings have practical implications for the optimal design and analysis of pair matched cluster randomized trials, as well as for observational studies in which treatment decisions may depend on characteristics of the entire sample. PMID:25097298

  14. Adaptive Peer Sampling with Newscast

    NASA Astrophysics Data System (ADS)

    Tölgyesi, Norbert; Jelasity, Márk

    The peer sampling service is a middleware service that provides random samples from a large decentralized network to support gossip-based applications such as multicast, data aggregation and overlay topology management. Lightweight gossip-based implementations of the peer sampling service have been shown to provide good quality random sampling while also being extremely robust to many failure scenarios, including node churn and catastrophic failure. We identify two problems with these approaches. The first problem is related to message drop failures: if a node experiences a higher-than-average message drop rate then the probability of sampling this node in the network will decrease. The second problem is that the application layer at different nodes might request random samples at very different rates which can result in very poor random sampling especially at nodes with high request rates. We propose solutions for both problems. We focus on Newscast, a robust implementation of the peer sampling service. Our solution is based on simple extensions of the protocol and an adaptive self-control mechanism for its parameters, namely—without involving failure detectors—nodes passively monitor local protocol events using them as feedback for a local control loop for self-tuning the protocol parameters. The proposed solution is evaluated by simulation experiments.

  15. Final report : sampling plan for pavement condition ratings of secondary roads.

    DOT National Transportation Integrated Search

    1984-01-01

    The purpose of this project was to develop a random sampling plan for use in selecting segments of the secondary highway system for evaluation under the Department's PMS. The plan developed is described here. It is a simple, workable, random sampling...

  16. Generating Random Samples of a Given Size Using Social Security Numbers.

    ERIC Educational Resources Information Center

    Erickson, Richard C.; Brauchle, Paul E.

    1984-01-01

    The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)

  17. RandomSpot: A web-based tool for systematic random sampling of virtual slides.

    PubMed

    Wright, Alexander I; Grabsch, Heike I; Treanor, Darren E

    2015-01-01

    This paper describes work presented at the Nordic Symposium on Digital Pathology 2014, Linköping, Sweden. Systematic random sampling (SRS) is a stereological tool, which provides a framework to quickly build an accurate estimation of the distribution of objects or classes within an image, whilst minimizing the number of observations required. RandomSpot is a web-based tool for SRS in stereology, which systematically places equidistant points within a given region of interest on a virtual slide. Each point can then be visually inspected by a pathologist in order to generate an unbiased sample of the distribution of classes within the tissue. Further measurements can then be derived from the distribution, such as the ratio of tumor to stroma. RandomSpot replicates the fundamental principle of traditional light microscope grid-shaped graticules, with the added benefits associated with virtual slides, such as facilitated collaboration and automated navigation between points. Once the sample points have been added to the region(s) of interest, users can download the annotations and view them locally using their virtual slide viewing software. Since its introduction, RandomSpot has been used extensively for international collaborative projects, clinical trials and independent research projects. So far, the system has been used to generate over 21,000 sample sets, and has been used to generate data for use in multiple publications, identifying significant new prognostic markers in colorectal, upper gastro-intestinal and breast cancer. Data generated using RandomSpot also has significant value for training image analysis algorithms using sample point coordinates and pathologist classifications.

  18. SNP selection and classification of genome-wide SNP data using stratified sampling random forests.

    PubMed

    Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K

    2012-09-01

    For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.

  19. Steinhaus’ Geometric Location Problem for Random Samples in the Plane.

    DTIC Science & Technology

    1982-05-11

    NAL 411R A1, ’I 7 - I STEINHAUS ’ GEOMETRIC LOCATION PROBLEM FOR RANDOM SAMPLES IN THE PLANE By Dorit Hochbaum and J. Michael Steele TECHNICAL REPORT...DEPARTMENT OF STATISTICS -Dltrib’ytion/ STANFORD UNIVERSITY A-I.abilty Codes STANFORD, CALIFORNIA Dist Spciat ecial Steinhaus ’ Geometric Location Problem for...Random Samples in the Plane By Dorit Hochbaum and J. Michael Steele I. Introduction. The work of H. Steinhaus U wf94 as apparently the first explicit

  20. Reducing seed dependent variability of non-uniformly sampled multidimensional NMR data

    NASA Astrophysics Data System (ADS)

    Mobli, Mehdi

    2015-07-01

    The application of NMR spectroscopy to study the structure, dynamics and function of macromolecules requires the acquisition of several multidimensional spectra. The one-dimensional NMR time-response from the spectrometer is extended to additional dimensions by introducing incremented delays in the experiment that cause oscillation of the signal along "indirect" dimensions. For a given dimension the delay is incremented at twice the rate of the maximum frequency (Nyquist rate). To achieve high-resolution requires acquisition of long data records sampled at the Nyquist rate. This is typically a prohibitive step due to time constraints, resulting in sub-optimal data records to the detriment of subsequent analyses. The multidimensional NMR spectrum itself is typically sparse, and it has been shown that in such cases it is possible to use non-Fourier methods to reconstruct a high-resolution multidimensional spectrum from a random subset of non-uniformly sampled (NUS) data. For a given acquisition time, NUS has the potential to improve the sensitivity and resolution of a multidimensional spectrum, compared to traditional uniform sampling. The improvements in sensitivity and/or resolution achieved by NUS are heavily dependent on the distribution of points in the random subset acquired. Typically, random points are selected from a probability density function (PDF) weighted according to the NMR signal envelope. In extreme cases as little as 1% of the data is subsampled. The heavy under-sampling can result in poor reproducibility, i.e. when two experiments are carried out where the same number of random samples is selected from the same PDF but using different random seeds. Here, a jittered sampling approach is introduced that is shown to improve random seed dependent reproducibility of multidimensional spectra generated from NUS data, compared to commonly applied NUS methods. It is shown that this is achieved due to the low variability of the inherent sensitivity of the random subset chosen from a given PDF. Finally, it is demonstrated that metrics used to find optimal NUS distributions are heavily dependent on the inherent sensitivity of the random subset, and such optimisation is therefore less critical when using the proposed sampling scheme.

  1. Use of Matrix Sampling Procedures to Assess Achievement in Solving Open Addition and Subtraction Sentences.

    ERIC Educational Resources Information Center

    Montague, Margariete A.

    This study investigated the feasibility of concurrently and randomly sampling examinees and items in order to estimate group achievement. Seven 32-item tests reflecting a 640-item universe of simple open sentences were used such that item selection (random, systematic) and assignment (random, systematic) of items (four, eight, sixteen) to forms…

  2. Improving Generalizations from Experiments Using Propensity Score Subclassification: Assumptions, Properties, and Contexts

    ERIC Educational Resources Information Center

    Tipton, Elizabeth

    2013-01-01

    As a result of the use of random assignment to treatment, randomized experiments typically have high internal validity. However, units are very rarely randomly selected from a well-defined population of interest into an experiment; this results in low external validity. Under nonrandom sampling, this means that the estimate of the sample average…

  3. 40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 31 2014-07-01 2014-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...

  4. 40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 32 2013-07-01 2013-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...

  5. 40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 32 2012-07-01 2012-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...

  6. 40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 31 2011-07-01 2011-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...

  7. 40 CFR 761.306 - Sampling 1 meter square surfaces by random selection of halves.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 30 2010-07-01 2010-07-01 false Sampling 1 meter square surfaces by...(b)(3) § 761.306 Sampling 1 meter square surfaces by random selection of halves. (a) Divide each 1 meter square portion where it is necessary to collect a surface wipe test sample into two equal (or as...

  8. Repeated Random Sampling in Year 5

    ERIC Educational Resources Information Center

    Watson, Jane M.; English, Lyn D.

    2016-01-01

    As an extension to an activity introducing Year 5 students to the practice of statistics, the software "TinkerPlots" made it possible to collect repeated random samples from a finite population to informally explore students' capacity to begin reasoning with a distribution of sample statistics. This article provides background for the…

  9. Latin Hypercube Sampling (LHS) UNIX Library/Standalone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2004-05-13

    The LHS UNIX Library/Standalone software provides the capability to draw random samples from over 30 distribution types. It performs the sampling by a stratified sampling method called Latin Hypercube Sampling (LHS). Multiple distributions can be sampled simultaneously, with user-specified correlations amongst the input distributions, LHS UNIX Library/ Standalone provides a way to generate multi-variate samples. The LHS samples can be generated either as a callable library (e.g., from within the DAKOTA software framework) or as a standalone capability. LHS UNIX Library/Standalone uses the Latin Hypercube Sampling method (LHS) to generate samples. LHS is a constrained Monte Carlo sampling scheme. Inmore » LHS, the range of each variable is divided into non-overlapping intervals on the basis of equal probability. A sample is selected at random with respect to the probability density in each interval, If multiple variables are sampled simultaneously, then values obtained for each are paired in a random manner with the n values of the other variables. In some cases, the pairing is restricted to obtain specified correlations amongst the input variables. Many simulation codes have input parameters that are uncertain and can be specified by a distribution, To perform uncertainty analysis and sensitivity analysis, random values are drawn from the input parameter distributions, and the simulation is run with these values to obtain output values. If this is done repeatedly, with many input samples drawn, one can build up a distribution of the output as well as examine correlations between input and output variables.« less

  10. Reply to Jackson, O'Keefe, and Jacobs.

    ERIC Educational Resources Information Center

    Morley, Donald Dean

    1988-01-01

    Replies to Sally Jackson, Daniel O'Keefe, and Scott Jacobs' article (same issue), maintaining that randomness requirements can not be relaxed for generalizing from message samples, since systematic samples are not truly random. (MS)

  11. Sampling Methods in Cardiovascular Nursing Research: An Overview.

    PubMed

    Kandola, Damanpreet; Banner, Davina; O'Keefe-McCarthy, Sheila; Jassal, Debbie

    2014-01-01

    Cardiovascular nursing research covers a wide array of topics from health services to psychosocial patient experiences. The selection of specific participant samples is an important part of the research design and process. The sampling strategy employed is of utmost importance to ensure that a representative sample of participants is chosen. There are two main categories of sampling methods: probability and non-probability. Probability sampling is the random selection of elements from the population, where each element of the population has an equal and independent chance of being included in the sample. There are five main types of probability sampling including simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling. Non-probability sampling methods are those in which elements are chosen through non-random methods for inclusion into the research study and include convenience sampling, purposive sampling, and snowball sampling. Each approach offers distinct advantages and disadvantages and must be considered critically. In this research column, we provide an introduction to these key sampling techniques and draw on examples from the cardiovascular research. Understanding the differences in sampling techniques may aid nurses in effective appraisal of research literature and provide a reference pointfor nurses who engage in cardiovascular research.

  12. Relations among questionnaire and experience sampling measures of inner speech: a smartphone app study

    PubMed Central

    Alderson-Day, Ben; Fernyhough, Charles

    2015-01-01

    Inner speech is often reported to be a common and central part of inner experience, but its true prevalence is unclear. Many questionnaire-based measures appear to lack convergent validity and it has been claimed that they overestimate inner speech in comparison to experience sampling methods (which involve collecting data at random timepoints). The present study compared self-reporting of inner speech collected via a general questionnaire and experience sampling, using data from a custom-made smartphone app (Inner Life). Fifty-one university students completed a generalized self-report measure of inner speech (the Varieties of Inner Speech Questionnaire, VISQ) and responded to at least seven random alerts to report on incidences of inner speech over a 2-week period. Correlations and pairwise comparisons were used to compare generalized endorsements and randomly sampled scores for each VISQ subscale. Significant correlations were observed between general and randomly sampled measures for only two of the four VISQ subscales, and endorsements of inner speech with evaluative or motivational characteristics did not correlate at all across different measures. Endorsement of inner speech items was significantly lower for random sampling compared to generalized self-report, for all VISQ subscales. Exploratory analysis indicated that specific inner speech characteristics were also related to anxiety and future-oriented thinking. PMID:25964773

  13. Sample size considerations when groups are the appropriate unit of analyses

    PubMed Central

    Sadler, Georgia Robins; Ko, Celine Marie; Alisangco, Jennifer; Rosbrook, Bradley P.; Miller, Eric; Fullerton, Judith

    2007-01-01

    This paper discusses issues to be considered by nurse researchers when groups should be used as a unit of randomization. Advantages and disadvantages are presented, with statistical calculations needed to determine effective sample size. Examples of these concepts are presented using data from the Black Cosmetologists Promoting Health Program. Different hypothetical scenarios and their impact on sample size are presented. Given the complexity of calculating sample size when using groups as a unit of randomization, it’s advantageous for researchers to work closely with statisticians when designing and implementing studies that anticipate the use of groups as the unit of randomization. PMID:17693219

  14. Practical quantum random number generator based on measuring the shot noise of vacuum states

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen Yong; Zou Hongxin; Tian Liang

    2010-06-15

    The shot noise of vacuum states is a kind of quantum noise and is totally random. In this paper a nondeterministic random number generation scheme based on measuring the shot noise of vacuum states is presented and experimentally demonstrated. We use a homodyne detector to measure the shot noise of vacuum states. Considering that the frequency bandwidth of our detector is limited, we derive the optimal sampling rate so that sampling points have the least correlation with each other. We also choose a method to extract random numbers from sampling values, and prove that the influence of classical noise canmore » be avoided with this method so that the detector does not have to be shot-noise limited. The random numbers generated with this scheme have passed ent and diehard tests.« less

  15. A Comparison of Single Sample and Bootstrap Methods to Assess Mediation in Cluster Randomized Trials

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Stapleton, Laura M.; Kang, Joo Youn

    2006-01-01

    A Monte Carlo study examined the statistical performance of single sample and bootstrap methods that can be used to test and form confidence interval estimates of indirect effects in two cluster randomized experimental designs. The designs were similar in that they featured random assignment of clusters to one of two treatment conditions and…

  16. Performance of Random Effects Model Estimators under Complex Sampling Designs

    ERIC Educational Resources Information Center

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…

  17. Resource Input, Service Process and Resident Activity Indicators in a Welsh National Random Sample of Staffed Housing Services for People with Intellectual Disabilities

    ERIC Educational Resources Information Center

    Felce, David; Perry, Jonathan

    2004-01-01

    Background: The aims were to: (i) explore the association between age and size of setting and staffing per resident; and (ii) report resident and setting characteristics, and indicators of service process and resident activity for a national random sample of staffed housing provision. Methods: Sixty settings were selected randomly from those…

  18. [Exploration of the concept of genetic drift in genetics teaching of undergraduates].

    PubMed

    Wang, Chun-ming

    2016-01-01

    Genetic drift is one of the difficulties in teaching genetics due to its randomness and probability which could easily cause conceptual misunderstanding. The “sampling error" in its definition is often misunderstood because of the research method of “sampling", which disturbs the results and causes the random changes in allele frequency. I analyzed and compared the definitions of genetic drift in domestic and international genetic textbooks, and found that the definitions containing “sampling error" are widely adopted but are interpreted correctly in only a few textbooks. Here, the history of research on genetic drift, i.e., the contributions of Wright, Fisher and Kimura, is introduced. Moreover, I particularly describe two representative articles recently published about genetic drift teaching of undergraduates, which point out that misconceptions are inevitable for undergraduates during the studying process and also provide a preliminary solution. Combined with my own teaching practice, I suggest that the definition of genetic drift containing “sampling error" can be adopted with further interpretation, i.e., “sampling error" is random sampling among gametes when generating the next generation of alleles which is equivalent to a random sampling of all gametes participating in mating in gamete pool and has no relationship with artificial sampling in general genetics studies. This article may provide some help in genetics teaching.

  19. 40 CFR 761.355 - Third level of sample selection.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...

  20. 40 CFR 761.355 - Third level of sample selection.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...

  1. 40 CFR 761.355 - Third level of sample selection.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...

  2. 40 CFR 761.355 - Third level of sample selection.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...

  3. 40 CFR 761.355 - Third level of sample selection.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...

  4. Differences in Mathematics Teachers' Perceived Preparedness to Demonstrate Competence in Secondary School Mathematics Content by Teacher Characteristics

    ERIC Educational Resources Information Center

    Ng'eno, J. K.; Chesimet, M. C.

    2016-01-01

    A sample of 300 mathematics teachers drawn from a population of 1500 participated in this study. The participants were selected using systematic random sampling and stratified random sampling (stratified by qualification and gender). The data was collected using self-report questionnaires for mathematics teachers. One tool was used to collect…

  5. Testing the Feasibility of Developmental Asset Measures on College Students to Guide Health Promotion Efforts

    ERIC Educational Resources Information Center

    Zullig, Keith J.; Ward, Rose Marie; King, Keith A.; Patton, Jon M.; Murray, Karen A.

    2009-01-01

    The purpose of this investigation was to assess the reliability and validity of eight developmental asset measures among a stratified, random sample (N = 540) of college students to guide health promotion efforts. The sample was randomly split to produce exploratory and confirmatory samples for factor analysis using principal axis factoring and…

  6. Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska

    NASA Technical Reports Server (NTRS)

    Lent, P. C. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.

  7. Workforce Readiness: A Study of University Students' Fluency with Information Technology

    ERIC Educational Resources Information Center

    Kaminski, Karen; Switzer, Jamie; Gloeckner, Gene

    2009-01-01

    This study with data collected from a large sample of freshmen in 2001 and a random stratified sample of seniors in 2005 examined students perceived FITness (fluency with Information Technology). In the fall of 2001 freshmen at a medium sized research-one institution completed a survey and in spring 2005 a random sample of graduating seniors…

  8. Gender Differentiation in the New York "Times": 1885 and 1985.

    ERIC Educational Resources Information Center

    Jolliffe, Lee

    A study examined the descriptive language and sex-linked roles ascribed to women and men in articles of the New York "Times" from 1885 and 1985. Seven content analysis methods were applied to four random samples from the "Times"; one sample each for women and men from both years. Samples were drawn using randomly constructed…

  9. Errors in radial velocity variance from Doppler wind lidar

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, H.; Barthelmie, R. J.; Doubrawa, P.

    A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less

  10. Errors in radial velocity variance from Doppler wind lidar

    DOE PAGES

    Wang, H.; Barthelmie, R. J.; Doubrawa, P.; ...

    2016-08-29

    A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less

  11. A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields

    DOE PAGES

    Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto

    2017-10-26

    In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less

  12. A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto

    In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less

  13. Evaluation of Bayesian Sequential Proportion Estimation Using Analyst Labels

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Abotteen, K. M. (Principal Investigator)

    1980-01-01

    The author has identified the following significant results. A total of ten Large Area Crop Inventory Experiment Phase 3 blind sites and analyst-interpreter labels were used in a study to compare proportional estimates obtained by the Bayes sequential procedure with estimates obtained from simple random sampling and from Procedure 1. The analyst error rate using the Bayes technique was shown to be no greater than that for the simple random sampling. Also, the segment proportion estimates produced using this technique had smaller bias and mean squared errors than the estimates produced using either simple random sampling or Procedure 1.

  14. Midwives Performance in Early Detection of Growth and Development Irregularities of Children Based on Task Commitment

    ERIC Educational Resources Information Center

    Utami, Sri; Nursalam; Hargono, Rachmat; Susilaningrum, Rekawati

    2016-01-01

    The purpose of this study was to analyze the performance of midwives based on the task commitment. This was an observational analytic with cross sectional approach. Multistage random sampling was used to determine the public health center, proportional random sampling to selected participants. The samples were 222 midwives in the public health…

  15. An improved sampling method of complex network

    NASA Astrophysics Data System (ADS)

    Gao, Qi; Ding, Xintong; Pan, Feng; Li, Weixing

    2014-12-01

    Sampling subnet is an important topic of complex network research. Sampling methods influence the structure and characteristics of subnet. Random multiple snowball with Cohen (RMSC) process sampling which combines the advantages of random sampling and snowball sampling is proposed in this paper. It has the ability to explore global information and discover the local structure at the same time. The experiments indicate that this novel sampling method could keep the similarity between sampling subnet and original network on degree distribution, connectivity rate and average shortest path. This method is applicable to the situation where the prior knowledge about degree distribution of original network is not sufficient.

  16. Exact intervals and tests for median when one sample value possibly an outliner

    NASA Technical Reports Server (NTRS)

    Keller, G. J.; Walsh, J. E.

    1973-01-01

    Available are independent observations (continuous data) that are believed to be a random sample. Desired are distribution-free confidence intervals and significance tests for the population median. However, there is the possibility that either the smallest or the largest observation is an outlier. Then, use of a procedure for rejection of an outlying observation might seem appropriate. Such a procedure would consider that two alternative situations are possible and would select one of them. Either (1) the n observations are truly a random sample, or (2) an outlier exists and its removal leaves a random sample of size n-1. For either situation, confidence intervals and tests are desired for the median of the population yielding the random sample. Unfortunately, satisfactory rejection procedures of a distribution-free nature do not seem to be available. Moreover, all rejection procedures impose undesirable conditional effects on the observations, and also, can select the wrong one of the two above situations. It is found that two-sided intervals and tests based on two symmetrically located order statistics (not the largest and smallest) of the n observations have this property.

  17. Sparsely sampling the sky: Regular vs. random sampling

    NASA Astrophysics Data System (ADS)

    Paykari, P.; Pires, S.; Starck, J.-L.; Jaffe, A. H.

    2015-09-01

    Aims: The next generation of galaxy surveys, aiming to observe millions of galaxies, are expensive both in time and money. This raises questions regarding the optimal investment of this time and money for future surveys. In a previous work, we have shown that a sparse sampling strategy could be a powerful substitute for the - usually favoured - contiguous observation of the sky. In our previous paper, regular sparse sampling was investigated, where the sparse observed patches were regularly distributed on the sky. The regularity of the mask introduces a periodic pattern in the window function, which induces periodic correlations at specific scales. Methods: In this paper, we use a Bayesian experimental design to investigate a "random" sparse sampling approach, where the observed patches are randomly distributed over the total sparsely sampled area. Results: We find that in this setting, the induced correlation is evenly distributed amongst all scales as there is no preferred scale in the window function. Conclusions: This is desirable when we are interested in any specific scale in the galaxy power spectrum, such as the matter-radiation equality scale. As the figure of merit shows, however, there is no preference between regular and random sampling to constrain the overall galaxy power spectrum and the cosmological parameters.

  18. The Performance of the Date-Randomization Test in Phylogenetic Analyses of Time-Structured Virus Data.

    PubMed

    Duchêne, Sebastián; Duchêne, David; Holmes, Edward C; Ho, Simon Y W

    2015-07-01

    Rates and timescales of viral evolution can be estimated using phylogenetic analyses of time-structured molecular sequences. This involves the use of molecular-clock methods, calibrated by the sampling times of the viral sequences. However, the spread of these sampling times is not always sufficient to allow the substitution rate to be estimated accurately. We conducted Bayesian phylogenetic analyses of simulated virus data to evaluate the performance of the date-randomization test, which is sometimes used to investigate whether time-structured data sets have temporal signal. An estimate of the substitution rate passes this test if its mean does not fall within the 95% credible intervals of rate estimates obtained using replicate data sets in which the sampling times have been randomized. We find that the test sometimes fails to detect rate estimates from data with no temporal signal. This error can be minimized by using a more conservative criterion, whereby the 95% credible interval of the estimate with correct sampling times should not overlap with those obtained with randomized sampling times. We also investigated the behavior of the test when the sampling times are not uniformly distributed throughout the tree, which sometimes occurs in empirical data sets. The test performs poorly in these circumstances, such that a modification to the randomization scheme is needed. Finally, we illustrate the behavior of the test in analyses of nucleotide sequences of cereal yellow dwarf virus. Our results validate the use of the date-randomization test and allow us to propose guidelines for interpretation of its results. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Evaluation of a Class of Simple and Effective Uncertainty Methods for Sparse Samples of Random Variables and Functions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin

    When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less

  20. Statistical methods for efficient design of community surveys of response to noise: Random coefficients regression models

    NASA Technical Reports Server (NTRS)

    Tomberlin, T. J.

    1985-01-01

    Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.

  1. Optimal sampling design for estimating spatial distribution and abundance of a freshwater mussel population

    USGS Publications Warehouse

    Pooler, P.S.; Smith, D.R.

    2005-01-01

    We compared the ability of simple random sampling (SRS) and a variety of systematic sampling (SYS) designs to estimate abundance, quantify spatial clustering, and predict spatial distribution of freshwater mussels. Sampling simulations were conducted using data obtained from a census of freshwater mussels in a 40 X 33 m section of the Cacapon River near Capon Bridge, West Virginia, and from a simulated spatially random population generated to have the same abundance as the real population. Sampling units that were 0.25 m 2 gave more accurate and precise abundance estimates and generally better spatial predictions than 1-m2 sampling units. Systematic sampling with ???2 random starts was more efficient than SRS. Estimates of abundance based on SYS were more accurate when the distance between sampling units across the stream was less than or equal to the distance between sampling units along the stream. Three measures for quantifying spatial clustering were examined: Hopkins Statistic, the Clumping Index, and Morisita's Index. Morisita's Index was the most reliable, and the Hopkins Statistic was prone to false rejection of complete spatial randomness. SYS designs with units spaced equally across and up stream provided the most accurate predictions when estimating the spatial distribution by kriging. Our research indicates that SYS designs with sampling units equally spaced both across and along the stream would be appropriate for sampling freshwater mussels even if no information about the true underlying spatial distribution of the population were available to guide the design choice. ?? 2005 by The North American Benthological Society.

  2. Fatigue crack growth model RANDOM2 user manual, appendix 1

    NASA Technical Reports Server (NTRS)

    Boyce, Lola; Lovelace, Thomas B.

    1989-01-01

    The FORTRAN program RANDOM2 is documented. RANDOM2 is based on fracture mechanics using a probabilistic fatigue crack growth model. It predicts the random lifetime of an engine component to reach a given crack size. Included in this user manual are details regarding the theoretical background of RANDOM2, input data, instructions and a sample problem illustrating the use of RANDOM2. Appendix A gives information on the physical quantities, their symbols, FORTRAN names, and both SI and U.S. Customary units. Appendix B includes photocopies of the actual computer printout corresponding to the sample problem. Appendices C and D detail the IMSL, Ver. 10(1), subroutines and functions called by RANDOM2 and a SAS/GRAPH(2) program that can be used to plot both the probability density function (p.d.f.) and the cumulative distribution function (c.d.f.).

  3. Toward a Principled Sampling Theory for Quasi-Orders

    PubMed Central

    Ünlü, Ali; Schrepp, Martin

    2016-01-01

    Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets. PMID:27965601

  4. Toward a Principled Sampling Theory for Quasi-Orders.

    PubMed

    Ünlü, Ali; Schrepp, Martin

    2016-01-01

    Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets.

  5. A novel 3D Cartesian random sampling strategy for Compressive Sensing Magnetic Resonance Imaging.

    PubMed

    Valvano, Giuseppe; Martini, Nicola; Santarelli, Maria Filomena; Chiappino, Dante; Landini, Luigi

    2015-01-01

    In this work we propose a novel acquisition strategy for accelerated 3D Compressive Sensing Magnetic Resonance Imaging (CS-MRI). This strategy is based on a 3D cartesian sampling with random switching of the frequency encoding direction with other K-space directions. Two 3D sampling strategies are presented. In the first strategy, the frequency encoding direction is randomly switched with one of the two phase encoding directions. In the second strategy, the frequency encoding direction is randomly chosen between all the directions of the K-Space. These strategies can lower the coherence of the acquisition, in order to produce reduced aliasing artifacts and to achieve a better image quality after Compressive Sensing (CS) reconstruction. Furthermore, the proposed strategies can reduce the typical smoothing of CS due to the limited sampling of high frequency locations. We demonstrated by means of simulations that the proposed acquisition strategies outperformed the standard Compressive Sensing acquisition. This results in a better quality of the reconstructed images and in a greater achievable acceleration.

  6. The Association between Childhood and Adolescent Sexual Abuse and Proxies for Sexual Risk Behavior: A Random Sample of the General Population of Sweden

    ERIC Educational Resources Information Center

    Steel, Jennifer L.; Herlitz, Claes A.

    2005-01-01

    Objective: Several studies with small and ''high risk'' samples have demonstrated that a history of childhood or adolescent sexual abuse (CASA) is associated with sexual risk behaviors (SRBs). However, few studies with large random samples from the general population have specifically examined the relationship between CASA and SRBs with a…

  7. A high speed implementation of the random decrement algorithm

    NASA Technical Reports Server (NTRS)

    Kiraly, L. J.

    1982-01-01

    The algorithm is useful for measuring net system damping levels in stochastic processes and for the development of equivalent linearized system response models. The algorithm works by summing together all subrecords which occur after predefined threshold level is crossed. The random decrement signature is normally developed by scanning stored data and adding subrecords together. The high speed implementation of the random decrement algorithm exploits the digital character of sampled data and uses fixed record lengths of 2(n) samples to greatly speed up the process. The contributions to the random decrement signature of each data point was calculated only once and in the same sequence as the data were taken. A hardware implementation of the algorithm using random logic is diagrammed and the process is shown to be limited only by the record size and the threshold crossing frequency of the sampled data. With a hardware cycle time of 200 ns and 1024 point signature, a threshold crossing frequency of 5000 Hertz can be processed and a stably averaged signature presented in real time.

  8. Estimation After a Group Sequential Trial.

    PubMed

    Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert

    2015-10-01

    Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

  9. Recording 2-D Nutation NQR Spectra by Random Sampling Method

    PubMed Central

    Sinyavsky, Nikolaj; Jadzyn, Maciej; Ostafin, Michal; Nogaj, Boleslaw

    2010-01-01

    The method of random sampling was introduced for the first time in the nutation nuclear quadrupole resonance (NQR) spectroscopy where the nutation spectra show characteristic singularities in the form of shoulders. The analytic formulae for complex two-dimensional (2-D) nutation NQR spectra (I = 3/2) were obtained and the condition for resolving the spectral singularities for small values of an asymmetry parameter η was determined. Our results show that the method of random sampling of a nutation interferogram allows significant reduction of time required to perform a 2-D nutation experiment and does not worsen the spectral resolution. PMID:20949121

  10. Navigation Using Orthogonal Frequency Division Multiplexed Signals of Opportunity

    DTIC Science & Technology

    2007-09-01

    transmits a 32,767 bit pseudo -random “short” code that repeats 37.5 times per second. Since the pseudo -random bit pattern and modulation scheme are... correlation process takes two “ sample windows,” both of which are ν = 16 samples wide and are spaced N = 64 samples apart, and compares them. When the...technique in (3.4) is a necessary step in order to get a more accurate estimate of the sample shift from the symbol boundary correlator in (3.1). Figure

  11. Multilattice sampling strategies for region of interest dynamic MRI.

    PubMed

    Rilling, Gabriel; Tao, Yuehui; Marshall, Ian; Davies, Mike E

    2013-08-01

    A multilattice sampling approach is proposed for dynamic MRI with Cartesian trajectories. It relies on the use of sampling patterns composed of several different lattices and exploits an image model where only some parts of the image are dynamic, whereas the rest is assumed static. Given the parameters of such an image model, the methodology followed for the design of a multilattice sampling pattern adapted to the model is described. The multi-lattice approach is compared to single-lattice sampling, as used by traditional acceleration methods such as UNFOLD (UNaliasing by Fourier-Encoding the Overlaps using the temporal Dimension) or k-t BLAST, and random sampling used by modern compressed sensing-based methods. On the considered image model, it allows more flexibility and higher accelerations than lattice sampling and better performance than random sampling. The method is illustrated on a phase-contrast carotid blood velocity mapping MR experiment. Combining the multilattice approach with the KEYHOLE technique allows up to 12× acceleration factors. Simulation and in vivo undersampling results validate the method. Compared to lattice and random sampling, multilattice sampling provides significant gains at high acceleration factors. © 2012 Wiley Periodicals, Inc.

  12. A method for determining the weak statistical stationarity of a random process

    NASA Technical Reports Server (NTRS)

    Sadeh, W. Z.; Koper, C. A., Jr.

    1978-01-01

    A method for determining the weak statistical stationarity of a random process is presented. The core of this testing procedure consists of generating an equivalent ensemble which approximates a true ensemble. Formation of an equivalent ensemble is accomplished through segmenting a sufficiently long time history of a random process into equal, finite, and statistically independent sample records. The weak statistical stationarity is ascertained based on the time invariance of the equivalent-ensemble averages. Comparison of these averages with their corresponding time averages over a single sample record leads to a heuristic estimate of the ergodicity of a random process. Specific variance tests are introduced for evaluating the statistical independence of the sample records, the time invariance of the equivalent-ensemble autocorrelations, and the ergodicity. Examination and substantiation of these procedures were conducted utilizing turbulent velocity signals.

  13. Solving a mixture of many random linear equations by tensor decomposition and alternating minimization.

    DOT National Transportation Integrated Search

    2016-09-01

    We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...

  14. HABITAT ASSESSMENT USING A RANDOM PROBABILITY BASED SAMPLING DESIGN: ESCAMBIA RIVER DELTA, FLORIDA

    EPA Science Inventory

    Smith, Lisa M., Darrin D. Dantin and Steve Jordan. In press. Habitat Assessment Using a Random Probability Based Sampling Design: Escambia River Delta, Florida (Abstract). To be presented at the SWS/GERS Fall Joint Society Meeting: Communication and Collaboration: Coastal Systems...

  15. On the repeated measures designs and sample sizes for randomized controlled trials.

    PubMed

    Tango, Toshiro

    2016-04-01

    For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection

    NASA Astrophysics Data System (ADS)

    Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa

    2018-01-01

    A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.

  17. Representative Sampling: Follow-up of Spring 1972 and Spring 1973 Students. TEX-SIS FOLLOW-UP SC3.

    ERIC Educational Resources Information Center

    Wilkinson, Larry; And Others

    This report presents the findings of a research study, conducted by the College of the Mainland (COM) as a subcontractor for Project FOLLOW-UP, designed to test the accuracy of random sampling and to measure non-response bias in mail surveys. In 1975, a computer-generated random sample of 500 students was drawn from a population of 1,256 students…

  18. Gradient-free MCMC methods for dynamic causal modelling

    DOE PAGES

    Sengupta, Biswa; Friston, Karl J.; Penny, Will D.

    2015-03-14

    Here, we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density -- albeit at almost 1000% increase in computational time, in comparisonmore » to the most efficient algorithm (i.e., the adaptive MCMC sampler).« less

  19. A random cluster survey and a convenience sample give comparable estimates of immunity to vaccine preventable diseases in children of school age in Victoria, Australia.

    PubMed

    Kelly, Heath; Riddell, Michaela A; Gidding, Heather F; Nolan, Terry; Gilbert, Gwendolyn L

    2002-08-19

    We compared estimates of the age-specific population immunity to measles, mumps, rubella, hepatitis B and varicella zoster viruses in Victorian school children obtained by a national sero-survey, using a convenience sample of residual sera from diagnostic laboratories throughout Australia, with those from a three-stage random cluster survey. When grouped according to school age (primary or secondary school) there was no significant difference in the estimates of immunity to measles, mumps, hepatitis B or varicella. Compared with the convenience sample, the random cluster survey estimated higher immunity to rubella in samples from both primary (98.7% versus 93.6%, P = 0.002) and secondary school students (98.4% versus 93.2%, P = 0.03). Despite some limitations, this study suggests that the collection of a convenience sample of sera from diagnostic laboratories is an appropriate sampling strategy to provide population immunity data that will inform Australia's current and future immunisation policies. Copyright 2002 Elsevier Science Ltd.

  20. Random sampling of elementary flux modes in large-scale metabolic networks.

    PubMed

    Machado, Daniel; Soons, Zita; Patil, Kiran Raosaheb; Ferreira, Eugénio C; Rocha, Isabel

    2012-09-15

    The description of a metabolic network in terms of elementary (flux) modes (EMs) provides an important framework for metabolic pathway analysis. However, their application to large networks has been hampered by the combinatorial explosion in the number of modes. In this work, we develop a method for generating random samples of EMs without computing the whole set. Our algorithm is an adaptation of the canonical basis approach, where we add an additional filtering step which, at each iteration, selects a random subset of the new combinations of modes. In order to obtain an unbiased sample, all candidates are assigned the same probability of getting selected. This approach avoids the exponential growth of the number of modes during computation, thus generating a random sample of the complete set of EMs within reasonable time. We generated samples of different sizes for a metabolic network of Escherichia coli, and observed that they preserve several properties of the full EM set. It is also shown that EM sampling can be used for rational strain design. A well distributed sample, that is representative of the complete set of EMs, should be suitable to most EM-based methods for analysis and optimization of metabolic networks. Source code for a cross-platform implementation in Python is freely available at http://code.google.com/p/emsampler. dmachado@deb.uminho.pt Supplementary data are available at Bioinformatics online.

  1. Statistical design and analysis of environmental studies for plutonium and other transuranics at NAEG ''safety-shot'' sites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, R.O.; Eberhardt, L.L.; Fowler, E.B.

    This paper is centered around the use of stratified random sampling for estimating the total amount (inventory) of $sup 239-240$Pu and uranium in surface soil at ten ''safety-shot'' sites on the Nevada Test Site (NTS) and Tonopah Test Range (TTR) that are currently being studied by the Nevada Applied Ecology Group (NAEG). The use of stratified random sampling has resulted in estimates of inventory at these desert study sites that have smaller standard errors than would have been the case had simple random sampling (no stratification) been used. Estimates of inventory are given for $sup 235$U, $sup 238$U, and $supmore » 239-240$Pu in soil at A Site of Area 11 on the NTS. Other results presented include average concentrations of one or more of these isotopes in soil and vegetation and in soil profile samples at depths to 25 cm. The regression relationship between soil and vegetation concentrations of $sup 235$U and $sup 238$U at adjacent sampling locations is also examined using three different models. The applicability of stratified random sampling to the estimation of concentration contours of $sup 239-240$Pu in surface soil using computer algorithms is also investigated. Estimates of such contours are obtained using several different methods. The planning of field sampling plans for estimating inventory and distribution is discussed. (auth)« less

  2. Describing Typical Capstone Course Experiences from a National Random Sample

    ERIC Educational Resources Information Center

    Grahe, Jon E.; Hauhart, Robert C.

    2013-01-01

    The pedagogical value of capstones has been regularly discussed within psychology. This study presents results from an examination of a national random sample of department webpages and an online survey that characterized the typical capstone course in terms of classroom activities and course administration. The department webpages provide an…

  3. Differential Cost Avoidance and Successful Criminal Careers: Random or Rational?

    ERIC Educational Resources Information Center

    Kazemian, Lila; Le Blanc, Marc

    2007-01-01

    Using a sample of adjudicated French Canadian males from the Montreal Two Samples Longitudinal Study, this article investigates individual and social characteristics associated with differential cost avoidance. The main objective of this study is to determine whether such traits are randomly distributed across differential degrees of cost…

  4. Aircraft adaptive learning control

    NASA Technical Reports Server (NTRS)

    Lee, P. S. T.; Vanlandingham, H. F.

    1979-01-01

    The optimal control theory of stochastic linear systems is discussed in terms of the advantages of distributed-control systems, and the control of randomly-sampled systems. An optimal solution to longitudinal control is derived and applied to the F-8 DFBW aircraft. A randomly-sampled linear process model with additive process and noise is developed.

  5. LOD score exclusion analyses for candidate genes using random population samples.

    PubMed

    Deng, H W; Li, J; Recker, R R

    2001-05-01

    While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes with random population samples. We develop a LOD score approach for exclusion analyses of candidate genes with random population samples. Under this approach, specific genetic effects and inheritance models at candidate genes can be analysed and if a LOD score is < or = - 2.0, the locus can be excluded from having an effect larger than that specified. Computer simulations show that, with sample sizes often employed in association studies, this approach has high power to exclude a gene from having moderate genetic effects. In contrast to regular association analyses, population admixture will not affect the robustness of our analyses; in fact, it renders our analyses more conservative and thus any significant exclusion result is robust. Our exclusion analysis complements association analysis for candidate genes in random population samples and is parallel to the exclusion mapping analyses that may be conducted in linkage analyses with pedigrees or relative pairs. The usefulness of the approach is demonstrated by an application to test the importance of vitamin D receptor and estrogen receptor genes underlying the differential risk to osteoporotic fractures.

  6. Essential energy space random walk via energy space metadynamics method to accelerate molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Li, Hongzhi; Min, Donghong; Liu, Yusong; Yang, Wei

    2007-09-01

    To overcome the possible pseudoergodicity problem, molecular dynamic simulation can be accelerated via the realization of an energy space random walk. To achieve this, a biased free energy function (BFEF) needs to be priori obtained. Although the quality of BFEF is essential for sampling efficiency, its generation is usually tedious and nontrivial. In this work, we present an energy space metadynamics algorithm to efficiently and robustly obtain BFEFs. Moreover, in order to deal with the associated diffusion sampling problem caused by the random walk in the total energy space, the idea in the original umbrella sampling method is generalized to be the random walk in the essential energy space, which only includes the energy terms determining the conformation of a region of interest. This essential energy space generalization allows the realization of efficient localized enhanced sampling and also offers the possibility of further sampling efficiency improvement when high frequency energy terms irrelevant to the target events are free of activation. The energy space metadynamics method and its generalization in the essential energy space for the molecular dynamics acceleration are demonstrated in the simulation of a pentanelike system, the blocked alanine dipeptide model, and the leucine model.

  7. Reaching Asian Americans: sampling strategies and incentives.

    PubMed

    Lee, Soo-Kyung; Cheng, Yu-Yao

    2006-07-01

    Reaching and recruiting representative samples of minority populations is often challenging. This study examined in Chinese and Korean Americans: 1) whether using two different sampling strategies (random sampling vs. convenience sampling) significantly affected characteristics of recruited participants and 2) whether providing different incentives in the mail survey produced different response rates. We found statistically significant, however mostly not remarkable, differences between random and convenience samples. Offering monetary incentives in the mail survey improved response rates among Chinese Americans, while offering a small gift did not improve response rates among either Chinese or Korean Americans. This information will be useful for researchers and practitioners working with Asian Americans.

  8. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chowdhary, Kenny; Najm, Habib N.

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  9. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE PAGES

    Chowdhary, Kenny; Najm, Habib N.

    2016-04-13

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  10. Sampling in epidemiological research: issues, hazards and pitfalls.

    PubMed

    Tyrer, Stephen; Heyman, Bob

    2016-04-01

    Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research.

  11. Sampling in epidemiological research: issues, hazards and pitfalls

    PubMed Central

    Tyrer, Stephen; Heyman, Bob

    2016-01-01

    Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research. PMID:27087985

  12. Investigation of spectral analysis techniques for randomly sampled velocimetry data

    NASA Technical Reports Server (NTRS)

    Sree, Dave

    1993-01-01

    It is well known that velocimetry (LV) generates individual realization velocity data that are randomly or unevenly sampled in time. Spectral analysis of such data to obtain the turbulence spectra, and hence turbulence scales information, requires special techniques. The 'slotting' technique of Mayo et al, also described by Roberts and Ajmani, and the 'Direct Transform' method of Gaster and Roberts are well known in the LV community. The slotting technique is faster than the direct transform method in computation. There are practical limitations, however, as to how a high frequency and accurate estimate can be made for a given mean sampling rate. These high frequency estimates are important in obtaining the microscale information of turbulence structure. It was found from previous studies that reliable spectral estimates can be made up to about the mean sampling frequency (mean data rate) or less. If the data were evenly samples, the frequency range would be half the sampling frequency (i.e. up to Nyquist frequency); otherwise, aliasing problem would occur. The mean data rate and the sample size (total number of points) basically limit the frequency range. Also, there are large variabilities or errors associated with the high frequency estimates from randomly sampled signals. Roberts and Ajmani proposed certain pre-filtering techniques to reduce these variabilities, but at the cost of low frequency estimates. The prefiltering acts as a high-pass filter. Further, Shapiro and Silverman showed theoretically that, for Poisson sampled signals, it is possible to obtain alias-free spectral estimates far beyond the mean sampling frequency. But the question is, how far? During his tenure under 1993 NASA-ASEE Summer Faculty Fellowship Program, the author investigated from his studies on the spectral analysis techniques for randomly sampled signals that the spectral estimates can be enhanced or improved up to about 4-5 times the mean sampling frequency by using a suitable prefiltering technique. But, this increased bandwidth comes at the cost of the lower frequency estimates. The studies further showed that large data sets of the order of 100,000 points, or more, high data rates, and Poisson sampling are very crucial for obtaining reliable spectral estimates from randomly sampled data, such as LV data. Some of the results of the current study are presented.

  13. Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much.

    PubMed

    He, Bryan; De Sa, Christopher; Mitliagkas, Ioannis; Ré, Christopher

    2016-01-01

    Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.

  14. Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

    PubMed Central

    He, Bryan; De Sa, Christopher; Mitliagkas, Ioannis; Ré, Christopher

    2016-01-01

    Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance. PMID:28344429

  15. Estimating the encounter rate variance in distance sampling

    USGS Publications Warehouse

    Fewster, R.M.; Buckland, S.T.; Burnham, K.P.; Borchers, D.L.; Jupp, P.E.; Laake, J.L.; Thomas, L.

    2009-01-01

    The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias. ?? 2008, The International Biometric Society.

  16. Surface plasmon enhanced cell microscopy with blocked random spatial activation

    NASA Astrophysics Data System (ADS)

    Son, Taehwang; Oh, Youngjin; Lee, Wonju; Yang, Heejin; Kim, Donghyun

    2016-03-01

    We present surface plasmon enhanced fluorescence microscopy with random spatial sampling using patterned block of silver nanoislands. Rigorous coupled wave analysis was performed to confirm near-field localization on nanoislands. Random nanoislands were fabricated in silver by temperature annealing. By analyzing random near-field distribution, average size of localized fields was found to be on the order of 135 nm. Randomly localized near-fields were used to spatially sample F-actin of J774 cells (mouse macrophage cell-line). Image deconvolution algorithm based on linear imaging theory was established for stochastic estimation of fluorescent molecular distribution. The alignment between near-field distribution and raw image was performed by the patterned block. The achieved resolution is dependent upon factors including the size of localized fields and estimated to be 100-150 nm.

  17. Fast generation of sparse random kernel graphs

    DOE PAGES

    Hagberg, Aric; Lemons, Nathan; Du, Wen -Bo

    2015-09-10

    The development of kernel-based inhomogeneous random graphs has provided models that are flexible enough to capture many observed characteristics of real networks, and that are also mathematically tractable. We specify a class of inhomogeneous random graph models, called random kernel graphs, that produces sparse graphs with tunable graph properties, and we develop an efficient generation algorithm to sample random instances from this model. As real-world networks are usually large, it is essential that the run-time of generation algorithms scales better than quadratically in the number of vertices n. We show that for many practical kernels our algorithm runs in timemore » at most ο(n(logn)²). As an example, we show how to generate samples of power-law degree distribution graphs with tunable assortativity.« less

  18. Generalized and synthetic regression estimators for randomized branch sampling

    Treesearch

    David L. R. Affleck; Timothy G. Gregoire

    2015-01-01

    In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...

  19. A Semantic Differential Evaluation of Attitudinal Outcomes of Introductory Physical Science.

    ERIC Educational Resources Information Center

    Hecht, Alfred Roland

    This study was designed to assess the attitudinal outcomes of Introductory Physical Science (IPS) curriculum materials used in schools. Random samples of 240 students receiving IPS instruction and 240 non-science students were assigned to separate Solomon four-group designs with non-equivalent control groups. Random samples of 60 traditional…

  20. A Genetic Linkage Map of Longleaf Pine (Pinus palustris Mill.) Based on Random Amplified Polymorphic DNAs

    Treesearch

    C.D. Nelson; Thomas L. Kubisiak; M. Stine; W.L. Nance

    1994-01-01

    Eight megagametophyte DNA samples from a single longleaf pine (Pinus palustris Mill.) tree were used to screen 576 oligonucleotide primers for random amplified polymorphic DNA (RAPD) fragments. Primers amplifying repeatable polymorphic fragments were further characterized within a sample of 72 megagametophytes from the same tree. Fragments...

  1. Exploratory and Confirmatory Analysis of the Trauma Practices Questionnaire

    ERIC Educational Resources Information Center

    Craig, Carlton D.; Sprang, Ginny

    2009-01-01

    Objective: The present study provides psychometric data for the Trauma Practices Questionnaire (TPQ). Method: A nationally randomized sample of 2,400 surveys was sent to self-identified trauma treatment specialists, and 711 (29.6%) were returned. Results: An exploratory factor analysis (N = 319) conducted on a randomly split sample (RSS) revealed…

  2. P-Type Factor Analyses of Individuals' Thought Sampling Data.

    ERIC Educational Resources Information Center

    Hurlburt, Russell T.; Melancon, Susan M.

    Recently, interest in research measuring stream of consciousness or thought has increased. A study was conducted, based on a previous study by Hurlburt, Lech, and Saltman, in which subjects were randomly interrupted to rate their thoughts and moods on a Likert-type scale. Thought samples were collected from 27 subjects who carried random-tone…

  3. Emotional Intelligence and Life Adjustment for Nigerian Secondary Students

    ERIC Educational Resources Information Center

    Ogoemeka, Obioma Helen

    2013-01-01

    In the process of educating adolescents, good emotional development and life adjustment are two significant factors for teachers to know. This study employed random cluster sampling of senior secondary school students in Ondo and Oyo States in south-western Nigeria. The Random sampling was employed to select 1,070 students. The data collected were…

  4. 1977 Survey of the American Professoriate. Technical Report.

    ERIC Educational Resources Information Center

    Ladd, Everett Carll, Jr.; And Others

    The development and data validation of the 1977 Ladd-Lipset national survey of the American professoriate are described. The respondents were selected from a random sample of colleges and universities and from a random sample of individual faculty members from the universities. The 158 institutions in the 1977 survey were selected from 2,406…

  5. A cluster randomized trial of strategies to increase uptake amongst young women invited for their first cervical screen: The STRATEGIC trial.

    PubMed

    Kitchener, H; Gittins, M; Cruickshank, M; Moseley, C; Fletcher, S; Albrow, R; Gray, A; Brabin, L; Torgerson, D; Crosbie, E J; Sargent, A; Roberts, C

    2018-06-01

    Objectives To measure the feasibility and effectiveness of interventions to increase cervical screening uptake amongst young women. Methods A two-phase cluster randomized trial conducted in general practices in the NHS Cervical Screening Programme. In Phase 1, women in practices randomized to intervention due for their first invitation to cervical screening received a pre-invitation leaflet and, separately, access to online booking. In Phase 2, non-attenders at six months were randomized to one of: vaginal self-sample kits sent unrequested or offered; timed appointments; nurse navigator; or the choice between nurse navigator or self-sample kits. Primary outcome was uplift in intervention vs. control practices, at 3 and 12 months post invitation. Results Phase 1 randomized 20,879 women. Neither pre-invitation leaflet nor online booking increased screening uptake by three months (18.8% pre-invitation leaflet vs. 19.2% control and 17.8% online booking vs. 17.2% control). Uptake was higher amongst human papillomavirus vaccinees at three months (OR 2.07, 95% CI 1.69-2.53, p < 0.001). Phase 2 randomized 10,126 non-attenders, with 32-34 clusters for each intervention and 100 clusters as controls. Sending self-sample kits increased uptake at 12 months (OR 1.51, 95% CI 1.20-1.91, p = 0.001), as did timed appointments (OR 1.41, 95% CI 1.14-1.74, p = 0.001). The offer of a nurse navigator, a self-sample kits on request, and choice between timed appointments and nurse navigator were ineffective. Conclusions Amongst non-attenders, self-sample kits sent and timed appointments achieved an uplift in screening over the short term; longer term impact is less certain. Prior human papillomavirus vaccination was associated with increased screening uptake.

  6. Sampling Strategies and Processing of Biobank Tissue Samples from Porcine Biomedical Models.

    PubMed

    Blutke, Andreas; Wanke, Rüdiger

    2018-03-06

    In translational medical research, porcine models have steadily become more popular. Considering the high value of individual animals, particularly of genetically modified pig models, and the often-limited number of available animals of these models, establishment of (biobank) collections of adequately processed tissue samples suited for a broad spectrum of subsequent analyses methods, including analyses not specified at the time point of sampling, represent meaningful approaches to take full advantage of the translational value of the model. With respect to the peculiarities of porcine anatomy, comprehensive guidelines have recently been established for standardized generation of representative, high-quality samples from different porcine organs and tissues. These guidelines are essential prerequisites for the reproducibility of results and their comparability between different studies and investigators. The recording of basic data, such as organ weights and volumes, the determination of the sampling locations and of the numbers of tissue samples to be generated, as well as their orientation, size, processing and trimming directions, are relevant factors determining the generalizability and usability of the specimen for molecular, qualitative, and quantitative morphological analyses. Here, an illustrative, practical, step-by-step demonstration of the most important techniques for generation of representative, multi-purpose biobank specimen from porcine tissues is presented. The methods described here include determination of organ/tissue volumes and densities, the application of a volume-weighted systematic random sampling procedure for parenchymal organs by point-counting, determination of the extent of tissue shrinkage related to histological embedding of samples, and generation of randomly oriented samples for quantitative stereological analyses, such as isotropic uniform random (IUR) sections generated by the "Orientator" and "Isector" methods, and vertical uniform random (VUR) sections.

  7. Gradient-free MCMC methods for dynamic causal modelling.

    PubMed

    Sengupta, Biswa; Friston, Karl J; Penny, Will D

    2015-05-15

    In this technical note we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density - albeit at almost 1000% increase in computational time, in comparison to the most efficient algorithm (i.e., the adaptive MCMC sampler). Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Binomial leap methods for simulating stochastic chemical kinetics.

    PubMed

    Tian, Tianhai; Burrage, Kevin

    2004-12-01

    This paper discusses efficient simulation methods for stochastic chemical kinetics. Based on the tau-leap and midpoint tau-leap methods of Gillespie [D. T. Gillespie, J. Chem. Phys. 115, 1716 (2001)], binomial random variables are used in these leap methods rather than Poisson random variables. The motivation for this approach is to improve the efficiency of the Poisson leap methods by using larger stepsizes. Unlike Poisson random variables whose range of sample values is from zero to infinity, binomial random variables have a finite range of sample values. This probabilistic property has been used to restrict possible reaction numbers and to avoid negative molecular numbers in stochastic simulations when larger stepsize is used. In this approach a binomial random variable is defined for a single reaction channel in order to keep the reaction number of this channel below the numbers of molecules that undergo this reaction channel. A sampling technique is also designed for the total reaction number of a reactant species that undergoes two or more reaction channels. Samples for the total reaction number are not greater than the molecular number of this species. In addition, probability properties of the binomial random variables provide stepsize conditions for restricting reaction numbers in a chosen time interval. These stepsize conditions are important properties of robust leap control strategies. Numerical results indicate that the proposed binomial leap methods can be applied to a wide range of chemical reaction systems with very good accuracy and significant improvement on efficiency over existing approaches. (c) 2004 American Institute of Physics.

  9. Diagnostic performance of random urine samples using albumin concentration vs ratio of albumin to creatinine for microalbuminuria screening in patients with diabetes mellitus: a systematic review and meta-analysis.

    PubMed

    Wu, Hon-Yen; Peng, Yu-Sen; Chiang, Chih-Kang; Huang, Jenq-Wen; Hung, Kuan-Yu; Wu, Kwan-Dun; Tu, Yu-Kang; Chien, Kuo-Liong

    2014-07-01

    A random urine sample measuring the albumin concentration (UAC) without simultaneously measuring the urinary creatinine is less expensive than measuring the ratio of albumin to creatinine (ACR), but comparisons of their diagnostic performance for microalbuminuria screening among patients with diabetes mellitus (DM) have not been undertaken in previous meta-analyses. To compare the diagnostic performance of the UAC vs the ACR in random urine samples for microalbuminuria screening among patients with DM. Electronic literature searches of PubMed, MEDLINE, and Scopus for English-language publications from the earliest available date of indexing through July 31, 2012. Clinical studies assessing the UAC or the ACR of random urine samples in detecting the presence of microalbuminuria among patients with DM using a urinary albumin excretion rate of 30 to 300 mg/d in 24-hour timed urine collections as the criterion standard. Bivariate random-effects models for analysis and pooling of the diagnostic performance measures across studies, as well as comparisons between different screening tests. The primary end point was the diagnostic performance measures of the UAC or the ACR in random urine samples, as well as comparisons between them. We identified 14 studies, with a total of 2078 patients; 9 studies reported on the UAC, and 12 studies reported on the ACR. Meta-analysis showed pooled sensitivities of 0.85 and 0.87 for the UAC and the ACR, respectively, and pooled specificities of 0.88 and 0.88, respectively. No differences in sensitivity (P = .70), specificity (P = .63), or diagnostic odds ratios (P = .59) between the UAC and the ACR were found. The time point of urine collection did not affect the diagnostic performance of either test. The UAC and the ACR yielded high sensitivity and specificity for the detection of microalbuminuria. Because the diagnostic performance of the UAC is comparable to that of the ACR, our findings indicate that the UAC of random urine samples may become the screening tool of choice for the population with DM, considering the rising incidence of DM and the constrained health care resources in many countries.

  10. Accelerated 1 H MRSI using randomly undersampled spiral-based k-space trajectories.

    PubMed

    Chatnuntawech, Itthi; Gagoski, Borjan; Bilgic, Berkin; Cauley, Stephen F; Setsompop, Kawin; Adalsteinsson, Elfar

    2014-07-30

    To develop and evaluate the performance of an acquisition and reconstruction method for accelerated MR spectroscopic imaging (MRSI) through undersampling of spiral trajectories. A randomly undersampled spiral acquisition and sensitivity encoding (SENSE) with total variation (TV) regularization, random SENSE+TV, is developed and evaluated on single-slice numerical phantom, in vivo single-slice MRSI, and in vivo three-dimensional (3D)-MRSI at 3 Tesla. Random SENSE+TV was compared with five alternative methods for accelerated MRSI. For the in vivo single-slice MRSI, random SENSE+TV yields up to 2.7 and 2 times reduction in root-mean-square error (RMSE) of reconstructed N-acetyl aspartate (NAA), creatine, and choline maps, compared with the denoised fully sampled and uniformly undersampled SENSE+TV methods with the same acquisition time, respectively. For the in vivo 3D-MRSI, random SENSE+TV yields up to 1.6 times reduction in RMSE, compared with uniform SENSE+TV. Furthermore, by using random SENSE+TV, we have demonstrated on the in vivo single-slice and 3D-MRSI that acceleration factors of 4.5 and 4 are achievable with the same quality as the fully sampled data, as measured by RMSE of reconstructed NAA map, respectively. With the same scan time, random SENSE+TV yields lower RMSEs of metabolite maps than other methods evaluated. Random SENSE+TV achieves up to 4.5-fold acceleration with comparable data quality as the fully sampled acquisition. Magn Reson Med, 2014. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.

  11. Quantitative comparison of randomization designs in sequential clinical trials based on treatment balance and allocation randomness.

    PubMed

    Zhao, Wenle; Weng, Yanqiu; Wu, Qi; Palesch, Yuko

    2012-01-01

    To evaluate the performance of randomization designs under various parameter settings and trial sample sizes, and identify optimal designs with respect to both treatment imbalance and allocation randomness, we evaluate 260 design scenarios from 14 randomization designs under 15 sample sizes range from 10 to 300, using three measures for imbalance and three measures for randomness. The maximum absolute imbalance and the correct guess (CG) probability are selected to assess the trade-off performance of each randomization design. As measured by the maximum absolute imbalance and the CG probability, we found that performances of the 14 randomization designs are located in a closed region with the upper boundary (worst case) given by Efron's biased coin design (BCD) and the lower boundary (best case) from the Soares and Wu's big stick design (BSD). Designs close to the lower boundary provide a smaller imbalance and a higher randomness than designs close to the upper boundary. Our research suggested that optimization of randomization design is possible based on quantified evaluation of imbalance and randomness. Based on the maximum imbalance and CG probability, the BSD, Chen's biased coin design with imbalance tolerance method, and Chen's Ehrenfest urn design perform better than popularly used permuted block design, EBCD, and Wei's urn design. Copyright © 2011 John Wiley & Sons, Ltd.

  12. A nonparametric method to generate synthetic populations to adjust for complex sampling design features.

    PubMed

    Dong, Qi; Elliott, Michael R; Raghunathan, Trivellore E

    2014-06-01

    Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.

  13. A nonparametric method to generate synthetic populations to adjust for complex sampling design features

    PubMed Central

    Dong, Qi; Elliott, Michael R.; Raghunathan, Trivellore E.

    2017-01-01

    Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs. PMID:29200608

  14. Random sampling and validation of covariance matrices of resonance parameters

    NASA Astrophysics Data System (ADS)

    Plevnik, Lucijan; Zerovnik, Gašper

    2017-09-01

    Analytically exact methods for random sampling of arbitrary correlated parameters are presented. Emphasis is given on one hand on the possible inconsistencies in the covariance data, concentrating on the positive semi-definiteness and consistent sampling of correlated inherently positive parameters, and on the other hand on optimization of the implementation of the methods itself. The methods have been applied in the program ENDSAM, written in the Fortran language, which from a file from a nuclear data library of a chosen isotope in ENDF-6 format produces an arbitrary number of new files in ENDF-6 format which contain values of random samples of resonance parameters (in accordance with corresponding covariance matrices) in places of original values. The source code for the program ENDSAM is available from the OECD/NEA Data Bank. The program works in the following steps: reads resonance parameters and their covariance data from nuclear data library, checks whether the covariance data is consistent, and produces random samples of resonance parameters. The code has been validated with both realistic and artificial data to show that the produced samples are statistically consistent. Additionally, the code was used to validate covariance data in existing nuclear data libraries. A list of inconsistencies, observed in covariance data of resonance parameters in ENDF-VII.1, JEFF-3.2 and JENDL-4.0 is presented. For now, the work has been limited to resonance parameters, however the methods presented are general and can in principle be extended to sampling and validation of any nuclear data.

  15. Experimental Investigations of Non-Stationary Properties In Radiometer Receivers Using Measurements of Multiple Calibration References

    NASA Technical Reports Server (NTRS)

    Racette, Paul; Lang, Roger; Zhang, Zhao-Nan; Zacharias, David; Krebs, Carolyn A. (Technical Monitor)

    2002-01-01

    Radiometers must be periodically calibrated because the receiver response fluctuates. Many techniques exist to correct for the time varying response of a radiometer receiver. An analytical technique has been developed that uses generalized least squares regression (LSR) to predict the performance of a wide variety of calibration algorithms. The total measurement uncertainty including the uncertainty of the calibration can be computed using LSR. The uncertainties of the calibration samples used in the regression are based upon treating the receiver fluctuations as non-stationary processes. Signals originating from the different sources of emission are treated as simultaneously existing random processes. Thus, the radiometer output is a series of samples obtained from these random processes. The samples are treated as random variables but because the underlying processes are non-stationary the statistics of the samples are treated as non-stationary. The statistics of the calibration samples depend upon the time for which the samples are to be applied. The statistics of the random variables are equated to the mean statistics of the non-stationary processes over the interval defined by the time of calibration sample and when it is applied. This analysis opens the opportunity for experimental investigation into the underlying properties of receiver non stationarity through the use of multiple calibration references. In this presentation we will discuss the application of LSR to the analysis of various calibration algorithms, requirements for experimental verification of the theory, and preliminary results from analyzing experiment measurements.

  16. Testing how voluntary participation requirements in an environmental study affect the planned random sample design outcomes: implications for the predictions of values and their uncertainty.

    NASA Astrophysics Data System (ADS)

    Ander, Louise; Lark, Murray; Smedley, Pauline; Watts, Michael; Hamilton, Elliott; Fletcher, Tony; Crabbe, Helen; Close, Rebecca; Studden, Mike; Leonardi, Giovanni

    2015-04-01

    Random sampling design is optimal in order to be able to assess outcomes, such as the mean of a given variable across an area. However, this optimal sampling design may be compromised to an unknown extent by unavoidable real-world factors: the extent to which the study design can still be considered random, and the influence this may have on the choice of appropriate statistical data analysis is examined in this work. We take a study which relied on voluntary participation for the sampling of private water tap chemical composition in England, UK. This study was designed and implemented as a categorical, randomised study. The local geological classes were grouped into 10 types, which were considered to be most important in likely effects on groundwater chemistry (the source of all the tap waters sampled). Locations of the users of private water supplies were made available to the study group from the Local Authority in the area. These were then assigned, based on location, to geological groups 1 to 10 and randomised within each group. However, the permission to collect samples then required active, voluntary participation by householders and thus, unlike many environmental studies, could not always follow the initial sample design. Impediments to participation ranged from 'willing but not available' during the designated sampling period, to a lack of response to requests to sample (assumed to be wholly unwilling or unable to participate). Additionally, a small number of unplanned samples were collected via new participants making themselves known to the sampling teams, during the sampling period. Here we examine the impact this has on the 'random' nature of the resulting data distribution, by comparison with the non-participating known supplies. We consider the implications this has on choice of statistical analysis methods to predict values and uncertainty at un-sampled locations.

  17. The use of single-date MODIS imagery for estimating large-scale urban impervious surface fraction with spectral mixture analysis and machine learning techniques

    NASA Astrophysics Data System (ADS)

    Deng, Chengbin; Wu, Changshan

    2013-12-01

    Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.

  18. GEOSTATISTICAL SAMPLING DESIGNS FOR HAZARDOUS WASTE SITES

    EPA Science Inventory

    This chapter discusses field sampling design for environmental sites and hazardous waste sites with respect to random variable sampling theory, Gy's sampling theory, and geostatistical (kriging) sampling theory. The literature often presents these sampling methods as an adversari...

  19. The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy.

    PubMed

    Rodgers, J L

    1999-10-01

    A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.

  20. Urine sampling techniques in symptomatic primary-care patients: a diagnostic accuracy review.

    PubMed

    Holm, Anne; Aabenhus, Rune

    2016-06-08

    Choice of urine sampling technique in urinary tract infection may impact diagnostic accuracy and thus lead to possible over- or undertreatment. Currently no evidencebased consensus exists regarding correct sampling technique of urine from women with symptoms of urinary tract infection in primary care. The aim of this study was to determine the accuracy of urine culture from different sampling-techniques in symptomatic non-pregnant women in primary care. A systematic review was conducted by searching Medline and Embase for clinical studies conducted in primary care using a randomized or paired design to compare the result of urine culture obtained with two or more collection techniques in adult, female, non-pregnant patients with symptoms of urinary tract infection. We evaluated quality of the studies and compared accuracy based on dichotomized outcomes. We included seven studies investigating urine sampling technique in 1062 symptomatic patients in primary care. Mid-stream-clean-catch had a positive predictive value of 0.79 to 0.95 and a negative predictive value close to 1 compared to sterile techniques. Two randomized controlled trials found no difference in infection rate between mid-stream-clean-catch, mid-stream-urine and random samples. At present, no evidence suggests that sampling technique affects the accuracy of the microbiological diagnosis in non-pregnant women with symptoms of urinary tract infection in primary care. However, the evidence presented is in-direct and the difference between mid-stream-clean-catch, mid-stream-urine and random samples remains to be investigated in a paired design to verify the present findings.

  1. Sparse sampling and reconstruction for electron and scanning probe microscope imaging

    DOEpatents

    Anderson, Hyrum; Helms, Jovana; Wheeler, Jason W.; Larson, Kurt W.; Rohrer, Brandon R.

    2015-07-28

    Systems and methods for conducting electron or scanning probe microscopy are provided herein. In a general embodiment, the systems and methods for conducting electron or scanning probe microscopy with an undersampled data set include: driving an electron beam or probe to scan across a sample and visit a subset of pixel locations of the sample that are randomly or pseudo-randomly designated; determining actual pixel locations on the sample that are visited by the electron beam or probe; and processing data collected by detectors from the visits of the electron beam or probe at the actual pixel locations and recovering a reconstructed image of the sample.

  2. Attenuation of species abundance distributions by sampling

    PubMed Central

    Shimadzu, Hideyasu; Darnell, Ross

    2015-01-01

    Quantifying biodiversity aspects such as species presence/ absence, richness and abundance is an important challenge to answer scientific and resource management questions. In practice, biodiversity can only be assessed from biological material taken by surveys, a difficult task given limited time and resources. A type of random sampling, or often called sub-sampling, is a commonly used technique to reduce the amount of time and effort for investigating large quantities of biological samples. However, it is not immediately clear how (sub-)sampling affects the estimate of biodiversity aspects from a quantitative perspective. This paper specifies the effect of (sub-)sampling as attenuation of the species abundance distribution (SAD), and articulates how the sampling bias is induced to the SAD by random sampling. The framework presented also reveals some confusion in previous theoretical studies. PMID:26064626

  3. Assessing the use of existing data to compare plains fish assemblages collected from random and fixed sites in Colorado

    USGS Publications Warehouse

    Zuellig, Robert E.; Crockett, Harry J.

    2013-01-01

    The U.S. Geological Survey, in cooperation with Colorado Parks and Wildlife, assessed the potential use of combining recently (2007 to 2010) and formerly (1992 to 1996) collected data to compare plains fish assemblages sampled from random and fixed sites located in the South Platte and Arkansas River Basins in Colorado. The first step was to determine if fish assemblages collected between 1992 and 1996 were comparable to samples collected at the same sites between 2007 and 2010. If samples from the two time periods were comparable, then it was considered reasonable that the combined time-period data could be used to make comparisons between random and fixed sites. In contrast, if differences were found between the two time periods, then it was considered unreasonable to use these data to make comparisons between random and fixed sites. One-hundred samples collected during the 1990s and 2000s from 50 sites dispersed among 19 streams in both basins were compiled from a database maintained by Colorado Parks and Wildlife. Nonparametric multivariate two-way analysis of similarities was used to test for fish-assemblage differences between time periods while accounting for stream-to-stream differences. Results indicated relatively weak but significant time-period differences in fish assemblages. Weak time-period differences in this case possibly were related to changes in fish assemblages associated with environmental factors; however, it is difficult to separate other possible explanations such as limited replication of paired time-period samples in many of the streams or perhaps differences in sampling efficiency and effort between the time periods. Regardless, using the 1990s data to fill data gaps to compare random and fixed-site fish-assemblage data is ill advised based on the significant separation in fish assemblages between time periods and the inability to determine conclusive explanations for these results. These findings indicated that additional sampling will be necessary before unbiased comparisons can be made between fish assemblages collected from random and fixed sites in the South Platte and Arkansas River Basins.

  4. Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosis.

    PubMed

    Ozçift, Akin

    2011-05-01

    Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Report for Colorado: Background & Visuals, Math 2005. The Nation's Report Card

    ERIC Educational Resources Information Center

    Sandoval, Pam A.

    2005-01-01

    The National Assessment of Educational Progress (NAEP) 2005 assessment was administered to a stratified random sample of fourth-, eighth-, and twelfth-graders at the national level and to a stratified random sample of fourth- and eighth-graders at the state level. The Mathematics Framework for NAEP was revised in 1996 and again in 2005. The new…

  6. Estimation of population mean in the presence of measurement error and non response under stratified random sampling

    PubMed Central

    Shabbir, Javid

    2018-01-01

    In the present paper we propose an improved class of estimators in the presence of measurement error and non-response under stratified random sampling for estimating the finite population mean. The theoretical and numerical studies reveal that the proposed class of estimators performs better than other existing estimators. PMID:29401519

  7. The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples

    ERIC Educational Resources Information Center

    Avetisyan, Marianna; Fox, Jean-Paul

    2012-01-01

    In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…

  8. Nonmanufacturing Businesses. U.S. Metric Study Interim Report.

    ERIC Educational Resources Information Center

    Cornog, June R.; Bunten, Elaine D.

    In this fifth interim report on the feasibility of a United States changeover to a metric system stems from the U.S. Metric Study, a primary stratified sample of 2,828 nonmanufacturing firms was randomly selected from 28,184 businesses taken from Social Security files, a secondary sample of 2,258 firms was randomly selected for replacement…

  9. Reinforcing Sampling Distributions through a Randomization-Based Activity for Introducing ANOVA

    ERIC Educational Resources Information Center

    Taylor, Laura; Doehler, Kirsten

    2015-01-01

    This paper examines the use of a randomization-based activity to introduce the ANOVA F-test to students. The two main goals of this activity are to successfully teach students to comprehend ANOVA F-tests and to increase student comprehension of sampling distributions. Four sections of students in an advanced introductory statistics course…

  10. A Nationwide Random Sampling Survey of Potential Complicated Grief in Japan

    ERIC Educational Resources Information Center

    Mizuno, Yasunao; Kishimoto, Junji; Asukai, Nozomu

    2012-01-01

    To investigate the prevalence of significant loss, potential complicated grief (CG), and its contributing factors, we conducted a nationwide random sampling survey of Japanese adults aged 18 or older (N = 1,343) using a self-rating Japanese-language version of the Complicated Grief Brief Screen. Among them, 37.0% experienced their most significant…

  11. The Effect of Departmentalization on the Reading Achievement of Sixth-Grade Students.

    ERIC Educational Resources Information Center

    Harris, Mary B.

    A study examined whether departmentalization affected the reading achievement of sixth-grade students attending a Chicago public school. A random sample of 30 students was chosen from a group of 53 who received instruction in a departmentalized program. A second random sample of 30 was selected from a total of 54 students who received instruction…

  12. A Model for Predicting Behavioural Sleep Problems in a Random Sample of Australian Pre-Schoolers

    ERIC Educational Resources Information Center

    Hall, Wendy A.; Zubrick, Stephen R.; Silburn, Sven R.; Parsons, Deborah E.; Kurinczuk, Jennifer J.

    2007-01-01

    Behavioural sleep problems (childhood insomnias) can cause distress for both parents and children. This paper reports a model describing predictors of high sleep problem scores in a representative population-based random sample survey of non-Aboriginal singleton children born in 1995 and 1996 (1085 girls and 1129 boys) in Western Australia.…

  13. The Evaluation of Bias of the Weighted Random Effects Model Estimators. Research Report. ETS RR-11-13

    ERIC Educational Resources Information Center

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner,…

  14. The Relationship between Teachers Commitment and Female Students Academic Achievements in Some Selected Secondary School in Wolaita Zone, Southern Ethiopia

    ERIC Educational Resources Information Center

    Bibiso, Abyot; Olango, Menna; Bibiso, Mesfin

    2017-01-01

    The purpose of this study was to investigate the relationship between teacher's commitment and female students academic achievement in selected secondary school of Wolaita zone, Southern Ethiopia. The research method employed was survey study and the sampling techniques were purposive, simple random and stratified random sampling. Questionnaire…

  15. Randomization of grab-sampling strategies for estimating the annual exposure of U miners to Rn daughters.

    PubMed

    Borak, T B

    1986-04-01

    Periodic grab sampling in combination with time-of-occupancy surveys has been the accepted procedure for estimating the annual exposure of underground U miners to Rn daughters. Temporal variations in the concentration of potential alpha energy in the mine generate uncertainties in this process. A system to randomize the selection of locations for measurement is described which can reduce uncertainties and eliminate systematic biases in the data. In general, a sample frequency of 50 measurements per year is sufficient to satisfy the criteria that the annual exposure be determined in working level months to within +/- 50% of the true value with a 95% level of confidence. Suggestions for implementing this randomization scheme are presented.

  16. Effect of alignment of easy axes on dynamic magnetization of immobilized magnetic nanoparticles

    NASA Astrophysics Data System (ADS)

    Yoshida, Takashi; Matsugi, Yuki; Tsujimura, Naotaka; Sasayama, Teruyoshi; Enpuku, Keiji; Viereck, Thilo; Schilling, Meinhard; Ludwig, Frank

    2017-04-01

    In some biomedical applications of magnetic nanoparticles (MNPs), the particles are physically immobilized. In this study, we explore the effect of the alignment of the magnetic easy axes on the dynamic magnetization of immobilized MNPs under an AC excitation field. We prepared three immobilized MNP samples: (1) a sample in which easy axes are randomly oriented, (2) a parallel-aligned sample in which easy axes are parallel to the AC field, and (3) an orthogonally aligned sample in which easy axes are perpendicular to the AC field. First, we show that the parallel-aligned sample has the largest hysteresis in the magnetization curve and the largest harmonic magnetization spectra, followed by the randomly oriented and orthogonally aligned samples. For example, 1.6-fold increase was observed in the area of the hysteresis loop of the parallel-aligned sample compared to that of the randomly oriented sample. To quantitatively discuss the experimental results, we perform a numerical simulation based on a Fokker-Planck equation, in which probability distributions for the directions of the easy axes are taken into account in simulating the prepared MNP samples. We obtained quantitative agreement between experiment and simulation. These results indicate that the dynamic magnetization of immobilized MNPs is significantly affected by the alignment of the easy axes.

  17. Lensless digital holography with diffuse illumination through a pseudo-random phase mask.

    PubMed

    Bernet, Stefan; Harm, Walter; Jesacher, Alexander; Ritsch-Marte, Monika

    2011-12-05

    Microscopic imaging with a setup consisting of a pseudo-random phase mask, and an open CMOS camera, without an imaging objective, is demonstrated. The pseudo random phase mask acts as a diffuser for an incoming laser beam, scattering a speckle pattern to a CMOS chip, which is recorded once as a reference. A sample which is afterwards inserted somewhere in the optical beam path changes the speckle pattern. A single (non-iterative) image processing step, comparing the modified speckle pattern with the previously recorded one, generates a sharp image of the sample. After a first calibration the method works in real-time and allows quantitative imaging of complex (amplitude and phase) samples in an extended three-dimensional volume. Since no lenses are used, the method is free from lens abberations. Compared to standard inline holography the diffuse sample illumination improves the axial sectioning capability by increasing the effective numerical aperture in the illumination path, and it suppresses the undesired so-called twin images. For demonstration, a high resolution spatial light modulator (SLM) is programmed to act as the pseudo-random phase mask. We show experimental results, imaging microscopic biological samples, e.g. insects, within an extended volume at a distance of 15 cm with a transverse and longitudinal resolution of about 60 μm and 400 μm, respectively.

  18. A comparison of two sampling designs for fish assemblage assessment in a large river

    USGS Publications Warehouse

    Kiraly, Ian A.; Coghlan, Stephen M.; Zydlewski, Joseph D.; Hayes, Daniel

    2014-01-01

    We compared the efficiency of stratified random and fixed-station sampling designs to characterize fish assemblages in anticipation of dam removal on the Penobscot River, the largest river in Maine. We used boat electrofishing methods in both sampling designs. Multiple 500-m transects were selected randomly and electrofished in each of nine strata within the stratified random sampling design. Within the fixed-station design, up to 11 transects (1,000 m) were electrofished, all of which had been sampled previously. In total, 88 km of shoreline were electrofished during summer and fall in 2010 and 2011, and 45,874 individuals of 34 fish species were captured. Species-accumulation and dissimilarity curve analyses indicated that all sampling effort, other than fall 2011 under the fixed-station design, provided repeatable estimates of total species richness and proportional abundances. Overall, our sampling designs were similar in precision and efficiency for sampling fish assemblages. The fixed-station design was negatively biased for estimating the abundance of species such as Common Shiner Luxilus cornutus and Fallfish Semotilus corporalis and was positively biased for estimating biomass for species such as White Sucker Catostomus commersonii and Atlantic Salmon Salmo salar. However, we found no significant differences between the designs for proportional catch and biomass per unit effort, except in fall 2011. The difference observed in fall 2011 was due to limitations on the number and location of fixed sites that could be sampled, rather than an inherent bias within the design. Given the results from sampling in the Penobscot River, application of the stratified random design is preferable to the fixed-station design due to less potential for bias caused by varying sampling effort, such as what occurred in the fall 2011 fixed-station sample or due to purposeful site selection.

  19. Response Rates in Random-Digit-Dialed Telephone Surveys: Estimation vs. Measurement.

    ERIC Educational Resources Information Center

    Franz, Jennifer D.

    The efficacy of the random digit dialing method in telephone surveys was examined. Random digit dialing (RDD) generates a pure random sample and provides the advantage of including unlisted phone numbers, as well as numbers which are too new to be listed. Its disadvantage is that it generates a major proportion of nonworking and business…

  20. Lead as a legendary pollutant with emerging concern: Survey of lead in tap water in an old campus building using four sampling methods.

    PubMed

    Ng, Ding-Quan; Liu, Shu-Wei; Lin, Yi-Pin

    2018-09-15

    In this study, a sampling campaign with a total of nine sampling events investigating lead in drinking water was conducted at 7 sampling locations in an old building with lead pipes in service in part of the building on the National Taiwan University campus. This study aims to assess the effectiveness of four different sampling methods, namely first draw sampling, sequential sampling, random daytime sampling and flush sampling, in lead contamination detection. In 3 out of the 7 sampling locations without lead pipe, lead could not be detected (<1.1 μg/L) in most samples regardless of the sampling methods. On the other hand, in the 4 sampling locations where lead pipes still existed, total lead concentrations >10 μg/L were consistently observed in 3 locations using any of the four sampling methods while the remaining location was identified to be contaminated using sequential sampling. High lead levels were consistently measured by the four sampling methods in the 3 locations in which particulate lead was either predominant or comparable to soluble lead. Compared to first draw and random daytime samplings, although flush sampling had a high tendency to reduce total lead in samples in lead-contaminated sites, the extent of lead reduction was location-dependent and not dependent on flush durations between 5 and 10 min. Overall, first draw sampling and random daytime sampling were reliable and effective in determining lead contamination in this study. Flush sampling could reveal the contamination if the extent is severe but tends to underestimate lead exposure risk. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Ground-water quality data in the north San Francisco Bay hydrologic provinces, California, 2004: Results from the California Ground-water Ambient Monitoring and Assessment (GAMA) program

    USGS Publications Warehouse

    Kulongoski, Justin T.; Belitz, Kenneth; Dawson, Barbara J.

    2006-01-01

    Ground-water samples were analyzed for major and minor ions, trace elements, nutrients, volatile organic compounds, pesticides and pesticide degradates, waste-water indicators, dissolved methane, nitrogen, carbon dioxide and noble gases (in collaboration with Lawrence Livermore National Laboratory). Naturally occurring isotopes (tritium, carbon-14, oxygen-18, deuterium and helium-4) also were measured in the samples to help identify the source and age of the ground water. Results show that no anthropogenic constituents were detected at concentrations higher than those levels set for regulatory purposes, and relatively few naturally-occurring constituents were detected at concentrations greater than regulatory levels. In this study, 21 of the 88 volatile organic compounds (VOCs) and gasoline additives and (or) oxygenates investigated were detected in ground-water samples, however, detected concentrations were one-half to one-forty-thousandth the maximum contaminant levels (MCL). Thirty-two percent of the randomized wells sampled had at least a single detection of a VOC or gasoline additive and (or) oxygenate. The most frequently detected compounds were chloroform, found in 12 of the 84 randomized wells; carbon disulfide, found in 8 of the 84 randomized wells; and toluene, found in 4 of the 84 randomized wells. Trihalomethanes were the most frequently detected class of VOCs. Nine of the 122 pesticides and (or) pesticide degradates investigated were detected in ground-water samples, however, concentrations were one-seventieth to one-eight-hundredth the MCLs. Seventeen percent of the randomized wells sampled had at least a single detection of pesticide and pesticide degradate. Herbicides were the most frequently detected class of pesticides. The most frequently detected compound was simazine, found in 8 of the 84 of the randomized wells. Chlordiamino-s-triazine and deisopropyl atrazine were both found in 2 of the 84 randomized wells sampled. Thirteen out of 63 compounds that may be indicative of the prescence of waste-water were detected in ground-water samples. Twenty-six percent of the randomized wells sampled for waste-water indicators had at least one detection. Isophorone was the most frequently detected in 6 of the 84 randomized wells. Bisphenol-A, caffeine, and indole each were detected in 3 of the 84 randomized wells. Major and minor ions and dissolved solids (DS) samples were collected at 33 public-supply wells; 3 samples had DS concentrations above the secondary maximum contaminant level (SMCL) of 500 mg/L. Ground-water samples from 32 public-supply wells were analyzed for trace elements. Arsenic concentrations above the MCL of 10 μg/L were measured at 4 public-supply wells, boron concentrations above the detection level for the purpose of reporting (DLR) of 100 μg/L were measured at 19 wells. Iron concentrations above the SMCL of 300 μg/L were measured at 7 wells, a lead concentration above the California notification level (NL) of 15 μg/L at one well, and manganese concentrations above the SMCL of 50 μg/L were measured at 17 wells. Vanadium concentrations above the DLR of 3 μg/L were measured at 9 public-supply wells; and chromium(VI) concentrations above the DLR of 1 μg/L were measured at 48 public-supply wells. Major and minor ions and dissolved solids (DS) samples were collected at 33 public-supply wells; 3 samples had DS concentrations above the secondary maximum contaminant level (SMCL) of 500 mg/L. Ground-water samples from 32 public-supply wells were analyzed for trace elements. Arsenic concentrations above the MCL of 10 μg/L were measured at 4 public-supply wells, boron concentrations above the detection level for the purpose of reporting (DLR) of 100 μg/L were measured at 19 wells. Iron concentrations above the SMCL of 300 μg/L were measured at 7 wells, a lead concentration above the California notification level (NL) of 15 μg/L at one well, and manganese concentrations above the SMCL of 50 μg/L were measured at 17 wells. Vanadium concentrations above the DLR of 3 μg/L were measured at 9 public-supply wells; and chromium(VI) concentrations above the DLR of 1 μg/L were measured at 48 public-supply wells. Microbial constituents were analyzed in 22 ground-water samples. Total coliform was detected in three wells. Counts ranged from 2 colonies per 100 mL to 20 colonies per 100 mL. MCLs for microbial constituents are based on reoccurring detection, and will be monitored during future sampling.

  2. Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation

    NASA Astrophysics Data System (ADS)

    Räisänen, Petri; Barker, W. Howard

    2004-07-01

    The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.

  3. Creating ensembles of decision trees through sampling

    DOEpatents

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  4. Characteristics of a random sample of emergency food program users in New York: II. Soup kitchens.

    PubMed Central

    Bowering, J; Clancy, K L; Poppendieck, J

    1991-01-01

    A random sample of soup kitchen clients in New York City was studied and specific comparisons made on various parameters including homelessness. Compared with the general population of low income persons, soup kitchen users were overwhelmingly male, disproportionately African-American, and more likely to live alone. The homeless (41 percent of the sample) were less likely to receive food stamps or free food, or to use food pantries. Fewer of them received Medicaid or had health insurance. Forty-seven percent had no income in contrast to 29 percent of the total sample. PMID:2053673

  5. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    PubMed

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. D Semantic Labeling of ALS Data Based on Domain Adaption by Transferring and Fusing Random Forest Models

    NASA Astrophysics Data System (ADS)

    Wu, J.; Yao, W.; Zhang, J.; Li, Y.

    2018-04-01

    Labeling 3D point cloud data with traditional supervised learning methods requires considerable labelled samples, the collection of which is cost and time expensive. This work focuses on adopting domain adaption concept to transfer existing trained random forest classifiers (based on source domain) to new data scenes (target domain), which aims at reducing the dependence of accurate 3D semantic labeling in point clouds on training samples from the new data scene. Firstly, two random forest classifiers were firstly trained with existing samples previously collected for other data. They were different from each other by using two different decision tree construction algorithms: C4.5 with information gain ratio and CART with Gini index. Secondly, four random forest classifiers adapted to the target domain are derived through transferring each tree in the source random forest models with two types of operations: structure expansion and reduction-SER and structure transfer-STRUT. Finally, points in target domain are labelled by fusing the four newly derived random forest classifiers using weights of evidence based fusion model. To validate our method, experimental analysis was conducted using 3 datasets: one is used as the source domain data (Vaihingen data for 3D Semantic Labelling); another two are used as the target domain data from two cities in China (Jinmen city and Dunhuang city). Overall accuracies of 85.5 % and 83.3 % for 3D labelling were achieved for Jinmen city and Dunhuang city data respectively, with only 1/3 newly labelled samples compared to the cases without domain adaption.

  7. Quantum Random Number Generation Using a Quanta Image Sensor

    PubMed Central

    Amri, Emna; Felk, Yacine; Stucki, Damien; Ma, Jiaju; Fossum, Eric R.

    2016-01-01

    A new quantum random number generation method is proposed. The method is based on the randomness of the photon emission process and the single photon counting capability of the Quanta Image Sensor (QIS). It has the potential to generate high-quality random numbers with remarkable data output rate. In this paper, the principle of photon statistics and theory of entropy are discussed. Sample data were collected with QIS jot device, and its randomness quality was analyzed. The randomness assessment method and results are discussed. PMID:27367698

  8. Provider-related barriers to rapid HIV testing in U.S. urban non-profit community clinics, community-based organizations (CBOs) and hospitals.

    PubMed

    Bogart, Laura M; Howerton, Devery; Lange, James; Setodji, Claude Messan; Becker, Kirsten; Klein, David J; Asch, Steven M

    2010-06-01

    We examined provider-reported barriers to rapid HIV testing in U.S. urban non-profit community clinics, community-based organizations (CBOs), and hospitals. 12 primary metropolitan statistical areas (PMSAs; three per region) were sampled randomly, with sampling weights proportional to AIDS case reports. Across PMSAs, all 671 hospitals and a random sample of 738 clinics/CBOs were telephoned for a survey on rapid HIV test availability. Of the 671 hospitals, 172 hospitals were randomly selected for barriers questions, for which 158 laboratory and 136 department staff were eligible and interviewed in 2005. Of the 738 clinics/CBOs, 276 were randomly selected for barriers questions, 206 were reached, and 118 were eligible and interviewed in 2005-2006. In multivariate models, barriers regarding translation of administrative/quality assurance policies into practice were significantly associated with rapid HIV testing availability. For greater rapid testing diffusion, policies are needed to reduce administrative barriers and provide quality assurance training to non-laboratory staff.

  9. Measuring larval nematode contamination on cattle pastures: Comparing two herbage sampling methods.

    PubMed

    Verschave, S H; Levecke, B; Duchateau, L; Vercruysse, J; Charlier, J

    2015-06-15

    Assessing levels of pasture larval contamination is frequently used to study the population dynamics of the free-living stages of parasitic nematodes of livestock. Direct quantification of infective larvae (L3) on herbage is the most applied method to measure pasture larval contamination. However, herbage collection remains labour intensive and there is a lack of studies addressing the variation induced by the sampling method and the required sample size. The aim of this study was (1) to compare two different sampling methods in terms of pasture larval count results and time required to sample, (2) to assess the amount of variation in larval counts at the level of sample plot, pasture and season, respectively and (3) to calculate the required sample size to assess pasture larval contamination with a predefined precision using random plots across pasture. Eight young stock pastures of different commercial dairy herds were sampled in three consecutive seasons during the grazing season (spring, summer and autumn). On each pasture, herbage samples were collected through both a double-crossed W-transect with samples taken every 10 steps (method 1) and four random located plots of 0.16 m(2) with collection of all herbage within the plot (method 2). The average (± standard deviation (SD)) pasture larval contamination using sampling methods 1 and 2 was 325 (± 479) and 305 (± 444)L3/kg dry herbage (DH), respectively. Large discrepancies in pasture larval counts of the same pasture and season were often seen between methods, but no significant difference (P = 0.38) in larval counts between methods was found. Less time was required to collect samples with method 2. This difference in collection time between methods was most pronounced for pastures with a surface area larger than 1 ha. The variation in pasture larval counts from samples generated by random plot sampling was mainly due to the repeated measurements on the same pasture in the same season (residual variance component = 6.2), rather than due to pasture (variance component = 0.55) or season (variance component = 0.15). Using the observed distribution of L3, the required sample size (i.e. number of plots per pasture) for sampling a pasture through random plots with a particular precision was simulated. A higher relative precision was acquired when estimating PLC on pastures with a high larval contamination and a low level of aggregation compared to pastures with a low larval contamination when the same sample size was applied. In the future, herbage sampling through random plots across pasture (method 2) seems a promising method to develop further as no significant difference in counts between the methods was found and this method was less time consuming. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Neither fixed nor random: weighted least squares meta-analysis.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2015-06-15

    This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects. Copyright © 2015 John Wiley & Sons, Ltd.

  11. THE SELECTION OF A NATIONAL RANDOM SAMPLE OF TEACHERS FOR EXPERIMENTAL CURRICULUM EVALUATION.

    ERIC Educational Resources Information Center

    WELCH, WAYNE W.; AND OTHERS

    MEMBERS OF THE EVALUATION SECTION OF HARVARD PROJECT PHYSICS, DESCRIBING WHAT IS SAID TO BE THE FIRST ATTEMPT TO SELECT A NATIONAL RANDOM SAMPLE OF (HIGH SCHOOL PHYSICS) TEACHERS, LIST THE STEPS AS (1) PURCHASE OF A LIST OF PHYSICS TEACHERS FROM THE NATIONAL SCIENCE TEACHERS ASSOCIATION (MOST COMPLETE AVAILABLE), (2) SELECTION OF 136 NAMES BY A…

  12. Random laser action in bovine semen

    NASA Astrophysics Data System (ADS)

    Smuk, Andrei; Lazaro, Edgar; Olson, Leif P.; Lawandy, N. M.

    2011-03-01

    Experiments using bovine semen reveal that the addition of a high-gain water soluble dye results in random laser action when excited by a Q-switched, frequency doubled, Nd:Yag laser. The data shows that the linewidth collapse of the emission is correlated to the sperm count of the individual samples, potentially making this a rapid, low sample volume approach to count determination.

  13. Teachers' Methodologies and Sources of Information on HIV/AIDS for Students with Visual Impairments in Selected Residential and Integrated Schools in Ghana

    ERIC Educational Resources Information Center

    Hayford, Samuel K.; Ocansey, Frederick

    2017-01-01

    This study reports part of a national survey on sources of information, education and communication materials on HIV/AIDS available to students with visual impairments in residential, segregated, and integrated schools in Ghana. A multi-staged stratified random sampling procedure and a purposive and simple random sampling approach, where…

  14. 40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...

  15. 40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...

  16. 40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...

  17. 40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...

  18. 40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...

  19. Modeling of Academic Achievement of Primary School Students in Ethiopia Using Bayesian Multilevel Approach

    ERIC Educational Resources Information Center

    Sebro, Negusse Yohannes; Goshu, Ayele Taye

    2017-01-01

    This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…

  20. PubMed Central

    King-Shier, Kathryn M; Hemmelgarn, Brenda R; Musto, Richard; Quan, Hude

    2014-01-01

    Background: Francophones who live outside the primarily French-speaking province of Quebec, Canada, risk being excluded from research by lack of a sampling frame. We examined the adequacy of random sampling, advertising, and respondent-driven sampling for recruitment of francophones for survey research. Methods: We recruited francophones residing in the city of Calgary, Alberta, through advertising and respondentdriven sampling. These 2 samples were then compared with a random subsample of Calgary francophones derived from the 2006 Canadian Community Health Survey (CCHS). We assessed the effectiveness of advertising and respondent-driven sampling in relation to the CCHS sample by comparing demographic characteristics and selected items from the CCHS (specifically self-reported general health status, perceived weight, and having a family doctor). Results: We recruited 120 francophones through advertising and 145 through respondent-driven sampling; the random sample from the CCHS consisted of 259 records. The samples derived from advertising and respondentdriven sampling differed from the CCHS in terms of age (mean ages 41.0, 37.6, and 42.5 years, respectively), sex (proportion of males 26.1%, 40.6%, and 56.6%, respectively), education (college or higher 86.7% , 77.9% , and 59.1%, respectively), place of birth (immigrants accounting for 45.8%, 55.2%, and 3.7%, respectively), and not having a regular medical doctor (16.7%, 34.5%, and 16.6%, respectively). Differences were not tested statistically because of limitations on the analysis of CCHS data imposed by Statistics Canada. Interpretation: The samples generated exclusively through advertising and respondent-driven sampling were not representative of the gold standard sample from the CCHS. Use of such biased samples for research studies could generate misleading results. PMID:25426180

  1. At convenience and systematic random sampling: effects on the prognostic value of nuclear area assessments in breast cancer patients.

    PubMed

    Jannink, I; Bennen, J N; Blaauw, J; van Diest, P J; Baak, J P

    1995-01-01

    This study compares the influence of two different nuclear sampling methods on the prognostic value of assessments of mean and standard deviation of nuclear area (MNA, SDNA) in 191 consecutive invasive breast cancer patients with long term follow up. The first sampling method used was 'at convenience' sampling (ACS); the second, systematic random sampling (SRS). Both sampling methods were tested with a sample size of 50 nuclei (ACS-50 and SRS-50). To determine whether, besides the sampling methods, sample size had impact on prognostic value as well, the SRS method was also tested using a sample size of 100 nuclei (SRS-100). SDNA values were systematically lower for ACS, obviously due to (unconsciously) not including small and large nuclei. Testing prognostic value of a series of cut off points, MNA and SDNA values assessed by the SRS method were prognostically significantly stronger than the values obtained by the ACS method. This was confirmed in Cox regression analysis. For the MNA, the Mantel-Cox p-values from SRS-50 and SRS-100 measurements were not significantly different. However, for the SDNA, SRS-100 yielded significantly lower p-values than SRS-50. In conclusion, compared with the 'at convenience' nuclear sampling method, systematic random sampling of nuclei is not only superior with respect to reproducibility of results, but also provides a better prognostic value in patients with invasive breast cancer.

  2. Random and independent sampling of endogenous tryptic peptides from normal human EDTA plasma by liquid chromatography micro electrospray ionization and tandem mass spectrometry.

    PubMed

    Dufresne, Jaimie; Florentinus-Mefailoski, Angelique; Ajambo, Juliet; Ferwa, Ammara; Bowden, Peter; Marshall, John

    2017-01-01

    Normal human EDTA plasma samples were collected on ice, processed ice cold, and stored in a freezer at - 80 °C prior to experiments. Plasma test samples from the - 80 °C freezer were thawed on ice or intentionally warmed to room temperature. Protein content was measured by CBBR binding and the release of alcohol soluble amines by the Cd ninhydrin assay. Plasma peptides released over time were collected over C18 for random and independent sampling by liquid chromatography micro electrospray ionization and tandem mass spectrometry (LC-ESI-MS/MS) and correlated with X!TANDEM. Fully tryptic peptides by X!TANDEM returned a similar set of proteins, but was more computationally efficient, than "no enzyme" correlations. Plasma samples maintained on ice, or ice with a cocktail of protease inhibitors, showed lower background amounts of plasma peptides compared to samples incubated at room temperature. Regression analysis indicated that warming plasma to room temperature, versus ice cold, resulted in a ~ twofold increase in the frequency of peptide identification over hours-days of incubation at room temperature. The type I error rate of the protein identification from the X!TANDEM algorithm combined was estimated to be low compared to a null model of computer generated random MS/MS spectra. The peptides of human plasma were identified and quantified with low error rates by random and independent sampling that revealed 1000s of peptides from hundreds of human plasma proteins from endogenous tryptic peptides.

  3. True Randomness from Big Data.

    PubMed

    Papakonstantinou, Periklis A; Woodruff, David P; Yang, Guang

    2016-09-26

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.

  4. True Randomness from Big Data

    NASA Astrophysics Data System (ADS)

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-09-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.

  5. True Randomness from Big Data

    PubMed Central

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-01-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests. PMID:27666514

  6. Inspection of care: Findings from an innovative demonstration

    PubMed Central

    Morris, John N.; Sherwood, Clarence C.; Dreyer, Paul

    1989-01-01

    In this article, information is presented concerning the efficacy of a sample-based approach to completing inspection of care reviews of Medicaid-supported nursing home residents. Massachusetts nursing homes were randomly assigned to full (the control group) or sample (the experimental group) review conditions. The primary research focus was to determine whether the proportion of facilities found to be deficient (based on quality of care and level of care criteria) in the experimental sample was comparable to the proportion in the control sample. The findings supported such a hypothesis: Deficient facilities appear to be equally identifiable using the random or full-sampling protocols, and the process can be completed with a considerable savings of surveyor time. PMID:10313458

  7. Imbalanced Learning for Functional State Assessment

    NASA Technical Reports Server (NTRS)

    Li, Feng; McKenzie, Frederick; Li, Jiang; Zhang, Guangfan; Xu, Roger; Richey, Carl; Schnell, Tom

    2011-01-01

    This paper presents results of several imbalanced learning techniques applied to operator functional state assessment where the data is highly imbalanced, i.e., some function states (majority classes) have much more training samples than other states (minority classes). Conventional machine learning techniques usually tend to classify all data samples into majority classes and perform poorly for minority classes. In this study, we implemented five imbalanced learning techniques, including random undersampling, random over-sampling, synthetic minority over-sampling technique (SMOTE), borderline-SMOTE and adaptive synthetic sampling (ADASYN) to solve this problem. Experimental results on a benchmark driving lest dataset show thai accuracies for minority classes could be improved dramatically with a cost of slight performance degradations for majority classes,

  8. Effect of randomness on multi-frequency aeroelastic responses resolved by Unsteady Adaptive Stochastic Finite Elements

    NASA Astrophysics Data System (ADS)

    Witteveen, Jeroen A. S.; Bijl, Hester

    2009-10-01

    The Unsteady Adaptive Stochastic Finite Elements (UASFE) method resolves the effect of randomness in numerical simulations of single-mode aeroelastic responses with a constant accuracy in time for a constant number of samples. In this paper, the UASFE framework is extended to multi-frequency responses and continuous structures by employing a wavelet decomposition pre-processing step to decompose the sampled multi-frequency signals into single-frequency components. The effect of the randomness on the multi-frequency response is then obtained by summing the results of the UASFE interpolation at constant phase for the different frequency components. Results for multi-frequency responses and continuous structures show a three orders of magnitude reduction of computational costs compared to crude Monte Carlo simulations in a harmonically forced oscillator, a flutter panel problem, and the three-dimensional transonic AGARD 445.6 wing aeroelastic benchmark subject to random fields and random parameters with various probability distributions.

  9. Sampling designs for HIV molecular epidemiology with application to Honduras.

    PubMed

    Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I

    2005-11-01

    Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.

  10. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  11. Flexible sampling large-scale social networks by self-adjustable random walk

    NASA Astrophysics Data System (ADS)

    Xu, Xiao-Ke; Zhu, Jonathan J. H.

    2016-12-01

    Online social networks (OSNs) have become an increasingly attractive gold mine for academic and commercial researchers. However, research on OSNs faces a number of difficult challenges. One bottleneck lies in the massive quantity and often unavailability of OSN population data. Sampling perhaps becomes the only feasible solution to the problems. How to draw samples that can represent the underlying OSNs has remained a formidable task because of a number of conceptual and methodological reasons. Especially, most of the empirically-driven studies on network sampling are confined to simulated data or sub-graph data, which are fundamentally different from real and complete-graph OSNs. In the current study, we propose a flexible sampling method, called Self-Adjustable Random Walk (SARW), and test it against with the population data of a real large-scale OSN. We evaluate the strengths of the sampling method in comparison with four prevailing methods, including uniform, breadth-first search (BFS), random walk (RW), and revised RW (i.e., MHRW) sampling. We try to mix both induced-edge and external-edge information of sampled nodes together in the same sampling process. Our results show that the SARW sampling method has been able to generate unbiased samples of OSNs with maximal precision and minimal cost. The study is helpful for the practice of OSN research by providing a highly needed sampling tools, for the methodological development of large-scale network sampling by comparative evaluations of existing sampling methods, and for the theoretical understanding of human networks by highlighting discrepancies and contradictions between existing knowledge/assumptions of large-scale real OSN data.

  12. Comparison of sampling methods for hard-to-reach francophone populations: yield and adequacy of advertisement and respondent-driven sampling.

    PubMed

    Ngwakongnwi, Emmanuel; King-Shier, Kathryn M; Hemmelgarn, Brenda R; Musto, Richard; Quan, Hude

    2014-01-01

    Francophones who live outside the primarily French-speaking province of Quebec, Canada, risk being excluded from research by lack of a sampling frame. We examined the adequacy of random sampling, advertising, and respondent-driven sampling for recruitment of francophones for survey research. We recruited francophones residing in the city of Calgary, Alberta, through advertising and respondentdriven sampling. These 2 samples were then compared with a random subsample of Calgary francophones derived from the 2006 Canadian Community Health Survey (CCHS). We assessed the effectiveness of advertising and respondent-driven sampling in relation to the CCHS sample by comparing demographic characteristics and selected items from the CCHS (specifically self-reported general health status, perceived weight, and having a family doctor). We recruited 120 francophones through advertising and 145 through respondent-driven sampling; the random sample from the CCHS consisted of 259 records. The samples derived from advertising and respondentdriven sampling differed from the CCHS in terms of age (mean ages 41.0, 37.6, and 42.5 years, respectively), sex (proportion of males 26.1%, 40.6%, and 56.6%, respectively), education (college or higher 86.7% , 77.9% , and 59.1%, respectively), place of birth (immigrants accounting for 45.8%, 55.2%, and 3.7%, respectively), and not having a regular medical doctor (16.7%, 34.5%, and 16.6%, respectively). Differences were not tested statistically because of limitations on the analysis of CCHS data imposed by Statistics Canada. The samples generated exclusively through advertising and respondent-driven sampling were not representative of the gold standard sample from the CCHS. Use of such biased samples for research studies could generate misleading results.

  13. Adapted random sampling patterns for accelerated MRI.

    PubMed

    Knoll, Florian; Clason, Christian; Diwoky, Clemens; Stollberger, Rudolf

    2011-02-01

    Variable density random sampling patterns have recently become increasingly popular for accelerated imaging strategies, as they lead to incoherent aliasing artifacts. However, the design of these sampling patterns is still an open problem. Current strategies use model assumptions like polynomials of different order to generate a probability density function that is then used to generate the sampling pattern. This approach relies on the optimization of design parameters which is very time consuming and therefore impractical for daily clinical use. This work presents a new approach that generates sampling patterns by making use of power spectra of existing reference data sets and hence requires neither parameter tuning nor an a priori mathematical model of the density of sampling points. The approach is validated with downsampling experiments, as well as with accelerated in vivo measurements. The proposed approach is compared with established sampling patterns, and the generalization potential is tested by using a range of reference images. Quantitative evaluation is performed for the downsampling experiments using RMS differences to the original, fully sampled data set. Our results demonstrate that the image quality of the method presented in this paper is comparable to that of an established model-based strategy when optimization of the model parameter is carried out and yields superior results to non-optimized model parameters. However, no random sampling pattern showed superior performance when compared to conventional Cartesian subsampling for the considered reconstruction strategy.

  14. An occupational exposure assessment of a perfluorooctanesulfonyl fluoride production site: biomonitoring.

    PubMed

    Olsen, Geary W; Logan, Perry W; Hansen, Kristen J; Simpson, Cathy A; Burris, Jean M; Burlew, Michele M; Vorarath, Phanasouk P; Venkateswarlu, Pothapragada; Schumpert, John C; Mandel, Jeffrey H

    2003-01-01

    This investigation randomly sampled a fluorochemical manufacturing employee population to determine the distribution of serum fluorochemical levels according to employees' jobs and work areas. Previous analyses of medical surveillance data have not shown significant associations between fluorochemical production employees' clinical chemistry and hematology tests and their serum PFOS and perfluorooctanoate (PFOA, C(7)F(15)COO(-)) concentrations, but may have been subject to nonparticipation bias. A random sample of the on-site film plant employee population, where fluorochemicals are not produced, determined their serum concentrations also. Of the 232 employees randomly selected for serum sampling, 186 (80%) employees participated (n=126 chemical plant; n=60 film plant). Sera samples were extracted using an ion-pairing extraction procedure and were quantitatively analyzed for seven fluorochemicals using high-pressure liquid chromatography electrospray tandem mass spectrometry methods. Geometric means (in parts per million) and 95% confidence intervals (in parentheses) of the random sample of 126 chemical plant employees were: PFOS 0.941 (0.787-1.126); PFOA 0.899 (0.722-1.120); perfluorohexanesulfonate 0.180 (0.145-0.223); N-ethyl perfluorooctanesulfonamidoacetate 0.008 (0.006-0.011); N-methyl perfluorooctanesulfonamidoacetate 0.081 (0.067-0.098); perfluorooctanesulfonamide 0.013 (0.009-0.018); and perfluorooctanesulfonamidoacetate 0.022 (0.018-0.029). These geometric means were approximately one order of magnitude higher than those observed for the film plant employees.

  15. Variable density randomized stack of spirals (VDR-SoS) for compressive sensing MRI.

    PubMed

    Valvano, Giuseppe; Martini, Nicola; Landini, Luigi; Santarelli, Maria Filomena

    2016-07-01

    To develop a 3D sampling strategy based on a stack of variable density spirals for compressive sensing MRI. A random sampling pattern was obtained by rotating each spiral by a random angle and by delaying for few time steps the gradient waveforms of the different interleaves. A three-dimensional (3D) variable sampling density was obtained by designing different variable density spirals for each slice encoding. The proposed approach was tested with phantom simulations up to a five-fold undersampling factor. Fully sampled 3D dataset of a human knee, and of a human brain, were obtained from a healthy volunteer. The proposed approach was tested with off-line reconstructions of the knee dataset up to a four-fold acceleration and compared with other noncoherent trajectories. The proposed approach outperformed the standard stack of spirals for various undersampling factors. The level of coherence and the reconstruction quality of the proposed approach were similar to those of other trajectories that, however, require 3D gridding for the reconstruction. The variable density randomized stack of spirals (VDR-SoS) is an easily implementable trajectory that could represent a valid sampling strategy for 3D compressive sensing MRI. It guarantees low levels of coherence without requiring 3D gridding. Magn Reson Med 76:59-69, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  16. Stanford GEMS phase 2 obesity prevention trial for low-income African-American girls: design and sample baseline characteristics.

    PubMed

    Robinson, Thomas N; Kraemer, Helena C; Matheson, Donna M; Obarzanek, Eva; Wilson, Darrell M; Haskell, William L; Pruitt, Leslie A; Thompson, Nikko S; Haydel, K Farish; Fujimoto, Michelle; Varady, Ann; McCarthy, Sally; Watanabe, Connie; Killen, Joel D

    2008-01-01

    African-American girls and women are at high risk of obesity and its associated morbidities. Few studies have tested obesity prevention strategies specifically designed for African-American girls. This report describes the design and baseline findings of the Stanford GEMS (Girls health Enrichment Multi-site Studies) trial to test the effect of a two-year community- and family-based intervention to reduce weight gain in low-income, pre-adolescent African-American girls. Randomized controlled trial with measurements scheduled in girls' homes at baseline, 6, 12, 18 and 24 month post-randomization. Low-income areas of Oakland, CA. Eight, nine and ten year old African-American girls and their parents/caregivers. Girls are randomized to a culturally-tailored after-school dance program and a home/family-based intervention to reduce screen media use versus an information-based community health education Active-Placebo Comparison intervention. Interventions last for 2 years for each participant. Change in body mass index over the two-year study. Recruitment and enrollment successfully produced a predominately low-socioeconomic status sample. Two-hundred sixty one (261) families were randomized. One girl per family is randomly chosen for the analysis sample. Randomization produced comparable experimental groups with only a few statistically significant differences. The sample had a mean body mass index (BMI) at the 74 th percentile on the 2000 CDC BMI reference, and one-third of the analysis sample had a BMI at the 95th percentile or above. Average fasting total cholesterol and LDL cholesterol were above NCEP thresholds for borderline high classifications. Girls averaged low levels of moderate to vigorous physical activity, more than 3 h per day of screen media use, and diets high in energy from fat. The Stanford GEMS trial is testing the benefits of culturally-tailored after-school dance and screen-time reduction interventions for obesity prevention in low-income, pre-adolescent African-American girls.

  17. Fatigue strength reduction model: RANDOM3 and RANDOM4 user manual. Appendix 2: Development of advanced methodologies for probabilistic constitutive relationships of material strength models

    NASA Technical Reports Server (NTRS)

    Boyce, Lola; Lovelace, Thomas B.

    1989-01-01

    FORTRAN programs RANDOM3 and RANDOM4 are documented in the form of a user's manual. Both programs are based on fatigue strength reduction, using a probabilistic constitutive model. The programs predict the random lifetime of an engine component to reach a given fatigue strength. The theoretical backgrounds, input data instructions, and sample problems illustrating the use of the programs are included.

  18. Estimates of Intraclass Correlation for Variables Related to Behavioral HIV/STD Prevention in a Predominantly African American and Hispanic Sample of Young Women

    ERIC Educational Resources Information Center

    Pals, Sherri L.; Beaty, Brenda L.; Posner, Samuel F.; Bull, Sheana S.

    2009-01-01

    Studies designed to evaluate HIV and STD prevention interventions often involve random assignment of groups such as neighborhoods or communities to study conditions (e.g., to intervention or control). Investigators who design group-randomized trials (GRTs) must take the expected intraclass correlation coefficient (ICC) into account in sample size…

  19. Summer School Effects in a Randomized Field Trial

    ERIC Educational Resources Information Center

    Zvoch, Keith; Stevens, Joseph J.

    2013-01-01

    This field-based randomized trial examined the effect of assignment to and participation in summer school for two moderately at-risk samples of struggling readers. Application of multiple regression models to difference scores capturing the change in summer reading fluency revealed that kindergarten students randomly assigned to summer school…

  20. Simulation of the Effects of Random Measurement Errors

    ERIC Educational Resources Information Center

    Kinsella, I. A.; Hannaidh, P. B. O.

    1978-01-01

    Describes a simulation method for measurement of errors that requires calculators and tables of random digits. Each student simulates the random behaviour of the component variables in the function and by combining the results of all students, the outline of the sampling distribution of the function can be obtained. (GA)

  1. Improved sampling and analysis of images in corneal confocal microscopy.

    PubMed

    Schaldemose, E L; Fontain, F I; Karlsson, P; Nyengaard, J R

    2017-10-01

    Corneal confocal microscopy (CCM) is a noninvasive clinical method to analyse and quantify corneal nerve fibres in vivo. Although the CCM technique is in constant progress, there are methodological limitations in terms of sampling of images and objectivity of the nerve quantification. The aim of this study was to present a randomized sampling method of the CCM images and to develop an adjusted area-dependent image analysis. Furthermore, a manual nerve fibre analysis method was compared to a fully automated method. 23 idiopathic small-fibre neuropathy patients were investigated using CCM. Corneal nerve fibre length density (CNFL) and corneal nerve fibre branch density (CNBD) were determined in both a manual and automatic manner. Differences in CNFL and CNBD between (1) the randomized and the most common sampling method, (2) the adjusted and the unadjusted area and (3) the manual and automated quantification method were investigated. The CNFL values were significantly lower when using the randomized sampling method compared to the most common method (p = 0.01). There was not a statistical significant difference in the CNBD values between the randomized and the most common sampling method (p = 0.85). CNFL and CNBD values were increased when using the adjusted area compared to the standard area. Additionally, the study found a significant increase in the CNFL and CNBD values when using the manual method compared to the automatic method (p ≤ 0.001). The study demonstrated a significant difference in the CNFL values between the randomized and common sampling method indicating the importance of clear guidelines for the image sampling. The increase in CNFL and CNBD values when using the adjusted cornea area is not surprising. The observed increases in both CNFL and CNBD values when using the manual method of nerve quantification compared to the automatic method are consistent with earlier findings. This study underlines the importance of improving the analysis of the CCM images in order to obtain more objective corneal nerve fibre measurements. © 2017 The Authors Journal of Microscopy © 2017 Royal Microscopical Society.

  2. Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

    PubMed

    NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

    2017-08-01

    Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.

  3. Survey of rural, private wells. Statistical design

    USGS Publications Warehouse

    Mehnert, Edward; Schock, Susan C.; ,

    1991-01-01

    Half of Illinois' 38 million acres were planted in corn and soybeans in 1988. On the 19 million acres planted in corn and soybeans, approximately 1 million tons of nitrogen fertilizer and 50 million pounds of pesticides were applied. Because groundwater is the water supply for over 90 percent of rural Illinois, the occurrence of agricultural chemicals in groundwater in Illinois is of interest to the agricultural community, the public, and regulatory agencies. The occurrence of agricultural chemicals in groundwater is well documented. However, the extent of this contamination still needs to be defined. This can be done by randomly sampling wells across a geographic area. Key elements of a random, water-well sampling program for regional groundwater quality include the overall statistical design of the program, definition of the sample population, selection of wells to be sampled, and analysis of survey results. These elements must be consistent with the purpose for conducting the program; otherwise, the program will not provide the desired information. The need to carefully design and conduct a sampling program becomes readily apparent when one considers the high cost of collecting and analyzing a sample. For a random sampling program conducted in Illinois, the key elements, as well as the limitations imposed by available information, are described.

  4. Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

    PubMed

    Sztepanacz, Jacqueline L; Blows, Mark W

    2017-07-01

    The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.

  5. Improved high-dimensional prediction with Random Forests by the use of co-data.

    PubMed

    Te Beest, Dennis E; Mes, Steven W; Wilting, Saskia M; Brakenhoff, Ruud H; van de Wiel, Mark A

    2017-12-28

    Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting. Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study. The proposed method is able to utilize auxiliary co-data to improve the performance of a Random Forest.

  6. A weighted ℓ{sub 1}-minimization approach for sparse polynomial chaos expansions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peng, Ji; Hampton, Jerrad; Doostan, Alireza, E-mail: alireza.doostan@colorado.edu

    2014-06-15

    This work proposes a method for sparse polynomial chaos (PC) approximation of high-dimensional stochastic functions based on non-adapted random sampling. We modify the standard ℓ{sub 1}-minimization algorithm, originally proposed in the context of compressive sampling, using a priori information about the decay of the PC coefficients, when available, and refer to the resulting algorithm as weightedℓ{sub 1}-minimization. We provide conditions under which we may guarantee recovery using this weighted scheme. Numerical tests are used to compare the weighted and non-weighted methods for the recovery of solutions to two differential equations with high-dimensional random inputs: a boundary value problem with amore » random elliptic operator and a 2-D thermally driven cavity flow with random boundary condition.« less

  7. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

    PubMed

    Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

    2013-12-01

    Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.

  8. Precision of systematic and random sampling in clustered populations: habitat patches and aggregating organisms.

    PubMed

    McGarvey, Richard; Burch, Paul; Matthews, Janet M

    2016-01-01

    Natural populations of plants and animals spatially cluster because (1) suitable habitat is patchy, and (2) within suitable habitat, individuals aggregate further into clusters of higher density. We compare the precision of random and systematic field sampling survey designs under these two processes of species clustering. Second, we evaluate the performance of 13 estimators for the variance of the sample mean from a systematic survey. Replicated simulated surveys, as counts from 100 transects, allocated either randomly or systematically within the study region, were used to estimate population density in six spatial point populations including habitat patches and Matérn circular clustered aggregations of organisms, together and in combination. The standard one-start aligned systematic survey design, a uniform 10 x 10 grid of transects, was much more precise. Variances of the 10 000 replicated systematic survey mean densities were one-third to one-fifth of those from randomly allocated transects, implying transect sample sizes giving equivalent precision by random survey would need to be three to five times larger. Organisms being restricted to patches of habitat was alone sufficient to yield this precision advantage for the systematic design. But this improved precision for systematic sampling in clustered populations is underestimated by standard variance estimators used to compute confidence intervals. True variance for the survey sample mean was computed from the variance of 10 000 simulated survey mean estimates. Testing 10 published and three newly proposed variance estimators, the two variance estimators (v) that corrected for inter-transect correlation (ν₈ and ν(W)) were the most accurate and also the most precise in clustered populations. These greatly outperformed the two "post-stratification" variance estimators (ν₂ and ν₃) that are now more commonly applied in systematic surveys. Similar variance estimator performance rankings were found with a second differently generated set of spatial point populations, ν₈ and ν(W) again being the best performers in the longer-range autocorrelated populations. However, no systematic variance estimators tested were free from bias. On balance, systematic designs bring more narrow confidence intervals in clustered populations, while random designs permit unbiased estimates of (often wider) confidence interval. The search continues for better estimators of sampling variance for the systematic survey mean.

  9. Slowdowns in diversification rates from real phylogenies may not be real.

    PubMed

    Cusimano, Natalie; Renner, Susanne S

    2010-07-01

    Studies of diversification patterns often find a slowing in lineage accumulation toward the present. This seemingly pervasive pattern of rate downturns has been taken as evidence for adaptive radiations, density-dependent regulation, and metacommunity species interactions. The significance of rate downturns is evaluated with statistical tests (the gamma statistic and Monte Carlo constant rates (MCCR) test; birth-death likelihood models and Akaike Information Criterion [AIC] scores) that rely on null distributions, which assume that the included species are a random sample of the entire clade. Sampling in real phylogenies, however, often is nonrandom because systematists try to include early-diverging species or representatives of previous intrataxon classifications. We studied the effects of biased sampling, structured sampling, and random sampling by experimentally pruning simulated trees (60 and 150 species) as well as a completely sampled empirical tree (58 species) and then applying the gamma statistic/MCCR test and birth-death likelihood models/AIC scores to assess rate changes. For trees with random species sampling, the true model (i.e., the one fitting the complete phylogenies) could be inferred in most cases. Oversampling deep nodes, however, strongly biases inferences toward downturns, with simulations of structured and biased sampling suggesting that this occurs when sampling percentages drop below 80%. The magnitude of the effect and the sensitivity of diversification rate models is such that a useful rule of thumb may be not to infer rate downturns from real trees unless they have >80% species sampling.

  10. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.

    PubMed

    Westfall, Jacob; Kenny, David A; Judd, Charles M

    2014-10-01

    Researchers designing experiments in which a sample of participants responds to a sample of stimuli are faced with difficult questions about optimal study design. The conventional procedures of statistical power analysis fail to provide appropriate answers to these questions because they are based on statistical models in which stimuli are not assumed to be a source of random variation in the data, models that are inappropriate for experiments involving crossed random factors of participants and stimuli. In this article, we present new methods of power analysis for designs with crossed random factors, and we give detailed, practical guidance to psychology researchers planning experiments in which a sample of participants responds to a sample of stimuli. We extensively examine 5 commonly used experimental designs, describe how to estimate statistical power in each, and provide power analysis results based on a reasonable set of default parameter values. We then develop general conclusions and formulate rules of thumb concerning the optimal design of experiments in which a sample of participants responds to a sample of stimuli. We show that in crossed designs, statistical power typically does not approach unity as the number of participants goes to infinity but instead approaches a maximum attainable power value that is possibly small, depending on the stimulus sample. We also consider the statistical merits of designs involving multiple stimulus blocks. Finally, we provide a simple and flexible Web-based power application to aid researchers in planning studies with samples of stimuli.

  11. A sub-sampled approach to extremely low-dose STEM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stevens, A.; Luzi, L.; Yang, H.

    The inpainting of randomly sub-sampled images acquired by scanning transmission electron microscopy (STEM) is an attractive method for imaging under low-dose conditions (≤ 1 e -Å 2) without changing either the operation of the microscope or the physics of the imaging process. We show that 1) adaptive sub-sampling increases acquisition speed, resolution, and sensitivity; and 2) random (non-adaptive) sub-sampling is equivalent, but faster than, traditional low-dose techniques. Adaptive sub-sampling opens numerous possibilities for the analysis of beam sensitive materials and in-situ dynamic processes at the resolution limit of the aberration corrected microscope and is demonstrated here for the analysis ofmore » the node distribution in metal-organic frameworks (MOFs).« less

  12. Provable classically intractable sampling with measurement-based computation in constant time

    NASA Astrophysics Data System (ADS)

    Sanders, Stephen; Miller, Jacob; Miyake, Akimasa

    We present a constant-time measurement-based quantum computation (MQC) protocol to perform a classically intractable sampling problem. We sample from the output probability distribution of a subclass of the instantaneous quantum polynomial time circuits introduced by Bremner, Montanaro and Shepherd. In contrast with the usual circuit model, our MQC implementation includes additional randomness due to byproduct operators associated with the computation. Despite this additional randomness we show that our sampling task cannot be efficiently simulated by a classical computer. We extend previous results to verify the quantum supremacy of our sampling protocol efficiently using only single-qubit Pauli measurements. Center for Quantum Information and Control, Department of Physics and Astronomy, University of New Mexico, Albuquerque, NM 87131, USA.

  13. Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.

    PubMed

    Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B

    2017-04-01

    Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

  14. Introductory Statistics Students' Conceptual Understanding of Study Design and Conclusions

    NASA Astrophysics Data System (ADS)

    Fry, Elizabeth Brondos

    Recommended learning goals for students in introductory statistics courses include the ability to recognize and explain the key role of randomness in designing studies and in drawing conclusions from those studies involving generalizations to a population or causal claims (GAISE College Report ASA Revision Committee, 2016). The purpose of this study was to explore introductory statistics students' understanding of the distinct roles that random sampling and random assignment play in study design and the conclusions that can be made from each. A study design unit lasting two and a half weeks was designed and implemented in four sections of an undergraduate introductory statistics course based on modeling and simulation. The research question that this study attempted to answer is: How does introductory statistics students' conceptual understanding of study design and conclusions (in particular, unbiased estimation and establishing causation) change after participating in a learning intervention designed to promote conceptual change in these areas? In order to answer this research question, a forced-choice assessment called the Inferences from Design Assessment (IDEA) was developed as a pretest and posttest, along with two open-ended assignments, a group quiz and a lab assignment. Quantitative analysis of IDEA results and qualitative analysis of the group quiz and lab assignment revealed that overall, students' mastery of study design concepts significantly increased after the unit, and the great majority of students successfully made the appropriate connections between random sampling and generalization, and between random assignment and causal claims. However, a small, but noticeable portion of students continued to demonstrate misunderstandings, such as confusion between random sampling and random assignment.

  15. Generalized essential energy space random walks to more effectively accelerate solute sampling in aqueous environment

    NASA Astrophysics Data System (ADS)

    Lv, Chao; Zheng, Lianqing; Yang, Wei

    2012-01-01

    Molecular dynamics sampling can be enhanced via the promoting of potential energy fluctuations, for instance, based on a Hamiltonian modified with the addition of a potential-energy-dependent biasing term. To overcome the diffusion sampling issue, which reveals the fact that enlargement of event-irrelevant energy fluctuations may abolish sampling efficiency, the essential energy space random walk (EESRW) approach was proposed earlier. To more effectively accelerate the sampling of solute conformations in aqueous environment, in the current work, we generalized the EESRW method to a two-dimension-EESRW (2D-EESRW) strategy. Specifically, the essential internal energy component of a focused region and the essential interaction energy component between the focused region and the environmental region are employed to define the two-dimensional essential energy space. This proposal is motivated by the general observation that in different conformational events, the two essential energy components have distinctive interplays. Model studies on the alanine dipeptide and the aspartate-arginine peptide demonstrate sampling improvement over the original one-dimension-EESRW strategy; with the same biasing level, the present generalization allows more effective acceleration of the sampling of conformational transitions in aqueous solution. The 2D-EESRW generalization is readily extended to higher dimension schemes and employed in more advanced enhanced-sampling schemes, such as the recent orthogonal space random walk method.

  16. The coalescent of a sample from a binary branching process.

    PubMed

    Lambert, Amaury

    2018-04-25

    At time 0, start a time-continuous binary branching process, where particles give birth to a single particle independently (at a possibly time-dependent rate) and die independently (at a possibly time-dependent and age-dependent rate). A particular case is the classical birth-death process. Stop this process at time T>0. It is known that the tree spanned by the N tips alive at time T of the tree thus obtained (called a reduced tree or coalescent tree) is a coalescent point process (CPP), which basically means that the depths of interior nodes are independent and identically distributed (iid). Now select each of the N tips independently with probability y (Bernoulli sample). It is known that the tree generated by the selected tips, which we will call the Bernoulli sampled CPP, is again a CPP. Now instead, select exactly k tips uniformly at random among the N tips (a k-sample). We show that the tree generated by the selected tips is a mixture of Bernoulli sampled CPPs with the same parent CPP, over some explicit distribution of the sampling probability y. An immediate consequence is that the genealogy of a k-sample can be obtained by the realization of k random variables, first the random sampling probability Y and then the k-1 node depths which are iid conditional on Y=y. Copyright © 2018. Published by Elsevier Inc.

  17. Visualizing the Sample Standard Deviation

    ERIC Educational Resources Information Center

    Sarkar, Jyotirmoy; Rashid, Mamunur

    2017-01-01

    The standard deviation (SD) of a random sample is defined as the square-root of the sample variance, which is the "mean" squared deviation of the sample observations from the sample mean. Here, we interpret the sample SD as the square-root of twice the mean square of all pairwise half deviations between any two sample observations. This…

  18. Balancing a U-Shaped Assembly Line by Applying Nested Partitions Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhagwat, Nikhil V.

    2005-01-01

    In this study, we applied the Nested Partitions method to a U-line balancing problem and conducted experiments to evaluate the application. From the results, it is quite evident that the Nested Partitions method provided near optimal solutions (optimal in some cases). Besides, the execution time is quite short as compared to the Branch and Bound algorithm. However, for larger data sets, the algorithm took significantly longer times for execution. One of the reasons could be the way in which the random samples are generated. In the present study, a random sample is a solution in itself which requires assignment ofmore » tasks to various stations. The time taken to assign tasks to stations is directly proportional to the number of tasks. Thus, if the number of tasks increases, the time taken to generate random samples for the different regions also increases. The performance index for the Nested Partitions method in the present study was the number of stations in the random solutions (samples) generated. The total idle time for the samples can be used as another performance index. ULINO method is known to have used a combination of bounds to come up with good solutions. This approach of combining different performance indices can be used to evaluate the random samples and obtain even better solutions. Here, we used deterministic time values for the tasks. In industries where majority of tasks are performed manually, the stochastic version of the problem could be of vital importance. Experimenting with different objective functions (No. of stations was used in this study) could be of some significance to some industries where in the cost associated with creation of a new station is not the same. For such industries, the results obtained by using the present approach will not be of much value. Labor costs, task incompletion costs or a combination of those can be effectively used as alternate objective functions.« less

  19. Evaluating effectiveness of down-sampling for stratified designs and unbalanced prevalence in Random Forest models of tree species distributions in Nevada

    Treesearch

    Elizabeth A. Freeman; Gretchen G. Moisen; Tracy S. Frescino

    2012-01-01

    Random Forests is frequently used to model species distributions over large geographic areas. Complications arise when data used to train the models have been collected in stratified designs that involve different sampling intensity per stratum. The modeling process is further complicated if some of the target species are relatively rare on the landscape leading to an...

  20. From Planning to Implementation: An Examination of Changes in the Research Design, Sample Size, and Precision of Group Randomized Trials Launched by the Institute of Education Sciences

    ERIC Educational Resources Information Center

    Spybrook, Jessaca; Puente, Anne Cullen; Lininger, Monica

    2013-01-01

    This article examines changes in the research design, sample size, and precision between the planning phase and implementation phase of group randomized trials (GRTs) funded by the Institute of Education Sciences. Thirty-eight GRTs funded between 2002 and 2006 were examined. Three studies revealed changes in the experimental design. Ten studies…

  1. The Distribution and Abundance of Leaf Litter Arthropods in MOFEP Sites 1, 2, and 3

    Treesearch

    Jan Weaver; Sarah Heyman

    1997-01-01

    In June 1993, we collected 144 leaf litter samples from 36 plots (4 samples/plot) located in MOFEP forest sites 1, 2 and 3. Half of the plots were placed randomly on northeast-facing stands (ELT 18), and half randomly placed on southwest-facing stands (ELT 17). Arthropods were extracted using Tullgren funnels, and then sorted into morpho-species, counted, and measured...

  2. A new approach to evaluate gamma-ray measurements

    NASA Technical Reports Server (NTRS)

    Dejager, O. C.; Swanepoel, J. W. H.; Raubenheimer, B. C.; Vandervalt, D. J.

    1985-01-01

    Misunderstandings about the term random samples its implications may easily arise. Conditions under which the phases, obtained from arrival times, do not form a random sample and the dangers involved are discussed. Watson's U sup 2 test for uniformity is recommended for light curves with duty cycles larger than 10%. Under certain conditions, non-parametric density estimation may be used to determine estimates of the true light curve and its parameters.

  3. A Letter on Ancaiani et al. "Evaluating Scientific Research in Italy: The 2004-10 Research Evaluation Exercise"

    ERIC Educational Resources Information Center

    Baccini, Alberto; De Nicolao, Giuseppe

    2017-01-01

    This letter documents some problems in Ancaiani et al. (2015). Namely the evaluation of concordance, based on Cohen's kappa, reported by Ancaiani et al. was not computed on the whole random sample of 9,199 articles, but on a subset of 7,597 articles. The kappas relative to the whole random sample were in the range 0.07-0.15, indicating an…

  4. Computerized stratified random site-selection approaches for design of a ground-water-quality sampling network

    USGS Publications Warehouse

    Scott, J.C.

    1990-01-01

    Computer software was written to randomly select sites for a ground-water-quality sampling network. The software uses digital cartographic techniques and subroutines from a proprietary geographic information system. The report presents the approaches, computer software, and sample applications. It is often desirable to collect ground-water-quality samples from various areas in a study region that have different values of a spatial characteristic, such as land-use or hydrogeologic setting. A stratified network can be used for testing hypotheses about relations between spatial characteristics and water quality, or for calculating statistical descriptions of water-quality data that account for variations that correspond to the spatial characteristic. In the software described, a study region is subdivided into areal subsets that have a common spatial characteristic to stratify the population into several categories from which sampling sites are selected. Different numbers of sites may be selected from each category of areal subsets. A population of potential sampling sites may be defined by either specifying a fixed population of existing sites, or by preparing an equally spaced population of potential sites. In either case, each site is identified with a single category, depending on the value of the spatial characteristic of the areal subset in which the site is located. Sites are selected from one category at a time. One of two approaches may be used to select sites. Sites may be selected randomly, or the areal subsets in the category can be grouped into cells and sites selected randomly from each cell.

  5. Sample size re-estimation and other midcourse adjustments with sequential parallel comparison design.

    PubMed

    Silverman, Rachel K; Ivanova, Anastasia

    2017-01-01

    Sequential parallel comparison design (SPCD) was proposed to reduce placebo response in a randomized trial with placebo comparator. Subjects are randomized between placebo and drug in stage 1 of the trial, and then, placebo non-responders are re-randomized in stage 2. Efficacy analysis includes all data from stage 1 and all placebo non-responding subjects from stage 2. This article investigates the possibility to re-estimate the sample size and adjust the design parameters, allocation proportion to placebo in stage 1 of SPCD, and weight of stage 1 data in the overall efficacy test statistic during an interim analysis.

  6. Randomized Controlled Trial of a Preventive Intervention for Perinatal Depression in High-Risk Latinas

    ERIC Educational Resources Information Center

    Le, Huynh-Nhu; Perry, Deborah F.; Stuart, Elizabeth A.

    2011-01-01

    Objective: A randomized controlled trial was conducted to evaluate the efficacy of a cognitive-behavioral (CBT) intervention to prevent perinatal depression in high-risk Latinas. Method: A sample of 217 participants, predominantly low-income Central American immigrants who met demographic and depression risk criteria, were randomized into usual…

  7. Pigeons' Choices between Fixed-Interval and Random-Interval Schedules: Utility of Variability?

    ERIC Educational Resources Information Center

    Andrzejewski, Matthew E.; Cardinal, Claudia D.; Field, Douglas P.; Flannery, Barbara A.; Johnson, Michael; Bailey, Kathleen; Hineline, Philip N.

    2005-01-01

    Pigeons' choosing between fixed-interval and random-interval schedules of reinforcement was investigated in three experiments using a discrete-trial procedure. In all three experiments, the random-interval schedule was generated by sampling a probability distribution at an interval (and in multiples of the interval) equal to that of the…

  8. Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies

    PubMed Central

    Theis, Fabian J.

    2017-01-01

    Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464

  9. Sample Size Calculations for Micro-randomized Trials in mHealth

    PubMed Central

    Liao, Peng; Klasnja, Predrag; Tewari, Ambuj; Murphy, Susan A.

    2015-01-01

    The use and development of mobile interventions are experiencing rapid growth. In “just-in-time” mobile interventions, treatments are provided via a mobile device and they are intended to help an individual make healthy decisions “in the moment,” and thus have a proximal, near future impact. Currently the development of mobile interventions is proceeding at a much faster pace than that of associated data science methods. A first step toward developing data-based methods is to provide an experimental design for testing the proximal effects of these just-in-time treatments. In this paper, we propose a “micro-randomized” trial design for this purpose. In a micro-randomized trial, treatments are sequentially randomized throughout the conduct of the study, with the result that each participant may be randomized at the 100s or 1000s of occasions at which a treatment might be provided. Further, we develop a test statistic for assessing the proximal effect of a treatment as well as an associated sample size calculator. We conduct simulation evaluations of the sample size calculator in various settings. Rules of thumb that might be used in designing a micro-randomized trial are discussed. This work is motivated by our collaboration on the HeartSteps mobile application designed to increase physical activity. PMID:26707831

  10. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    PubMed Central

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  11. The coverage of a random sample from a biological community.

    PubMed

    Engen, S

    1975-03-01

    A taxonomic group will frequently have a large number of species with small abundances. When a sample is drawn at random from this group, one is therefore faced with the problem that a large proportion of the species will not be discovered. A general definition of quantitative measures of "sample coverage" is proposed, and the problem of statistical inference is considered for two special cases, (1) the actual total relative abundance of those species that are represented in the sample, and (2) their relative contribution to the information index of diversity. The analysis is based on a extended version of the negative binomial species frequency model. The results are tabulated.

  12. Optimal sampling with prior information of the image geometry in microfluidic MRI.

    PubMed

    Han, S H; Cho, H; Paulsen, J L

    2015-03-01

    Recent advances in MRI acquisition for microscopic flows enable unprecedented sensitivity and speed in a portable NMR/MRI microfluidic analysis platform. However, the application of MRI to microfluidics usually suffers from prolonged acquisition times owing to the combination of the required high resolution and wide field of view necessary to resolve details within microfluidic channels. When prior knowledge of the image geometry is available as a binarized image, such as for microfluidic MRI, it is possible to reduce sampling requirements by incorporating this information into the reconstruction algorithm. The current approach to the design of the partial weighted random sampling schemes is to bias toward the high signal energy portions of the binarized image geometry after Fourier transformation (i.e. in its k-space representation). Although this sampling prescription is frequently effective, it can be far from optimal in certain limiting cases, such as for a 1D channel, or more generally yield inefficient sampling schemes at low degrees of sub-sampling. This work explores the tradeoff between signal acquisition and incoherent sampling on image reconstruction quality given prior knowledge of the image geometry for weighted random sampling schemes, finding that optimal distribution is not robustly determined by maximizing the acquired signal but from interpreting its marginal change with respect to the sub-sampling rate. We develop a corresponding sampling design methodology that deterministically yields a near optimal sampling distribution for image reconstructions incorporating knowledge of the image geometry. The technique robustly identifies optimal weighted random sampling schemes and provides improved reconstruction fidelity for multiple 1D and 2D images, when compared to prior techniques for sampling optimization given knowledge of the image geometry. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. A new mosaic method for three-dimensional surface

    NASA Astrophysics Data System (ADS)

    Yuan, Yun; Zhu, Zhaokun; Ding, Yongjun

    2011-08-01

    Three-dimensional (3-D) data mosaic is a indispensable link in surface measurement and digital terrain map generation. With respect to the mosaic problem of the local unorganized cloud points with rude registration and mass mismatched points, a new mosaic method for 3-D surface based on RANSAC is proposed. Every circular of this method is processed sequentially by random sample with additional shape constraint, data normalization of cloud points, absolute orientation, data denormalization of cloud points, inlier number statistic, etc. After N random sample trials the largest consensus set is selected, and at last the model is re-estimated using all the points in the selected subset. The minimal subset is composed of three non-colinear points which form a triangle. The shape of triangle is considered in random sample selection in order to make the sample selection reasonable. A new coordinate system transformation algorithm presented in this paper is used to avoid the singularity. The whole rotation transformation between the two coordinate systems can be solved by twice rotations expressed by Euler angle vector, each rotation has explicit physical means. Both simulation and real data are used to prove the correctness and validity of this mosaic method. This method has better noise immunity due to its robust estimation property, and has high accuracy as the shape constraint is added to random sample and the data normalization added to the absolute orientation. This method is applicable for high precision measurement of three-dimensional surface and also for the 3-D terrain mosaic.

  14. Is a 'convenience' sample useful for estimating immunization coverage in a small population?

    PubMed

    Weir, Jean E; Jones, Carrie

    2008-01-01

    Rapid survey methodologies are widely used for assessing immunization coverage in developing countries, approximating true stratified random sampling. Non-random ('convenience') sampling is not considered appropriate for estimating immunization coverage rates but has the advantages of low cost and expediency. We assessed the validity of a convenience sample of children presenting to a travelling clinic by comparing the coverage rate in the convenience sample to the true coverage established by surveying each child in three villages in rural Papua New Guinea. The rate of DTF immunization coverage as estimated by the convenience sample was within 10% of the true coverage when the proportion of children in the sample was two-thirds or when only children over the age of one year were counted, but differed by 11% when the sample included only 53% of the children and when all eligible children were included. The convenience sample may be sufficiently accurate for reporting purposes and is useful for identifying areas of low coverage.

  15. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  16. Random sampling of constrained phylogenies: conducting phylogenetic analyses when the phylogeny is partially known.

    PubMed

    Housworth, E A; Martins, E P

    2001-01-01

    Statistical randomization tests in evolutionary biology often require a set of random, computer-generated trees. For example, earlier studies have shown how large numbers of computer-generated trees can be used to conduct phylogenetic comparative analyses even when the phylogeny is uncertain or unknown. These methods were limited, however, in that (in the absence of molecular sequence or other data) they allowed users to assume that no phylogenetic information was available or that all possible trees were known. Intermediate situations where only a taxonomy or other limited phylogenetic information (e.g., polytomies) are available are technically more difficult. The current study describes a procedure for generating random samples of phylogenies while incorporating limited phylogenetic information (e.g., four taxa belong together in a subclade). The procedure can be used to conduct comparative analyses when the phylogeny is only partially resolved or can be used in other randomization tests in which large numbers of possible phylogenies are needed.

  17. Baseline Serum Estradiol and Fracture Reduction During Treatment With Hormone Therapy: The Women’s Health Initiative Randomized Trial

    PubMed Central

    Cauley, Jane A.; LaCroix, Andrea Z.; Robbins, John A.; Larson, Joseph; Wallace, Robert; Wactawski-Wende, Jean; Chen, Zhao; Bauer, Douglas C.; Cummings, Steven R.; Jackson, Rebecca

    2009-01-01

    Purpose To test the hypothesis that the reduction in fractures with hormone therapy (HT) is greater in women with lower estradiol levels. Methods We conducted a nested case-control study within the Women’s Health Initiative HT Trials. The sample included 231 hip fracture case-control pairs and a random sample of 519 all fracture case-control pairs. Cases and controls were matched for age, ethnicity, randomization date, fracture history and hysterectomy status. Hormones were measured prior to randomization. Incident cases of fracture identified over an average follow-up of 6.53 years. Results There was no evidence that the effect of HT on fracture differed by baseline estradiol (E2) or sex hormone binding globulin (SHBG). Across all quartiles of E2 and SHBG, women randomized to HT had about a 50% lower risk of fracture including hip fracture, compared to placebo. Conclusion The effect of HT on fracture reduction is independent of estradiol and SHBG levels. PMID:19436934

  18. Accelerating IMRT optimization by voxel sampling

    NASA Astrophysics Data System (ADS)

    Martin, Benjamin C.; Bortfeld, Thomas R.; Castañon, David A.

    2007-12-01

    This paper presents a new method for accelerating intensity-modulated radiation therapy (IMRT) optimization using voxel sampling. Rather than calculating the dose to the entire patient at each step in the optimization, the dose is only calculated for some randomly selected voxels. Those voxels are then used to calculate estimates of the objective and gradient which are used in a randomized version of a steepest descent algorithm. By selecting different voxels on each step, we are able to find an optimal solution to the full problem. We also present an algorithm to automatically choose the best sampling rate for each structure within the patient during the optimization. Seeking further improvements, we experimented with several other gradient-based optimization algorithms and found that the delta-bar-delta algorithm performs well despite the randomness. Overall, we were able to achieve approximately an order of magnitude speedup on our test case as compared to steepest descent.

  19. Demythologizing sex education in Oklahoma: an attitudinal study.

    PubMed

    Turner, N H

    1983-08-01

    A randomized study was conducted to determine the distribution of attitudes among Oklahomans of voting age toward sex education and to analyze the relationship of demographic, sociocultural, and attitudinal factors. The state was stratified into six regions. Forty-five percent of the sample lived in urban areas, and 55% in rural areas. Random digit dialing and random selection within households were utilized to ensure a representative sample of the population. Eighty percent of the sample was found to be favorable toward sex education in the public schools, while 20% was unfavorable. A majority of respondents in all religious groups including "fundamentalists" were favorable. Seventeen variables were found to be significant in the univariate analysis of the data; eight were not significant. In a multivariate analysis, three variables, age, Protestant denominational type and female employment, were shown to have predictive ability in determining favorability and unfavorability. Implications for building community support for sex education also are discussed.

  20. Variances and uncertainties of the sample laboratory-to-laboratory variance (S(L)2) and standard deviation (S(L)) associated with an interlaboratory study.

    PubMed

    McClure, Foster D; Lee, Jung K

    2012-01-01

    The validation process for an analytical method usually employs an interlaboratory study conducted as a balanced completely randomized model involving a specified number of randomly chosen laboratories, each analyzing a specified number of randomly allocated replicates. For such studies, formulas to obtain approximate unbiased estimates of the variance and uncertainty of the sample laboratory-to-laboratory (lab-to-lab) STD (S(L)) have been developed primarily to account for the uncertainty of S(L) when there is a need to develop an uncertainty budget that includes the uncertainty of S(L). For the sake of completeness on this topic, formulas to estimate the variance and uncertainty of the sample lab-to-lab variance (S(L)2) were also developed. In some cases, it was necessary to derive the formulas based on an approximate distribution for S(L)2.

  1. The Coalescent Process in Models with Selection

    PubMed Central

    Kaplan, N. L.; Darden, T.; Hudson, R. R.

    1988-01-01

    Statistical properties of the process describing the genealogical history of a random sample of genes are obtained for a class of population genetics models with selection. For models with selection, in contrast to models without selection, the distribution of this process, the coalescent process, depends on the distribution of the frequencies of alleles in the ancestral generations. If the ancestral frequency process can be approximated by a diffusion, then the mean and the variance of the number of segregating sites due to selectively neutral mutations in random samples can be numerically calculated. The calculations are greatly simplified if the frequencies of the alleles are tightly regulated. If the mutation rates between alleles maintained by balancing selection are low, then the number of selectively neutral segregating sites in a random sample of genes is expected to substantially exceed the number predicted under a neutral model. PMID:3066685

  2. Optimal random search for a single hidden target.

    PubMed

    Snider, Joseph

    2011-01-01

    A single target is hidden at a location chosen from a predetermined probability distribution. Then, a searcher must find a second probability distribution from which random search points are sampled such that the target is found in the minimum number of trials. Here it will be shown that if the searcher must get very close to the target to find it, then the best search distribution is proportional to the square root of the target distribution regardless of dimension. For a Gaussian target distribution, the optimum search distribution is approximately a Gaussian with a standard deviation that varies inversely with how close the searcher must be to the target to find it. For a network where the searcher randomly samples nodes and looks for the fixed target along edges, the optimum is either to sample a node with probability proportional to the square root of the out-degree plus 1 or not to do so at all.

  3. Reweighting of the primary sampling units in the National Automotive Sampling System

    DOT National Transportation Integrated Search

    1997-09-01

    The original design of hte National Automotive Sampling System - formerly the National Accident Sampling System - called for 75 PSUs randomly selected from PSUs which were grouped into various strata across the U.S. The implementation of the PSU samp...

  4. Does Fidelity of Implementation Account for Changes in Teacher-Child Interactions in a Randomized Controlled Trial of Banking Time?

    ERIC Educational Resources Information Center

    LoCasale-Crouch, Jennifer; Williford, Amanda; Whittaker, Jessica; DeCoster, Jamie; Alamos, Pilar

    2018-01-01

    This study examined fidelity of implementation in a randomized trial of Banking Time, a classroom-based intervention intended to improve children's behavior, specifically for those at risk for developing externalizing behavior problems, through improving the quality of teacher-child interactions. The study sample comes from a randomized controlled…

  5. Improving Classroom Learning Environments by Cultivating Awareness and Resilience in Education (CARE): Results of a Randomized Controlled Trial

    ERIC Educational Resources Information Center

    Jennings, Patricia A.; Frank, Jennifer L.; Snowberg, Karin E.; Coccia, Michael A.; Greenberg, Mark T.

    2013-01-01

    Cultivating Awareness and Resilience in Education (CARE for Teachers) is a mindfulness-based professional development program designed to reduce stress and improve teachers' performance and classroom learning environments. A randomized controlled trial examined program efficacy and acceptability among a sample of 50 teachers randomly assigned to…

  6. ENHANCEMENT OF LEARNING ON SAMPLE SIZE CALCULATION WITH A SMARTPHONE APPLICATION: A CLUSTER-RANDOMIZED CONTROLLED TRIAL.

    PubMed

    Ngamjarus, Chetta; Chongsuvivatwong, Virasakdi; McNeil, Edward; Holling, Heinz

    2017-01-01

    Sample size determination usually is taught based on theory and is difficult to understand. Using a smartphone application to teach sample size calculation ought to be more attractive to students than using lectures only. This study compared levels of understanding of sample size calculations for research studies between participants attending a lecture only versus lecture combined with using a smartphone application to calculate sample sizes, to explore factors affecting level of post-test score after training sample size calculation, and to investigate participants’ attitude toward a sample size application. A cluster-randomized controlled trial involving a number of health institutes in Thailand was carried out from October 2014 to March 2015. A total of 673 professional participants were enrolled and randomly allocated to one of two groups, namely, 341 participants in 10 workshops to control group and 332 participants in 9 workshops to intervention group. Lectures on sample size calculation were given in the control group, while lectures using a smartphone application were supplied to the test group. Participants in the intervention group had better learning of sample size calculation (2.7 points out of maximnum 10 points, 95% CI: 24 - 2.9) than the participants in the control group (1.6 points, 95% CI: 1.4 - 1.8). Participants doing research projects had a higher post-test score than those who did not have a plan to conduct research projects (0.9 point, 95% CI: 0.5 - 1.4). The majority of the participants had a positive attitude towards the use of smartphone application for learning sample size calculation.

  7. Optimizing Urine Processing Protocols for Protein and Metabolite Detection.

    PubMed

    Siddiqui, Nazema Y; DuBois, Laura G; St John-Williams, Lisa; Will, Thompson J; Grenier, Carole; Burke, Emily; Fraser, Matthew O; Amundsen, Cindy L; Murphy, Susan K

    In urine, factors such as timing of voids, and duration at room temperature (RT) may affect the quality of recovered protein and metabolite data. Additives may aid with detection, but can add more complexity in sample collection or analysis. We aimed to identify the optimal urine processing protocol for clinically-obtained urine samples that allows for the highest protein and metabolite yields with minimal degradation. Healthy women provided multiple urine samples during the same day. Women collected their first morning (1 st AM) void and another "random void". Random voids were aliquotted with: 1) no additive; 2) boric acid (BA); 3) protease inhibitor (PI); or 4) both BA + PI. Of these aliquots, some were immediately stored at 4°C, and some were left at RT for 4 hours. Proteins and individual metabolites were quantified, normalized to creatinine concentrations, and compared across processing conditions. Sample pools corresponding to each processing condition were analyzed using mass spectrometry to assess protein degradation. Ten Caucasian women between 35-65 years of age provided paired 1 st morning and random voided urine samples. Normalized protein concentrations were slightly higher in 1 st AM compared to random "spot" voids. The addition of BA did not significantly change proteins, while PI significantly improved normalized protein concentrations, regardless of whether samples were immediately cooled or left at RT for 4 hours. In pooled samples, there were minimal differences in protein degradation under the various conditions we tested. In metabolite analyses, there were significant differences in individual amino acids based on the timing of the void. For comparative translational research using urine, information about void timing should be collected and standardized. For urine samples processed in the same day, BA does not appear to be necessary while the addition of PI enhances protein yields, regardless of 4°C or RT storage temperature.

  8. Determination of reference values for elevated fasting and random insulin levels and their associations with metabolic risk factors among rural Pakistanis from Sindh Province.

    PubMed

    Ahmadani, Muhammad Yakoob; Hakeem, Rubina; Fawwad, Asher; Basit, Abdul; Shera, A Samad

    2008-06-01

    To assess insulin levels and their association with metabolic risk factors (family history of diabetes, abnormal glucose tolerance, hypertension, overweight and android obesity) among a representative group of Pakistan. The study data was taken from the database of a population-based survey conducted in Sindh Province, Pakistan, in 1994 to assess the prevalence of diabetes mellitus and impaired glucose tolerance (IGT). Through stratified random sampling; oral glucose tolerance tests were performed in 967 adults; every fifth sample was estimated for fasting and random (2-hour post-75 gm glucose load) insulin levels. The total number of metabolic risk factors was counted for each subject, and their association with insulin levels studied. Of the 130 subjects, 56.1% were females and 95.4% were Sindhi. The mean age of males and females was 43.84 and 40.61 years, respectively. Family history for diabetes and frequency of overweight had significant positive associations with both fasting and random insulin levels (P < 0.05). Association between hypertension and insulin levels was significant only for random insulin levels, and between android obesity, abnormal glucose tolerance, or male gender and insulin levels only for fasting insulin levels (P < 0.05). Metabolic risk factors had significant positive associations with both fasting (r = 0.351 P = 0.000) as well as random insulin levels (r = 0.364 P = 0.000). This paper provides baseline pioneering information applicable to the Pakistani population. Furthermore, the observations made in this study about differences in association of fasting or random insulin levels with various metabolic risk factors highlight the possibility of using either of them for risk assessment. This finding needs to be assessed in a larger and nationally representative sample.

  9. Cluster-randomized Studies in Educational Research: Principles and Methodological Aspects.

    PubMed

    Dreyhaupt, Jens; Mayer, Benjamin; Keis, Oliver; Öchsner, Wolfgang; Muche, Rainer

    2017-01-01

    An increasing number of studies are being performed in educational research to evaluate new teaching methods and approaches. These studies could be performed more efficiently and deliver more convincing results if they more strictly applied and complied with recognized standards of scientific studies. Such an approach could substantially increase the quality in particular of prospective, two-arm (intervention) studies that aim to compare two different teaching methods. A key standard in such studies is randomization, which can minimize systematic bias in study findings; such bias may result if the two study arms are not structurally equivalent. If possible, educational research studies should also achieve this standard, although this is not yet generally the case. Some difficulties and concerns exist, particularly regarding organizational and methodological aspects. An important point to consider in educational research studies is that usually individuals cannot be randomized, because of the teaching situation, and instead whole groups have to be randomized (so-called "cluster randomization"). Compared with studies with individual randomization, studies with cluster randomization normally require (significantly) larger sample sizes and more complex methods for calculating sample size. Furthermore, cluster-randomized studies require more complex methods for statistical analysis. The consequence of the above is that a competent expert with respective special knowledge needs to be involved in all phases of cluster-randomized studies. Studies to evaluate new teaching methods need to make greater use of randomization in order to achieve scientifically convincing results. Therefore, in this article we describe the general principles of cluster randomization and how to implement these principles, and we also outline practical aspects of using cluster randomization in prospective, two-arm comparative educational research studies.

  10. Network Sampling with Memory: A proposal for more efficient sampling from social networks.

    PubMed

    Mouw, Ted; Verdery, Ashton M

    2012-08-01

    Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations where study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk based approach such as Respondent Driven Sampling (RDS) can result in high design effects (DE)-the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high design effect means that more cases must be collected to achieve the same level of precision as SRS. In this paper we propose an alternative strategy, Network Sampling with Memory (NSM), which collects network data from respondents in order to reduce design effects and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a "List" mode, where all individuals on the revealed network list are sampled with the same cumulative probability, with a "Search" mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared to RDS and SRS on 162 school and university networks from Add Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average design effect for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE=1), and 98.5% lower than the average DE we observed for RDS.

  11. Network Sampling with Memory: A proposal for more efficient sampling from social networks

    PubMed Central

    Mouw, Ted; Verdery, Ashton M.

    2013-01-01

    Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations where study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk based approach such as Respondent Driven Sampling (RDS) can result in high design effects (DE)—the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high design effect means that more cases must be collected to achieve the same level of precision as SRS. In this paper we propose an alternative strategy, Network Sampling with Memory (NSM), which collects network data from respondents in order to reduce design effects and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a “List” mode, where all individuals on the revealed network list are sampled with the same cumulative probability, with a “Search” mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared to RDS and SRS on 162 school and university networks from Add Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average design effect for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE=1), and 98.5% lower than the average DE we observed for RDS. PMID:24159246

  12. Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats

    USGS Publications Warehouse

    Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.

    2012-01-01

    This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.

  13. Random Numbers and Monte Carlo Methods

    NASA Astrophysics Data System (ADS)

    Scherer, Philipp O. J.

    Many-body problems often involve the calculation of integrals of very high dimension which cannot be treated by standard methods. For the calculation of thermodynamic averages Monte Carlo methods are very useful which sample the integration volume at randomly chosen points. After summarizing some basic statistics, we discuss algorithms for the generation of pseudo-random numbers with given probability distribution which are essential for all Monte Carlo methods. We show how the efficiency of Monte Carlo integration can be improved by sampling preferentially the important configurations. Finally the famous Metropolis algorithm is applied to classical many-particle systems. Computer experiments visualize the central limit theorem and apply the Metropolis method to the traveling salesman problem.

  14. Fatigue crack growth model RANDOM2 user manual. Appendix 1: Development of advanced methodologies for probabilistic constitutive relationships of material strength models

    NASA Technical Reports Server (NTRS)

    Boyce, Lola; Lovelace, Thomas B.

    1989-01-01

    FORTRAN program RANDOM2 is presented in the form of a user's manual. RANDOM2 is based on fracture mechanics using a probabilistic fatigue crack growth model. It predicts the random lifetime of an engine component to reach a given crack size. Details of the theoretical background, input data instructions, and a sample problem illustrating the use of the program are included.

  15. Effects of unstratified and centre-stratified randomization in multi-centre clinical trials.

    PubMed

    Anisimov, Vladimir V

    2011-01-01

    This paper deals with the analysis of randomization effects in multi-centre clinical trials. The two randomization schemes most often used in clinical trials are considered: unstratified and centre-stratified block-permuted randomization. The prediction of the number of patients randomized to different treatment arms in different regions during the recruitment period accounting for the stochastic nature of the recruitment and effects of multiple centres is investigated. A new analytic approach using a Poisson-gamma patient recruitment model (patients arrive at different centres according to Poisson processes with rates sampled from a gamma distributed population) and its further extensions is proposed. Closed-form expressions for corresponding distributions of the predicted number of the patients randomized in different regions are derived. In the case of two treatments, the properties of the total imbalance in the number of patients on treatment arms caused by using centre-stratified randomization are investigated and for a large number of centres a normal approximation of imbalance is proved. The impact of imbalance on the power of the study is considered. It is shown that the loss of statistical power is practically negligible and can be compensated by a minor increase in sample size. The influence of patient dropout is also investigated. The impact of randomization on predicted drug supply overage is discussed. Copyright © 2010 John Wiley & Sons, Ltd.

  16. Underestimating extreme events in power-law behavior due to machine-dependent cutoffs

    NASA Astrophysics Data System (ADS)

    Radicchi, Filippo

    2014-11-01

    Power-law distributions are typical macroscopic features occurring in almost all complex systems observable in nature. As a result, researchers in quantitative analyses must often generate random synthetic variates obeying power-law distributions. The task is usually performed through standard methods that map uniform random variates into the desired probability space. Whereas all these algorithms are theoretically solid, in this paper we show that they are subject to severe machine-dependent limitations. As a result, two dramatic consequences arise: (i) the sampling in the tail of the distribution is not random but deterministic; (ii) the moments of the sample distribution, which are theoretically expected to diverge as functions of the sample sizes, converge instead to finite values. We provide quantitative indications for the range of distribution parameters that can be safely handled by standard libraries used in computational analyses. Whereas our findings indicate possible reinterpretations of numerical results obtained through flawed sampling methodologies, they also pave the way for the search for a concrete solution to this central issue shared by all quantitative sciences dealing with complexity.

  17. An evaluation of flow-stratified sampling for estimating suspended sediment loads

    Treesearch

    Robert B. Thomas; Jack Lewis

    1995-01-01

    Abstract - Flow-stratified sampling is a new method for sampling water quality constituents such as suspended sediment to estimate loads. As with selection-at-list-time (SALT) and time-stratified sampling, flow-stratified sampling is a statistical method requiring random sampling, and yielding unbiased estimates of load and variance. It can be used to estimate event...

  18. Toward cost-efficient sampling methods

    NASA Astrophysics Data System (ADS)

    Luo, Peng; Li, Yongli; Wu, Chong; Zhang, Guijie

    2015-09-01

    The sampling method has been paid much attention in the field of complex network in general and statistical physics in particular. This paper proposes two new sampling methods based on the idea that a small part of vertices with high node degree could possess the most structure information of a complex network. The two proposed sampling methods are efficient in sampling high degree nodes so that they would be useful even if the sampling rate is low, which means cost-efficient. The first new sampling method is developed on the basis of the widely used stratified random sampling (SRS) method and the second one improves the famous snowball sampling (SBS) method. In order to demonstrate the validity and accuracy of two new sampling methods, we compare them with the existing sampling methods in three commonly used simulation networks that are scale-free network, random network, small-world network, and also in two real networks. The experimental results illustrate that the two proposed sampling methods perform much better than the existing sampling methods in terms of achieving the true network structure characteristics reflected by clustering coefficient, Bonacich centrality and average path length, especially when the sampling rate is low.

  19. Sampled-Data Consensus of Linear Multi-agent Systems With Packet Losses.

    PubMed

    Zhang, Wenbing; Tang, Yang; Huang, Tingwen; Kurths, Jurgen

    In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.

  20. Characterization of Friction Joints Subjected to High Levels of Random Vibration

    NASA Technical Reports Server (NTRS)

    deSantos, Omar; MacNeal, Paul

    2012-01-01

    This paper describes the test program in detail including test sample description, test procedures, and vibration test results of multiple test samples. The material pairs used in the experiment were Aluminum-Aluminum, Aluminum- Dicronite coated Aluminum, and Aluminum-Plasmadize coated Aluminum. Levels of vibration for each set of twelve samples of each material pairing were gradually increased until all samples experienced substantial displacement. Data was collected on 1) acceleration in all three axes, 2) relative static displacement between vibration runs utilizing photogrammetry techniques, and 3) surface galling and contaminant generation. This data was used to estimate the values of static friction during random vibratory motion when "stick-slip" occurs and compare these to static friction coefficients measured before and after vibration testing.

  1. Reproduction and optical analysis of Morpho-inspired polymeric nanostructures

    NASA Astrophysics Data System (ADS)

    Tippets, Cary A.; Fu, Yulan; Jackson, Anne-Martine; Donev, Eugenii U.; Lopez, Rene

    2016-06-01

    The brilliant blue coloration of the Morpho rhetenor butterfly originates from complex nanostructures found on the surface of its wings. The Morpho butterfly exhibits strong short-wavelength reflection and a unique two-lobe optical signature in the incident (θ) and reflected (ϕ) angular space. Here, we report the large-area fabrication of a Morpho-like structure and its reproduction in perfluoropolyether. Reflection comparisons of periodic and quasi-random ‘polymer butterfly’ nanostructures show similar normal-incidence spectra but differ in the angular θ-ϕ dependence. The periodic sample shows strong specular reflection and simple diffraction. However, the quasi-random sample produces a two-lobe angular reflection pattern with minimal specular refection, approximating the real butterfly’s optical behavior. Finite-difference time-domain simulations confirm that this pattern results from the quasi-random periodicity and highlights the significance of the inherent randomness in the Morpho’s photonic structure.

  2. Random Effects: Variance Is the Spice of Life.

    PubMed

    Jupiter, Daniel C

    Covariates in regression analyses allow us to understand how independent variables of interest impact our dependent outcome variable. Often, we consider fixed effects covariates (e.g., gender or diabetes status) for which we examine subjects at each value of the covariate. We examine both men and women and, within each gender, examine both diabetic and nondiabetic patients. Occasionally, however, we consider random effects covariates for which we do not examine subjects at every value. For example, we examine patients from only a sample of hospitals and, within each hospital, examine both diabetic and nondiabetic patients. The random sampling of hospitals is in contrast to the complete coverage of all genders. In this column I explore the differences in meaning and analysis when thinking about fixed and random effects variables. Copyright © 2016 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  3. CIDR

    Science.gov Websites

    * Minimum # Experimental Samples DNA Volume (ul) Genomic DNA Concentration (ng/ul) Low Input DNA Volume (ul . **Please inquire about additional cost for low input option. Genotyping Minimum # Experimental Samples DNA sample quality. If you do submit WGA samples, you should anticipate a higher non-random missing data rate

  4. CTEPP STANDARD OPERATING PROCEDURE FOR TELEPHONE SAMPLE SUBJECTS RECRUITMENT (SOP-1.12)

    EPA Science Inventory

    The subject recruitment procedures for the telephone sample component are described in the SOP. A random telephone sample list is ordered from a commercial survey sampling firm. Using this list, introductory letters are sent to targeted homes prior to making initial telephone c...

  5. Uncertainty in monitoring E. coli concentrations in streams and stormwater runoff

    NASA Astrophysics Data System (ADS)

    Harmel, R. D.; Hathaway, J. M.; Wagner, K. L.; Wolfe, J. E.; Karthikeyan, R.; Francesconi, W.; McCarthy, D. T.

    2016-03-01

    Microbial contamination of surface waters, a substantial public health concern throughout the world, is typically identified by fecal indicator bacteria such as Escherichia coli. Thus, monitoring E. coli concentrations is critical to evaluate current conditions, determine restoration effectiveness, and inform model development and calibration. An often overlooked component of these monitoring and modeling activities is understanding the inherent random and systematic uncertainty present in measured data. In this research, a review and subsequent analysis was performed to identify, document, and analyze measurement uncertainty of E. coli data collected in stream flow and stormwater runoff as individual discrete samples or throughout a single runoff event. Data on the uncertainty contributed by sample collection, sample preservation/storage, and laboratory analysis in measured E. coli concentrations were compiled and analyzed, and differences in sampling method and data quality scenarios were compared. The analysis showed that: (1) manual integrated sampling produced the lowest random and systematic uncertainty in individual samples, but automated sampling typically produced the lowest uncertainty when sampling throughout runoff events; (2) sample collection procedures often contributed the highest amount of uncertainty, although laboratory analysis introduced substantial random uncertainty and preservation/storage introduced substantial systematic uncertainty under some scenarios; and (3) the uncertainty in measured E. coli concentrations was greater than that of sediment and nutrients, but the difference was not as great as may be assumed. This comprehensive analysis of uncertainty in E. coli concentrations measured in streamflow and runoff should provide valuable insight for designing E. coli monitoring projects, reducing uncertainty in quality assurance efforts, regulatory and policy decision making, and fate and transport modeling.

  6. Data-Division-Specific Robustness and Power of Randomization Tests for ABAB Designs

    ERIC Educational Resources Information Center

    Manolov, Rumen; Solanas, Antonio; Bulte, Isis; Onghena, Patrick

    2010-01-01

    This study deals with the statistical properties of a randomization test applied to an ABAB design in cases where the desirable random assignment of the points of change in phase is not possible. To obtain information about each possible data division, the authors carried out a conditional Monte Carlo simulation with 100,000 samples for each…

  7. A Facility Specialist Model for Improving Retention of Nursing Home Staff: Results from a Randomized, Controlled Study

    ERIC Educational Resources Information Center

    Pillemer, Karl; Meador, Rhoda; Henderson, Charles, Jr.; Robison, Julie; Hegeman, Carol; Graham, Edwin; Schultz, Leslie

    2008-01-01

    Purpose: This article reports on a randomized, controlled intervention study designed to reduce employee turnover by creating a retention specialist position in nursing homes. Design and Methods: We collected data three times over a 1-year period in 30 nursing homes, sampled in stratified random manner from facilities in New York State and…

  8. SMERFS: Stochastic Markov Evaluation of Random Fields on the Sphere

    NASA Astrophysics Data System (ADS)

    Creasey, Peter; Lang, Annika

    2018-04-01

    SMERFS (Stochastic Markov Evaluation of Random Fields on the Sphere) creates large realizations of random fields on the sphere. It uses a fast algorithm based on Markov properties and fast Fourier Transforms in 1d that generates samples on an n X n grid in O(n2 log n) and efficiently derives the necessary conditional covariance matrices.

  9. MicroRNA array normalization: an evaluation using a randomized dataset as the benchmark.

    PubMed

    Qin, Li-Xuan; Zhou, Qin

    2014-01-01

    MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays.

  10. MicroRNA Array Normalization: An Evaluation Using a Randomized Dataset as the Benchmark

    PubMed Central

    Qin, Li-Xuan; Zhou, Qin

    2014-01-01

    MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays. PMID:24905456

  11. Using random telephone sampling to recruit generalizable samples for family violence studies.

    PubMed

    Slep, Amy M Smith; Heyman, Richard E; Williams, Mathew C; Van Dyke, Cheryl E; O'Leary, Susan G

    2006-12-01

    Convenience sampling methods predominate in recruiting for laboratory-based studies within clinical and family psychology. The authors used random digit dialing (RDD) to determine whether they could feasibly recruit generalizable samples for 2 studies (a parenting study and an intimate partner violence study). RDD screen response rate was 42-45%; demographics matched those in the 2000 U.S. Census, with small- to medium-sized differences on race, age, and income variables. RDD respondents who qualified for, but did not participate in, the laboratory study of parents showed small differences on income, couple conflicts, and corporal punishment. Time and cost are detailed, suggesting that RDD may be a feasible, effective method by which to recruit more generalizable samples for in-laboratory studies of family violence when those studies have sufficient resources. (c) 2006 APA, all rights reserved.

  12. Computational methods for efficient structural reliability and reliability sensitivity analysis

    NASA Technical Reports Server (NTRS)

    Wu, Y.-T.

    1993-01-01

    This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.

  13. Application of random effects to the study of resource selection by animals

    USGS Publications Warehouse

    Gillies, C.S.; Hebblewhite, M.; Nielsen, S.E.; Krawchuk, M.A.; Aldridge, Cameron L.; Frair, J.L.; Saher, D.J.; Stevens, C.E.; Jerde, C.L.

    2006-01-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence.2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability.3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed.4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects.5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection.6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  14. Application of random effects to the study of resource selection by animals.

    PubMed

    Gillies, Cameron S; Hebblewhite, Mark; Nielsen, Scott E; Krawchuk, Meg A; Aldridge, Cameron L; Frair, Jacqueline L; Saher, D Joanne; Stevens, Cameron E; Jerde, Christopher L

    2006-07-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence. 2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability. 3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed. 4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects. 5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection. 6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  15. Robust reliable sampled-data control for switched systems with application to flight control

    NASA Astrophysics Data System (ADS)

    Sakthivel, R.; Joby, Maya; Shi, P.; Mathiyalagan, K.

    2016-11-01

    This paper addresses the robust reliable stabilisation problem for a class of uncertain switched systems with random delays and norm bounded uncertainties. The main aim of this paper is to obtain the reliable robust sampled-data control design which involves random time delay with an appropriate gain control matrix for achieving the robust exponential stabilisation for uncertain switched system against actuator failures. In particular, the involved delays are assumed to be randomly time-varying which obeys certain mutually uncorrelated Bernoulli distributed white noise sequences. By constructing an appropriate Lyapunov-Krasovskii functional (LKF) and employing an average-dwell time approach, a new set of criteria is derived for ensuring the robust exponential stability of the closed-loop switched system. More precisely, the Schur complement and Jensen's integral inequality are used in derivation of stabilisation criteria. By considering the relationship among the random time-varying delay and its lower and upper bounds, a new set of sufficient condition is established for the existence of reliable robust sampled-data control in terms of solution to linear matrix inequalities (LMIs). Finally, an illustrative example based on the F-18 aircraft model is provided to show the effectiveness of the proposed design procedures.

  16. Theory and implementation of a very high throughput true random number generator in field programmable gate array

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Yonggang, E-mail: wangyg@ustc.edu.cn; Hui, Cong; Liu, Chong

    The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving,more » so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.« less

  17. Theory and implementation of a very high throughput true random number generator in field programmable gate array.

    PubMed

    Wang, Yonggang; Hui, Cong; Liu, Chong; Xu, Chao

    2016-04-01

    The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving, so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.

  18. AUTOCLASSIFICATION OF THE VARIABLE 3XMM SOURCES USING THE RANDOM FOREST MACHINE LEARNING ALGORITHM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Farrell, Sean A.; Murphy, Tara; Lo, Kitty K., E-mail: s.farrell@physics.usyd.edu.au

    In the current era of large surveys and massive data sets, autoclassification of astrophysical sources using intelligent algorithms is becoming increasingly important. In this paper we present the catalog of variable sources in the Third XMM-Newton Serendipitous Source catalog (3XMM) autoclassified using the Random Forest machine learning algorithm. We used a sample of manually classified variable sources from the second data release of the XMM-Newton catalogs (2XMMi-DR2) to train the classifier, obtaining an accuracy of ∼92%. We also evaluated the effectiveness of identifying spurious detections using a sample of spurious sources, achieving an accuracy of ∼95%. Manual investigation of amore » random sample of classified sources confirmed these accuracy levels and showed that the Random Forest machine learning algorithm is highly effective at automatically classifying 3XMM sources. Here we present the catalog of classified 3XMM variable sources. We also present three previously unidentified unusual sources that were flagged as outlier sources by the algorithm: a new candidate supergiant fast X-ray transient, a 400 s X-ray pulsar, and an eclipsing 5 hr binary system coincident with a known Cepheid.« less

  19. Limits to randomness in paleobiologic models: the case of Phanerozoic species diversity

    NASA Technical Reports Server (NTRS)

    Sepkoski, J. J. Jr; Sepkoski JJ, J. r. (Principal Investigator)

    1994-01-01

    The question of how random, or unconstrained, paleobiologic models should be is examined with a case study: Signor's (1982, 1985) inverse calculation of levels of marine species diversity through the Phanerozoic. His calculation involved an ingenious model that estimated species numbers and species abundances in the world oceans of the past by correcting known numbers of fossil species for variations in sedimentary rocks available for sampling and in effort paleontologists might devote to sampling. The model proves robust to changes in possible shapes of species-abundance distributions, but it is sensitive to alterations in the assumption that paleontologists collect fossils at random. If it is assumed that ease of collecting varies with age of sediment (with the Cenozoic offering easy sampling) or that paleontologists tend to seek out rarer fossils, results of the inverse calculation change. In particular, the magnitude of the calculated Cenozoic diversity increase always declines from the factor of about seven as originally reported to something considerably smaller. This leaves open the problem of the magnitude of Cenozoic increase in marine species diversity, awaiting better empirical data and, perhaps, more exacting models, random or otherwise.

  20. Study Design Rigor in Animal-Experimental Research Published in Anesthesia Journals.

    PubMed

    Hoerauf, Janine M; Moss, Angela F; Fernandez-Bustamante, Ana; Bartels, Karsten

    2018-01-01

    Lack of reproducibility of preclinical studies has been identified as an impediment for translation of basic mechanistic research into effective clinical therapies. Indeed, the National Institutes of Health has revised its grant application process to require more rigorous study design, including sample size calculations, blinding procedures, and randomization steps. We hypothesized that the reporting of such metrics of study design rigor has increased over time for animal-experimental research published in anesthesia journals. PubMed was searched for animal-experimental studies published in 2005, 2010, and 2015 in primarily English-language anesthesia journals. A total of 1466 publications were graded on the performance of sample size estimation, randomization, and blinding. Cochran-Armitage test was used to assess linear trends over time for the primary outcome of whether or not a metric was reported. Interrater agreement for each of the 3 metrics (power, randomization, and blinding) was assessed using the weighted κ coefficient in a 10% random sample of articles rerated by a second investigator blinded to the ratings of the first investigator. A total of 1466 manuscripts were analyzed. Reporting for all 3 metrics of experimental design rigor increased over time (2005 to 2010 to 2015): for power analysis, from 5% (27/516), to 12% (59/485), to 17% (77/465); for randomization, from 41% (213/516), to 50% (243/485), to 54% (253/465); and for blinding, from 26% (135/516), to 38% (186/485), to 47% (217/465). The weighted κ coefficients and 98.3% confidence interval indicate almost perfect agreement between the 2 raters beyond that which occurs by chance alone (power, 0.93 [0.85, 1.0], randomization, 0.91 [0.85, 0.98], and blinding, 0.90 [0.84, 0.96]). Our hypothesis that reported metrics of rigor in animal-experimental studies in anesthesia journals have increased during the past decade was confirmed. More consistent reporting, or explicit justification for absence, of sample size calculations, blinding techniques, and randomization procedures could better enable readers to evaluate potential sources of bias in animal-experimental research manuscripts. Future studies should assess whether such steps lead to improved translation of animal-experimental anesthesia research into successful clinical trials.

  1. Wave Propagation inside Random Media

    NASA Astrophysics Data System (ADS)

    Cheng, Xiaojun

    This thesis presents results of studies of wave scattering within and transmission through random and periodic systems. The main focus is on energy profiles inside quasi-1D and 1D random media. The connection between transport and the states of the medium is manifested in the equivalence of the dimensionless conductance, g, and the Thouless number which is the ratio of the average linewidth and spacing of energy levels. This equivalence and theories regarding the energy profiles inside random media are based on the assumption that LDOS is uniform throughout the samples. We have conducted microwave measurements of the longitudinal energy profiles within disordered samples contained in a copper tube supporting multiple waveguide channels with an antenna moving along a slit on the tube. These measurements allow us to determine the local density of states (LDOS) at a location which is the sum of energy from all incoming channels on both sides. For diffusive samples, the LDOS is uniform and the energy profile decays linearly as expected. However, for localized samples, we find that the LDOS drops sharply towards the middle of the sample and the energy profile does not follow the result of the local diffusion theory where the LDOS is assumed to be uniform. We analyze the field spectra into quasi-normal modes and found that the mode linewidth and the number of modes saturates as the sample length increases. Thus the Thouless number saturates while the dimensionless conductance g continues to fall with increasing length, indicating that the modes are localized near the boundaries. This is in contrast to the general believing that g and Thouless number follow the same scaling behavior. Previous measurements show that single parameter scaling (SPS) still holds in the same sample where the LDOS is suppressed te{shi2014microwave}. We explore the extension of SPS to the interior of the sample by analyzing statistics of the logrithm of the energy density ln W(x) and found that =-x/l where l is the transport mean free path. The result does not depend on the sample length, which is counterintuitive yet remarkably simple. More supprisingly, the linear fall-off of energy profile holds for totally disordered random 1D layered samples in simulations where the LDOS is uniform as well as for single mode random waveguide experiments and 1D nearly periodic samples where the LDOS is suppressed in the middle of the sample. The generalization of the transmission matrix to the interior of quasi-1D random samples, which is defined as the field matrix, and its eigenvalues statistics are also discussed. The maximum energy deposition at a location is not the intensity of the first transmission eigenchannel but the eigenvalue of the first energy density eigenchannels at that cross section, which can be much greater than the average value. The contrast, which is the ratio of the intensity at the focused point to the background intensity, in optimal focusing is determined by the participation number of the energy density eigenvalues and its inverse gives the variance of the energy density at that cross section in a single configuration. We have also studied topological states in photonic structures. We have demonstrated robust propagation of electromagnetic waves along reconfigurable pathways within a topological photonic metacrystal. Since the wave is confined within the domain wall, which is the boundary between two distinct topological insulating systems, we can freely steer the wave by reconstructing the photonic structure. Other topics, such as speckle pattern evolutions and the effects of boundary conditions on the statistics of transmission eigenvalues and energy profiles are also discussed.

  2. Designing a national soil erosion monitoring network for England and Wales

    NASA Astrophysics Data System (ADS)

    Lark, Murray; Rawlins, Barry; Anderson, Karen; Evans, Martin; Farrow, Luke; Glendell, Miriam; James, Mike; Rickson, Jane; Quine, Timothy; Quinton, John; Brazier, Richard

    2014-05-01

    Although soil erosion is recognised as a significant threat to sustainable land use and may be a priority for action in any forthcoming EU Soil Framework Directive, those responsible for setting national policy with respect to erosion are constrained by a lack of robust, representative, data at large spatial scales. This reflects the process-orientated nature of much soil erosion research. Recognising this limitation, The UK Department for Environment, Food and Rural Affairs (Defra) established a project to pilot a cost-effective framework for monitoring of soil erosion in England and Wales (E&W). The pilot will compare different soil erosion monitoring methods at a site scale and provide statistical information for the final design of the full national monitoring network that will: provide unbiased estimates of the spatial mean of soil erosion rate across E&W (tonnes ha-1 yr-1) for each of three land-use classes - arable and horticultural grassland upland and semi-natural habitats quantify the uncertainty of these estimates with confidence intervals. Probability (design-based) sampling provides most efficient unbiased estimates of spatial means. In this study, a 16 hectare area (a square of 400 x 400 m) positioned at the centre of a 1-km grid cell, selected at random from mapped land use across E&W, provided the sampling support for measurement of erosion rates, with at least 94% of the support area corresponding to the target land use classes. Very small or zero erosion rates likely to be encountered at many sites reduce the sampling efficiency and make it difficult to compare different methods of soil erosion monitoring. Therefore, to increase the proportion of samples with larger erosion rates without biasing our estimates, we increased the inclusion probability density in areas where the erosion rate is likely to be large by using stratified random sampling. First, each sampling domain (land use class in E&W) was divided into strata; e.g. two sub-domains within which, respectively, small or no erosion rates, and moderate or larger erosion rates are expected. Each stratum was then sampled independently and at random. The sample density need not be equal in the two strata, but is known and is accounted for in the estimation of the mean and its standard error. To divide the domains into strata we used information on slope angle, previous interpretation of erosion susceptibility of the soil associations that correspond to the soil map of E&W at 1:250 000 (Soil Survey of England and Wales, 1983), and visual interpretation of evidence of erosion from aerial photography. While each domain could be stratified on the basis of the first two criteria, air photo interpretation across the whole country was not feasible. For this reason we used a two-phase random sampling for stratification (TPRS) design (de Gruijter et al., 2006). First, we formed an initial random sample of 1-km grid cells from the target domain. Second, each cell was then allocated to a stratum on the basis of the three criteria. A subset of the selected cells from each stratum were then selected for field survey at random, with a specified sampling density for each stratum so as to increase the proportion of cells where moderate or larger erosion rates were expected. Once measurements of erosion have been made, an estimate of the spatial mean of the erosion rate over the target domain, its standard error and associated uncertainty can be calculated by an expression which accounts for the estimated proportions of the two strata within the initial random sample. de Gruijter, J.J., Brus, D.J., Biekens, M.F.P. & Knotters, M. 2006. Sampling for Natural Resource Monitoring. Springer, Berlin. Soil Survey of England and Wales. 1983 National Soil Map NATMAP Vector 1:250,000. National Soil Research Institute, Cranfield University.

  3. Cluster-randomized Studies in Educational Research: Principles and Methodological Aspects

    PubMed Central

    Dreyhaupt, Jens; Mayer, Benjamin; Keis, Oliver; Öchsner, Wolfgang; Muche, Rainer

    2017-01-01

    An increasing number of studies are being performed in educational research to evaluate new teaching methods and approaches. These studies could be performed more efficiently and deliver more convincing results if they more strictly applied and complied with recognized standards of scientific studies. Such an approach could substantially increase the quality in particular of prospective, two-arm (intervention) studies that aim to compare two different teaching methods. A key standard in such studies is randomization, which can minimize systematic bias in study findings; such bias may result if the two study arms are not structurally equivalent. If possible, educational research studies should also achieve this standard, although this is not yet generally the case. Some difficulties and concerns exist, particularly regarding organizational and methodological aspects. An important point to consider in educational research studies is that usually individuals cannot be randomized, because of the teaching situation, and instead whole groups have to be randomized (so-called “cluster randomization”). Compared with studies with individual randomization, studies with cluster randomization normally require (significantly) larger sample sizes and more complex methods for calculating sample size. Furthermore, cluster-randomized studies require more complex methods for statistical analysis. The consequence of the above is that a competent expert with respective special knowledge needs to be involved in all phases of cluster-randomized studies. Studies to evaluate new teaching methods need to make greater use of randomization in order to achieve scientifically convincing results. Therefore, in this article we describe the general principles of cluster randomization and how to implement these principles, and we also outline practical aspects of using cluster randomization in prospective, two-arm comparative educational research studies. PMID:28584874

  4. Field-based random sampling without a sampling frame: control selection for a case-control study in rural Africa.

    PubMed

    Crampin, A C; Mwinuka, V; Malema, S S; Glynn, J R; Fine, P E

    2001-01-01

    Selection bias, particularly of controls, is common in case-control studies and may materially affect the results. Methods of control selection should be tailored both for the risk factors and disease under investigation and for the population being studied. We present here a control selection method devised for a case-control study of tuberculosis in rural Africa (Karonga, northern Malawi) that selects an age/sex frequency-matched random sample of the population, with a geographical distribution in proportion to the population density. We also present an audit of the selection process, and discuss the potential of this method in other settings.

  5. Synchronous scattering and diffraction from gold nanotextured surfaces with structure factors

    NASA Astrophysics Data System (ADS)

    Gu, Min-Jhong; Lee, Ming-Tsang; Huang, Chien-Hsun; Wu, Chi-Chun; Chen, Yu-Bin

    2018-05-01

    Synchronous scattering and diffraction were demonstrated using reflectance from gold nanotextured surfaces at oblique (θi = 15° and 60°) incidence of wavelength λ = 405 nm. Two samples of unique auto-correlation functions were cost-effectively fabricated. Multiple structure factors of their profiles were confirmed with Fourier expansions. Bi-directional reflectance function (BRDF) from these samples provided experimental proofs. On the other hand, standard deviation of height and unique auto-correlation function of each sample were used to generate surfaces numerically. Comparing their BRDF with those of totally random rough surfaces further suggested that structure factors in profile could reduce specular reflection more than totally random roughness.

  6. Porphyrins - urine test

    MedlinePlus

    Urine uroporphyrin; Urine coproporphyrin; Porphyria - uroporphyrin ... After you provide a urine sample, it is tested in the lab. This is called a random urine sample. If needed, your health care provider ...

  7. A primer on stand and forest inventory designs

    Treesearch

    H. Gyde Lund; Charles E. Thomas

    1989-01-01

    Covers designs for the inventory of stands and forests in detail and with worked-out examples. For stands, random sampling, line transects, ricochet plot, systematic sampling, single plot, cluster, subjective sampling and complete enumeration are discussed. For forests inventory, the main categories are subjective sampling, inventories without prior stand mapping,...

  8. Accuracy Sampling Design Bias on Coarse Spatial Resolution Land Cover Data in the Great Lakes Region (United States and Canada)

    EPA Science Inventory

    A number of articles have investigated the impact of sampling design on remotely sensed landcover accuracy estimates. Gong and Howarth (1990) found significant differences for Kappa accuracy values when comparing purepixel sampling, stratified random sampling, and stratified sys...

  9. Using maximum entropy modeling for optimal selection of sampling sites for monitoring networks

    USGS Publications Warehouse

    Stohlgren, Thomas J.; Kumar, Sunil; Barnett, David T.; Evangelista, Paul H.

    2011-01-01

    Environmental monitoring programs must efficiently describe state shifts. We propose using maximum entropy modeling to select dissimilar sampling sites to capture environmental variability at low cost, and demonstrate a specific application: sample site selection for the Central Plains domain (453,490 km2) of the National Ecological Observatory Network (NEON). We relied on four environmental factors: mean annual temperature and precipitation, elevation, and vegetation type. A “sample site” was defined as a 20 km × 20 km area (equal to NEON’s airborne observation platform [AOP] footprint), within which each 1 km2 cell was evaluated for each environmental factor. After each model run, the most environmentally dissimilar site was selected from all potential sample sites. The iterative selection of eight sites captured approximately 80% of the environmental envelope of the domain, an improvement over stratified random sampling and simple random designs for sample site selection. This approach can be widely used for cost-efficient selection of survey and monitoring sites.

  10. Hierarchical model analysis of the Atlantic Flyway Breeding Waterfowl Survey

    USGS Publications Warehouse

    Sauer, John R.; Zimmerman, Guthrie S.; Klimstra, Jon D.; Link, William A.

    2014-01-01

    We used log-linear hierarchical models to analyze data from the Atlantic Flyway Breeding Waterfowl Survey. The survey has been conducted by state biologists each year since 1989 in the northeastern United States from Virginia north to New Hampshire and Vermont. Although yearly population estimates from the survey are used by the United States Fish and Wildlife Service for estimating regional waterfowl population status for mallards (Anas platyrhynchos), black ducks (Anas rubripes), wood ducks (Aix sponsa), and Canada geese (Branta canadensis), they are not routinely adjusted to control for time of day effects and other survey design issues. The hierarchical model analysis permits estimation of year effects and population change while accommodating the repeated sampling of plots and controlling for time of day effects in counting. We compared population estimates from the current stratified random sample analysis to population estimates from hierarchical models with alternative model structures that describe year to year changes as random year effects, a trend with random year effects, or year effects modeled as 1-year differences. Patterns of population change from the hierarchical model results generally were similar to the patterns described by stratified random sample estimates, but significant visibility differences occurred between twilight to midday counts in all species. Controlling for the effects of time of day resulted in larger population estimates for all species in the hierarchical model analysis relative to the stratified random sample analysis. The hierarchical models also provided a convenient means of estimating population trend as derived statistics from the analysis. We detected significant declines in mallard and American black ducks and significant increases in wood ducks and Canada geese, a trend that had not been significant for 3 of these 4 species in the prior analysis. We recommend using hierarchical models for analysis of the Atlantic Flyway Breeding Waterfowl Survey.

  11. Early morning urine collection to improve urinary lateral flow LAM assay sensitivity in hospitalised patients with HIV-TB co-infection.

    PubMed

    Gina, Phindile; Randall, Philippa J; Muchinga, Tapuwa E; Pooran, Anil; Meldau, Richard; Peter, Jonny G; Dheda, Keertan

    2017-05-12

    Urine LAM testing has been approved by the WHO for use in hospitalised patients with advanced immunosuppression. However, sensitivity remains suboptimal. We therefore examined the incremental diagnostic sensitivity of early morning urine (EMU) versus random urine sampling using the Determine® lateral flow lipoarabinomannan assay (LF-LAM) in HIV-TB co-infected patients. Consenting HIV-infected inpatients, screened as part of a larger prospective randomized controlled trial, that were treated for TB, and could donate matched random and EMU samples were included. Thus paired sample were collected from the same patient, LF-LAM was graded using the pre-January 2014, with grade 1 and 2 manufacturer-designated cut-points (the latter designated grade 1 after January 2014). Single sputum Xpert-MTB/RIF and/or TB culture positivity served as the reference standard (definite TB). Those treated for TB but not meeting this standard were designated probable TB. 123 HIV-infected patients commenced anti-TB treatment and provided matched random and EMU samples. 33% (41/123) and 67% (82/123) had definite and probable TB, respectively. Amongst those with definite TB LF-LAM sensitivity (95%CI), using the grade 2 cut-point, increased from 12% (5-24; 5/43) to 39% (26-54; 16/41) with random versus EMU, respectively (p = 0.005). Similarly, amongst probable TB, LF-LAM sensitivity increased from 10% (5-17; 8/83) to 24% (16-34; 20/82) (p = 0.001). LF-LAM specificity was not determined. This proof of concept study indicates that EMU could improve the sensitivity of LF-LAM in hospitalised TB-HIV co-infected patients. These data have implications for clinical practice.

  12. A comparison of two sampling approaches for assessing the urban forest canopy cover from aerial photography.

    Treesearch

    Ucar Zennure; Pete Bettinger; Krista Merry; Jacek Siry; J.M. Bowker

    2016-01-01

    Two different sampling approaches for estimating urban tree canopy cover were applied to two medium-sized cities in the United States, in conjunction with two freely available remotely sensed imagery products. A random point-based sampling approach, which involved 1000 sample points, was compared against a plot/grid sampling (cluster sampling) approach that involved a...

  13. Fatigue strength reduction model: RANDOM3 and RANDOM4 user manual, appendix 2

    NASA Technical Reports Server (NTRS)

    Boyce, Lola; Lovelace, Thomas B.

    1989-01-01

    The FORTRAN programs RANDOM3 and RANDOM4 are documented. They are based on fatigue strength reduction, using a probabilistic constitutive model. They predict the random lifetime of an engine component to reach a given fatigue strength. Included in this user manual are details regarding the theoretical backgrounds of RANDOM3 and RANDOM4. Appendix A gives information on the physical quantities, their symbols, FORTRAN names, and both SI and U.S. Customary units. Appendix B and C include photocopies of the actual computer printout corresponding to the sample problems. Appendices D and E detail the IMSL, Version 10(1), subroutines and functions called by RANDOM3 and RANDOM4 and SAS/GRAPH(2) programs that can be used to plot both the probability density functions (p.d.f.) and the cumulative distribution functions (c.d.f.).

  14. (I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research.

    PubMed

    van Rijnsoever, Frank J

    2017-01-01

    I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: "random chance," which is based on probability sampling, "minimal information," which yields at least one new code per sampling step, and "maximum information," which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.

  15. Random Sampling with Interspike-Intervals of the Exponential Integrate and Fire Neuron: A Computational Interpretation of UP-States.

    PubMed

    Steimer, Andreas; Schindler, Kaspar

    2015-01-01

    Oscillations between high and low values of the membrane potential (UP and DOWN states respectively) are an ubiquitous feature of cortical neurons during slow wave sleep and anesthesia. Nevertheless, a surprisingly small number of quantitative studies have been conducted only that deal with this phenomenon's implications for computation. Here we present a novel theory that explains on a detailed mathematical level the computational benefits of UP states. The theory is based on random sampling by means of interspike intervals (ISIs) of the exponential integrate and fire (EIF) model neuron, such that each spike is considered a sample, whose analog value corresponds to the spike's preceding ISI. As we show, the EIF's exponential sodium current, that kicks in when balancing a noisy membrane potential around values close to the firing threshold, leads to a particularly simple, approximative relationship between the neuron's ISI distribution and input current. Approximation quality depends on the frequency spectrum of the current and is improved upon increasing the voltage baseline towards threshold. Thus, the conceptually simpler leaky integrate and fire neuron that is missing such an additional current boost performs consistently worse than the EIF and does not improve when voltage baseline is increased. For the EIF in contrast, the presented mechanism is particularly effective in the high-conductance regime, which is a hallmark feature of UP-states. Our theoretical results are confirmed by accompanying simulations, which were conducted for input currents of varying spectral composition. Moreover, we provide analytical estimations of the range of ISI distributions the EIF neuron can sample from at a given approximation level. Such samples may be considered by any algorithmic procedure that is based on random sampling, such as Markov Chain Monte Carlo or message-passing methods. Finally, we explain how spike-based random sampling relates to existing computational theories about UP states during slow wave sleep and present possible extensions of the model in the context of spike-frequency adaptation.

  16. A topological analysis of large-scale structure, studied using the CMASS sample of SDSS-III

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parihar, Prachi; Gott, J. Richard III; Vogeley, Michael S.

    2014-12-01

    We study the three-dimensional genus topology of large-scale structure using the northern region of the CMASS Data Release 10 (DR10) sample of the SDSS-III Baryon Oscillation Spectroscopic Survey. We select galaxies with redshift 0.452 < z < 0.625 and with a stellar mass M {sub stellar} > 10{sup 11.56} M {sub ☉}. We study the topology at two smoothing lengths: R {sub G} = 21 h {sup –1} Mpc and R {sub G} = 34 h {sup –1} Mpc. The genus topology studied at the R {sub G} = 21 h {sup –1} Mpc scale results in the highest genusmore » amplitude observed to date. The CMASS sample yields a genus curve that is characteristic of one produced by Gaussian random phase initial conditions. The data thus support the standard model of inflation where random quantum fluctuations in the early universe produced Gaussian random phase initial conditions. Modest deviations in the observed genus from random phase are as expected from shot noise effects and the nonlinear evolution of structure. We suggest the use of a fitting formula motivated by perturbation theory to characterize the shift and asymmetries in the observed genus curve with a single parameter. We construct 54 mock SDSS CMASS surveys along the past light cone from the Horizon Run 3 (HR3) N-body simulations, where gravitationally bound dark matter subhalos are identified as the sites of galaxy formation. We study the genus topology of the HR3 mock surveys with the same geometry and sampling density as the observational sample and find the observed genus topology to be consistent with ΛCDM as simulated by the HR3 mock samples. We conclude that the topology of the large-scale structure in the SDSS CMASS sample is consistent with cosmological models having primordial Gaussian density fluctuations growing in accordance with general relativity to form galaxies in massive dark matter halos.« less

  17. The impact of privacy protections on recruitment in a multicenter stroke genetics study

    PubMed Central

    Chen, D.T.; Worrall, B.B.; Brown, R.D.; Brott, T.G.; Kissela, B.M.; Olson, T.S.; Rich, S.S.; Meschia, J.F.

    2006-01-01

    The authors reviewed the recruitment of stroke-affected sibling pairs using a letter-based, proband-initiated contact strategy. The authors randomly sampled 99 proband enrollment forms (Phase 1) and randomly sampled 50 sibling reply cards (Phase 2). The sibling response rate was 30.6%, for a pedigree response rate of 58%. Of the siblings who replied, 96% authorized further contact. Median time from proband enrollment to pedigree DNA banking, which required 3+ probands, was 134 days. PMID:15728301

  18. The Quality of Reporting of Abstracts in Physical Therapy Literature is Suboptimal: Cross-Sectional, Bibliographic Analysis.

    PubMed

    Richter, Randy R; Sebelski, Chris A; Austin, Tricia M

    2016-09-01

    The quality of abstract reporting in physical therapy literature is unknown. The purpose of this study was to provide baseline data for judging the future impact of the 2010 Consolidated Standards of Reporting Trials statement specifically referencing the 2008 Consolidated Standards of Reporting Trials statement for reporting of abstracts of randomized controlled trials across and between a broad sample and a core sample of physical therapy literature. A cross-sectional, bibliographic analysis was conducted. Abstracts of randomized controlled trials from 2009 were retrieved from PubMed, PEDro, and CENTRAL. Eligibility was determined using PEDro criteria. For outcomes measures, items from the Consolidated Standards of Reporting Trials statement for abstract reporting were used for assessment. Raters were not blinded to citation details. Using a computer-generated set of random numbers, 150 abstracts from 112 journals comprised the broad sample. A total of 53 abstracts comprised the core sample. Fourteen of 20 Consolidated Standards of Reporting Trials items for both samples were reported in less than 50% of the abstracts. Significantly more abstracts in the core sample reported (% difference core - broad; 95% confidence interval) title (28.4%; 12.9%-41.2%), blinding (15.2%; 1.6%-29.8%), setting (47.6%; 32.4%-59.4%), and confidence intervals (13.1%; 5.0%-25.1%). These findings provide baseline data for determining if continuing efforts to improve abstract reporting are heeded.

  19. Simultaneous escaping of explicit and hidden free energy barriers: application of the orthogonal space random walk strategy in generalized ensemble based conformational sampling.

    PubMed

    Zheng, Lianqing; Chen, Mengen; Yang, Wei

    2009-06-21

    To overcome the pseudoergodicity problem, conformational sampling can be accelerated via generalized ensemble methods, e.g., through the realization of random walks along prechosen collective variables, such as spatial order parameters, energy scaling parameters, or even system temperatures or pressures, etc. As usually observed, in generalized ensemble simulations, hidden barriers are likely to exist in the space perpendicular to the collective variable direction and these residual free energy barriers could greatly abolish the sampling efficiency. This sampling issue is particularly severe when the collective variable is defined in a low-dimension subset of the target system; then the "Hamiltonian lagging" problem, which reveals the fact that necessary structural relaxation falls behind the move of the collective variable, may be likely to occur. To overcome this problem in equilibrium conformational sampling, we adopted the orthogonal space random walk (OSRW) strategy, which was originally developed in the context of free energy simulation [L. Zheng, M. Chen, and W. Yang, Proc. Natl. Acad. Sci. U.S.A. 105, 20227 (2008)]. Thereby, generalized ensemble simulations can simultaneously escape both the explicit barriers along the collective variable direction and the hidden barriers that are strongly coupled with the collective variable move. As demonstrated in our model studies, the present OSRW based generalized ensemble treatments show improved sampling capability over the corresponding classical generalized ensemble treatments.

  20. Sampling error in timber surveys

    Treesearch

    Austin Hasel

    1938-01-01

    Various sampling strategies are evaluated for efficiency in an interior ponderosa pine forest. In a 5760 acre tract, efficiency was gained by stratifying into quarter acre blocks and sampling randomly from within. A systematic cruise was found to be superior for volume estimation.

  1. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  2. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  3. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  4. Local polynomial chaos expansion for linear differential equations with high dimensional random inputs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yi; Jakeman, John; Gittelson, Claude

    2015-01-08

    In this paper we present a localized polynomial chaos expansion for partial differential equations (PDE) with random inputs. In particular, we focus on time independent linear stochastic problems with high dimensional random inputs, where the traditional polynomial chaos methods, and most of the existing methods, incur prohibitively high simulation cost. Furthermore, the local polynomial chaos method employs a domain decomposition technique to approximate the stochastic solution locally. In each subdomain, a subdomain problem is solved independently and, more importantly, in a much lower dimensional random space. In a postprocesing stage, accurate samples of the original stochastic problems are obtained frommore » the samples of the local solutions by enforcing the correct stochastic structure of the random inputs and the coupling conditions at the interfaces of the subdomains. Overall, the method is able to solve stochastic PDEs in very large dimensions by solving a collection of low dimensional local problems and can be highly efficient. In our paper we present the general mathematical framework of the methodology and use numerical examples to demonstrate the properties of the method.« less

  5. A Practical Methodology for Quantifying Random and Systematic Components of Unexplained Variance in a Wind Tunnel

    NASA Technical Reports Server (NTRS)

    Deloach, Richard; Obara, Clifford J.; Goodman, Wesley L.

    2012-01-01

    This paper documents a check standard wind tunnel test conducted in the Langley 0.3-Meter Transonic Cryogenic Tunnel (0.3M TCT) that was designed and analyzed using the Modern Design of Experiments (MDOE). The test designed to partition the unexplained variance of typical wind tunnel data samples into two constituent components, one attributable to ordinary random error, and one attributable to systematic error induced by covariate effects. Covariate effects in wind tunnel testing are discussed, with examples. The impact of systematic (non-random) unexplained variance on the statistical independence of sequential measurements is reviewed. The corresponding correlation among experimental errors is discussed, as is the impact of such correlation on experimental results generally. The specific experiment documented herein was organized as a formal test for the presence of unexplained variance in representative samples of wind tunnel data, in order to quantify the frequency with which such systematic error was detected, and its magnitude relative to ordinary random error. Levels of systematic and random error reported here are representative of those quantified in other facilities, as cited in the references.

  6. Accounting for selection bias in association studies with complex survey data.

    PubMed

    Wirth, Kathleen E; Tchetgen Tchetgen, Eric J

    2014-05-01

    Obtaining representative information from hidden and hard-to-reach populations is fundamental to describe the epidemiology of many sexually transmitted diseases, including HIV. Unfortunately, simple random sampling is impractical in these settings, as no registry of names exists from which to sample the population at random. However, complex sampling designs can be used, as members of these populations tend to congregate at known locations, which can be enumerated and sampled at random. For example, female sex workers may be found at brothels and street corners, whereas injection drug users often come together at shooting galleries. Despite the logistical appeal, complex sampling schemes lead to unequal probabilities of selection, and failure to account for this differential selection can result in biased estimates of population averages and relative risks. However, standard techniques to account for selection can lead to substantial losses in efficiency. Consequently, researchers implement a variety of strategies in an effort to balance validity and efficiency. Some researchers fully or partially account for the survey design, whereas others do nothing and treat the sample as a realization of the population of interest. We use directed acyclic graphs to show how certain survey sampling designs, combined with subject-matter considerations unique to individual exposure-outcome associations, can induce selection bias. Finally, we present a novel yet simple maximum likelihood approach for analyzing complex survey data; this approach optimizes statistical efficiency at no cost to validity. We use simulated data to illustrate this method and compare it with other analytic techniques.

  7. Assessment on Experimental Bacterial Biofilms and in Clinical Practice of the Efficacy of Sampling Solutions for Microbiological Testing of Endoscopes

    PubMed Central

    Aumeran, C.; Thibert, E.; Chapelle, F. A.; Hennequin, C.; Lesens, O.

    2012-01-01

    Opinions differ on the value of microbiological testing of endoscopes, which varies according to the technique used. We compared the efficacy on bacterial biofilms of sampling solutions used for the surveillance of the contamination of endoscope channels. To compare efficacy, we used an experimental model of a 48-h Pseudomonas biofilm grown on endoscope internal tubing. Sampling of this experimental biofilm was performed with a Tween 80-lecithin-based solution, saline, and sterile water. We also performed a randomized prospective study during routine clinical practice in our hospital sampling randomly with two different solutions the endoscopes after reprocessing. Biofilm recovery expressed as a logarithmic ratio of bacteria recovered on bacteria initially present in biofilm was significantly more effective with the Tween 80-lecithin-based solution than with saline solution (P = 0.002) and sterile water (P = 0.002). There was no significant difference between saline and sterile water. In the randomized clinical study, the rates of endoscopes that were contaminated with the Tween 80-lecithin-based sampling solution and the saline were 8/25 and 1/25, respectively (P = 0.02), and the mean numbers of bacteria recovered were 281 and 19 CFU/100 ml (P = 0.001), respectively. In conclusion, the efficiency and therefore the value of the monitoring of endoscope reprocessing by microbiological cultures is dependent on the sampling solutions used. A sampling solution with a tensioactive action is more efficient than saline in detecting biofilm contamination of endoscopes. PMID:22170930

  8. Random versus fixed-site sampling when monitoring relative abundance of fishes in headwater streams of the upper Colorado River basin

    USGS Publications Warehouse

    Quist, M.C.; Gerow, K.G.; Bower, M.R.; Hubert, W.A.

    2006-01-01

    Native fishes of the upper Colorado River basin (UCRB) have declined in distribution and abundance due to habitat degradation and interactions with normative fishes. Consequently, monitoring populations of both native and nonnative fishes is important for conservation of native species. We used data collected from Muddy Creek, Wyoming (2003-2004), to compare sample size estimates using a random and a fixed-site sampling design to monitor changes in catch per unit effort (CPUE) of native bluehead suckers Catostomus discobolus, flannelmouth suckers C. latipinnis, roundtail chub Gila robusta, and speckled dace Rhinichthys osculus, as well as nonnative creek chub Semotilus atromaculatus and white suckers C. commersonii. When one-pass backpack electrofishing was used, detection of 10% or 25% changes in CPUE (fish/100 m) at 60% statistical power required 50-1,000 randomly sampled reaches among species regardless of sampling design. However, use of a fixed-site sampling design with 25-50 reaches greatly enhanced the ability to detect changes in CPUE. The addition of seining did not appreciably reduce required effort. When detection of 25-50% changes in CPUE of native and nonnative fishes is acceptable, we recommend establishment of 25-50 fixed reaches sampled by one-pass electrofishing in Muddy Creek. Because Muddy Creek has habitat and fish assemblages characteristic of other headwater streams in the UCRB, our results are likely to apply to many other streams in the basin. ?? Copyright by the American Fisheries Society 2006.

  9. Decision tree modeling using R.

    PubMed

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

  10. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review.

    PubMed

    Rutterford, Clare; Taljaard, Monica; Dixon, Stephanie; Copas, Andrew; Eldridge, Sandra

    2015-06-01

    To assess the quality of reporting and accuracy of a priori estimates used in sample size calculations for cluster randomized trials (CRTs). We reviewed 300 CRTs published between 2000 and 2008. The prevalence of reporting sample size elements from the 2004 CONSORT recommendations was evaluated and a priori estimates compared with those observed in the trial. Of the 300 trials, 166 (55%) reported a sample size calculation. Only 36 of 166 (22%) reported all recommended descriptive elements. Elements specific to CRTs were the worst reported: a measure of within-cluster correlation was specified in only 58 of 166 (35%). Only 18 of 166 articles (11%) reported both a priori and observed within-cluster correlation values. Except in two cases, observed within-cluster correlation values were either close to or less than a priori values. Even with the CONSORT extension for cluster randomization, the reporting of sample size elements specific to these trials remains below that necessary for transparent reporting. Journal editors and peer reviewers should implement stricter requirements for authors to follow CONSORT recommendations. Authors should report observed and a priori within-cluster correlation values to enable comparisons between these over a wider range of trials. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  11. A study of active learning methods for named entity recognition in clinical text.

    PubMed

    Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu; Denny, Joshua C; Xu, Hua

    2015-12-01

    Named entity recognition (NER), a sequential labeling task, is one of the fundamental tasks for building clinical natural language processing (NLP) systems. Machine learning (ML) based approaches can achieve good performance, but they often require large amounts of annotated samples, which are expensive to build due to the requirement of domain experts in annotation. Active learning (AL), a sample selection approach integrated with supervised ML, aims to minimize the annotation cost while maximizing the performance of ML-based models. In this study, our goal was to develop and evaluate both existing and new AL methods for a clinical NER task to identify concepts of medical problems, treatments, and lab tests from the clinical notes. Using the annotated NER corpus from the 2010 i2b2/VA NLP challenge that contained 349 clinical documents with 20,423 unique sentences, we simulated AL experiments using a number of existing and novel algorithms in three different categories including uncertainty-based, diversity-based, and baseline sampling strategies. They were compared with the passive learning that uses random sampling. Learning curves that plot performance of the NER model against the estimated annotation cost (based on number of sentences or words in the training set) were generated to evaluate different active learning and the passive learning methods and the area under the learning curve (ALC) score was computed. Based on the learning curves of F-measure vs. number of sentences, uncertainty sampling algorithms outperformed all other methods in ALC. Most diversity-based methods also performed better than random sampling in ALC. To achieve an F-measure of 0.80, the best method based on uncertainty sampling could save 66% annotations in sentences, as compared to random sampling. For the learning curves of F-measure vs. number of words, uncertainty sampling methods again outperformed all other methods in ALC. To achieve 0.80 in F-measure, in comparison to random sampling, the best uncertainty based method saved 42% annotations in words. But the best diversity based method reduced only 7% annotation effort. In the simulated setting, AL methods, particularly uncertainty-sampling based approaches, seemed to significantly save annotation cost for the clinical NER task. The actual benefit of active learning in clinical NER should be further evaluated in a real-time setting. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Predictors of Suicide Ideation in a Random Digit Dial Study: Exposure to Suicide Matters.

    PubMed

    van de Venne, Judy; Cerel, Julie; Moore, Melinda; Maple, Myfanwy

    2017-07-03

    Suicide is an important public health concern requiring ongoing research to understand risk factors for suicide ideation. A dual-frame, random digit dial survey was utilized to identify demographic and suicide-related factors associated with suicide ideation in a statewide sample of 1,736 adults. The PH-Q 9 Depression scale suicide ideation question was used to assess current suicide ideation in both the full sample and suicide exposed sub-sample. Being non-married and having previous suicide exposure were separately associated with higher risks of suicide ideation in the full sample. Being male, having increased suicide exposures, and having increased perceptions of closeness to the decedent increased risks, while older age decreased risks for the suicide exposed. Implications for future screening and research are discussed.

  13. Random glucose is useful for individual prediction of type 2 diabetes: results of the Study of Health in Pomerania (SHIP).

    PubMed

    Kowall, Bernd; Rathmann, Wolfgang; Giani, Guido; Schipf, Sabine; Baumeister, Sebastian; Wallaschofski, Henri; Nauck, Matthias; Völzke, Henry

    2013-04-01

    Random glucose is widely used in routine clinical practice. We investigated whether this non-standardized glycemic measure is useful for individual diabetes prediction. The Study of Health in Pomerania (SHIP), a population-based cohort study in north-east Germany, included 3107 diabetes-free persons aged 31-81 years at baseline in 1997-2001. 2475 persons participated at 5-year follow-up and gave self-reports of incident diabetes. For the total sample and for subjects aged ≥50 years, statistical properties of prediction models with and without random glucose were compared. A basic model (including age, sex, diabetes of parents, hypertension and waist circumference) and a comprehensive model (additionally including various lifestyle variables and blood parameters, but not HbA1c) performed statistically significantly better after adding random glucose (e.g., the area under the receiver-operating curve (AROC) increased from 0.824 to 0.856 after adding random glucose to the comprehensive model in the total sample). Likewise, adding random glucose to prediction models which included HbA1c led to significant improvements of predictive ability (e.g., for subjects ≥50 years, AROC increased from 0.824 to 0.849 after adding random glucose to the comprehensive model+HbA1c). Random glucose is useful for individual diabetes prediction, and improves prediction models including HbA1c. Copyright © 2012 Primary Care Diabetes Europe. Published by Elsevier Ltd. All rights reserved.

  14. Linking species richness curves from non-contiguous sampling to contiguous-nested SAR: An empirical study

    NASA Astrophysics Data System (ADS)

    Lazarina, Maria; Kallimanis, Athanasios S.; Pantis, John D.; Sgardelis, Stefanos P.

    2014-11-01

    The species-area relationship (SAR) is one of the few generalizations in ecology. However, many different relationships are denoted as SARs. Here, we empirically evaluated the differences between SARs derived from nested-contiguous and non-contiguous sampling designs, using plants, birds and butterflies datasets from Great Britain, Greece, Massachusetts, New York and San Diego. The shape of SAR depends on the sampling scheme, but there is little empirical documentation on the magnitude of the deviation between different types of SARs and the factors affecting it. We implemented a strictly nested sampling design to construct nested-contiguous SAR (SACR), and systematic nested but non-contiguous, and random designs to construct non-contiguous species richness curves (SASRs for systematic and SACs for random designs) per dataset. The SACR lay below any SASR and most of the SACs. The deviation between them was related to the exponent f of the power law relationship between sampled area and extent. The lower the exponent f, the higher was the deviation between the curves. We linked SACR to SASR and SAC through the concept of "effective" area (Ae), i.e. the nested-contiguous area containing equal number of species with the accumulated sampled area (AS) of a non-contiguous sampling. The relationship between effective and sampled area was modeled as log(Ae) = klog(AS). A Generalized Linear Model was used to estimate the values of k from sampling design and dataset properties. The parameter k increased with the average distance between samples and with beta diversity, while k decreased with f. For both systematic and random sampling, the model performed well in predicting effective area in both the training set and in the test set which was totally independent from the training one. Through effective area, we can link different types of species richness curves based on sampling design properties, sampling effort, spatial scale and beta diversity patterns.

  15. Insulation workers in Belfast. 1. Comparison of a random sample with a control population1

    PubMed Central

    Wallace, William F. M.; Langlands, Jean H. M.

    1971-01-01

    Wallace, W. F. M., and Langlands, J. H. M. (1971).Brit. J. industr. Med.,28, 211-216. Insulation workers in Belfast. 1. Comparison of a random sample with a control population. A sample of 50 men was chosen at random from the population of asbestos insulators in Belfast and matched with a control series of men of similar occupational group with respect to age, height, and smoking habit. Significantly more of the insulators complained of cough and sputum and had basal rales on examination. Clubbing was assessed by means of measurements of the hyponychial angle of both index fingers. These angles were significantly greater in the group of insulators. Twenty-one insulators had ϰ-rays which showed pleural calcification with or without pulmonary fibrosis; one control ϰ-ray showed pulmonary fibrosis. The insulators had no evidence of airways obstruction but static lung volume was reduced and their arterial oxygen tension was lower than that of the controls and their alveolar-arterial oxygen gradient was greater. PMID:5557841

  16. CONSISTENCY UNDER SAMPLING OF EXPONENTIAL RANDOM GRAPH MODELS.

    PubMed

    Shalizi, Cosma Rohilla; Rinaldo, Alessandro

    2013-04-01

    The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling , or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM's expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses.

  17. Dynamical traps in Wang-Landau sampling of continuous systems: Mechanism and solution

    NASA Astrophysics Data System (ADS)

    Koh, Yang Wei; Sim, Adelene Y. L.; Lee, Hwee Kuan

    2015-08-01

    We study the mechanism behind dynamical trappings experienced during Wang-Landau sampling of continuous systems reported by several authors. Trapping is caused by the random walker coming close to a local energy extremum, although the mechanism is different from that of the critical slowing-down encountered in conventional molecular dynamics or Monte Carlo simulations. When trapped, the random walker misses the entire or even several stages of Wang-Landau modification factor reduction, leading to inadequate sampling of the configuration space and a rough density of states, even though the modification factor has been reduced to very small values. Trapping is dependent on specific systems, the choice of energy bins, and the Monte Carlo step size, making it highly unpredictable. A general, simple, and effective solution is proposed where the configurations of multiple parallel Wang-Landau trajectories are interswapped to prevent trapping. We also explain why swapping frees the random walker from such traps. The efficacy of the proposed algorithm is demonstrated.

  18. CONSISTENCY UNDER SAMPLING OF EXPONENTIAL RANDOM GRAPH MODELS

    PubMed Central

    Shalizi, Cosma Rohilla; Rinaldo, Alessandro

    2015-01-01

    The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling, or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM’s expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses. PMID:26166910

  19. Quality of different in-clinic test systems for feline immunodeficiency virus and feline leukaemia virus infection.

    PubMed

    Hartmann, Katrin; Griessmayr, Pascale; Schulz, Bianka; Greene, Craig E; Vidyashankar, Anand N; Jarrett, Os; Egberink, Herman F

    2007-12-01

    Many new diagnostic in-house tests for identification of feline immunodeficiency virus (FIV) and feline leukaemia virus (FeLV) infection have been licensed for use in veterinary practice, and the question of the relative merits of these kits has prompted comparative studies. This study was designed to define the strengths and weaknesses of seven FIV and eight FeLV tests that are commercially available. In this study, 536 serum samples from randomly selected cats were tested. Those samples reacting FIV-positive in at least one of the tests were confirmed by Western blot, and those reacting FeLV-positive were confirmed by virus isolation. In addition, a random selection of samples testing negative in all test systems was re-tested by Western blot (100 samples) and by virus isolation (81 samples). Specificity, sensitivity, positive and negative predictive values of each test and the quality of the results were compared.

  20. A Short Research Note on Calculating Exact Distribution Functions and Random Sampling for the 3D NFW Profile

    NASA Astrophysics Data System (ADS)

    Robotham, A. S. G.; Howlett, Cullan

    2018-06-01

    In this short note we publish the analytic quantile function for the Navarro, Frenk & White (NFW) profile. All known published and coded methods for sampling from the 3D NFW PDF use either accept-reject, or numeric interpolation (sometimes via a lookup table) for projecting random Uniform samples through the quantile distribution function to produce samples of the radius. This is a common requirement in N-body initial condition (IC), halo occupation distribution (HOD), and semi-analytic modelling (SAM) work for correctly assigning particles or galaxies to positions given an assumed concentration for the NFW profile. Using this analytic description allows for much faster and cleaner code to solve a common numeric problem in modern astronomy. We release R and Python versions of simple code that achieves this sampling, which we note is trivial to reproduce in any modern programming language.

  1. A tale of two "forests": random forest machine learning AIDS tropical forest carbon mapping.

    PubMed

    Mascaro, Joseph; Asner, Gregory P; Knapp, David E; Kennedy-Bowdoin, Ty; Martin, Roberta E; Anderson, Christopher; Higgins, Mark; Chadwick, K Dana

    2014-01-01

    Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including--in the latter case--x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called "out-of-bag"), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha(-1) when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation.

  2. A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping

    PubMed Central

    Mascaro, Joseph; Asner, Gregory P.; Knapp, David E.; Kennedy-Bowdoin, Ty; Martin, Roberta E.; Anderson, Christopher; Higgins, Mark; Chadwick, K. Dana

    2014-01-01

    Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including—in the latter case—x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called “out-of-bag”), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha−1 when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation. PMID:24489686

  3. THREE-PEE SAMPLING THEORY and program 'THRP' for computer generation of selection criteria

    Treesearch

    L. R. Grosenbaugh

    1965-01-01

    Theory necessary for sampling with probability proportional to prediction ('three-pee,' or '3P,' sampling) is first developed and then exemplified by numerical comparisons of several estimators. Program 'T RP' for computer generation of appropriate 3P-sample-selection criteria is described, and convenient random integer dispensers are...

  4. 21 CFR 601.15 - Foreign establishments and products: samples for each importation.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 7 2010-04-01 2010-04-01 false Foreign establishments and products: samples for each importation. 601.15 Section 601.15 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF... establishments and products: samples for each importation. Random samples of each importation, obtained by the...

  5. 21 CFR 601.15 - Foreign establishments and products: samples for each importation.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 7 2011-04-01 2010-04-01 true Foreign establishments and products: samples for each importation. 601.15 Section 601.15 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF... establishments and products: samples for each importation. Random samples of each importation, obtained by the...

  6. Gram-negative and -positive bacteria differentiation in blood culture samples by headspace volatile compound analysis.

    PubMed

    Dolch, Michael E; Janitza, Silke; Boulesteix, Anne-Laure; Graßmann-Lichtenauer, Carola; Praun, Siegfried; Denzer, Wolfgang; Schelling, Gustav; Schubert, Sören

    2016-12-01

    Identification of microorganisms in positive blood cultures still relies on standard techniques such as Gram staining followed by culturing with definite microorganism identification. Alternatively, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry or the analysis of headspace volatile compound (VC) composition produced by cultures can help to differentiate between microorganisms under experimental conditions. This study assessed the efficacy of volatile compound based microorganism differentiation into Gram-negatives and -positives in unselected positive blood culture samples from patients. Headspace gas samples of positive blood culture samples were transferred to sterilized, sealed, and evacuated 20 ml glass vials and stored at -30 °C until batch analysis. Headspace gas VC content analysis was carried out via an auto sampler connected to an ion-molecule reaction mass spectrometer (IMR-MS). Measurements covered a mass range from 16 to 135 u including CO2, H2, N2, and O2. Prediction rules for microorganism identification based on VC composition were derived using a training data set and evaluated using a validation data set within a random split validation procedure. One-hundred-fifty-two aerobic samples growing 27 Gram-negatives, 106 Gram-positives, and 19 fungi and 130 anaerobic samples growing 37 Gram-negatives, 91 Gram-positives, and two fungi were analysed. In anaerobic samples, ten discriminators were identified by the random forest method allowing for bacteria differentiation into Gram-negative and -positive (error rate: 16.7 % in validation data set). For aerobic samples the error rate was not better than random. In anaerobic blood culture samples of patients IMR-MS based headspace VC composition analysis facilitates bacteria differentiation into Gram-negative and -positive.

  7. Extending cluster Lot Quality Assurance Sampling designs for surveillance programs

    PubMed Central

    Hund, Lauren; Pagano, Marcello

    2014-01-01

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance based on the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible non-parametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. PMID:24633656

  8. Using variance components to estimate power in a hierarchically nested sampling design improving monitoring of larval Devils Hole pupfish

    USGS Publications Warehouse

    Dzul, Maria C.; Dixon, Philip M.; Quist, Michael C.; Dinsomore, Stephen J.; Bower, Michael R.; Wilson, Kevin P.; Gaines, D. Bailey

    2013-01-01

    We used variance components to assess allocation of sampling effort in a hierarchically nested sampling design for ongoing monitoring of early life history stages of the federally endangered Devils Hole pupfish (DHP) (Cyprinodon diabolis). Sampling design for larval DHP included surveys (5 days each spring 2007–2009), events, and plots. Each survey was comprised of three counting events, where DHP larvae on nine plots were counted plot by plot. Statistical analysis of larval abundance included three components: (1) evaluation of power from various sample size combinations, (2) comparison of power in fixed and random plot designs, and (3) assessment of yearly differences in the power of the survey. Results indicated that increasing the sample size at the lowest level of sampling represented the most realistic option to increase the survey's power, fixed plot designs had greater power than random plot designs, and the power of the larval survey varied by year. This study provides an example of how monitoring efforts may benefit from coupling variance components estimation with power analysis to assess sampling design.

  9. The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival

    PubMed Central

    Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas

    2016-01-01

    Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561

  10. Extending cluster lot quality assurance sampling designs for surveillance programs.

    PubMed

    Hund, Lauren; Pagano, Marcello

    2014-07-20

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance on the basis of the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible nonparametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. Copyright © 2014 John Wiley & Sons, Ltd.

  11. Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

    DOE PAGES

    Osborn, Sarah; Zulian, Patrick; Benson, Thomas; ...

    2018-01-30

    This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less

  12. Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osborn, Sarah; Zulian, Patrick; Benson, Thomas

    This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less

  13. Linear combinations come alive in crossover designs.

    PubMed

    Shuster, Jonathan J

    2017-10-30

    Before learning anything about statistical inference in beginning service courses in biostatistics, students learn how to calculate the mean and variance of linear combinations of random variables. Practical precalculus examples of the importance of these exercises can be helpful for instructors, the target audience of this paper. We shall present applications to the "1-sample" and "2-sample" methods for randomized short-term 2-treatment crossover studies, where patients experience both treatments in random order with a "washout" between the active treatment periods. First, we show that the 2-sample method is preferred as it eliminates "conditional bias" when sample sizes by order differ and produces a smaller variance. We also demonstrate that it is usually advisable to use the differences in posttests (ignoring baseline and post washout values) rather than the differences between the changes in treatment from the start of the period to the end of the period ("delta of delta"). Although the intent is not to provide a definitive discussion of crossover designs, we provide a section and references to excellent alternative methods, where instructors can provide motivation to students to explore the topic in greater detail in future readings or courses. Copyright © 2017 John Wiley & Sons, Ltd.

  14. Random On-Board Pixel Sampling (ROPS) X-Ray Camera

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Zhehui; Iaroshenko, O.; Li, S.

    Recent advances in compressed sensing theory and algorithms offer new possibilities for high-speed X-ray camera design. In many CMOS cameras, each pixel has an independent on-board circuit that includes an amplifier, noise rejection, signal shaper, an analog-to-digital converter (ADC), and optional in-pixel storage. When X-ray images are sparse, i.e., when one of the following cases is true: (a.) The number of pixels with true X-ray hits is much smaller than the total number of pixels; (b.) The X-ray information is redundant; or (c.) Some prior knowledge about the X-ray images exists, sparse sampling may be allowed. Here we first illustratemore » the feasibility of random on-board pixel sampling (ROPS) using an existing set of X-ray images, followed by a discussion about signal to noise as a function of pixel size. Next, we describe a possible circuit architecture to achieve random pixel access and in-pixel storage. The combination of a multilayer architecture, sparse on-chip sampling, and computational image techniques, is expected to facilitate the development and applications of high-speed X-ray camera technology.« less

  15. The Bayesian group lasso for confounded spatial data

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

    2017-01-01

    Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

  16. Systematic sampling of discrete and continuous populations: sample selection and the choice of estimator

    Treesearch

    Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire

    2009-01-01

    Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...

  17. Generation of kth-order random toposequences

    NASA Astrophysics Data System (ADS)

    Odgers, Nathan P.; McBratney, Alex. B.; Minasny, Budiman

    2008-05-01

    The model presented in this paper derives toposequences from a digital elevation model (DEM). It is written in ArcInfo Macro Language (AML). The toposequences are called kth-order random toposequences, because they take a random path uphill to the top of a hill and downhill to a stream or valley bottom from a randomly selected seed point, and they are located in a streamshed of order k according to a particular stream-ordering system. We define a kth-order streamshed as the area of land that drains directly to a stream segment of stream order k. The model attempts to optimise the spatial configuration of a set of derived toposequences iteratively by using simulated annealing to maximise the total sum of distances between each toposequence hilltop in the set. The user is able to select the order, k, of the derived toposequences. Toposequences are useful for determining soil sampling locations for use in collecting soil data for digital soil mapping applications. Sampling locations can be allocated according to equal elevation or equal-distance intervals along the length of the toposequence, for example. We demonstrate the use of this model for a study area in the Hunter Valley of New South Wales, Australia. Of the 64 toposequences derived, 32 were first-order random toposequences according to Strahler's stream-ordering system, and 32 were second-order random toposequences. The model that we present in this paper is an efficient method for sampling soil along soil toposequences. The soils along a toposequence are related to each other by the topography they are found in, so soil data collected by this method is useful for establishing soil-landscape rules for the preparation of digital soil maps.

  18. Implementing collaborative primary care for depression and posttraumatic stress disorder: design and sample for a randomized trial in the U.S. military health system.

    PubMed

    Engel, Charles C; Bray, Robert M; Jaycox, Lisa H; Freed, Michael C; Zatzick, Doug; Lane, Marian E; Brambilla, Donald; Rae Olmsted, Kristine; Vandermaas-Peeler, Russ; Litz, Brett; Tanielian, Terri; Belsher, Bradley E; Evatt, Daniel P; Novak, Laura A; Unützer, Jürgen; Katon, Wayne J

    2014-11-01

    War-related trauma, posttraumatic stress disorder (PTSD), depression and suicide are common in US military members. Often, those affected do not seek treatment due to stigma and barriers to care. When care is sought, it often fails to meet quality standards. A randomized trial is assessing whether collaborative primary care improves quality and outcomes of PTSD and depression care in the US military health system. The aim of this study is to describe the design and sample for a randomized effectiveness trial of collaborative care for PTSD and depression in military members attending primary care. The STEPS-UP Trial (STepped Enhancement of PTSD Services Using Primary Care) is a 6 installation (18 clinic) randomized effectiveness trial in the US military health system. Study rationale, design, enrollment and sample characteristics are summarized. Military members attending primary care with suspected PTSD, depression or both were referred to care management and recruited for the trial (2592), and 1041 gave permission to contact for research participation. Of those, 666 (64%) met eligibility criteria, completed baseline assessments, and were randomized to 12 months of usual collaborative primary care versus STEPS-UP collaborative care. Implementation was locally managed for usual collaborative care and centrally managed for STEPS-UP. Research reassessments occurred at 3-, 6-, and 12-months. Baseline characteristics were similar across the two intervention groups. STEPS-UP will be the first large scale randomized effectiveness trial completed in the US military health system, assessing how an implementation model affects collaborative care impact on mental health outcomes. It promises lessons for health system change. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Estimating the efficacy of Alcoholics Anonymous without self-selection bias: An instrumental variables re-analysis of randomized clinical trials

    PubMed Central

    Humphreys, Keith; Blodgett, Janet C.; Wagner, Todd H.

    2014-01-01

    Background Observational studies of Alcoholics Anonymous’ (AA) effectiveness are vulnerable to self-selection bias because individuals choose whether or not to attend AA. The present study therefore employed an innovative statistical technique to derive a selection bias-free estimate of AA’s impact. Methods Six datasets from 5 National Institutes of Health-funded randomized trials (one with two independent parallel arms) of AA facilitation interventions were analyzed using instrumental variables models. Alcohol dependent individuals in one of the datasets (n = 774) were analyzed separately from the rest of sample (n = 1582 individuals pooled from 5 datasets) because of heterogeneity in sample parameters. Randomization itself was used as the instrumental variable. Results Randomization was a good instrument in both samples, effectively predicting increased AA attendance that could not be attributed to self-selection. In five of the six data sets, which were pooled for analysis, increased AA attendance that was attributable to randomization (i.e., free of self-selection bias) was effective at increasing days of abstinence at 3-month (B = .38, p = .001) and 15-month (B = 0.42, p = .04) follow-up. However, in the remaining dataset, in which pre-existing AA attendance was much higher, further increases in AA involvement caused by the randomly assigned facilitation intervention did not affect drinking outcome. Conclusions For most individuals seeking help for alcohol problems, increasing AA attendance leads to short and long term decreases in alcohol consumption that cannot be attributed to self-selection. However, for populations with high pre-existing AA involvement, further increases in AA attendance may have little impact. PMID:25421504

  20. Estimating the efficacy of Alcoholics Anonymous without self-selection bias: an instrumental variables re-analysis of randomized clinical trials.

    PubMed

    Humphreys, Keith; Blodgett, Janet C; Wagner, Todd H

    2014-11-01

    Observational studies of Alcoholics Anonymous' (AA) effectiveness are vulnerable to self-selection bias because individuals choose whether or not to attend AA. The present study, therefore, employed an innovative statistical technique to derive a selection bias-free estimate of AA's impact. Six data sets from 5 National Institutes of Health-funded randomized trials (1 with 2 independent parallel arms) of AA facilitation interventions were analyzed using instrumental variables models. Alcohol-dependent individuals in one of the data sets (n = 774) were analyzed separately from the rest of sample (n = 1,582 individuals pooled from 5 data sets) because of heterogeneity in sample parameters. Randomization itself was used as the instrumental variable. Randomization was a good instrument in both samples, effectively predicting increased AA attendance that could not be attributed to self-selection. In 5 of the 6 data sets, which were pooled for analysis, increased AA attendance that was attributable to randomization (i.e., free of self-selection bias) was effective at increasing days of abstinence at 3-month (B = 0.38, p = 0.001) and 15-month (B = 0.42, p = 0.04) follow-up. However, in the remaining data set, in which preexisting AA attendance was much higher, further increases in AA involvement caused by the randomly assigned facilitation intervention did not affect drinking outcome. For most individuals seeking help for alcohol problems, increasing AA attendance leads to short- and long-term decreases in alcohol consumption that cannot be attributed to self-selection. However, for populations with high preexisting AA involvement, further increases in AA attendance may have little impact. Copyright © 2014 by the Research Society on Alcoholism.

  1. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments

    PubMed Central

    2013-01-01

    Background Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. Results To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations. The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. Conclusions We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs. PMID:24160725

  2. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments.

    PubMed

    Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello

    2013-10-26

    Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.

  3. The mean and variance of phylogenetic diversity under rarefaction

    PubMed Central

    Matsen, Frederick A.

    2013-01-01

    Summary Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required. PMID:23833701

  4. The mean and variance of phylogenetic diversity under rarefaction.

    PubMed

    Nipperess, David A; Matsen, Frederick A

    2013-06-01

    Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required.

  5. Attitudes towards smoking restrictions and tobacco advertisement bans in Georgia.

    PubMed

    Bakhturidze, George D; Mittelmark, Maurice B; Aarø, Leif E; Peikrishvili, Nana T

    2013-11-25

    This study aims to provide data on a public level of support for restricting smoking in public places and banning tobacco advertisements. A nationally representative multistage sampling design, with sampling strata defined by region (sampling quotas proportional to size) and substrata defined by urban/rural and mountainous/lowland settlement, within which census enumeration districts were randomly sampled, within which households were randomly sampled, within which a randomly selected respondent was interviewed. The country of Georgia, population 4.7 million, located in the Caucasus region of Eurasia. One household member aged between 13 and 70 was selected as interviewee. In households with more than one age-eligible person, selection was carried out at random. Of 1588 persons selected, 14 refused to participate and interviews were conducted with 915 women and 659 men. Respondents were interviewed about their level of agreement with eight possible smoking restrictions/bans, used to calculate a single dichotomous (agree/do not agree) opinion indicator. The level of agreement with restrictions was analysed in bivariate and multivariate analyses by age, gender, education, income and tobacco use status. Overall, 84.9% of respondents indicated support for smoking restrictions and tobacco advertisement bans. In all demographic segments, including tobacco users, the majority of respondents indicated agreement with restrictions, ranging from a low of 51% in the 13-25 age group to a high of 98% in the 56-70 age group. Logistic regression with all demographic variables entered showed that agreement with restrictions was higher with age, and was significantly higher among never smokers as compared to daily smokers. Georgian public opinion is normatively supportive of more stringent tobacco-control measures in the form of smoking restrictions and tobacco advertisement bans.

  6. The revised Temperament and Character Inventory: normative data by sex and age from a Spanish normal randomized sample

    PubMed Central

    Labad, Javier; Martorell, Lourdes; Gaviria, Ana; Bayón, Carmen; Vilella, Elisabet; Cloninger, C. Robert

    2015-01-01

    Objectives. The psychometric properties regarding sex and age for the revised version of the Temperament and Character Inventory (TCI-R) and its derived short version, the Temperament and Character Inventory (TCI-140), were evaluated with a randomized sample from the community. Methods. A randomized sample of 367 normal adult subjects from a Spanish municipality, who were representative of the general population based on sex and age, participated in the current study. Descriptive statistics and internal consistency according to α coefficient were obtained for all of the dimensions and facets. T-tests and univariate analyses of variance, followed by Bonferroni tests, were conducted to compare the distributions of the TCI-R dimension scores by age and sex. Results. On both the TCI-R and TCI-140, women had higher scores for Harm Avoidance, Reward Dependence and Cooperativeness than men, whereas men had higher scores for Persistence. Age correlated negatively with Novelty Seeking, Reward Dependence and Cooperativeness and positively with Harm Avoidance and Self-transcendence. Young subjects between 18 and 35 years had higher scores than older subjects in NS and RD. Subjects between 51 and 77 years scored higher in both HA and ST. The alphas for the dimensions were between 0.74 and 0.87 for the TCI-R and between 0.63 and 0.83 for the TCI-140. Conclusion. Results, which were obtained with a randomized sample, suggest that there are specific distributions of personality traits by sex and age. Overall, both the TCI-R and the abbreviated TCI-140 were reliable in the ‘good-to-excellent’ range. A strength of the current study is the representativeness of the sample. PMID:26713237

  7. An exploratory study of Taiwanese consumers' experiences of using health-related websites.

    PubMed

    Hsu, Li-Ling

    2005-06-01

    It is manifest that the rapid growth of Internet use and improvement of information technology have changed our lifestyles. In recent years, Internet use in Taiwan has increased dramatically, from 3 million users in 1998 to approximately 8.6 million by the end of 2002. The statistics imply that not only health care professionals but also laypersons rely on the Internet for health information. The purpose of this study was to explore Taiwan consumers' preferences and information needs, and the problems they encountered when getting information from medical websites. Using simple random sampling and systematic random sampling, a survey was conducted in Taipei from August 26, 2002 to October 30, 2002. Using simple random sampling and systematic random sampling, 28 boroughs (Li) were selected; the total sample number was 1043. Over one-quarter (26.8 %) of the respondents reported having never accessed the Internet, while 763 (73.2%) reported having accessed the Internet. Of the Internet users, only 396 (51.9%) had accessed health-related websites, and 367 (48.1%) reported having never accessed health-related websites. The most popular topics were disease information (46.5%), followed by diet consultation (34.8%), medical news (28.5%), and cosmetology (28.5%). The results of the survey show that a large percentage of people in Taiwan have never made good use of health information available on the websites. The reasons for not using the websites included a lack of time or Internet access skills, no motivation, dissatisfaction with the information, unreliable information be provided, and inability to locate the information needed. The author recommends to enhance health information access skills, understand the needs and preferences of consumers, promote the quality of medical websites, and improve the functions of medical websites.

  8. Random sequential adsorption of cubes

    NASA Astrophysics Data System (ADS)

    Cieśla, Michał; Kubala, Piotr

    2018-01-01

    Random packings built of cubes are studied numerically using a random sequential adsorption algorithm. To compare the obtained results with previous reports, three different models of cube orientation sampling were used. Also, three different cube-cube intersection algorithms were tested to find the most efficient one. The study focuses on the mean saturated packing fraction as well as kinetics of packing growth. Microstructural properties of packings were analyzed using density autocorrelation function.

  9. Characteristics of men with substance use disorder consequent to illicit drug use: comparison of a random sample and volunteers.

    PubMed

    Reynolds, Maureen D; Tarter, Ralph E; Kirisci, Levent

    2004-09-06

    Men qualifying for substance use disorder (SUD) consequent to consumption of an illicit drug were compared according to recruitment method. It was hypothesized that volunteers would be more self-disclosing and exhibit more severe disturbances compared to randomly recruited subjects. Personal, demographic, family, social, substance use, psychiatric, and SUD characteristics of volunteers (N = 146) were compared to randomly recruited (N = 102) subjects. Volunteers had lower socioceconomic status, were more likely to be African American, and had lower IQ than randomly recruited subjects. Volunteers also evidenced greater social and family maladjustment and more frequently had received treatment for substance abuse. In addition, lower social desirability response bias was observed in the volunteers. SUD was not more severe in the volunteers; however, they reported a higher lifetime rate of opiate, diet, depressant, and analgesic drug use. Volunteers and randomly recruited subjects qualifying for SUD consequent to illicit drug use are similar in SUD severity but differ in terms of severity of psychosocial disturbance and history of drug involvement. The factors discriminating volunteers and randomly recruited subjects are well known to impact on outcome, hence they need to be considered in research design, especially when selecting a sampling strategy in treatment research.

  10. Influence of Random DC Offsets on Burst-Mode Receiver Sensitivity

    NASA Astrophysics Data System (ADS)

    Ossieur, Peter; de Ridder, Tine; Qiu, Xing-Zhi; Vandewege, Jan

    2006-03-01

    This paper presents the influence of random direct current (dc) offsets on the sensitivity of dc-coupled burst-mode receivers (BMRxs). It is well known that a BMRx exhibits a noisy decision threshold, resulting in a sensitivity penalty. If the BMRx is dc coupled, an additional penalty is incurred by random dc offsets. This penalty can only be determined for a statistically significant number of fabricated BMRx samples. Using Monte Carlo (MC) simulations and a detailed BMRx model, the relationship between the variance of this random dc offset, the resulting sensitivity penalty, and BMRx yield (the fraction of fabricated BMRx samples that meets a given sensitivity specification) is evaluated as a function of various receiver parameters. The obtained curves can be used to trade off BMRx die area against sensitivity for a given yield. It is demonstrated that a thorough understanding of the relationship between BMRx sensitivity, BMRx yield, and the variance of the random dc offsets is needed to optimize a dc-coupled BMRx with respect to sensitivity and die area for a given yield. It is shown that compensation of dc offsets with a resolution of 8 bits results in a sensitivity penalty of 1 dB for a wide range of random dc offsets.

  11. Using GIS to generate spatially balanced random survey designs for natural resource applications.

    PubMed

    Theobald, David M; Stevens, Don L; White, Denis; Urquhart, N Scott; Olsen, Anthony R; Norman, John B

    2007-07-01

    Sampling of a population is frequently required to understand trends and patterns in natural resource management because financial and time constraints preclude a complete census. A rigorous probability-based survey design specifies where to sample so that inferences from the sample apply to the entire population. Probability survey designs should be used in natural resource and environmental management situations because they provide the mathematical foundation for statistical inference. Development of long-term monitoring designs demand survey designs that achieve statistical rigor and are efficient but remain flexible to inevitable logistical or practical constraints during field data collection. Here we describe an approach to probability-based survey design, called the Reversed Randomized Quadrant-Recursive Raster, based on the concept of spatially balanced sampling and implemented in a geographic information system. This provides environmental managers a practical tool to generate flexible and efficient survey designs for natural resource applications. Factors commonly used to modify sampling intensity, such as categories, gradients, or accessibility, can be readily incorporated into the spatially balanced sample design.

  12. The Microbial Detection Array Combined with Random Phi29-Amplification Used as a Diagnostic Tool for Virus Detection in Clinical Samples

    PubMed Central

    Erlandsson, Lena; Rosenstierne, Maiken W.; McLoughlin, Kevin; Jaing, Crystal; Fomsgaard, Anders

    2011-01-01

    A common technique used for sensitive and specific diagnostic virus detection in clinical samples is PCR that can identify one or several viruses in one assay. However, a diagnostic microarray containing probes for all human pathogens could replace hundreds of individual PCR-reactions and remove the need for a clear clinical hypothesis regarding a suspected pathogen. We have established such a diagnostic platform for random amplification and subsequent microarray identification of viral pathogens in clinical samples. We show that Phi29 polymerase-amplification of a diverse set of clinical samples generates enough viral material for successful identification by the Microbial Detection Array, demonstrating the potential of the microarray technique for broad-spectrum pathogen detection. We conclude that this method detects both DNA and RNA virus, present in the same sample, as well as differentiates between different virus subtypes. We propose this assay for diagnostic analysis of viruses in clinical samples. PMID:21853040

  13. Unbiased feature selection in learning random forests for high-dimensional data.

    PubMed

    Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi

    2015-01-01

    Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.

  14. Survival distributions impact the power of randomized placebo-phase design and parallel groups randomized clinical trials.

    PubMed

    Abrahamyan, Lusine; Li, Chuan Silvia; Beyene, Joseph; Willan, Andrew R; Feldman, Brian M

    2011-03-01

    The study evaluated the power of the randomized placebo-phase design (RPPD)-a new design of randomized clinical trials (RCTs), compared with the traditional parallel groups design, assuming various response time distributions. In the RPPD, at some point, all subjects receive the experimental therapy, and the exposure to placebo is for only a short fixed period of time. For the study, an object-oriented simulation program was written in R. The power of the simulated trials was evaluated using six scenarios, where the treatment response times followed the exponential, Weibull, or lognormal distributions. The median response time was assumed to be 355 days for the placebo and 42 days for the experimental drug. Based on the simulation results, the sample size requirements to achieve the same level of power were different under different response time to treatment distributions. The scenario where the response times followed the exponential distribution had the highest sample size requirement. In most scenarios, the parallel groups RCT had higher power compared with the RPPD. The sample size requirement varies depending on the underlying hazard distribution. The RPPD requires more subjects to achieve a similar power to the parallel groups design. Copyright © 2011 Elsevier Inc. All rights reserved.

  15. Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh My!

    PubMed

    Vetter, Thomas R; Mascha, Edward J

    2017-09-01

    Epidemiologists seek to make a valid inference about the causal effect between an exposure and a disease in a specific population, using representative sample data from a specific population. Clinical researchers likewise seek to make a valid inference about the association between an intervention and outcome(s) in a specific population, based upon their randomly collected, representative sample data. Both do so by using the available data about the sample variable to make a valid estimate about its corresponding or underlying, but unknown population parameter. Random error in an experiment can be due to the natural, periodic fluctuation or variation in the accuracy or precision of virtually any data sampling technique or health measurement tool or scale. In a clinical research study, random error can be due to not only innate human variability but also purely chance. Systematic error in an experiment arises from an innate flaw in the data sampling technique or measurement instrument. In the clinical research setting, systematic error is more commonly referred to as systematic bias. The most commonly encountered types of bias in anesthesia, perioperative, critical care, and pain medicine research include recall bias, observational bias (Hawthorne effect), attrition bias, misclassification or informational bias, and selection bias. A confounding variable is a factor associated with both the exposure of interest and the outcome of interest. A confounding variable (confounding factor or confounder) is a variable that correlates (positively or negatively) with both the exposure and outcome. Confounding is typically not an issue in a randomized trial because the randomized groups are sufficiently balanced on all potential confounding variables, both observed and nonobserved. However, confounding can be a major problem with any observational (nonrandomized) study. Ignoring confounding in an observational study will often result in a "distorted" or incorrect estimate of the association or treatment effect. Interaction among variables, also known as effect modification, exists when the effect of 1 explanatory variable on the outcome depends on the particular level or value of another explanatory variable. Bias and confounding are common potential explanations for statistically significant associations between exposure and outcome when the true relationship is noncausal. Understanding interactions is vital to proper interpretation of treatment effects. These complex concepts should be consistently and appropriately considered whenever one is not only designing but also analyzing and interpreting data from a randomized trial or observational study.

  16. Model's sparse representation based on reduced mixed GMsFE basis methods

    NASA Astrophysics Data System (ADS)

    Jiang, Lijian; Li, Qiuqi

    2017-06-01

    In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a large number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.

  17. Model's sparse representation based on reduced mixed GMsFE basis methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Lijian, E-mail: ljjiang@hnu.edu.cn; Li, Qiuqi, E-mail: qiuqili@hnu.edu.cn

    2017-06-01

    In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a largemore » number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.« less

  18. Auto Regressive Moving Average (ARMA) Modeling Method for Gyro Random Noise Using a Robust Kalman Filter

    PubMed Central

    Huang, Lei

    2015-01-01

    To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required. PMID:26437409

  19. Sampling considerations for disease surveillance in wildlife populations

    USGS Publications Warehouse

    Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

    2008-01-01

    Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.

  20. Influence of item distribution pattern and abundance on efficiency of benthic core sampling

    USGS Publications Warehouse

    Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.

    2014-01-01

    ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.

  1. Compressive sampling of polynomial chaos expansions: Convergence analysis and sampling strategies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hampton, Jerrad; Doostan, Alireza, E-mail: alireza.doostan@colorado.edu

    2015-01-01

    Sampling orthogonal polynomial bases via Monte Carlo is of interest for uncertainty quantification of models with random inputs, using Polynomial Chaos (PC) expansions. It is known that bounding a probabilistic parameter, referred to as coherence, yields a bound on the number of samples necessary to identify coefficients in a sparse PC expansion via solution to an ℓ{sub 1}-minimization problem. Utilizing results for orthogonal polynomials, we bound the coherence parameter for polynomials of Hermite and Legendre type under their respective natural sampling distribution. In both polynomial bases we identify an importance sampling distribution which yields a bound with weaker dependence onmore » the order of the approximation. For more general orthonormal bases, we propose the coherence-optimal sampling: a Markov Chain Monte Carlo sampling, which directly uses the basis functions under consideration to achieve a statistical optimality among all sampling schemes with identical support. We demonstrate these different sampling strategies numerically in both high-order and high-dimensional, manufactured PC expansions. In addition, the quality of each sampling method is compared in the identification of solutions to two differential equations, one with a high-dimensional random input and the other with a high-order PC expansion. In both cases, the coherence-optimal sampling scheme leads to similar or considerably improved accuracy.« less

  2. CTEPP STANDARD OPERATING PROCEDURE FOR SAMPLE SELECTION (SOP-1.10)

    EPA Science Inventory

    The procedures for selecting CTEPP study subjects are described in the SOP. The primary, county-level stratification is by region and urbanicity. Six sample counties in each of the two states (North Carolina and Ohio) are selected using stratified random sampling and reflect ...

  3. What Do Lead and Copper Sampling Protocols Mean, and Which Is Right for You?

    EPA Science Inventory

    this presentation will provide a short review of the explicit and implicit concepts behind most of the currently-used regulatory and diagnostic sampling schemes for lead, such as: random daytime sampling; automated proportional sampler; 30 minute first draw stagnation; Sequential...

  4. Acute stress symptoms during the second Lebanon war in a random sample of Israeli citizens.

    PubMed

    Cohen, Miri; Yahav, Rivka

    2008-02-01

    The aims of this study were to assess prevalence of acute stress disorder (ASD) and acute stress symptoms (ASS) in Israel during the second Lebanon war. A telephone survey was conducted in July 2006 of a random sample of 235 residents of northern Israel, who were subjected to missile attacks, and of central Israel, who were not subjected to missile attacks. Results indicate that ASS scores were higher in the northern respondents; 6.8% of the northern sample and 3.9% of the central sample met ASD criteria. Appearance of each symptom ranged from 15.4% for dissociative to 88.4% for reexperiencing, with significant differences between northern and central respondents only for reexperiencing and arousal. A low ASD rate and a moderate difference between areas subjected and not subjected to attack were found.

  5. Compressed sensing: Radar signal detection and parameter measurement for EW applications

    NASA Astrophysics Data System (ADS)

    Rao, M. Sreenivasa; Naik, K. Krishna; Reddy, K. Maheshwara

    2016-09-01

    State of the art system development is very much required for UAVs (Unmanned Aerial Vehicle) and other airborne applications, where miniature, lightweight and low-power specifications are essential. Currently, the airborne Electronic Warfare (EW) systems are developed with digital receiver technology using Nyquist sampling. The detection of radar signals and parameter measurement is a necessary requirement in EW digital receivers. The Random Modulator Pre-Integrator (RMPI) can be used for matched detection of signals using smashed filter. RMPI hardware eliminates the high sampling rate analog to digital computer and reduces the number of samples using random sampling and detection of sparse orthonormal basis vectors. RMPI explore the structural and geometrical properties of the signal apart from traditional time and frequency domain analysis for improved detection. The concept has been proved with the help of MATLAB and LabVIEW simulations.

  6. Demonstration of Multi- and Single-Reader Sample Size Program for Diagnostic Studies software.

    PubMed

    Hillis, Stephen L; Schartz, Kevin M

    2015-02-01

    The recently released software Multi- and Single-Reader Sample Size Sample Size Program for Diagnostic Studies , written by Kevin Schartz and Stephen Hillis, performs sample size computations for diagnostic reader-performance studies. The program computes the sample size needed to detect a specified difference in a reader performance measure between two modalities, when using the analysis methods initially proposed by Dorfman, Berbaum, and Metz (DBM) and Obuchowski and Rockette (OR), and later unified and improved by Hillis and colleagues. A commonly used reader performance measure is the area under the receiver-operating-characteristic curve. The program can be used with typical common reader-performance measures which can be estimated parametrically or nonparametrically. The program has an easy-to-use step-by-step intuitive interface that walks the user through the entry of the needed information. Features of the software include the following: (1) choice of several study designs; (2) choice of inputs obtained from either OR or DBM analyses; (3) choice of three different inference situations: both readers and cases random, readers fixed and cases random, and readers random and cases fixed; (4) choice of two types of hypotheses: equivalence or noninferiority; (6) choice of two output formats: power for specified case and reader sample sizes, or a listing of case-reader combinations that provide a specified power; (7) choice of single or multi-reader analyses; and (8) functionality in Windows, Mac OS, and Linux.

  7. Piecewise SALT sampling for estimating suspended sediment yields

    Treesearch

    Robert B. Thomas

    1989-01-01

    A probability sampling method called SALT (Selection At List Time) has been developed for collecting and summarizing data on delivery of suspended sediment in rivers. It is based on sampling and estimating yield using a suspended-sediment rating curve for high discharges and simple random sampling for low flows. The method gives unbiased estimates of total yield and...

  8. Sampling estimators of total mill receipts for use in timber product output studies

    Treesearch

    John P. Brown; Richard G. Oderwald

    2012-01-01

    Data from the 2001 timber product output study for Georgia was explored to determine new methods for stratifying mills and finding suitable sampling estimators. Estimators for roundwood receipts totals comprised several types: simple random sample, ratio, stratified sample, and combined ratio. Two stratification methods were examined: the Dalenius-Hodges (DH) square...

  9. Statistical Sampling Handbook for Student Aid Programs: A Reference for Non-Statisticians. Winter 1984.

    ERIC Educational Resources Information Center

    Office of Student Financial Assistance (ED), Washington, DC.

    A manual on sampling is presented to assist audit and program reviewers, project officers, managers, and program specialists of the U.S. Office of Student Financial Assistance (OSFA). For each of the following types of samples, definitions and examples are provided, along with information on advantages and disadvantages: simple random sampling,…

  10. Pedagogical Simulation of Sampling Distributions and the Central Limit Theorem

    ERIC Educational Resources Information Center

    Hagtvedt, Reidar; Jones, Gregory Todd; Jones, Kari

    2007-01-01

    Students often find the fact that a sample statistic is a random variable very hard to grasp. Even more mysterious is why a sample mean should become ever more Normal as the sample size increases. This simulation tool is meant to illustrate the process, thereby giving students some intuitive grasp of the relationship between a parent population…

  11. (I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research

    PubMed Central

    2017-01-01

    I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario. PMID:28746358

  12. Evaluation of three sampling methods to monitor outcomes of antiretroviral treatment programmes in low- and middle-income countries.

    PubMed

    Tassie, Jean-Michel; Malateste, Karen; Pujades-Rodríguez, Mar; Poulet, Elisabeth; Bennett, Diane; Harries, Anthony; Mahy, Mary; Schechter, Mauro; Souteyrand, Yves; Dabis, François

    2010-11-10

    Retention of patients on antiretroviral therapy (ART) over time is a proxy for quality of care and an outcome indicator to monitor ART programs. Using existing databases (Antiretroviral in Lower Income Countries of the International Databases to Evaluate AIDS and Médecins Sans Frontières), we evaluated three sampling approaches to simplify the generation of outcome indicators. We used individual patient data from 27 ART sites and included 27,201 ART-naive adults (≥15 years) who initiated ART in 2005. For each site, we generated two outcome indicators at 12 months, retention on ART and proportion of patients lost to follow-up (LFU), first using all patient data and then within a smaller group of patients selected using three sampling methods (random, systematic and consecutive sampling). For each method and each site, 500 samples were generated, and the average result was compared with the unsampled value. The 95% sampling distribution (SD) was expressed as the 2.5(th) and 97.5(th) percentile values from the 500 samples. Overall, retention on ART was 76.5% (range 58.9-88.6) and the proportion of patients LFU, 13.5% (range 0.8-31.9). Estimates of retention from sampling (n = 5696) were 76.5% (SD 75.4-77.7) for random, 76.5% (75.3-77.5) for systematic and 76.0% (74.1-78.2) for the consecutive method. Estimates for the proportion of patients LFU were 13.5% (12.6-14.5), 13.5% (12.6-14.3) and 14.0% (12.5-15.5), respectively. With consecutive sampling, 50% of sites had SD within ±5% of the unsampled site value. Our results suggest that random, systematic or consecutive sampling methods are feasible for monitoring ART indicators at national level. However, sampling may not produce precise estimates in some sites.

  13. On-field measurement trial of 4×128 Gbps PDM-QPSK signals by linear optical sampling

    NASA Astrophysics Data System (ADS)

    Bin Liu; Wu, Zhichao; Fu, Songnian; Feng, Yonghua; Liu, Deming

    2017-02-01

    Linear optical sampling is a promising characterization technique for advanced modulation formats, together with digital signal processing (DSP) and software-synchronized algorithm. We theoretically investigate the acquisition of optical sampling, when the high-speed signal under test is either periodic or random. Especially, when the profile of optical sampling pulse is asymmetrical, the repetition frequency of sampling pulse needs careful adjustment in order to obtain correct waveform. Then, we demonstrate on-field measurement trial of commercial four-channel 128 Gbps polarization division multiplexing quadrature phase shift keying (PDM-QPSK) signals with truly random characteristics by self-developed equipment. A passively mode-locked fiber laser (PMFL) with a repetition frequency of 95.984 MHz is used as optical sampling source, meanwhile four balanced photo detectors (BPDs) with 400 MHz bandwidth and four-channel analog-to-digital convertor (ADC) with 1.25 GS/s sampling rate are used for data acquisition. The performance comparison with conventional optical modulation analyzer (OMA) verifies that the self-developed equipment has the advantages of low cost, easy implementation, and fast response.

  14. Electrical conductivity and total dissolved solids in urine.

    PubMed

    Fazil Marickar, Y M

    2010-08-01

    The objective of this paper is to study the relevance of electrical conductivity (EC) and total dissolved solids (TDS) in early morning and random samples of urine of urinary stone patients; 2,000 urine samples were studied. The two parameters were correlated with the extent of various urinary concrements. The early morning urine (EMU) and random samples of the patients who attended the urinary stone clinic were analysed routinely. The pH, specific gravity, EC, TDS, redox potential, albumin, sugar and microscopic study of the urinary sediments including red blood cells (RBC), pus cells (PC), crystals, namely calcium oxalate monohydrate (COM), calcium oxalate dihydrate (COD), uric acid (UA), and phosphates and epithelial cells were assessed. The extent of RBC, PC, COM, COD, UA and phosphates was correlated with EC and TDS. The values of EC ranged from 1.1 to 33.9 mS, the mean value being 21.5 mS. TDS ranged from 3,028 to 18,480 ppm, the mean value being 7,012 ppm. The TDS levels corresponded with EC of urine. Both values were significantly higher (P < 0.05) in the EMU samples than the random samples. There was a statistically significant correlation between the level of abnormality in the urinary deposits (r = +0.27, P < 0.05). In samples, where the TDS were more than 12,000 ppm, there were more crystals than those samples containing TDS less than 12,000 ppm. However, there were certain urine samples, where the TDS were over 12,000, which did not contain any urinary crystals. It is concluded that the value of TDS has relevance in the process of stone formation.

  15. Design and implementation of a dental caries prevention trial in remote Canadian Aboriginal communities.

    PubMed

    Harrison, Rosamund; Veronneau, Jacques; Leroux, Brian

    2010-05-13

    The goal of this cluster randomized trial is to test the effectiveness of a counseling approach, Motivational Interviewing, to control dental caries in young Aboriginal children. Motivational Interviewing, a client-centred, directive counseling style, has not yet been evaluated as an approach for promotion of behaviour change in indigenous communities in remote settings. Aboriginal women were hired from the 9 communities to recruit expectant and new mothers to the trial, administer questionnaires and deliver the counseling to mothers in the test communities. The goal is for mothers to receive the intervention during pregnancy and at their child's immunization visits. Data on children's dental health status and family dental health practices will be collected when children are 30-months of age. The communities were randomly allocated to test or control group by a random "draw" over community radio. Sample size and power were determined based on an anticipated 20% reduction in caries prevalence. Randomization checks were conducted between groups. In the 5 test and 4 control communities, 272 of the original target sample size of 309 mothers have been recruited over a two-and-a-half year period. A power calculation using the actual attained sample size showed power to be 79% to detect a treatment effect. If an attrition fraction of 4% per year is maintained, power will remain at 80%. Power will still be > 90% to detect a 25% reduction in caries prevalence. The distribution of most baseline variables was similar for the two randomized groups of mothers. However, despite the random assignment of communities to treatment conditions, group differences exist for stage of pregnancy and prior tooth extractions in the family. Because of the group imbalances on certain variables, control of baseline variables will be done in the analyses of treatment effects. This paper explains the challenges of conducting randomized trials in remote settings, the importance of thorough community collaboration, and also illustrates the likelihood that some baseline variables that may be clinically important will be unevenly split in group-randomized trials when the number of groups is small. This trial is registered as ISRCTN41467632.

  16. Design and implementation of a dental caries prevention trial in remote Canadian Aboriginal communities

    PubMed Central

    2010-01-01

    Background The goal of this cluster randomized trial is to test the effectiveness of a counseling approach, Motivational Interviewing, to control dental caries in young Aboriginal children. Motivational Interviewing, a client-centred, directive counseling style, has not yet been evaluated as an approach for promotion of behaviour change in indigenous communities in remote settings. Methods/design Aboriginal women were hired from the 9 communities to recruit expectant and new mothers to the trial, administer questionnaires and deliver the counseling to mothers in the test communities. The goal is for mothers to receive the intervention during pregnancy and at their child's immunization visits. Data on children's dental health status and family dental health practices will be collected when children are 30-months of age. The communities were randomly allocated to test or control group by a random "draw" over community radio. Sample size and power were determined based on an anticipated 20% reduction in caries prevalence. Randomization checks were conducted between groups. Discussion In the 5 test and 4 control communities, 272 of the original target sample size of 309 mothers have been recruited over a two-and-a-half year period. A power calculation using the actual attained sample size showed power to be 79% to detect a treatment effect. If an attrition fraction of 4% per year is maintained, power will remain at 80%. Power will still be > 90% to detect a 25% reduction in caries prevalence. The distribution of most baseline variables was similar for the two randomized groups of mothers. However, despite the random assignment of communities to treatment conditions, group differences exist for stage of pregnancy and prior tooth extractions in the family. Because of the group imbalances on certain variables, control of baseline variables will be done in the analyses of treatment effects. This paper explains the challenges of conducting randomized trials in remote settings, the importance of thorough community collaboration, and also illustrates the likelihood that some baseline variables that may be clinically important will be unevenly split in group-randomized trials when the number of groups is small. Trial registration This trial is registered as ISRCTN41467632. PMID:20465831

  17. Random sphere packing model of heterogeneous propellants

    NASA Astrophysics Data System (ADS)

    Kochevets, Sergei Victorovich

    It is well recognized that combustion of heterogeneous propellants is strongly dependent on the propellant morphology. Recent developments in computing systems make it possible to start three-dimensional modeling of heterogeneous propellant combustion. A key component of such large scale computations is a realistic model of industrial propellants which retains the true morphology---a goal never achieved before. The research presented develops the Random Sphere Packing Model of heterogeneous propellants and generates numerical samples of actual industrial propellants. This is done by developing a sphere packing algorithm which randomly packs a large number of spheres with a polydisperse size distribution within a rectangular domain. First, the packing code is developed, optimized for performance, and parallelized using the OpenMP shared memory architecture. Second, the morphology and packing fraction of two simple cases of unimodal and bimodal packs are investigated computationally and analytically. It is shown that both the Loose Random Packing and Dense Random Packing limits are not well defined and the growth rate of the spheres is identified as the key parameter controlling the efficiency of the packing. For a properly chosen growth rate, computational results are found to be in excellent agreement with experimental data. Third, two strategies are developed to define numerical samples of polydisperse heterogeneous propellants: the Deterministic Strategy and the Random Selection Strategy. Using these strategies, numerical samples of industrial propellants are generated. The packing fraction is investigated and it is shown that the experimental values of the packing fraction can be achieved computationally. It is strongly believed that this Random Sphere Packing Model of propellants is a major step forward in the realistic computational modeling of heterogeneous propellant of combustion. In addition, a method of analysis of the morphology of heterogeneous propellants is developed which uses the concept of multi-point correlation functions. A set of intrinsic length scales of local density fluctuations in random heterogeneous propellants is identified by performing a Monte-Carlo study of the correlation functions. This method of analysis shows great promise for understanding the origins of the combustion instability of heterogeneous propellants, and is believed to become a valuable tool for the development of safe and reliable rocket engines.

  18. Methodological reporting quality of randomized controlled trials: A survey of seven core journals of orthopaedics from Mainland China over 5 years following the CONSORT statement.

    PubMed

    Zhang, J; Chen, X; Zhu, Q; Cui, J; Cao, L; Su, J

    2016-11-01

    In recent years, the number of randomized controlled trials (RCTs) in the field of orthopaedics is increasing in Mainland China. However, randomized controlled trials (RCTs) are inclined to bias if they lack methodological quality. Therefore, we performed a survey of RCT to assess: (1) What about the quality of RCTs in the field of orthopedics in Mainland China? (2) Whether there is difference between the core journals of the Chinese department of orthopedics and Orthopaedics Traumatology Surgery & Research (OTSR). This research aimed to evaluate the methodological reporting quality according to the CONSORT statement of randomized controlled trials (RCTs) in seven key orthopaedic journals published in Mainland China over 5 years from 2010 to 2014. All of the articles were hand researched on Chongqing VIP database between 2010 and 2014. Studies were considered eligible if the words "random", "randomly", "randomization", "randomized" were employed to describe the allocation way. Trials including animals, cadavers, trials published as abstracts and case report, trials dealing with subgroups analysis, or trials without the outcomes were excluded. In addition, eight articles selected from Orthopaedics Traumatology Surgery & Research (OTSR) between 2010 and 2014 were included in this study for comparison. The identified RCTs are analyzed using a modified version of the Consolidated Standards of Reporting Trials (CONSORT), including the sample size calculation, allocation sequence generation, allocation concealment, blinding and handling of dropouts. A total of 222 RCTs were identified in seven core orthopaedic journals. No trials reported adequate sample size calculation, 74 (33.4%) reported adequate allocation generation, 8 (3.7%) trials reported adequate allocation concealment, 18 (8.1%) trials reported adequate blinding and 16 (7.2%) trials reported handling of dropouts. In OTSR, 1 (12.5%) trial reported adequate sample size calculation, 4 (50.0%) reported adequate allocation generation, 1 (12.5%) trials reported adequate allocation concealment, 2 (25.0%) trials reported adequate blinding and 5 (62.5%) trials reported handling of dropouts. There were statistical differences as for sample size calculation and handling of dropouts between papers from Mainland China and OTSR (P<0.05). The findings of this study show that the methodological reporting quality of RCTs in seven core orthopaedic journals from the Mainland China is far from satisfaction and it needs to further improve to keep up with the standards of the CONSORT statement. Level III case control. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  19. Random Forest-Based Recognition of Isolated Sign Language Subwords Using Data from Accelerometers and Surface Electromyographic Sensors.

    PubMed

    Su, Ruiliang; Chen, Xiang; Cao, Shuai; Zhang, Xu

    2016-01-14

    Sign language recognition (SLR) has been widely used for communication amongst the hearing-impaired and non-verbal community. This paper proposes an accurate and robust SLR framework using an improved decision tree as the base classifier of random forests. This framework was used to recognize Chinese sign language subwords using recordings from a pair of portable devices worn on both arms consisting of accelerometers (ACC) and surface electromyography (sEMG) sensors. The experimental results demonstrated the validity of the proposed random forest-based method for recognition of Chinese sign language (CSL) subwords. With the proposed method, 98.25% average accuracy was obtained for the classification of a list of 121 frequently used CSL subwords. Moreover, the random forests method demonstrated a superior performance in resisting the impact of bad training samples. When the proportion of bad samples in the training set reached 50%, the recognition error rate of the random forest-based method was only 10.67%, while that of a single decision tree adopted in our previous work was almost 27.5%. Our study offers a practical way of realizing a robust and wearable EMG-ACC-based SLR systems.

  20. Experiences of being a control group: lessons from a UK-based randomized controlled trial of group singing as a health promotion initiative for older people.

    PubMed

    Skingley, Ann; Bungay, Hilary; Clift, Stephen; Warden, June

    2014-12-01

    Existing randomized controlled trials within the health field suggest that the concept of randomization is not always well understood and that feelings of disappointment may occur when participants are not placed in their preferred arm. This may affect a study's rigour and ethical integrity if not addressed. We aimed to test whether these issues apply to a healthy volunteer sample within a health promotion trial of singing for older people. Written comments from control group participants at two points during the trial were analysed, together with individual semi-structured interviews with a small sample (n = 11) of this group. We found that motivation to participate in the trial was largely due to the appeal of singing and disappointment resulted from allocation to the control group. Understanding of randomization was generally good and feelings of disappointment lessened over time and with a post-research opportunity to sing. Findings suggest that measures should be put in place to minimize the potential negative impacts of randomized controlled trials in health promotion research. © The Author (2013). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Random walks and diffusion on networks

    NASA Astrophysics Data System (ADS)

    Masuda, Naoki; Porter, Mason A.; Lambiotte, Renaud

    2017-11-01

    Random walks are ubiquitous in the sciences, and they are interesting from both theoretical and practical perspectives. They are one of the most fundamental types of stochastic processes; can be used to model numerous phenomena, including diffusion, interactions, and opinions among humans and animals; and can be used to extract information about important entities or dense groups of entities in a network. Random walks have been studied for many decades on both regular lattices and (especially in the last couple of decades) on networks with a variety of structures. In the present article, we survey the theory and applications of random walks on networks, restricting ourselves to simple cases of single and non-adaptive random walkers. We distinguish three main types of random walks: discrete-time random walks, node-centric continuous-time random walks, and edge-centric continuous-time random walks. We first briefly survey random walks on a line, and then we consider random walks on various types of networks. We extensively discuss applications of random walks, including ranking of nodes (e.g., PageRank), community detection, respondent-driven sampling, and opinion models such as voter models.

  2. Transcutaneous bilirubinometry reduces the need for blood sampling in neonates with visible jaundice.

    PubMed

    Mishra, S; Chawla, D; Agarwal, R; Deorari, A K; Paul, V K; Bhutani, V K

    2009-12-01

    We determined usefulness of transcutaneous bilirubinometry to decrease the need for blood sampling to assay serum total bilirubin (STB) in the management of jaundiced healthy Indian neonates. Newborns, > or =35 weeks' gestation, with clinical evidence of jaundice were enrolled in an institutional approved randomized clinical trial. The severity of hyperbilirubinaemia was determined by two non-invasive methods: i) protocol-based visual assessment of bilirubin (VaB) and ii) transcutaneous bilirubin (TcB) determination (BiliCheck). By a random allocation, either method was used to decide the need for blood sampling, which was defined to be present if assessed STB by allocated method exceeded 80% of hour-specific threshold values for phototherapy (2004 AAP Guidelines). A total of 617 neonates were randomized to either TcB (n = 314) or VaB (n = 303) groups with comparable gestation, birth weight and postnatal age. Need for blood sampling to assay STB was 34% lower (95% CI: 10% to 51%) in the TcB group compared with VaB group (17.5% vs 26.4% assessments; risk difference: -8.9%, 95% CI: -2.4% to -15.4%; p = 0.008). Routine use of transcutaneous bilirubinometry compared with systematic visual assessment of bilirubin significantly reduced the need for blood sampling to assay STB in jaundiced term and late-preterm neonates. (ClinicalTrials.gov number, NCT00653874).

  3. Compressive Sampling based Image Coding for Resource-deficient Visual Communication.

    PubMed

    Liu, Xianming; Zhai, Deming; Zhou, Jiantao; Zhang, Xinfeng; Zhao, Debin; Gao, Wen

    2016-04-14

    In this paper, a new compressive sampling based image coding scheme is developed to achieve competitive coding efficiency at lower encoder computational complexity, while supporting error resilience. This technique is particularly suitable for visual communication with resource-deficient devices. At the encoder, compact image representation is produced, which is a polyphase down-sampled version of the input image; but the conventional low-pass filter prior to down-sampling is replaced by a local random binary convolution kernel. The pixels of the resulting down-sampled pre-filtered image are local random measurements and placed in the original spatial configuration. The advantages of local random measurements are two folds: 1) preserve high-frequency image features that are otherwise discarded by low-pass filtering; 2) remain a conventional image and can therefore be coded by any standardized codec to remove statistical redundancy of larger scales. Moreover, measurements generated by different kernels can be considered as multiple descriptions of the original image and therefore the proposed scheme has the advantage of multiple description coding. At the decoder, a unified sparsity-based soft-decoding technique is developed to recover the original image from received measurements in a framework of compressive sensing. Experimental results demonstrate that the proposed scheme is competitive compared with existing methods, with a unique strength of recovering fine details and sharp edges at low bit-rates.

  4. Estimating the Size of a Large Network and its Communities from a Random Sample

    PubMed Central

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W.

    2017-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924

  5. Estimating the Size of a Large Network and its Communities from a Random Sample.

    PubMed

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W

    2016-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.

  6. Influence of population versus convenience sampling on sample characteristics in studies of cognitive aging.

    PubMed

    Brodaty, Henry; Mothakunnel, Annu; de Vel-Palumbo, Melissa; Ames, David; Ellis, Kathryn A; Reppermund, Simone; Kochan, Nicole A; Savage, Greg; Trollor, Julian N; Crawford, John; Sachdev, Perminder S

    2014-01-01

    We examined whether differences in findings of studies examining mild cognitive impairment (MCI) were associated with recruitment methods by comparing sample characteristics in two contemporaneous Australian studies, using population-based and convenience sampling. The Sydney Memory and Aging Study invited participants randomly from the electoral roll in defined geographic areas in Sydney. The Australian Imaging, Biomarkers and Lifestyle Study of Ageing recruited cognitively normal (CN) individuals via media appeals and MCI participants via referrals from clinicians in Melbourne and Perth. Demographic and cognitive variables were harmonized, and similar diagnostic criteria were applied to both samples retrospectively. CN participants recruited via convenience sampling were younger, better educated, more likely to be married and have a family history of dementia, and performed better cognitively than those recruited via population-based sampling. MCI participants recruited via population-based sampling had better memory performance and were less likely to carry the apolipoprotein E ε4 allele than clinically referred participants but did not differ on other demographic variables. A convenience sample of normal controls is likely to be younger and better functioning and that of an MCI group likely to perform worse than a purportedly random sample. Sampling bias should be considered when interpreting findings. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Developing Sampling Frame for Case Study: Challenges and Conditions

    ERIC Educational Resources Information Center

    Ishak, Noriah Mohd; Abu Bakar, Abu Yazid

    2014-01-01

    Due to statistical analysis, the issue of random sampling is pertinent to any quantitative study. Unlike quantitative study, the elimination of inferential statistical analysis, allows qualitative researchers to be more creative in dealing with sampling issue. Since results from qualitative study cannot be generalized to the bigger population,…

  8. Single-Phase Mail Survey Design for Rare Population Subgroups

    ERIC Educational Resources Information Center

    Brick, J. Michael; Andrews, William R.; Mathiowetz, Nancy A.

    2016-01-01

    Although using random digit dialing (RDD) telephone samples was the preferred method for conducting surveys of households for many years, declining response and coverage rates have led researchers to explore alternative approaches. The use of address-based sampling (ABS) has been examined for sampling the general population and subgroups, most…

  9. A MORE COST-EFFECTIVE EMAP-ESTUARIES BENTHIC MACROFAUNAL SAMPLING PROTOCOL

    EPA Science Inventory

    The standard benthic macrofaunal sampling protocol in the U.S. Environmental Protection Agency's Pacific Coast Environmental Monitoring and Assessment Program (EMAP) is to collect a minimum of 30 random benthic samples per reporting unit (e.g., estuary) using a 0.1 m2 grab and to...

  10. 40 CFR 204.51 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... tests. (m) Test sample means the collection of compressors from the same category or configuration which is randomly drawn from the batch sample and which will receive emissions tests. (n) Batch size means... category or configuration in a batch. (o) Test sample size means the number of compressors of the same...

  11. Efficiency of analytical and sampling-based uncertainty propagation in intensity-modulated proton therapy

    NASA Astrophysics Data System (ADS)

    Wahl, N.; Hennig, P.; Wieser, H. P.; Bangert, M.

    2017-07-01

    The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU ≤slant {5} min). The resulting standard deviation (expectation value) of dose show average global γ{3% / {3}~mm} pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.

  12. Efficiency of analytical and sampling-based uncertainty propagation in intensity-modulated proton therapy.

    PubMed

    Wahl, N; Hennig, P; Wieser, H P; Bangert, M

    2017-06-26

    The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU [Formula: see text] min). The resulting standard deviation (expectation value) of dose show average global [Formula: see text] pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.

  13. Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding.

    PubMed

    Leray, Matthieu; Knowlton, Nancy

    2017-01-01

    DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this method remain to be quantified. The interpretation of differences in patterns of sequence abundance and the ecological relevance of rare sequences remain particularly uncertain. Here we used one artificial mock community to explore the significance of abundance patterns and disentangle the effects of two potential biases on data reproducibility: indexed PCR primers and random sampling during Illumina MiSeq sequencing. We amplified a short fragment of the mitochondrial Cytochrome c Oxidase Subunit I (COI) for a single mock sample containing equimolar amounts of total genomic DNA from 34 marine invertebrates belonging to six phyla. We used seven indexed broad-range primers and sequenced the resulting library on two consecutive Illumina MiSeq runs. The total number of Operational Taxonomic Units (OTUs) was ∼4 times higher than expected based on the composition of the mock sample. Moreover, the total number of reads for the 34 components of the mock sample differed by up to three orders of magnitude. However, 79 out of 86 of the unexpected OTUs were represented by <10 sequences that did not appear consistently across replicates. Our data suggest that random sampling of rare OTUs (e.g., small associated fauna such as parasites) accounted for most of variation in OTU presence-absence, whereas biases associated with indexed PCRs accounted for a larger amount of variation in relative abundance patterns. These results suggest that random sampling during sequencing leads to the low reproducibility of rare OTUs. We suggest that the strategy for handling rare OTUs should depend on the objectives of the study. Systematic removal of rare OTUs may avoid inflating diversity based on common β descriptors but will exclude positive records of taxa that are functionally important. Our results further reinforce the need for technical replicates (parallel PCR and sequencing from the same sample) in metabarcoding experimental designs. Data reproducibility should be determined empirically as it will depend upon the sequencing depth, the type of sample, the sequence analysis pipeline, and the number of replicates. Moreover, estimating relative biomasses or abundances based on read counts remains elusive at the OTU level.

  14. Using re-randomization to increase the recruitment rate in clinical trials - an assessment of three clinical areas.

    PubMed

    Kahan, Brennan C

    2016-12-13

    Patient recruitment in clinical trials is often challenging, and as a result, many trials are stopped early due to insufficient recruitment. The re-randomization design allows patients to be re-enrolled and re-randomized for each new treatment episode that they experience. Because it allows multiple enrollments for each patient, this design has been proposed as a way to increase the recruitment rate in clinical trials. However, it is unknown to what extent recruitment could be increased in practice. We modelled the expected recruitment rate for parallel-group and re-randomization trials in different settings based on estimates from real trials and datasets. We considered three clinical areas: in vitro fertilization, severe asthma exacerbations, and acute sickle cell pain crises. We compared the two designs in terms of the expected time to complete recruitment, and the sample size recruited over a fixed recruitment period. Across the different scenarios we considered, we estimated that re-randomization could reduce the expected time to complete recruitment by between 4 and 22 months (relative reductions of 19% and 45%), or increase the sample size recruited over a fixed recruitment period by between 29% and 171%. Re-randomization can increase recruitment most for trials with a short follow-up period, a long trial recruitment duration, and patients with high rates of treatment episodes. Re-randomization has the potential to increase the recruitment rate in certain settings, and could lead to quicker and more efficient trials in these scenarios.

  15. Analysis of histological findings obtained combining US/mp-MRI fusion-guided biopsies with systematic US biopsies: mp-MRI role in prostate cancer detection and false negative.

    PubMed

    Faiella, Eliodoro; Santucci, Domiziana; Greco, Federico; Frauenfelder, Giulia; Giacobbe, Viola; Muto, Giovanni; Zobel, Bruno Beomonte; Grasso, Rosario Francesco

    2018-02-01

    To evaluate the diagnostic accuracy of mp-MRI correlating US/mp-MRI fusion-guided biopsy with systematic random US-guided biopsy in prostate cancer diagnosis. 137 suspected prostatic abnormalities were identified on mp-MRI (1.5T) in 96 patients and classified according to PI-RADS score v2. All target lesions underwent US/mp-MRI fusion biopsy and prostatic sampling was completed by US-guided systematic random 12-core biopsies. Histological analysis and Gleason score were established for all the samples, both target lesions defined by mp-MRI, and random biopsies. PI-RADS score was correlated with the histological results, divided in three groups (benign tissue, atypia and carcinoma) and with Gleason groups, divided in four categories considering the new Grading system of the ISUP 2014, using t test. Multivariate analysis was used to correlate PI-RADS and Gleason categories to PSA level and abnormalities axial diameter. When the random core biopsies showed carcinoma (mp-MRI false-negatives), PSA value and lesions Gleason median value were compared with those of carcinomas identified by mp-MRI (true-positives), using t test. There was statistically significant difference between PI-RADS score in carcinoma, atypia and benign lesions groups (4.41, 3.61 and 3.24, respectively) and between PI-RADS score in Gleason < 7 group and Gleason > 7 group (4.14 and 4.79, respectively). mp-MRI performance was more accurate for lesions > 15 mm and in patients with PSA > 6 ng/ml. In systematic sampling, 130 (11.25%) mp-MRI false-negative were identified. There was no statistic difference in Gleason median value (7.0 vs 7.06) between this group and the mp-MRI true-positives, but a significant lower PSA median value was demonstrated (7.08 vs 7.53 ng/ml). mp-MRI remains the imaging modality of choice to identify PCa lesions. Integrating US-guided random sampling with US/mp-MRI fusion target lesions sampling, 3.49% of false-negative were identified.

  16. On the importance of incorporating sampling weights in ...

    EPA Pesticide Factsheets

    Occupancy models are used extensively to assess wildlife-habitat associations and to predict species distributions across large geographic regions. Occupancy models were developed as a tool to properly account for imperfect detection of a species. Current guidelines on survey design requirements for occupancy models focus on the number of sample units and the pattern of revisits to a sample unit within a season. We focus on the sampling design or how the sample units are selected in geographic space (e.g., stratified, simple random, unequal probability, etc). In a probability design, each sample unit has a sample weight which quantifies the number of sample units it represents in the finite (oftentimes areal) sampling frame. We demonstrate the importance of including sampling weights in occupancy model estimation when the design is not a simple random sample or equal probability design. We assume a finite areal sampling frame as proposed for a national bat monitoring program. We compare several unequal and equal probability designs and varying sampling intensity within a simulation study. We found the traditional single season occupancy model produced biased estimates of occupancy and lower confidence interval coverage rates compared to occupancy models that accounted for the sampling design. We also discuss how our findings inform the analyses proposed for the nascent North American Bat Monitoring Program and other collaborative synthesis efforts that propose h

  17. Stable and efficient retrospective 4D-MRI using non-uniformly distributed quasi-random numbers

    NASA Astrophysics Data System (ADS)

    Breuer, Kathrin; Meyer, Cord B.; Breuer, Felix A.; Richter, Anne; Exner, Florian; Weng, Andreas M.; Ströhle, Serge; Polat, Bülent; Jakob, Peter M.; Sauer, Otto A.; Flentje, Michael; Weick, Stefan

    2018-04-01

    The purpose of this work is the development of a robust and reliable three-dimensional (3D) Cartesian imaging technique for fast and flexible retrospective 4D abdominal MRI during free breathing. To this end, a non-uniform quasi random (NU-QR) reordering of the phase encoding (k y –k z ) lines was incorporated into 3D Cartesian acquisition. The proposed sampling scheme allocates more phase encoding points near the k-space origin while reducing the sampling density in the outer part of the k-space. Respiratory self-gating in combination with SPIRiT-reconstruction is used for the reconstruction of abdominal data sets in different respiratory phases (4D-MRI). Six volunteers and three patients were examined at 1.5 T during free breathing. Additionally, data sets with conventional two-dimensional (2D) linear and 2D quasi random phase encoding order were acquired for the volunteers for comparison. A quantitative evaluation of image quality versus scan times (from 70 s to 626 s) for the given sampling schemes was obtained by calculating the normalized mutual information (NMI) for all volunteers. Motion estimation was accomplished by calculating the maximum derivative of a signal intensity profile of a transition (e.g. tumor or diaphragm). The 2D non-uniform quasi-random distribution of phase encoding lines in Cartesian 3D MRI yields more efficient undersampling patterns for parallel imaging compared to conventional uniform quasi-random and linear sampling. Median NMI values of NU-QR sampling are the highest for all scan times. Therefore, within the same scan time 4D imaging could be performed with improved image quality. The proposed method allows for the reconstruction of motion artifact reduced 4D data sets with isotropic spatial resolution of 2.1  ×  2.1  ×  2.1 mm3 in a short scan time, e.g. 10 respiratory phases in only 3 min. Cranio-caudal tumor displacements between 23 and 46 mm could be observed. NU-QR sampling enables for stable 4D-MRI with high temporal and spatial resolution within short scan time for visualization of organ or tumor motion during free breathing. Further studies, e.g. the application of the method for radiotherapy planning are needed to investigate the clinical applicability and diagnostic value of the approach.

  18. Monte Carlo Sampling in Fractal Landscapes

    NASA Astrophysics Data System (ADS)

    Leitão, Jorge C.; Lopes, J. M. Viana Parente; Altmann, Eduardo G.

    2013-05-01

    We design a random walk to explore fractal landscapes such as those describing chaotic transients in dynamical systems. We show that the random walk moves efficiently only when its step length depends on the height of the landscape via the largest Lyapunov exponent of the chaotic system. We propose a generalization of the Wang-Landau algorithm which constructs not only the density of states (transient time distribution) but also the correct step length. As a result, we obtain a flat-histogram Monte Carlo method which samples fractal landscapes in polynomial time, a dramatic improvement over the exponential scaling of traditional uniform-sampling methods. Our results are not limited by the dimensionality of the landscape and are confirmed numerically in chaotic systems with up to 30 dimensions.

  19. Comparisons of molecular karyotype and RAPD patterns of anuran trypanosome isolates during long-term in vitro cultivation.

    PubMed

    Lun, Z R; Desser, S S

    1996-01-01

    The patterns of random amplified fragments and molecular karyotypes of 12 isolates of anuran trypanosomes continuously cultured in vitro were compared by random amplified polymorphic DNA (RAPD) analysis and pulsed field gradient gel electrophoresis (PFGE). The time interval between preparation of two series of samples was one year. Changes were not observed in the number and size of sharp, amplified fragments of DNA samples from both series examined with the ten primers used. Likewise, changes in the molecular karyotypes were not detected between the two samples of these isolates. These results suggest that the molecular karyotype and the RAPD patterns of the anuran trypanosomes remain stable after being cultured continuously in vitro for one year.

  20. Alcohol- and Drug-Involved Driving in the United States: Methodology for the 2007 National Roadside Survey

    PubMed Central

    Lacey, John H.; Kelley-Baker, Tara; Voas, Robert B.; Romano, Eduardo; Furr-Holden, C. Debra; Torres, Pedro; Berning, Amy

    2013-01-01

    This article describes the methodology used in the 2007 U.S. National Roadside Survey to estimate the prevalence of alcohol- and drug-impaired driving and alcohol- and drug-involved driving. This study involved randomly stopping drivers at 300 locations across the 48 continental U.S. states at sites selected through a stratified random sampling procedure. Data were collected during a 2-hour Friday daytime session at 60 locations and during 2-hour nighttime weekend periods at 240 locations. Both self-report and biological measures were taken. Biological measures included breath alcohol measurements from 9,413 respondents, oral fluid samples from 7,719 respondents, and blood samples from 3,276 respondents. PMID:21997324

  1. Observed intra-cluster correlation coefficients in a cluster survey sample of patient encounters in general practice in Australia

    PubMed Central

    Knox, Stephanie A; Chondros, Patty

    2004-01-01

    Background Cluster sample study designs are cost effective, however cluster samples violate the simple random sample assumption of independence of observations. Failure to account for the intra-cluster correlation of observations when sampling through clusters may lead to an under-powered study. Researchers therefore need estimates of intra-cluster correlation for a range of outcomes to calculate sample size. We report intra-cluster correlation coefficients observed within a large-scale cross-sectional study of general practice in Australia, where the general practitioner (GP) was the primary sampling unit and the patient encounter was the unit of inference. Methods Each year the Bettering the Evaluation and Care of Health (BEACH) study recruits a random sample of approximately 1,000 GPs across Australia. Each GP completes details of 100 consecutive patient encounters. Intra-cluster correlation coefficients were estimated for patient demographics, morbidity managed and treatments received. Intra-cluster correlation coefficients were estimated for descriptive outcomes and for associations between outcomes and predictors and were compared across two independent samples of GPs drawn three years apart. Results Between April 1999 and March 2000, a random sample of 1,047 Australian general practitioners recorded details of 104,700 patient encounters. Intra-cluster correlation coefficients for patient demographics ranged from 0.055 for patient sex to 0.451 for language spoken at home. Intra-cluster correlations for morbidity variables ranged from 0.005 for the management of eye problems to 0.059 for management of psychological problems. Intra-cluster correlation for the association between two variables was smaller than the descriptive intra-cluster correlation of each variable. When compared with the April 2002 to March 2003 sample (1,008 GPs) the estimated intra-cluster correlation coefficients were found to be consistent across samples. Conclusions The demonstrated precision and reliability of the estimated intra-cluster correlations indicate that these coefficients will be useful for calculating sample sizes in future general practice surveys that use the GP as the primary sampling unit. PMID:15613248

  2. Views of United States physicians and members of the American Medical Association House of Delegates on physician-assisted suicide.

    PubMed

    Whitney, S N; Brown, B W; Brody, H; Alcser, K H; Bachman, J G; Greely, H T

    2001-05-01

    To ascertain the views of physicians and physician leaders toward the legalization of physician-assisted suicide. Confidential mail questionnaire. A nationwide random sample of physicians of all ages and specialties, and all members of the American Medical Association (AMA) House of Delegates as of April 1996. Demographic and practice characteristics and attitude toward legalization of physician-assisted suicide. Usable questionnaires were returned by 658 of 930 eligible physicians in the nationwide random sample (71%) and 315 of 390 eligible physicians in the House of Delegates (81%). In the nationwide random sample, 44.5% favored legalization (16.4% definitely and 28.1% probably), 33.9% opposed legalization (20.4% definitely and 13.5% probably), and 22% were unsure. Opposition to legalization was strongly associated with self-defined politically conservative beliefs, religious affiliation, and the importance of religion to the respondent (P <.001). Among members of the AMA House of Delegates, 23.5% favored legalization (7.3% definitely and 16.2% probably), 61.6% opposed legalization (43.5% definitely and 18.1% probably), and 15% were unsure; their views differed significantly from those of the nationwide random sample (P <.001). Given the choice, a majority of both groups would prefer no law at all, with physician-assisted suicide being neither legal nor illegal. Members of the AMA House of Delegates strongly oppose physician-assisted suicide, but rank-and-file physicians show no consensus either for or against its legalization. Although the debate is sometimes adversarial, most physicians in the United States are uncertain or endorse moderate views on assisted suicide.

  3. Trends in glaucoma surgery incidence and reimbursement for physician services in the Medicare population from 1995 to 1998.

    PubMed

    Paikal, David; Yu, Fei; Coleman, Anne L

    2002-07-01

    To better understand the relationship between glaucoma management and economic incentives, we examined the volume and the reimbursement of argon laser trabeculoplasty (ALT) and trabeculectomy in a 5% random sample of the Medicare population from 1995 to 1998. Retrospective cohort study. Subjects in a 5% random sample of the Medicare population who had ALT and trabeculectomy from 1995 to 1998. Using the Health Care Financing Administration (HCFA) Physician/Supplier Part-B files for a 5% random sample of the Medicare population, we identified all subjects who had ALT and trabeculectomy from 1995 to 1998. Descriptive summaries (the number of surgeries and the mean and the standard deviation of reimbursement per surgery) were calculated for each year. Analysis of variance was used to test for differences in reimbursement per surgery across years. Chi-square tests were used to assess any associations between the changing numbers of ALTs and trabeculectomies over the study period and both age and race. We assessed the number of ALTs and trabeculectomies and the allowed charges for each surgery in the 5% random sample of the Medicare population from 1995 to 1998. The volume of both ALTs and trabeculectomies declined during the study period. Reimbursement per surgery for both ALT and trabeculectomy varied significantly across years (P < 0.001). Significant associations were found between the changing number of ALTs and both age and race. Changing numbers of ALT and trabeculectomy seem unrelated to reimbursement rates. Rather, these changes are more likely driven by new developments in the clinical management of glaucoma, among other factors.

  4. Endothelial Nitric Oxide Synthase (G894T) Gene Polymorphism in a Random Sample of the Egyptian Population: Comparison with Myocardial Infarction Patients

    PubMed Central

    Abdel Rahman, Mohamed F.; Hashad, Ingy M.; Abdel-Maksoud, Sahar M.; Farag, Nabil M.; Abou-Aisha, Khaled

    2012-01-01

    Aim: The aim of this study was to detect endothelial nitric oxide synthase (eNOS) Glu298Asp gene variants in a random sample of the Egyptian population, compare it with those from other populations, and attempt to correlate these variants with serum levels of nitric oxide (NO). The association of eNOS genotypes or serum NO levels with the incidence of acute myocardial infarction (AMI) was also examined. Methods: One hundred one unrelated healthy subjects and 104 unrelated AMI patients were recruited randomly from the 57357 Hospital and intensive care units of El Demerdash Hospital and National Heart Institute, Cairo, Egypt. eNOS genotypes were determined by polymerase chain reaction–restriction fragment length polymorphism. Serum NO was determined spectrophotometrically. Results: The genotype distribution of eNOS Glu298Asp polymorphism determined for our sample was 58.42% GG (wild type), 33.66% GT, and 7.92% TT genotypes while allele frequencies were 75.25% and 24.75% for G and T alleles, respectively. No significant association between serum NO and specific eNOS genotype could be detected. No significant correlation between eNOS genotype distribution or allele frequencies and the incidence of AMI was observed. Conclusion: The present study demonstrated the predominance of the homozygous genotype GG over the heterozygous GT and homozygous TT in random samples of Egyptian population. It also showed the lack of association between eNOS genotypes and mean serum levels of NO, as well as the incidence of AMI. PMID:22731641

  5. Do children with grapheme-colour synaesthesia show cognitive benefits?

    PubMed

    Simner, Julia; Bain, Angela E

    2018-02-01

    Grapheme-colour synaesthesia is characterized by conscious and consistent associations between letters and colours, or between numbers and colours (e.g., synaesthetes might see A as red, 7 as green). Our study explored the development of this condition in a group of randomly sampled child synaesthetes. Two previous studies (Simner & Bain, 2013, Frontiers in Human Neuroscience, 7, 603; Simner, Harrold, Creed, Monro, & Foulkes, 2009, Brain, 132, 57) had screened over 600 primary school children to find the first randomly sampled cohort of child synaesthetes. In this study, we evaluate this cohort to ask whether their synaesthesia is associated with a particular cognitive profile of strengths and/or weaknesses. We tested our child synaesthetes at age 10-11 years in a series of cognitive tests, in comparison with matched controls and baseline norms. One previous study (Green & Goswami, 2008, Cognition, 106, 463) had suggested that child synaesthetes might perform differently to non-synaesthetes in such tasks, although those participants may have been a special type of population independent of their synaesthesia. In our own study of randomly sampled child synaesthetes, we found no significant advantages or disadvantages in a receptive vocabulary test and a memory matrix task. However, we found that synaesthetes demonstrated above-average performance in a processing-speed task and a near-significant advantage in a letter-span task (i.e., memory/recall task of letters). Our findings point to advantages for synaesthetes that go beyond those expected from enhanced coding accounts and we present the first picture of the broader cognitive profile of a randomly sampled population of child synaesthetes. © 2017 The British Psychological Society.

  6. A Dexterous Optional Randomized Response Model

    ERIC Educational Resources Information Center

    Tarray, Tanveer A.; Singh, Housila P.; Yan, Zaizai

    2017-01-01

    This article addresses the problem of estimating the proportion Pi[subscript S] of the population belonging to a sensitive group using optional randomized response technique in stratified sampling based on Mangat model that has proportional and Neyman allocation and larger gain in efficiency. Numerically, it is found that the suggested model is…

  7. Group Matching: Is This a Research Technique to Be Avoided?

    ERIC Educational Resources Information Center

    Ross, Donald C.; Klein, Donald F.

    1988-01-01

    The variance of the sample difference and the power of the "F" test for mean differences were studied under group matching on covariates and also under random assignment. Results shed light on systematic assignment procedures advocated to provide more precise estimates of treatment effects than simple random assignment. (TJH)

  8. Intraclass Correlation Values for Planning Group-Randomized Trials in Education

    ERIC Educational Resources Information Center

    Hedges, Larry V.; Hedberg, E. C.

    2007-01-01

    Experiments that assign intact groups to treatment conditions are increasingly common in social research. In educational research, the groups assigned are often schools. The design of group-randomized experiments requires knowledge of the intraclass correlation structure to compute statistical power and sample sizes required to achieve adequate…

  9. A Randomized Violence Prevention Trial with Comparison: Responses by Gender

    ERIC Educational Resources Information Center

    Griffin, James P., Jr.; Chen, Dungtsa; Eubanks, Adriane; Brantley, Katrina M.; Willis, Leigh A.

    2007-01-01

    Using random assignment of students to two intervention groups and a comparison school sample, the researchers evaluated a three-group school-based violence prevention program. The three groups were (1) a whole-school intervention, (2) whole-school, cognitive-behavioral and cultural enrichment training, and (3) no violence prevention. The…

  10. Random Sampling with Interspike-Intervals of the Exponential Integrate and Fire Neuron: A Computational Interpretation of UP-States

    PubMed Central

    Steimer, Andreas; Schindler, Kaspar

    2015-01-01

    Oscillations between high and low values of the membrane potential (UP and DOWN states respectively) are an ubiquitous feature of cortical neurons during slow wave sleep and anesthesia. Nevertheless, a surprisingly small number of quantitative studies have been conducted only that deal with this phenomenon’s implications for computation. Here we present a novel theory that explains on a detailed mathematical level the computational benefits of UP states. The theory is based on random sampling by means of interspike intervals (ISIs) of the exponential integrate and fire (EIF) model neuron, such that each spike is considered a sample, whose analog value corresponds to the spike’s preceding ISI. As we show, the EIF’s exponential sodium current, that kicks in when balancing a noisy membrane potential around values close to the firing threshold, leads to a particularly simple, approximative relationship between the neuron’s ISI distribution and input current. Approximation quality depends on the frequency spectrum of the current and is improved upon increasing the voltage baseline towards threshold. Thus, the conceptually simpler leaky integrate and fire neuron that is missing such an additional current boost performs consistently worse than the EIF and does not improve when voltage baseline is increased. For the EIF in contrast, the presented mechanism is particularly effective in the high-conductance regime, which is a hallmark feature of UP-states. Our theoretical results are confirmed by accompanying simulations, which were conducted for input currents of varying spectral composition. Moreover, we provide analytical estimations of the range of ISI distributions the EIF neuron can sample from at a given approximation level. Such samples may be considered by any algorithmic procedure that is based on random sampling, such as Markov Chain Monte Carlo or message-passing methods. Finally, we explain how spike-based random sampling relates to existing computational theories about UP states during slow wave sleep and present possible extensions of the model in the context of spike-frequency adaptation. PMID:26203657

  11. Sampling pig farms at the abattoir in a cross-sectional study - Evaluation of a sampling method.

    PubMed

    Birkegård, Anna Camilla; Halasa, Tariq; Toft, Nils

    2017-09-15

    A cross-sectional study design is relatively inexpensive, fast and easy to conduct when compared to other study designs. Careful planning is essential to obtaining a representative sample of the population, and the recommended approach is to use simple random sampling from an exhaustive list of units in the target population. This approach is rarely feasible in practice, and other sampling procedures must often be adopted. For example, when slaughter pigs are the target population, sampling the pigs on the slaughter line may be an alternative to on-site sampling at a list of farms. However, it is difficult to sample a large number of farms from an exact predefined list, due to the logistics and workflow of an abattoir. Therefore, it is necessary to have a systematic sampling procedure and to evaluate the obtained sample with respect to the study objective. We propose a method for 1) planning, 2) conducting, and 3) evaluating the representativeness and reproducibility of a cross-sectional study when simple random sampling is not possible. We used an example of a cross-sectional study with the aim of quantifying the association of antimicrobial resistance and antimicrobial consumption in Danish slaughter pigs. It was not possible to visit farms within the designated timeframe. Therefore, it was decided to use convenience sampling at the abattoir. Our approach was carried out in three steps: 1) planning: using data from meat inspection to plan at which abattoirs and how many farms to sample; 2) conducting: sampling was carried out at five abattoirs; 3) evaluation: representativeness was evaluated by comparing sampled and non-sampled farms, and the reproducibility of the study was assessed through simulated sampling based on meat inspection data from the period where the actual data collection was carried out. In the cross-sectional study samples were taken from 681 Danish pig farms, during five weeks from February to March 2015. The evaluation showed that the sampling procedure was reproducible with results comparable to the collected sample. However, the sampling procedure favoured sampling of large farms. Furthermore, both under-sampled and over-sampled areas were found using scan statistics. In conclusion, sampling conducted at abattoirs can provide a spatially representative sample. Hence it is a possible cost-effective alternative to simple random sampling. However, it is important to assess the properties of the resulting sample so that any potential selection bias can be addressed when reporting the findings. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Survey research with a random digit dial national mobile phone sample in Ghana: Methods and sample quality.

    PubMed

    L'Engle, Kelly; Sefa, Eunice; Adimazoya, Edward Akolgo; Yartey, Emmanuel; Lenzi, Rachel; Tarpo, Cindy; Heward-Mills, Nii Lante; Lew, Katherine; Ampeh, Yvonne

    2018-01-01

    Generating a nationally representative sample in low and middle income countries typically requires resource-intensive household level sampling with door-to-door data collection. High mobile phone penetration rates in developing countries provide new opportunities for alternative sampling and data collection methods, but there is limited information about response rates and sample biases in coverage and nonresponse using these methods. We utilized data from an interactive voice response, random-digit dial, national mobile phone survey in Ghana to calculate standardized response rates and assess representativeness of the obtained sample. The survey methodology was piloted in two rounds of data collection. The final survey included 18 demographic, media exposure, and health behavior questions. Call outcomes and response rates were calculated according to the American Association of Public Opinion Research guidelines. Sample characteristics, productivity, and costs per interview were calculated. Representativeness was assessed by comparing data to the Ghana Demographic and Health Survey and the National Population and Housing Census. The survey was fielded during a 27-day period in February-March 2017. There were 9,469 completed interviews and 3,547 partial interviews. Response, cooperation, refusal, and contact rates were 31%, 81%, 7%, and 39% respectively. Twenty-three calls were dialed to produce an eligible contact: nonresponse was substantial due to the automated calling system and dialing of many unassigned or non-working numbers. Younger, urban, better educated, and male respondents were overrepresented in the sample. The innovative mobile phone data collection methodology yielded a large sample in a relatively short period. Response rates were comparable to other surveys, although substantial coverage bias resulted from fewer women, rural, and older residents completing the mobile phone survey in comparison to household surveys. Random digit dialing of mobile phones offers promise for future data collection in Ghana and may be suitable for other developing countries.

  13. Survey research with a random digit dial national mobile phone sample in Ghana: Methods and sample quality

    PubMed Central

    Sefa, Eunice; Adimazoya, Edward Akolgo; Yartey, Emmanuel; Lenzi, Rachel; Tarpo, Cindy; Heward-Mills, Nii Lante; Lew, Katherine; Ampeh, Yvonne

    2018-01-01

    Introduction Generating a nationally representative sample in low and middle income countries typically requires resource-intensive household level sampling with door-to-door data collection. High mobile phone penetration rates in developing countries provide new opportunities for alternative sampling and data collection methods, but there is limited information about response rates and sample biases in coverage and nonresponse using these methods. We utilized data from an interactive voice response, random-digit dial, national mobile phone survey in Ghana to calculate standardized response rates and assess representativeness of the obtained sample. Materials and methods The survey methodology was piloted in two rounds of data collection. The final survey included 18 demographic, media exposure, and health behavior questions. Call outcomes and response rates were calculated according to the American Association of Public Opinion Research guidelines. Sample characteristics, productivity, and costs per interview were calculated. Representativeness was assessed by comparing data to the Ghana Demographic and Health Survey and the National Population and Housing Census. Results The survey was fielded during a 27-day period in February-March 2017. There were 9,469 completed interviews and 3,547 partial interviews. Response, cooperation, refusal, and contact rates were 31%, 81%, 7%, and 39% respectively. Twenty-three calls were dialed to produce an eligible contact: nonresponse was substantial due to the automated calling system and dialing of many unassigned or non-working numbers. Younger, urban, better educated, and male respondents were overrepresented in the sample. Conclusions The innovative mobile phone data collection methodology yielded a large sample in a relatively short period. Response rates were comparable to other surveys, although substantial coverage bias resulted from fewer women, rural, and older residents completing the mobile phone survey in comparison to household surveys. Random digit dialing of mobile phones offers promise for future data collection in Ghana and may be suitable for other developing countries. PMID:29351349

  14. Work Sampling Study of an Engineering Professor during a Regular Contract Period

    ERIC Educational Resources Information Center

    Brink, Jan; McDonald, Dale B.

    2015-01-01

    Work sampling is a technique that has been employed in industry and fields such as healthcare for some time. It is a powerful technique, and an alternative to conventional stop watch time studies, used by industrial engineers to focus upon random work sampling observations. This study applies work sampling to the duties performed by an individual…

  15. The Accuracy of Estimated Total Test Statistics. Final Report.

    ERIC Educational Resources Information Center

    Kleinke, David J.

    In a post-mortem study of item sampling, 1,050 examinees were divided into ten groups 50 times. Each time, their papers were scored on four different sets of item samples from a 150-item test of academic aptitude. These samples were selected using (a) unstratified random sampling and stratification on (b) content, (c) difficulty, and (d) both.…

  16. Remote Sensing, Sampling and Simulation Applications in Analyses of Insect Dispersion and Abundance in Cotton

    Treesearch

    J. L. Willers; J. M. McKinion; J. N. Jenkins

    2006-01-01

    Simulation was employed to create stratified simple random samples of different sample unit sizes to represent tarnished plant bug abundance at different densities within various habitats of simulated cotton fields. These samples were used to investigate dispersion patterns of this cotton insect. It was found that the assessment of spatial pattern varied as a function...

  17. Statistical Power and Optimum Sample Allocation Ratio for Treatment and Control Having Unequal Costs Per Unit of Randomization

    ERIC Educational Resources Information Center

    Liu, Xiaofeng

    2003-01-01

    This article considers optimal sample allocation between the treatment and control condition in multilevel designs when the costs per sampling unit vary due to treatment assignment. Optimal unequal allocation may reduce the cost from that of a balanced design without sacrificing any power. The optimum sample allocation ratio depends only on the…

  18. Fitting distributions to microbial contamination data collected with an unequal probability sampling design.

    PubMed

    Williams, M S; Ebel, E D; Cao, Y

    2013-01-01

    The fitting of statistical distributions to microbial sampling data is a common application in quantitative microbiology and risk assessment applications. An underlying assumption of most fitting techniques is that data are collected with simple random sampling, which is often times not the case. This study develops a weighted maximum likelihood estimation framework that is appropriate for microbiological samples that are collected with unequal probabilities of selection. A weighted maximum likelihood estimation framework is proposed for microbiological samples that are collected with unequal probabilities of selection. Two examples, based on the collection of food samples during processing, are provided to demonstrate the method and highlight the magnitude of biases in the maximum likelihood estimator when data are inappropriately treated as a simple random sample. Failure to properly weight samples to account for how data are collected can introduce substantial biases into inferences drawn from the data. The proposed methodology will reduce or eliminate an important source of bias in inferences drawn from the analysis of microbial data. This will also make comparisons between studies and the combination of results from different studies more reliable, which is important for risk assessment applications. © 2012 No claim to US Government works.

  19. The effect of sampling rate on observed statistics in a correlated random walk

    PubMed Central

    Rosser, G.; Fletcher, A. G.; Maini, P. K.; Baker, R. E.

    2013-01-01

    Tracking the movement of individual cells or animals can provide important information about their motile behaviour, with key examples including migrating birds, foraging mammals and bacterial chemotaxis. In many experimental protocols, observations are recorded with a fixed sampling interval and the continuous underlying motion is approximated as a series of discrete steps. The size of the sampling interval significantly affects the tracking measurements, the statistics computed from observed trajectories, and the inferences drawn. Despite the widespread use of tracking data to investigate motile behaviour, many open questions remain about these effects. We use a correlated random walk model to study the variation with sampling interval of two key quantities of interest: apparent speed and angle change. Two variants of the model are considered, in which reorientations occur instantaneously and with a stationary pause, respectively. We employ stochastic simulations to study the effect of sampling on the distributions of apparent speeds and angle changes, and present novel mathematical analysis in the case of rapid sampling. Our investigation elucidates the complex nature of sampling effects for sampling intervals ranging over many orders of magnitude. Results show that inclusion of a stationary phase significantly alters the observed distributions of both quantities. PMID:23740484

  20. Distribution of pesticide residues in soil and uncertainty of sampling.

    PubMed

    Suszter, Gabriela K; Ambrus, Árpád

    2017-08-03

    Pesticide residues were determined in about 120 soil cores taken randomly from the top 15 cm layer of two sunflower fields about 30 days after preemergence herbicide treatments. Samples were extracted with acetone-ethyl acetate mixture and the residues were determined with GC-TSD. Residues of dimethenamid, pendimethalin, and prometryn ranged from 0.005 to 2.97 mg/kg. Their relative standard deviations (CV) were between 0.66 and 1.13. The relative frequency distributions of residues in soil cores were very similar to those observed in root and tuber vegetables grown in pesticide treated soils. Based on all available information, a typical CV of 1.00 was estimated for pesticide residues in primary soil samples (soil cores). The corresponding expectable relative uncertainty of sampling is 20% when composite samples of size 25 are taken. To obtain a reliable estimate of the average residues in the top 15 cm layer of soil of a field up to 8 independent replicate random samples should be taken. To obtain better estimate of the actual residue level of the sampled filed would be marginal if larger number of samples were taken.

  1. DS/LPI autocorrelation detection in noise plus random-tone interference. [Direct Sequence Low-Probabilty of Intercept

    NASA Technical Reports Server (NTRS)

    Hinedi, S.; Polydoros, A.

    1988-01-01

    The authors present and analyze a frequency-noncoherent two-lag autocorrelation statistic for the wideband detection of random BPSK signals in noise-plus-random-multitone interference. It is shown that this detector is quite robust to the presence or absence of interference and its specific parameter values, contrary to the case of an energy detector. The rule assumes knowledge of the data rate and the active scenario under H0. It is concluded that the real-time autocorrelation domain and its samples (lags) are a viable approach for detecting random signals in dense environments.

  2. LOD score exclusion analyses for candidate QTLs using random population samples.

    PubMed

    Deng, Hong-Wen

    2003-11-01

    While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes as putative QTLs using random population samples. Previously, we developed an LOD score exclusion mapping approach for candidate genes for complex diseases. Here, we extend this LOD score approach for exclusion analyses of candidate genes for quantitative traits. Under this approach, specific genetic effects (as reflected by heritability) and inheritance models at candidate QTLs can be analyzed and if an LOD score is < or = -2.0, the locus can be excluded from having a heritability larger than that specified. Simulations show that this approach has high power to exclude a candidate gene from having moderate genetic effects if it is not a QTL and is robust to population admixture. Our exclusion analysis complements association analysis for candidate genes as putative QTLs in random population samples. The approach is applied to test the importance of Vitamin D receptor (VDR) gene as a potential QTL underlying the variation of bone mass, an important determinant of osteoporosis.

  3. Sample size determination for GEE analyses of stepped wedge cluster randomized trials.

    PubMed

    Li, Fan; Turner, Elizabeth L; Preisser, John S

    2018-06-19

    In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.

  4. Sample size determination for bibliographic retrieval studies

    PubMed Central

    Yao, Xiaomei; Wilczynski, Nancy L; Walter, Stephen D; Haynes, R Brian

    2008-01-01

    Background Research for developing search strategies to retrieve high-quality clinical journal articles from MEDLINE is expensive and time-consuming. The objective of this study was to determine the minimal number of high-quality articles in a journal subset that would need to be hand-searched to update or create new MEDLINE search strategies for treatment, diagnosis, and prognosis studies. Methods The desired width of the 95% confidence intervals (W) for the lowest sensitivity among existing search strategies was used to calculate the number of high-quality articles needed to reliably update search strategies. New search strategies were derived in journal subsets formed by 2 approaches: random sampling of journals and top journals (having the most high-quality articles). The new strategies were tested in both the original large journal database and in a low-yielding journal (having few high-quality articles) subset. Results For treatment studies, if W was 10% or less for the lowest sensitivity among our existing search strategies, a subset of 15 randomly selected journals or 2 top journals were adequate for updating search strategies, based on each approach having at least 99 high-quality articles. The new strategies derived in 15 randomly selected journals or 2 top journals performed well in the original large journal database. Nevertheless, the new search strategies developed using the random sampling approach performed better than those developed using the top journal approach in a low-yielding journal subset. For studies of diagnosis and prognosis, no journal subset had enough high-quality articles to achieve the expected W (10%). Conclusion The approach of randomly sampling a small subset of journals that includes sufficient high-quality articles is an efficient way to update or create search strategies for high-quality articles on therapy in MEDLINE. The concentrations of diagnosis and prognosis articles are too low for this approach. PMID:18823538

  5. Polynomial chaos representation of databases on manifolds

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu

    2017-04-15

    Characterizing the polynomial chaos expansion (PCE) of a vector-valued random variable with probability distribution concentrated on a manifold is a relevant problem in data-driven settings. The probability distribution of such random vectors is multimodal in general, leading to potentially very slow convergence of the PCE. In this paper, we build on a recent development for estimating and sampling from probabilities concentrated on a diffusion manifold. The proposed methodology constructs a PCE of the random vector together with an associated generator that samples from the target probability distribution which is estimated from data concentrated in the neighborhood of the manifold. Themore » method is robust and remains efficient for high dimension and large datasets. The resulting polynomial chaos construction on manifolds permits the adaptation of many uncertainty quantification and statistical tools to emerging questions motivated by data-driven queries.« less

  6. Recruitment of healthy participants for studies on risks for alcoholism: effectiveness of random digit dialling.

    PubMed

    Sorocco, Kristen H; Vincent, Andrea S; Collins, Frank L; Johnson, Christine A; Lovallo, William R

    2006-01-01

    To compare the effectiveness of two strategies for recruiting healthy research volunteers. Demographic characteristics and recruitment costs of participants who completed a laboratory study examining risk factors for alcoholism recruited through random digit dialling (N = 11) and community advertisements (N = 102) were compared. Advertisement yielded a more representative sample [76% Caucasian, less well educated (M = 15.2 years, SEM = 0.2; P < 0.05), more equally divided by family history of alcoholism (43% FH- and 57% FH+), and lower in SES (M = 42.8, SEM = 1.3; P < 0.05)] and was more cost effective (72 dollars vs 2272 dollars per participant) than random digit dialling. Findings are relevant to alcohol researchers trying to determine the recruitment strategy that will yield the most representative sample at the lowest cost.

  7. Distributed fiber sparse-wideband vibration sensing by sub-Nyquist additive random sampling

    NASA Astrophysics Data System (ADS)

    Zhang, Jingdong; Zheng, Hua; Zhu, Tao; Yin, Guolu; Liu, Min; Bai, Yongzhong; Qu, Dingrong; Qiu, Feng; Huang, Xianbing

    2018-05-01

    The round trip time of the light pulse limits the maximum detectable vibration frequency response range of phase-sensitive optical time domain reflectometry ({\\phi}-OTDR). Unlike the uniform laser pulse interval in conventional {\\phi}-OTDR, we randomly modulate the pulse interval, so that an equivalent sub-Nyquist additive random sampling (sNARS) is realized for every sensing point of the long interrogation fiber. For an {\\phi}-OTDR system with 10 km sensing length, the sNARS method is optimized by theoretical analysis and Monte Carlo simulation, and the experimental results verify that a wide-band spars signal can be identified and reconstructed. Such a method can broaden the vibration frequency response range of {\\phi}-OTDR, which is of great significance in sparse-wideband-frequency vibration signal detection, such as rail track monitoring and metal defect detection.

  8. Estimating the occupancy of spotted owl habitat areas by sampling and adjusting for bias

    Treesearch

    David L. Azuma; James A. Baldwin; Barry R. Noon

    1990-01-01

    A basic sampling scheme is proposed to estimate the proportion of sampled units (Spotted Owl Habitat Areas (SOHAs) or randomly sampled 1000-acre polygon areas (RSAs)) occupied by spotted owl pairs. A bias adjustment for the possibility of missing a pair given its presence on a SOHA or RSA is suggested. The sampling scheme is based on a fixed number of visits to a...

  9. A Randomized Clinical Trial of the Collaborative Assessment and Management of Suicidality vs. Enhanced Care as Usual for Sucidal Soldiers

    DTIC Science & Technology

    2017-06-01

    Research Projects : Patient Perspectives on Successful Management of Suicide Risk in Military and Civilian Samples Masters Student: Kaitlyn R. Schuler...Award Number: W81XWH-11-1-0164 TITLE: "A Randomized Clinical Trial of the Collaborative Assessment & Management of Suicidality vs. Enhanced Care...REPORT TYPE Final 3. DATES COVERED (From - To) 4. TITLE AND SUBTITLE "A Randomized Clinical Trial of the Collaborative Assessment & Management

  10. Polarized ensembles of random pure states

    NASA Astrophysics Data System (ADS)

    Deelan Cunden, Fabio; Facchi, Paolo; Florio, Giuseppe

    2013-08-01

    A new family of polarized ensembles of random pure states is presented. These ensembles are obtained by linear superposition of two random pure states with suitable distributions, and are quite manageable. We will use the obtained results for two purposes: on the one hand we will be able to derive an efficient strategy for sampling states from isopurity manifolds. On the other, we will characterize the deviation of a pure quantum state from separability under the influence of noise.

  11. Studies in astronomical time series analysis: Modeling random processes in the time domain

    NASA Technical Reports Server (NTRS)

    Scargle, J. D.

    1979-01-01

    Random process models phased in the time domain are used to analyze astrophysical time series data produced by random processes. A moving average (MA) model represents the data as a sequence of pulses occurring randomly in time, with random amplitudes. An autoregressive (AR) model represents the correlations in the process in terms of a linear function of past values. The best AR model is determined from sampled data and transformed to an MA for interpretation. The randomness of the pulse amplitudes is maximized by a FORTRAN algorithm which is relatively stable numerically. Results of test cases are given to study the effects of adding noise and of different distributions for the pulse amplitudes. A preliminary analysis of the optical light curve of the quasar 3C 273 is given.

  12. Fluorescent Random Amplified Microsatellites (F-RAMS) analysis of mushrooms as a forensic investigative tool.

    PubMed

    Kallifatidis, Beatrice; Borovička, Jan; Stránská, Jana; Drábek, Jiří; Mills, Deetta K

    2014-03-01

    The capability of Fluorescent Random Amplified Microsatellites (F-RAMS) to profile hallucinogenic mushrooms to species and sub-species level was assessed. Fifteen samples of Amanita rubescens and 22 samples of other hallucinogenic and non-hallucinogenic mushrooms of the genera Amanita and Psilocybe were profiled using two fluorescently-labeled, 5'degenerate primers, 5'-6FAM-SpC3-DD (CCA)5 and 5'-6FAM-SpC3-DHB (CGA)5, which target different microsatellite repeat regions. Among the two primers, 5'-6FAM-SpC3-DHB (CGA)5 provided more reliable data for identification purposes, by grouping samples of the same species and clustering closely related species together in a dendrogram based on amplicon similarities. A high degree of intra-specific variation between the 15 A. rubescens samples was shown with both primers and the amplicons generated for all A. rubescens samples were organized into three classes of amplicons (discriminant, private, and marker) based on their individualizing potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  13. Boosting association rule mining in large datasets via Gibbs sampling.

    PubMed

    Qian, Guoqi; Rao, Calyampudi Radhakrishna; Sun, Xiaoying; Wu, Yuehua

    2016-05-03

    Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this paper, we develop a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. Also a general rule importance measure is proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. In the simulation study and a real genomic data example, we show how to boost association rule mining by an integrated use of the stochastic search and the Apriori algorithm.

  14. Macrostructure from Microstructure: Generating Whole Systems from Ego Networks

    PubMed Central

    Smith, Jeffrey A.

    2014-01-01

    This paper presents a new simulation method to make global network inference from sampled data. The proposed simulation method takes sampled ego network data and uses Exponential Random Graph Models (ERGM) to reconstruct the features of the true, unknown network. After describing the method, the paper presents two validity checks of the approach: the first uses the 20 largest Add Health networks while the second uses the Sociology Coauthorship network in the 1990's. For each test, I take random ego network samples from the known networks and use my method to make global network inference. I find that my method successfully reproduces the properties of the networks, such as distance and main component size. The results also suggest that simpler, baseline models provide considerably worse estimates for most network properties. I end the paper by discussing the bounds/limitations of ego network sampling. I also discuss possible extensions to the proposed approach. PMID:25339783

  15. THE RHETORICAL USE OF RANDOM SAMPLING: CRAFTING AND COMMUNICATING THE PUBLIC IMAGE OF POLLS AS A SCIENCE (1935-1948).

    PubMed

    Lusinchi, Dominic

    2017-03-01

    The scientific pollsters (Archibald Crossley, George H. Gallup, and Elmo Roper) emerged onto the American news media scene in 1935. Much of what they did in the following years (1935-1948) was to promote both the political and scientific legitimacy of their enterprise. They sought to be recognized as the sole legitimate producers of public opinion. In this essay I examine the, mostly overlooked, rhetorical work deployed by the pollsters to publicize the scientific credentials of their polling activities, and the central role the concept of sampling has had in that pursuit. First, they distanced themselves from the failed straw poll by claiming that their sampling methodology based on quotas was informed by science. Second, although in practice they did not use random sampling, they relied on it rhetorically to derive the symbolic benefits of being associated with the "laws of probability." © 2017 Wiley Periodicals, Inc.

  16. VNIR hyperspectral background characterization methods in adverse weather conditions

    NASA Astrophysics Data System (ADS)

    Romano, João M.; Rosario, Dalton; Roth, Luz

    2009-05-01

    Hyperspectral technology is currently being used by the military to detect regions of interest where potential targets may be located. Weather variability, however, may affect the ability for an algorithm to discriminate possible targets from background clutter. Nonetheless, different background characterization approaches may facilitate the ability for an algorithm to discriminate potential targets over a variety of weather conditions. In a previous paper, we introduced a new autonomous target size invariant background characterization process, the Autonomous Background Characterization (ABC) or also known as the Parallel Random Sampling (PRS) method, features a random sampling stage, a parallel process to mitigate the inclusion by chance of target samples into clutter background classes during random sampling; and a fusion of results at the end. In this paper, we will demonstrate how different background characterization approaches are able to improve performance of algorithms over a variety of challenging weather conditions. By using the Mahalanobis distance as the standard algorithm for this study, we compare the performance of different characterization methods such as: the global information, 2 stage global information, and our proposed method, ABC, using data that was collected under a variety of adverse weather conditions. For this study, we used ARDEC's Hyperspectral VNIR Adverse Weather data collection comprised of heavy, light, and transitional fog, light and heavy rain, and low light conditions.

  17. Species conservation profiles of a random sample of world spiders I: Agelenidae to Filistatidae

    PubMed Central

    Seppälä, Sini; Henriques, Sérgio; Draney, Michael L; Foord, Stefan; Gibbons, Alastair T; Gomez, Luz A; Kariko, Sarah; Malumbres-Olarte, Jagoba; Milne, Marc; Vink, Cor J

    2018-01-01

    Abstract Background The IUCN Red List of Threatened Species is the most widely used information source on the extinction risk of species. One of the uses of the Red List is to evaluate and monitor the state of biodiversity and a possible approach for this purpose is the Red List Index (RLI). For many taxa, mainly hyperdiverse groups, it is not possible within available resources to assess all known species. In such cases, a random sample of species might be selected for assessment and the results derived from it extrapolated for the entire group - the Sampled Red List Index (SRLI). With the current contribution and the three following papers, we intend to create the first point in time of a future spider SRLI encompassing 200 species distributed across the world. New information A sample of 200 species of spiders were randomly selected from the World Spider Catalogue, an updated global database containing all recognised species names for the group. The 200 selected species where divided taxonomically at the family level and the familes were ordered alphabetically. In this publication, we present the conservation profiles of 46 species belonging to the famillies alphabetically arranged between Agelenidae and Filistatidae, which encompassed Agelenidae, Amaurobiidae, Anyphaenidae, Araneidae, Archaeidae, Barychelidae, Clubionidae, Corinnidae, Ctenidae, Ctenizidae, Cyatholipidae, Dictynidae, Dysderidae, Eresidae and Filistatidae. PMID:29725239

  18. Self-reference and random sampling approach for label-free identification of DNA composition using plasmonic nanomaterials.

    PubMed

    Freeman, Lindsay M; Pang, Lin; Fainman, Yeshaiahu

    2018-05-09

    The analysis of DNA has led to revolutionary advancements in the fields of medical diagnostics, genomics, prenatal screening, and forensic science, with the global DNA testing market expected to reach revenues of USD 10.04 billion per year by 2020. However, the current methods for DNA analysis remain dependent on the necessity for fluorophores or conjugated proteins, leading to high costs associated with consumable materials and manual labor. Here, we demonstrate a potential label-free DNA composition detection method using surface-enhanced Raman spectroscopy (SERS) in which we identify the composition of cytosine and adenine within single strands of DNA. This approach depends on the fact that there is one phosphate backbone per nucleotide, which we use as a reference to compensate for systematic measurement variations. We utilize plasmonic nanomaterials with random Raman sampling to perform label-free detection of the nucleotide composition within DNA strands, generating a calibration curve from standard samples of DNA and demonstrating the capability of resolving the nucleotide composition. The work represents an innovative way for detection of the DNA composition within DNA strands without the necessity of attached labels, offering a highly sensitive and reproducible method that factors in random sampling to minimize error.

  19. Quantifying errors without random sampling.

    PubMed

    Phillips, Carl V; LaPole, Luwanna M

    2003-06-12

    All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically overstate their precision. We argue that the precision implicit in typical reporting is problematic and sketch methods for quantifying the various sources of error, building up from simple examples that can be solved analytically to more complex cases. There are straightforward ways to partially quantify the uncertainty surrounding a parameter that is not characterized by random sampling, such as limiting reported significant figures. We present simple methods for doing such quantifications, and for incorporating them into calculations. More complicated methods become necessary when multiple sources of uncertainty must be combined. We demonstrate that Monte Carlo simulation, using available software, can estimate the uncertainty resulting from complicated calculations with many sources of uncertainty. We apply the method to the current estimate of the annual incidence of foodborne illness in the United States. Quantifying uncertainty from systematic errors is practical. Reporting this uncertainty would more honestly represent study results, help show the probability that estimated values fall within some critical range, and facilitate better targeting of further research.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    V Yashchuk; R Conley; E Anderson

    Verification of the reliability of metrology data from high quality X-ray optics requires that adequate methods for test and calibration of the instruments be developed. For such verification for optical surface profilometers in the spatial frequency domain, a modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) gratings and arrays has been suggested [1] and [2] and proven to be an effective calibration method for a number of interferometric microscopes, a phase shifting Fizeau interferometer, and a scatterometer [5]. Here we describe the details of development of binary pseudo-random multilayer (BPRML) test samples suitable for characterization of scanningmore » (SEM) and transmission (TEM) electron microscopes. We discuss the results of TEM measurements with the BPRML test samples fabricated from a WiSi2/Si multilayer coating with pseudo-randomly distributed layers. In particular, we demonstrate that significant information about the metrological reliability of the TEM measurements can be extracted even when the fundamental frequency of the BPRML sample is smaller than the Nyquist frequency of the measurements. The measurements demonstrate a number of problems related to the interpretation of the SEM and TEM data. Note that similar BPRML test samples can be used to characterize X-ray microscopes. Corresponding work with X-ray microscopes is in progress.« less

  1. Molecular analysis of the microbial diversity present in the colonic wall, colonic lumen, and cecal lumen of a pig.

    PubMed

    Pryde, S E; Richardson, A J; Stewart, C S; Flint, H J

    1999-12-01

    Random clones of 16S ribosomal DNA gene sequences were isolated after PCR amplification with eubacterial primers from total genomic DNA recovered from samples of the colonic lumen, colonic wall, and cecal lumen from a pig. Sequences were also obtained for cultures isolated anaerobically from the same colonic-wall sample. Phylogenetic analysis showed that many sequences were related to those of Lactobacillus or Streptococcus spp. or fell into clusters IX, XIVa, and XI of gram-positive bacteria. In addition, 59% of randomly cloned sequences showed less than 95% similarity to database entries or sequences from cultivated organisms. Cultivation bias is also suggested by the fact that the majority of isolates (54%) recovered from the colon wall by culturing were related to Lactobacillus and Streptococcus, whereas this group accounted for only one-third of the sequence variation for the same sample from random cloning. The remaining cultured isolates were mainly Selenomonas related. A higher proportion of Lactobacillus reuteri-related sequences than of Lactobacillus acidophilus- and Lactobacillus amylovorus-related sequences were present in the colonic-wall sample. Since the majority of bacterial ribosomal sequences recovered from the colon wall are less than 95% related to known organisms, the roles of many of the predominant wall-associated bacteria remain to be defined.

  2. Molecular Analysis of the Microbial Diversity Present in the Colonic Wall, Colonic Lumen, and Cecal Lumen of a Pig

    PubMed Central

    Pryde, Susan E.; Richardson, Anthony J.; Stewart, Colin S.; Flint, Harry J.

    1999-01-01

    Random clones of 16S ribosomal DNA gene sequences were isolated after PCR amplification with eubacterial primers from total genomic DNA recovered from samples of the colonic lumen, colonic wall, and cecal lumen from a pig. Sequences were also obtained for cultures isolated anaerobically from the same colonic-wall sample. Phylogenetic analysis showed that many sequences were related to those of Lactobacillus or Streptococcus spp. or fell into clusters IX, XIVa, and XI of gram-positive bacteria. In addition, 59% of randomly cloned sequences showed less than 95% similarity to database entries or sequences from cultivated organisms. Cultivation bias is also suggested by the fact that the majority of isolates (54%) recovered from the colon wall by culturing were related to Lactobacillus and Streptococcus, whereas this group accounted for only one-third of the sequence variation for the same sample from random cloning. The remaining cultured isolates were mainly Selenomonas related. A higher proportion of Lactobacillus reuteri-related sequences than of Lactobacillus acidophilus- and Lactobacillus amylovorus-related sequences were present in the colonic-wall sample. Since the majority of bacterial ribosomal sequences recovered from the colon wall are less than 95% related to known organisms, the roles of many of the predominant wall-associated bacteria remain to be defined. PMID:10583991

  3. Species conservation profiles of a random sample of world spiders I: Agelenidae to Filistatidae.

    PubMed

    Seppälä, Sini; Henriques, Sérgio; Draney, Michael L; Foord, Stefan; Gibbons, Alastair T; Gomez, Luz A; Kariko, Sarah; Malumbres-Olarte, Jagoba; Milne, Marc; Vink, Cor J; Cardoso, Pedro

    2018-01-01

    The IUCN Red List of Threatened Species is the most widely used information source on the extinction risk of species. One of the uses of the Red List is to evaluate and monitor the state of biodiversity and a possible approach for this purpose is the Red List Index (RLI). For many taxa, mainly hyperdiverse groups, it is not possible within available resources to assess all known species. In such cases, a random sample of species might be selected for assessment and the results derived from it extrapolated for the entire group - the Sampled Red List Index (SRLI). With the current contribution and the three following papers, we intend to create the first point in time of a future spider SRLI encompassing 200 species distributed across the world. A sample of 200 species of spiders were randomly selected from the World Spider Catalogue, an updated global database containing all recognised species names for the group. The 200 selected species where divided taxonomically at the family level and the familes were ordered alphabetically. In this publication, we present the conservation profiles of 46 species belonging to the famillies alphabetically arranged between Agelenidae and Filistatidae, which encompassed Agelenidae, Amaurobiidae, Anyphaenidae, Araneidae, Archaeidae, Barychelidae, Clubionidae, Corinnidae, Ctenidae, Ctenizidae, Cyatholipidae, Dictynidae, Dysderidae, Eresidae and Filistatidae.

  4. Randomized comparison of 3 different-sized biopsy forceps for quality of sampling in Barrett’s esophagus

    PubMed Central

    Gonzalez, Susana; Yu, Woojin M.; Smith, Michael S.; Slack, Kristen N.; Rotterdam, Heidrun; Abrams, Julian A.; Lightdale, Charles J.

    2011-01-01

    Background Several types of forceps are available for use in sampling Barrett’s esophagus (BE). Few data exist with regard to biopsy quality for histologic assessment. Objective To evaluate sampling quality of 3 different forceps in patients with BE. Design Single-center, randomized clinical trial. Patients Consecutive patients with BE undergoing upper endoscopy. Interventions Patients randomized to have biopsy specimens taken with 1 of 3 types of forceps: standard, large capacity, or jumbo. Main Outcome Measurements Specimen adequacy was defined a priori as a well-oriented biopsy sample 2 mm or greater in diameter and with at least muscularis mucosa present. Results A total of 65 patients were enrolled and analyzed (standard forceps, n = 21; large-capacity forceps, n = 21; jumbo forceps, n = 23). Compared with jumbo forceps, a significantly higher proportion of biopsy samples with large-capacity forceps were adequate (37.8% vs 25.2%, P = .002). Of the standard forceps biopsy samples, 31.9% were adequate, which was not significantly different from specimens taken with large-capacity (P = .20) or jumbo (P = .09) forceps. Biopsy specimens taken with jumbo forceps had the largest diameter (median, 3.0 mm vs 2.5 mm [standard] vs 2.8 mm [large capacity]; P = .0001). However, jumbo forceps had the lowest proportion of specimens that were well oriented (overall P = .001). Limitations Heterogeneous patient population precluded dysplasia detection analyses. Conclusions Our results challenge the requirement of jumbo forceps and therapeutic endoscopes to properly perform the Seattle protocol. We found that standard and large-capacity forceps used with standard upper endoscopes produced biopsy samples at least as adequate as those obtained with jumbo forceps and therapeutic endoscopes in patients with BE. PMID:21034895

  5. Is Knowledge Random? Introducing Sampling and Bias through Outdoor Inquiry

    ERIC Educational Resources Information Center

    Stier, Sam

    2010-01-01

    Sampling, very generally, is the process of learning about something by selecting and assessing representative parts of that population or object. In the inquiry activity described here, students learned about sampling techniques as they estimated the number of trees greater than 12 cm dbh (diameter at breast height) in a wooded, discrete area…

  6. Investigating Test Equating Methods in Small Samples through Various Factors

    ERIC Educational Resources Information Center

    Asiret, Semih; Sünbül, Seçil Ömür

    2016-01-01

    In this study, equating methods for random group design using small samples through factors such as sample size, difference in difficulty between forms, and guessing parameter was aimed for comparison. Moreover, which method gives better results under which conditions was also investigated. In this study, 5,000 dichotomous simulated data…

  7. Rational Variability in Children's Causal Inferences: The Sampling Hypothesis

    ERIC Educational Resources Information Center

    Denison, Stephanie; Bonawitz, Elizabeth; Gopnik, Alison; Griffiths, Thomas L.

    2013-01-01

    We present a proposal--"The Sampling Hypothesis"--suggesting that the variability in young children's responses may be part of a rational strategy for inductive inference. In particular, we argue that young learners may be randomly sampling from the set of possible hypotheses that explain the observed data, producing different hypotheses with…

  8. Convenience Samples and Caregiving Research: How Generalizable Are the Findings?

    ERIC Educational Resources Information Center

    Pruchno, Rachel A.; Brill, Jonathan E.; Shands, Yvonne; Gordon, Judith R.; Genderson, Maureen Wilson; Rose, Miriam; Cartwright, Francine

    2008-01-01

    Purpose: We contrast characteristics of respondents recruited using convenience strategies with those of respondents recruited by random digit dial (RDD) methods. We compare sample variances, means, and interrelationships among variables generated from the convenience and RDD samples. Design and Methods: Women aged 50 to 64 who work full time and…

  9. The Recruitment and Retention of People with Disabilities. Report 301.

    ERIC Educational Resources Information Center

    Dench, S.; And Others

    A British survey of employers examined the recruitment and retention of people with disabilities (PWDs). Telephone interviews were conducted with two samples of employers: a random sample of 1,250 and a sample of 250 registered users of the Employment Service's "Disability Symbol," which sets a good practice standard for the employment…

  10. Evaluation of Students' Perceptions about Efficiency of Educational Club Practices in Primary Schools

    ERIC Educational Resources Information Center

    Gelen, Ismail; Onay, Ihsan; Varol, Volkan

    2014-01-01

    The purpose of this study is to examine the efficiency of "Educational Club Practices" that has been in Elementary School program since 2005-2006, by examining the attitudes of students about "Educational Club Practices". Sample was selected in two steps. First, stratified sampling was employed and then random sampling was…

  11. A MORE COST-EFFECTIVE EMAP-W BENTHIC MACROFAUNAL SAMPLE UNIT

    EPA Science Inventory

    The standard EPA West Coast Environmental Monitoring and Assessment Program (EMAP-W) benthic macrofaunal sampling protocol is to collect 30-50 random benthic samples per reporting unit (e.g., estuary, region) using a 0.1 m2 grab and to sort out macrofauna using a 1.0 mm mesh scre...

  12. 45 CFR 1356.71 - Federal review of the eligibility of children in foster care and the eligibility of foster care...

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... by ACF statistical staff from the Adoption and Foster Care Analysis and Reporting System (AFCARS... primary review utilizing probability sampling methodologies. Usually, the chosen methodology will be simple random sampling, but other probability samples may be utilized, when necessary and appropriate. (3...

  13. [Environmental Education Units.

    ERIC Educational Resources Information Center

    Minneapolis Independent School District 275, Minn.

    Two of these three pamphlets describe methods of teaching young elementary school children the principles of sampling. Tiles of five colors are added to a tub and children sample these randomly; using the tiles as units for a graph, they draw a representation of the population. Pooling results leads to a more reliable sample. Practice is given in…

  14. Adaptive detection of noise signal according to Neumann-Pearson criterion

    NASA Astrophysics Data System (ADS)

    Padiryakov, Y. A.

    1985-03-01

    Optimum detection according to the Neumann-Pearson criterion is considered in the case of a random Gaussian noise signal, stationary during measurement, and a stationary random Gaussian background interference. Detection is based on two samples, their statistics characterized by estimates of their spectral densities, it being a priori known that sample A from the signal channel is either the sum of signal and interference or interference alone and sample B from the reference interference channel is an interference with the same spectral density as that of the interference in sample A for both hypotheses. The probability of correct detection is maximized on the average, first in the 2N-dimensional space of signal spectral density and interference spectral density readings, by fixing the probability of false alarm at each point so as to stabilize it at a constant level against variation of the interference spectral density. Deterministic decision rules are established. The algorithm is then reduced to equivalent detection in the N-dimensional space of the ratio of sample A readings to sample B readings.

  15. Probability of coincidental similarity among the orbits of small bodies - I. Pairing

    NASA Astrophysics Data System (ADS)

    Jopek, Tadeusz Jan; Bronikowska, Małgorzata

    2017-09-01

    Probability of coincidental clustering among orbits of comets, asteroids and meteoroids depends on many factors like: the size of the orbital sample searched for clusters or the size of the identified group, it is different for groups of 2,3,4,… members. Probability of coincidental clustering is assessed by the numerical simulation, therefore, it depends also on the method used for the synthetic orbits generation. We have tested the impact of some of these factors. For a given size of the orbital sample we have assessed probability of random pairing among several orbital populations of different sizes. We have found how these probabilities vary with the size of the orbital samples. Finally, keeping fixed size of the orbital sample we have shown that the probability of random pairing can be significantly different for the orbital samples obtained by different observation techniques. Also for the user convenience we have obtained several formulae which, for given size of the orbital sample can be used to calculate the similarity threshold corresponding to the small value of the probability of coincidental similarity among two orbits.

  16. A New Stratified Sampling Procedure which Decreases Error Estimation of Varroa Mite Number on Sticky Boards.

    PubMed

    Kretzschmar, A; Durand, E; Maisonnasse, A; Vallon, J; Le Conte, Y

    2015-06-01

    A new procedure of stratified sampling is proposed in order to establish an accurate estimation of Varroa destructor populations on sticky bottom boards of the hive. It is based on the spatial sampling theory that recommends using regular grid stratification in the case of spatially structured process. The distribution of varroa mites on sticky board being observed as spatially structured, we designed a sampling scheme based on a regular grid with circles centered on each grid element. This new procedure is then compared with a former method using partially random sampling. Relative error improvements are exposed on the basis of a large sample of simulated sticky boards (n=20,000) which provides a complete range of spatial structures, from a random structure to a highly frame driven structure. The improvement of varroa mite number estimation is then measured by the percentage of counts with an error greater than a given level. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. A random effects meta-analysis model with Box-Cox transformation.

    PubMed

    Yamaguchi, Yusuke; Maruo, Kazushi; Partlett, Christopher; Riley, Richard D

    2017-07-19

    In a random effects meta-analysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval. We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects meta-analysis model where a Box-Cox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the within-study sampling distributions and the random effects distribution. When sampling distributions are approximately normal, non-normality in the overall distribution will be mainly due to the random effects distribution, especially when the between-study variation is large relative to the within-study variation. The Box-Cox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the meta-analysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable. A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I 2 from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model. The random effects meta-analysis with the Box-Cox transformation may be an important tool for examining robustness of traditional meta-analysis results against skewness on the observed treatment effect estimates. Further critical evaluation of the method is needed.

  18. Unit Costs of Interlibrary Loans and Photocopies at a Regional Medical Library: Preliminary Report *

    PubMed Central

    Spencer, Carol C.

    1970-01-01

    Unit costs of providing interlibrary loans and photocopies were determined by a method not previously used for library cost studies: random time sampling with self-observation. The working time of all appropriate personnel was sampled using Random Alarm Mechanisms and a structured checklist of tasks. The total lender's unit cost per request received, including direct labor, materials, fringe benefits, and overhead, was $1.526 for originals mailed postpaid by lender, and $1.534 for photocopies mailed. Corresponding unit costs per request filled were: originals, $1.932, and photocopies, $1.763. PMID:5439910

  19. Ventilatory Function in Relation to Mining Experience and Smoking in a Random Sample of Miners and Non-miners in a Witwatersrand Town1

    PubMed Central

    Sluis-Cremer, G. K.; Walters, L. G.; Sichel, H. S.

    1967-01-01

    The ventilatory capacity of a random sample of men over the age of 35 years in the town of Carletonville was estimated by the forced expiratory volume and the peak expiratory flow rate. Five hundred and sixty-two persons were working or had worked in gold-mines and 265 had never worked in gold-mines. No difference in ventilatory function was found between the miners and non-miners other than that due to the excess of chronic bronchitis in miners. PMID:6017134

  20. Archaeological Investigations at Sites 45-OK-250 and 45-OK-4, Chief Joseph Dam Project, Washington.

    DTIC Science & Technology

    1984-01-01

    level behind Chief Joseph Dam. Systematic random sampling using I x I x ’ - 0.1-m collection units in 1 x 1, 1 x 2, or 2 x 2 m cel Is disclosed...ft to the operating pool . level behind Chief Joseph Dam. Systematic random sampling using 1 x 1 x 0.1-i collection units In 1 x 1, 1 x 2, or 2 x 2 m...contingencies under which data were collected , describe data collection and analysis, and organize and summarize data In a torm useful to the widest

  1. Method of multiplexed analysis using ion mobility spectrometer

    DOEpatents

    Belov, Mikhail E [Richland, WA; Smith, Richard D [Richland, WA

    2009-06-02

    A method for analyzing analytes from a sample introduced into a Spectrometer by generating a pseudo random sequence of a modulation bins, organizing each modulation bin as a series of submodulation bins, thereby forming an extended pseudo random sequence of submodulation bins, releasing the analytes in a series of analyte packets into a Spectrometer, thereby generating an unknown original ion signal vector, detecting the analytes at a detector, and characterizing the sample using the plurality of analyte signal subvectors. The method is advantageously applied to an Ion Mobility Spectrometer, and an Ion Mobility Spectrometer interfaced with a Time of Flight Mass Spectrometer.

  2. Motivating smokers at outdoor public smoking hotspots to have a quit attempt with a nicotine replacement therapy sample: study protocol for a randomized controlled trial.

    PubMed

    Cheung, Yee Tak Derek; Leung, Jessica Pui Kei; Cheung, Chelsia Ka Ching; Li, William Ho Cheung; Wang, Man Ping; Lam, Tai Hing

    2016-07-26

    About half of the daily smokers in Hong Kong have never tried and have no intention to quit smoking. More than one-third (37.9 %) of daily smokers have attempted to quit but failed. Nicotine replacement therapy (NRT) is a safe and effective pharmacotherapy to increase abstinence by reducing withdrawal symptoms during the early stage of smoking abstinence. However, the prevalence of NRT use in Hong Kong is lower than in most developed countries. The proposed study aims to assess the effectiveness of providing free NRT samples to smokers on increasing quit attempts and the quit rate. Trained university undergraduate students as ambassadors will invite smokers at outdoor public smoking hotspots to participate in the randomized controlled trial, in which eligible smokers will be randomized to receive a 1-week free NRT sample and medication counselling (intervention) or advice to purchase NRT on their own (control). The primary outcome is self-reported quit attempts (no smoking for at least 24 hours) in the past 30 days at 1-month and 3-month telephone follow-up. The findings will inform the effectiveness of delivering free NRT samples at outdoor public smoking hotspots to increase quit attempts and abstinence. The study will also provide information on smokers' adherence to the NRT sample, side effects and safety issues related to the usage. This will improve the design of a large trial to test the effect of the NRT sample. ClinicalTrials.gov NCT02491086 . Registered on 7 July 2015.

  3. Maximum type I error rate inflation from sample size reassessment when investigators are blind to treatment labels.

    PubMed

    Żebrowska, Magdalena; Posch, Martin; Magirr, Dominic

    2016-05-30

    Consider a parallel group trial for the comparison of an experimental treatment to a control, where the second-stage sample size may depend on the blinded primary endpoint data as well as on additional blinded data from a secondary endpoint. For the setting of normally distributed endpoints, we demonstrate that this may lead to an inflation of the type I error rate if the null hypothesis holds for the primary but not the secondary endpoint. We derive upper bounds for the inflation of the type I error rate, both for trials that employ random allocation and for those that use block randomization. We illustrate the worst-case sample size reassessment rule in a case study. For both randomization strategies, the maximum type I error rate increases with the effect size in the secondary endpoint and the correlation between endpoints. The maximum inflation increases with smaller block sizes if information on the block size is used in the reassessment rule. Based on our findings, we do not question the well-established use of blinded sample size reassessment methods with nuisance parameter estimates computed from the blinded interim data of the primary endpoint. However, we demonstrate that the type I error rate control of these methods relies on the application of specific, binding, pre-planned and fully algorithmic sample size reassessment rules and does not extend to general or unplanned sample size adjustments based on blinded data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  4. What Are Probability Surveys used by the National Aquatic Resource Surveys?

    EPA Pesticide Factsheets

    The National Aquatic Resource Surveys (NARS) use probability-survey designs to assess the condition of the nation’s waters. In probability surveys (also known as sample-surveys or statistical surveys), sampling sites are selected randomly.

  5. Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach

    ERIC Educational Resources Information Center

    Rotondi, Michael A.; Donner, Allan

    2009-01-01

    The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…

  6. An Efficient Alternative Mixed Randomized Response Procedure

    ERIC Educational Resources Information Center

    Singh, Housila P.; Tarray, Tanveer A.

    2015-01-01

    In this article, we have suggested a new modified mixed randomized response (RR) model and studied its properties. It is shown that the proposed mixed RR model is always more efficient than the Kim and Warde's mixed RR model. The proposed mixed RR model has also been extended to stratified sampling. Numerical illustrations and graphical…

  7. What Does a Random Line Look Like: An Experimental Study

    ERIC Educational Resources Information Center

    Turner, Nigel E.; Liu, Eleanor; Toneatto, Tony

    2011-01-01

    The study examined the perception of random lines by people with gambling problems compared to people without gambling problems. The sample consisted of 67 probable pathological gamblers and 46 people without gambling problems. Participants completed a number of questionnaires about their gambling and were then presented with a series of random…

  8. Adiposity and Quality of Life: A Case Study from an Urban Center in Nigeria

    ERIC Educational Resources Information Center

    Akinpelu, Aderonke O.; Akinola, Odunayo T.; Gbiri, Caleb A.

    2009-01-01

    Objective: To determine relationship between adiposity indices and quality of life (QOL) of residents of a housing estate in Lagos, Nigeria. Design: Cross-sectional survey employing multistep random sampling method. Setting: Urban residential estate. Participants: This study involved 900 randomly selected residents of Abesan Housing Estate, Lagos,…

  9. A Strategy to Use Soft Data Effectively in Randomized Controlled Clinical Trials.

    ERIC Educational Resources Information Center

    Kraemer, Helena Chmura; Thiemann, Sue

    1989-01-01

    Sees soft data, measures having substantial intrasubject variability due to errors of measurement or response inconsistency, as important measures of response in randomized clinical trials. Shows that using intensive design and slope of response on time as outcome measure maximizes sample retention and decreases within-group variability, thus…

  10. Is There a Silent Hearing Loss among Children in Jordan?

    ERIC Educational Resources Information Center

    Alaqrabawi, Wala' S.; Alshawabka, Amneh Z.; Al-Addasi, Zainab M.

    2016-01-01

    This study measured the prevalence of hearing loss among school children in Jordan. A random sample of 1649 children (990 males and 659 females) was collected from randomly chosen 40 schools in Amman. Screening was conducted between November 2010 and October 2014. Otoscopic examination, tympanometry, and audiometry were used for screening. Based…

  11. The Role of Religiosity in Influencing Adolescent and Adult Alcohol Use in Trinidad

    ERIC Educational Resources Information Center

    Rollocks, Steve C. T.; Dass, Natasha; Seepersad, Randy; Mohammed, Linda

    2008-01-01

    This study examined the role of religiosity among adolescents' and adults' alcohol use in Trinidad. A stratified random sample design of 369 adolescents and 210 adult parents belonging to the various religious groups in Trinidad was employed. Participants were randomly selected from various educational districts across Trinidad. Adolescent…

  12. Interpretation Training in Individuals with Generalized Social Anxiety Disorder: A Randomized Controlled Trial

    ERIC Educational Resources Information Center

    Amir, Nader; Taylor, Charles T.

    2012-01-01

    Objective: To examine the efficacy of a multisession computerized interpretation modification program (IMP) in the treatment of generalized social anxiety disorder (GSAD). Method: The sample comprised 49 individuals meeting diagnostic criteria for GSAD who were enrolled in a randomized, double-blind placebo-controlled trial comparing IMP (n = 23)…

  13. Teachers' Attitude towards Implementation of Learner-Centered Methodology in Science Education in Kenya

    ERIC Educational Resources Information Center

    Ndirangu, Caroline

    2017-01-01

    This study aims to evaluate teachers' attitude towards implementation of learner-centered methodology in science education in Kenya. The study used a survey design methodology, adopting the purposive, stratified random and simple random sampling procedures and hypothesised that there was no significant relationship between the head teachers'…

  14. Disk Density Tuning of a Maximal Random Packing

    PubMed Central

    Ebeida, Mohamed S.; Rushdi, Ahmad A.; Awad, Muhammad A.; Mahmoud, Ahmed H.; Yan, Dong-Ming; English, Shawn A.; Owens, John D.; Bajaj, Chandrajit L.; Mitchell, Scott A.

    2016-01-01

    We introduce an algorithmic framework for tuning the spatial density of disks in a maximal random packing, without changing the sizing function or radii of disks. Starting from any maximal random packing such as a Maximal Poisson-disk Sampling (MPS), we iteratively relocate, inject (add), or eject (remove) disks, using a set of three successively more-aggressive local operations. We may achieve a user-defined density, either more dense or more sparse, almost up to the theoretical structured limits. The tuned samples are conflict-free, retain coverage maximality, and, except in the extremes, retain the blue noise randomness properties of the input. We change the density of the packing one disk at a time, maintaining the minimum disk separation distance and the maximum domain coverage distance required of any maximal packing. These properties are local, and we can handle spatially-varying sizing functions. Using fewer points to satisfy a sizing function improves the efficiency of some applications. We apply the framework to improve the quality of meshes, removing non-obtuse angles; and to more accurately model fiber reinforced polymers for elastic and failure simulations. PMID:27563162

  15. Disk Density Tuning of a Maximal Random Packing.

    PubMed

    Ebeida, Mohamed S; Rushdi, Ahmad A; Awad, Muhammad A; Mahmoud, Ahmed H; Yan, Dong-Ming; English, Shawn A; Owens, John D; Bajaj, Chandrajit L; Mitchell, Scott A

    2016-08-01

    We introduce an algorithmic framework for tuning the spatial density of disks in a maximal random packing, without changing the sizing function or radii of disks. Starting from any maximal random packing such as a Maximal Poisson-disk Sampling (MPS), we iteratively relocate, inject (add), or eject (remove) disks, using a set of three successively more-aggressive local operations. We may achieve a user-defined density, either more dense or more sparse, almost up to the theoretical structured limits. The tuned samples are conflict-free, retain coverage maximality, and, except in the extremes, retain the blue noise randomness properties of the input. We change the density of the packing one disk at a time, maintaining the minimum disk separation distance and the maximum domain coverage distance required of any maximal packing. These properties are local, and we can handle spatially-varying sizing functions. Using fewer points to satisfy a sizing function improves the efficiency of some applications. We apply the framework to improve the quality of meshes, removing non-obtuse angles; and to more accurately model fiber reinforced polymers for elastic and failure simulations.

  16. The unbiasedness of a generalized mirage boundary correction method for Monte Carlo integration estimators of volume

    Treesearch

    Thomas B. Lynch; Jeffrey H. Gove

    2014-01-01

    The typical "double counting" application of the mirage method of boundary correction cannot be applied to sampling systems such as critical height sampling (CHS) that are based on a Monte Carlo sample of a tree (or debris) attribute because the critical height (or other random attribute) sampled from a mirage point is generally not equal to the critical...

  17. Event-triggered synchronization for reaction-diffusion complex networks via random sampling

    NASA Astrophysics Data System (ADS)

    Dong, Tao; Wang, Aijuan; Zhu, Huiyun; Liao, Xiaofeng

    2018-04-01

    In this paper, the synchronization problem of the reaction-diffusion complex networks (RDCNs) with Dirichlet boundary conditions is considered, where the data is sampled randomly. An event-triggered controller based on the sampled data is proposed, which can reduce the number of controller and the communication load. Under this strategy, the synchronization problem of the diffusion complex network is equivalently converted to the stability of a of reaction-diffusion complex dynamical systems with time delay. By using the matrix inequality technique and Lyapunov method, the synchronization conditions of the RDCNs are derived, which are dependent on the diffusion term. Moreover, it is found the proposed control strategy can get rid of the Zeno behavior naturally. Finally, a numerical example is given to verify the obtained results.

  18. Linear discriminant analysis with misallocation in training samples

    NASA Technical Reports Server (NTRS)

    Chhikara, R. (Principal Investigator); Mckeon, J.

    1982-01-01

    Linear discriminant analysis for a two-class case is studied in the presence of misallocation in training samples. A general appraoch to modeling of mislocation is formulated, and the mean vectors and covariance matrices of the mixture distributions are derived. The asymptotic distribution of the discriminant boundary is obtained and the asymptotic first two moments of the two types of error rate given. Certain numerical results for the error rates are presented by considering the random and two non-random misallocation models. It is shown that when the allocation procedure for training samples is objectively formulated, the effect of misallocation on the error rates of the Bayes linear discriminant rule can almost be eliminated. If, however, this is not possible, the use of Fisher rule may be preferred over the Bayes rule.

  19. Performance of the Fecal Immunochemical Test for Colorectal Cancer Screening Using Different Stool-Collection Devices: Preliminary Results from a Randomized Controlled Trial.

    PubMed

    Shin, Hye Young; Suh, Mina; Baik, Hyung Won; Choi, Kui Son; Park, Boyoung; Jun, Jae Kwan; Hwang, Sang-Hyun; Kim, Byung Chang; Lee, Chan Wha; Oh, Jae Hwan; Lee, You Kyoung; Han, Dong Soo; Lee, Do-Hoon

    2016-11-15

    We are in the process of conducting a randomized trial to determine whether compliance with the fecal immunochemical test (FIT) for colorectal cancer screening differs according to the stool-collection method. This study was an interim analysis of the performance of two stool-collection devices (sampling bottle vs conventional container). In total, 1,701 individuals (age range, 50 to 74 years) were randomized into the sampling bottle group (intervention arm) or the conventional container group (control arm). In both groups, we evaluated the FIT positivity rate, the positive predictive value for advanced neoplasia, and the detection rate for advanced neoplasia. The FIT positivity rates were 4.1% for the sampling bottles and 2.0% for the conventional containers; these values were significantly different. The positive predictive values for advanced neoplasia in the sampling bottles and conventional containers were 11.1% (95% confidence interval [CI], -3.4 to 25.6) and 12.0% (95% CI, -0.7 to 24.7), respectively. The detection rates for advanced neoplasia in the sampling bottles and conventional containers were 4.5 per 1,000 persons (95% CI, 2.0 to 11.0) and 2.4 per 1,000 persons (95% CI, 0.0 to 5.0), respectively. The impact of these findings on FIT screening performance was unclear in this interim analysis. This impact should therefore be evaluated in the final analysis following the final enrollment period.

  20. A model-based 'varimax' sampling strategy for a heterogeneous population.

    PubMed

    Akram, Nuzhat A; Farooqi, Shakeel R

    2014-01-01

    Sampling strategies are planned to enhance the homogeneity of a sample, hence to minimize confounding errors. A sampling strategy was developed to minimize the variation within population groups. Karachi, the largest urban agglomeration in Pakistan, was used as a model population. Blood groups ABO and Rh factor were determined for 3000 unrelated individuals selected through simple random sampling. Among them five population groups, namely Balochi, Muhajir, Pathan, Punjabi and Sindhi, based on paternal ethnicity were identified. An index was designed to measure the proportion of admixture at parental and grandparental levels. Population models based on index score were proposed. For validation, 175 individuals selected through stratified random sampling were genotyped for the three STR loci CSF1PO, TPOX and TH01. ANOVA showed significant differences across the population groups for blood groups and STR loci distribution. Gene diversity was higher across the sub-population model than in the agglomerated population. At parental level gene diversities are significantly higher across No admixture models than Admixture models. At grandparental level the difference was not significant. A sub-population model with no admixture at parental level was justified for sampling the heterogeneous population of Karachi.

Top