Fourcade, Yoan; Engler, Jan O; Rödder, Dennis; Secondi, Jean
2014-01-01
MAXENT is now a common species distribution modeling (SDM) tool used by conservation practitioners for predicting the distribution of a species from a set of records and environmental predictors. However, datasets of species occurrence used to train the model are often biased in the geographical space because of unequal sampling effort across the study area. This bias may be a source of strong inaccuracy in the resulting model and could lead to incorrect predictions. Although a number of sampling bias correction methods have been proposed, there is no consensual guideline to account for it. We compared here the performance of five methods of bias correction on three datasets of species occurrence: one "virtual" derived from a land cover map, and two actual datasets for a turtle (Chrysemys picta) and a salamander (Plethodon cylindraceus). We subjected these datasets to four types of sampling biases corresponding to potential types of empirical biases. We applied five correction methods to the biased samples and compared the outputs of distribution models to unbiased datasets to assess the overall correction performance of each method. The results revealed that the ability of methods to correct the initial sampling bias varied greatly depending on bias type, bias intensity and species. However, the simple systematic sampling of records consistently ranked among the best performing across the range of conditions tested, whereas other methods performed more poorly in most cases. The strong effect of initial conditions on correction performance highlights the need for further research to develop a step-by-step guideline to account for sampling bias. However, this method seems to be the most efficient in correcting sampling bias and should be advised in most cases.
Fourcade, Yoan; Engler, Jan O.; Rödder, Dennis; Secondi, Jean
2014-01-01
MAXENT is now a common species distribution modeling (SDM) tool used by conservation practitioners for predicting the distribution of a species from a set of records and environmental predictors. However, datasets of species occurrence used to train the model are often biased in the geographical space because of unequal sampling effort across the study area. This bias may be a source of strong inaccuracy in the resulting model and could lead to incorrect predictions. Although a number of sampling bias correction methods have been proposed, there is no consensual guideline to account for it. We compared here the performance of five methods of bias correction on three datasets of species occurrence: one “virtual” derived from a land cover map, and two actual datasets for a turtle (Chrysemys picta) and a salamander (Plethodon cylindraceus). We subjected these datasets to four types of sampling biases corresponding to potential types of empirical biases. We applied five correction methods to the biased samples and compared the outputs of distribution models to unbiased datasets to assess the overall correction performance of each method. The results revealed that the ability of methods to correct the initial sampling bias varied greatly depending on bias type, bias intensity and species. However, the simple systematic sampling of records consistently ranked among the best performing across the range of conditions tested, whereas other methods performed more poorly in most cases. The strong effect of initial conditions on correction performance highlights the need for further research to develop a step-by-step guideline to account for sampling bias. However, this method seems to be the most efficient in correcting sampling bias and should be advised in most cases. PMID:24818607
Jeffrey H. Gove
2003-01-01
Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a...
Bannerman, J A; Costamagna, A C; McCornack, B P; Ragsdale, D W
2015-06-01
Generalist natural enemies play an important role in controlling soybean aphid, Aphis glycines (Hemiptera: Aphididae), in North America. Several sampling methods are used to monitor natural enemy populations in soybean, but there has been little work investigating their relative bias, precision, and efficiency. We compare five sampling methods: quadrats, whole-plant counts, sweep-netting, walking transects, and yellow sticky cards to determine the most practical methods for sampling the three most prominent species, which included Harmonia axyridis (Pallas), Coccinella septempunctata L. (Coleoptera: Coccinellidae), and Orius insidiosus (Say) (Hemiptera: Anthocoridae). We show an important time by sampling method interaction indicated by diverging community similarities within and between sampling methods as the growing season progressed. Similarly, correlations between sampling methods for the three most abundant species over multiple time periods indicated differences in relative bias between sampling methods and suggests that bias is not consistent throughout the growing season, particularly for sticky cards and whole-plant samples. Furthermore, we show that sticky cards produce strongly biased capture rates relative to the other four sampling methods. Precision and efficiency differed between sampling methods and sticky cards produced the most precise (but highly biased) results for adult natural enemies, while walking transects and whole-plant counts were the most efficient methods for detecting coccinellids and O. insidiosus, respectively. Based on bias, precision, and efficiency considerations, the most practical sampling methods for monitoring in soybean include walking transects for coccinellid detection and whole-plant counts for detection of small predators like O. insidiosus. Sweep-netting and quadrat samples are also useful for some applications, when efficiency is not paramount. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Efficient global biopolymer sampling with end-transfer configurational bias Monte Carlo
NASA Astrophysics Data System (ADS)
Arya, Gaurav; Schlick, Tamar
2007-01-01
We develop an "end-transfer configurational bias Monte Carlo" method for efficient thermodynamic sampling of complex biopolymers and assess its performance on a mesoscale model of chromatin (oligonucleosome) at different salt conditions compared to other Monte Carlo moves. Our method extends traditional configurational bias by deleting a repeating motif (monomer) from one end of the biopolymer and regrowing it at the opposite end using the standard Rosenbluth scheme. The method's sampling efficiency compared to local moves, pivot rotations, and standard configurational bias is assessed by parameters relating to translational, rotational, and internal degrees of freedom of the oligonucleosome. Our results show that the end-transfer method is superior in sampling every degree of freedom of the oligonucleosomes over other methods at high salt concentrations (weak electrostatics) but worse than the pivot rotations in terms of sampling internal and rotational sampling at low-to-moderate salt concentrations (strong electrostatics). Under all conditions investigated, however, the end-transfer method is several orders of magnitude more efficient than the standard configurational bias approach. This is because the characteristic sampling time of the innermost oligonucleosome motif scales quadratically with the length of the oligonucleosomes for the end-transfer method while it scales exponentially for the traditional configurational-bias method. Thus, the method we propose can significantly improve performance for global biomolecular applications, especially in condensed systems with weak nonbonded interactions and may be combined with local enhancements to improve local sampling.
Neural Network and Nearest Neighbor Algorithms for Enhancing Sampling of Molecular Dynamics.
Galvelis, Raimondas; Sugita, Yuji
2017-06-13
The free energy calculations of complex chemical and biological systems with molecular dynamics (MD) are inefficient due to multiple local minima separated by high-energy barriers. The minima can be escaped using an enhanced sampling method such as metadynamics, which apply bias (i.e., importance sampling) along a set of collective variables (CV), but the maximum number of CVs (or dimensions) is severely limited. We propose a high-dimensional bias potential method (NN2B) based on two machine learning algorithms: the nearest neighbor density estimator (NNDE) and the artificial neural network (ANN) for the bias potential approximation. The bias potential is constructed iteratively from short biased MD simulations accounting for correlation among CVs. Our method is capable of achieving ergodic sampling and calculating free energy of polypeptides with up to 8-dimensional bias potential.
Nelson, Jennifer Clark; Marsh, Tracey; Lumley, Thomas; Larson, Eric B; Jackson, Lisa A; Jackson, Michael L
2013-08-01
Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased owing to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. We applied two such methods, namely imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method's ability to reduce bias using the control time period before influenza circulation. Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not use the validation sample confounders. Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from health care database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which the data can be imputed or reweighted using the additional validation sample information. Copyright © 2013 Elsevier Inc. All rights reserved.
Kazmerski, Lawrence L.
1990-01-01
A Method and apparatus for differential spectroscopic atomic-imaging is disclosed for spatial resolution and imaging for display not only individual atoms on a sample surface, but also bonding and the specific atomic species in such bond. The apparatus includes a scanning tunneling microscope (STM) that is modified to include photon biasing, preferably a tuneable laser, modulating electronic surface biasing for the sample, and temperature biasing, preferably a vibration-free refrigerated sample mounting stage. Computer control and data processing and visual display components are also included. The method includes modulating the electronic bias voltage with and without selected photon wavelengths and frequency biasing under a stabilizing (usually cold) bias temperature to detect bonding and specific atomic species in the bonds as the STM rasters the sample. This data is processed along with atomic spatial topography data obtained from the STM raster scan to create a real-time visual image of the atoms on the sample surface.
Nelson, Jennifer C.; Marsh, Tracey; Lumley, Thomas; Larson, Eric B.; Jackson, Lisa A.; Jackson, Michael
2014-01-01
Objective Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased due to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. Study Design and Setting We applied two such methods, imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method’s ability to reduce bias using the control time period prior to influenza circulation. Results Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not utilize the validation sample confounders. Conclusion Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from healthcare database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which data can be imputed or reweighted using the additional validation sample information. PMID:23849144
Bias Assessment of General Chemistry Analytes using Commutable Samples.
Koerbin, Gus; Tate, Jillian R; Ryan, Julie; Jones, Graham Rd; Sikaris, Ken A; Kanowski, David; Reed, Maxine; Gill, Janice; Koumantakis, George; Yen, Tina; St John, Andrew; Hickman, Peter E; Simpson, Aaron; Graham, Peter
2014-11-01
Harmonisation of reference intervals for routine general chemistry analytes has been a goal for many years. Analytical bias may prevent this harmonisation. To determine if analytical bias is present when comparing methods, the use of commutable samples, or samples that have the same properties as the clinical samples routinely analysed, should be used as reference samples to eliminate the possibility of matrix effect. The use of commutable samples has improved the identification of unacceptable analytical performance in the Netherlands and Spain. The International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) has undertaken a pilot study using commutable samples in an attempt to determine not only country specific reference intervals but to make them comparable between countries. Australia and New Zealand, through the Australasian Association of Clinical Biochemists (AACB), have also undertaken an assessment of analytical bias using commutable samples and determined that of the 27 general chemistry analytes studied, 19 showed sufficiently small between method biases as to not prevent harmonisation of reference intervals. Application of evidence based approaches including the determination of analytical bias using commutable material is necessary when seeking to harmonise reference intervals.
Roh, Min K; Gillespie, Dan T; Petzold, Linda R
2010-11-07
The weighted stochastic simulation algorithm (wSSA) was developed by Kuwahara and Mura [J. Chem. Phys. 129, 165101 (2008)] to efficiently estimate the probabilities of rare events in discrete stochastic systems. The wSSA uses importance sampling to enhance the statistical accuracy in the estimation of the probability of the rare event. The original algorithm biases the reaction selection step with a fixed importance sampling parameter. In this paper, we introduce a novel method where the biasing parameter is state-dependent. The new method features improved accuracy, efficiency, and robustness.
Cao, Youfang; Liang, Jie
2013-01-01
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape. PMID:23862966
NASA Astrophysics Data System (ADS)
Cao, Youfang; Liang, Jie
2013-07-01
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Cao, Youfang; Liang, Jie
2013-07-14
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Biased Brownian dynamics for rate constant calculation.
Zou, G; Skeel, R D; Subramaniam, S
2000-08-01
An enhanced sampling method-biased Brownian dynamics-is developed for the calculation of diffusion-limited biomolecular association reaction rates with high energy or entropy barriers. Biased Brownian dynamics introduces a biasing force in addition to the electrostatic force between the reactants, and it associates a probability weight with each trajectory. A simulation loses weight when movement is along the biasing force and gains weight when movement is against the biasing force. The sampling of trajectories is then biased, but the sampling is unbiased when the trajectory outcomes are multiplied by their weights. With a suitable choice of the biasing force, more reacted trajectories are sampled. As a consequence, the variance of the estimate is reduced. In our test case, biased Brownian dynamics gives a sevenfold improvement in central processing unit (CPU) time with the choice of a simple centripetal biasing force.
Investigating the Stability of Four Methods for Estimating Item Bias.
ERIC Educational Resources Information Center
Perlman, Carole L.; And Others
The reliability of item bias estimates was studied for four methods: (1) the transformed delta method; (2) Shepard's modified delta method; (3) Rasch's one-parameter residual analysis; and (4) the Mantel-Haenszel procedure. Bias statistics were computed for each sample using all methods. Data were from administration of multiple-choice items from…
Trutschel, Diana; Palm, Rebecca; Holle, Bernhard; Simon, Michael
2017-11-01
Because not every scientific question on effectiveness can be answered with randomised controlled trials, research methods that minimise bias in observational studies are required. Two major concerns influence the internal validity of effect estimates: selection bias and clustering. Hence, to reduce the bias of the effect estimates, more sophisticated statistical methods are needed. To introduce statistical approaches such as propensity score matching and mixed models into representative real-world analysis and to conduct the implementation in statistical software R to reproduce the results. Additionally, the implementation in R is presented to allow the results to be reproduced. We perform a two-level analytic strategy to address the problems of bias and clustering: (i) generalised models with different abilities to adjust for dependencies are used to analyse binary data and (ii) the genetic matching and covariate adjustment methods are used to adjust for selection bias. Hence, we analyse the data from two population samples, the sample produced by the matching method and the full sample. The different analysis methods in this article present different results but still point in the same direction. In our example, the estimate of the probability of receiving a case conference is higher in the treatment group than in the control group. Both strategies, genetic matching and covariate adjustment, have their limitations but complement each other to provide the whole picture. The statistical approaches were feasible for reducing bias but were nevertheless limited by the sample used. For each study and obtained sample, the pros and cons of the different methods have to be weighted. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Adaptive enhanced sampling by force-biasing using neural networks
NASA Astrophysics Data System (ADS)
Guo, Ashley Z.; Sevgen, Emre; Sidky, Hythem; Whitmer, Jonathan K.; Hubbell, Jeffrey A.; de Pablo, Juan J.
2018-04-01
A machine learning assisted method is presented for molecular simulation of systems with rugged free energy landscapes. The method is general and can be combined with other advanced sampling techniques. In the particular implementation proposed here, it is illustrated in the context of an adaptive biasing force approach where, rather than relying on discrete force estimates, one can resort to a self-regularizing artificial neural network to generate continuous, estimated generalized forces. By doing so, the proposed approach addresses several shortcomings common to adaptive biasing force and other algorithms. Specifically, the neural network enables (1) smooth estimates of generalized forces in sparsely sampled regions, (2) force estimates in previously unexplored regions, and (3) continuous force estimates with which to bias the simulation, as opposed to biases generated at specific points of a discrete grid. The usefulness of the method is illustrated with three different examples, chosen to highlight the wide range of applicability of the underlying concepts. In all three cases, the new method is found to enhance considerably the underlying traditional adaptive biasing force approach. The method is also found to provide improvements over previous implementations of neural network assisted algorithms.
Unconstrained Enhanced Sampling for Free Energy Calculations of Biomolecules: A Review
Miao, Yinglong; McCammon, J. Andrew
2016-01-01
Free energy calculations are central to understanding the structure, dynamics and function of biomolecules. Yet insufficient sampling of biomolecular configurations is often regarded as one of the main sources of error. Many enhanced sampling techniques have been developed to address this issue. Notably, enhanced sampling methods based on biasing collective variables (CVs), including the widely used umbrella sampling, adaptive biasing force and metadynamics, have been discussed in a recent excellent review (Abrams and Bussi, Entropy, 2014). Here, we aim to review enhanced sampling methods that do not require predefined system-dependent CVs for biomolecular simulations and as such do not suffer from the hidden energy barrier problem as encountered in the CV-biasing methods. These methods include, but are not limited to, replica exchange/parallel tempering, self-guided molecular/Langevin dynamics, essential energy space random walk and accelerated molecular dynamics. While it is overwhelming to describe all details of each method, we provide a summary of the methods along with the applications and offer our perspectives. We conclude with challenges and prospects of the unconstrained enhanced sampling methods for accurate biomolecular free energy calculations. PMID:27453631
Unconstrained Enhanced Sampling for Free Energy Calculations of Biomolecules: A Review.
Miao, Yinglong; McCammon, J Andrew
Free energy calculations are central to understanding the structure, dynamics and function of biomolecules. Yet insufficient sampling of biomolecular configurations is often regarded as one of the main sources of error. Many enhanced sampling techniques have been developed to address this issue. Notably, enhanced sampling methods based on biasing collective variables (CVs), including the widely used umbrella sampling, adaptive biasing force and metadynamics, have been discussed in a recent excellent review (Abrams and Bussi, Entropy, 2014). Here, we aim to review enhanced sampling methods that do not require predefined system-dependent CVs for biomolecular simulations and as such do not suffer from the hidden energy barrier problem as encountered in the CV-biasing methods. These methods include, but are not limited to, replica exchange/parallel tempering, self-guided molecular/Langevin dynamics, essential energy space random walk and accelerated molecular dynamics. While it is overwhelming to describe all details of each method, we provide a summary of the methods along with the applications and offer our perspectives. We conclude with challenges and prospects of the unconstrained enhanced sampling methods for accurate biomolecular free energy calculations.
Nomura, Yuki; Yamamoto, Kazuo; Hirayama, Tsukasa; Saitoh, Koh
2018-06-01
We developed a novel sample preparation method for transmission electron microscopy (TEM) to suppress superfluous electric fields leaked from biased TEM samples. In this method, a thin TEM sample is first coated with an insulating amorphous aluminum oxide (AlOx) film with a thickness of about 20 nm. Then, the sample is coated with a conductive amorphous carbon film with a thickness of about 10 nm, and the film is grounded. This technique was applied to a model sample of a metal electrode/Li-ion-conductive-solid-electrolyte/metal electrode for biasing electron holography. We found that AlOx film with a thickness of 10 nm has a large withstand voltage of about 8 V and that double layers of AlOx and carbon act as a 'nano-shield' to suppress 99% of the electric fields outside of the sample. We also found an asymmetry potential distribution between high and low potential electrodes in biased solid-electrolyte, indicating different accumulation behaviors of lithium-ions (Li+) and lithium-ion vacancies (VLi-) in the biased solid-electrolyte.
Nonlinear vs. linear biasing in Trp-cage folding simulations
NASA Astrophysics Data System (ADS)
Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka
2015-03-01
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.
Nonlinear vs. linear biasing in Trp-cage folding simulations.
Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka
2015-03-21
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.
NASA Astrophysics Data System (ADS)
Peter, Emanuel K.
2017-12-01
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Peter, Emanuel K
2017-12-07
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Efficiently estimating salmon escapement uncertainty using systematically sampled data
Reynolds, Joel H.; Woody, Carol Ann; Gove, Nancy E.; Fair, Lowell F.
2007-01-01
Fish escapement is generally monitored using nonreplicated systematic sampling designs (e.g., via visual counts from towers or hydroacoustic counts). These sampling designs support a variety of methods for estimating the variance of the total escapement. Unfortunately, all the methods give biased results, with the magnitude of the bias being determined by the underlying process patterns. Fish escapement commonly exhibits positive autocorrelation and nonlinear patterns, such as diurnal and seasonal patterns. For these patterns, poor choice of variance estimator can needlessly increase the uncertainty managers have to deal with in sustaining fish populations. We illustrate the effect of sampling design and variance estimator choice on variance estimates of total escapement for anadromous salmonids from systematic samples of fish passage. Using simulated tower counts of sockeye salmon Oncorhynchus nerka escapement on the Kvichak River, Alaska, five variance estimators for nonreplicated systematic samples were compared to determine the least biased. Using the least biased variance estimator, four confidence interval estimators were compared for expected coverage and mean interval width. Finally, five systematic sampling designs were compared to determine the design giving the smallest average variance estimate for total annual escapement. For nonreplicated systematic samples of fish escapement, all variance estimators were positively biased. Compared to the other estimators, the least biased estimator reduced bias by, on average, from 12% to 98%. All confidence intervals gave effectively identical results. Replicated systematic sampling designs consistently provided the smallest average estimated variance among those compared.
Constructing a multidimensional free energy surface like a spider weaving a web.
Chen, Changjun
2017-10-15
Complete free energy surface in the collective variable space provides important information of the reaction mechanisms of the molecules. But, sufficient sampling in the collective variable space is not easy. The space expands quickly with the number of the collective variables. To solve the problem, many methods utilize artificial biasing potentials to flatten out the original free energy surface of the molecule in the simulation. Their performances are sensitive to the definitions of the biasing potentials. Fast-growing biasing potential accelerates the sampling speed but decreases the accuracy of the free energy result. Slow-growing biasing potential gives an optimized result but needs more simulation time. In this article, we propose an alternative method. It adds the biasing potential to a representative point of the molecule in the collective variable space to improve the conformational sampling. And the free energy surface is calculated from the free energy gradient in the constrained simulation, not given by the negative of the biasing potential as previous methods. So the presented method does not require the biasing potential to remove all the barriers and basins on the free energy surface exactly. Practical applications show that the method in this work is able to produce the accurate free energy surfaces for different molecules in a short time period. The free energy errors are small in the cases of various biasing potentials. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions
Marinelli, Fabrizio; Faraldo-Gómez, José D.
2015-01-01
We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines. PMID:26083917
Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro
2018-05-09
Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
Sampling of temporal networks: Methods and biases
NASA Astrophysics Data System (ADS)
Rocha, Luis E. C.; Masuda, Naoki; Holme, Petter
2017-11-01
Temporal networks have been increasingly used to model a diversity of systems that evolve in time; for example, human contact structures over which dynamic processes such as epidemics take place. A fundamental aspect of real-life networks is that they are sampled within temporal and spatial frames. Furthermore, one might wish to subsample networks to reduce their size for better visualization or to perform computationally intensive simulations. The sampling method may affect the network structure and thus caution is necessary to generalize results based on samples. In this paper, we study four sampling strategies applied to a variety of real-life temporal networks. We quantify the biases generated by each sampling strategy on a number of relevant statistics such as link activity, temporal paths and epidemic spread. We find that some biases are common in a variety of networks and statistics, but one strategy, uniform sampling of nodes, shows improved performance in most scenarios. Given the particularities of temporal network data and the variety of network structures, we recommend that the choice of sampling methods be problem oriented to minimize the potential biases for the specific research questions on hand. Our results help researchers to better design network data collection protocols and to understand the limitations of sampled temporal network data.
Hypothesis Testing Using Factor Score Regression
Devlieger, Ines; Mayer, Axel; Rosseel, Yves
2015-01-01
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886
Nonlinear vs. linear biasing in Trp-cage folding simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spiwok, Vojtěch, E-mail: spiwokv@vscht.cz; Oborský, Pavel; Králová, Blanka
2015-03-21
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energymore » minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.« less
Yang, Mingjun; Huang, Jing; MacKerell, Alexander D
2015-06-09
Replica exchange (REX) is a powerful computational tool for overcoming the quasi-ergodic sampling problem of complex molecular systems. Recently, several multidimensional extensions of this method have been developed to realize exchanges in both temperature and biasing potential space or the use of multiple biasing potentials to improve sampling efficiency. However, increased computational cost due to the multidimensionality of exchanges becomes challenging for use on complex systems under explicit solvent conditions. In this study, we develop a one-dimensional (1D) REX algorithm to concurrently combine the advantages of overall enhanced sampling from Hamiltonian solute scaling and the specific enhancement of collective variables using Hamiltonian biasing potentials. In the present Hamiltonian replica exchange method, termed HREST-BP, Hamiltonian solute scaling is applied to the solute subsystem, and its interactions with the environment to enhance overall conformational transitions and biasing potentials are added along selected collective variables associated with specific conformational transitions, thereby balancing the sampling of different hierarchical degrees of freedom. The two enhanced sampling approaches are implemented concurrently allowing for the use of a small number of replicas (e.g., 6 to 8) in 1D, thus greatly reducing the computational cost in complex system simulations. The present method is applied to conformational sampling of two nitrogen-linked glycans (N-glycans) found on the HIV gp120 envelope protein. Considering the general importance of the conformational sampling problem, HREST-BP represents an efficient procedure for the study of complex saccharides, and, more generally, the method is anticipated to be of general utility for the conformational sampling in a wide range of macromolecular systems.
Peng, Yaguang; Li, Wei; Wang, Yang; Chen, Hui; Bo, Jian; Wang, Xingyu; Liu, Lisheng
2016-01-01
24-h urinary sodium excretion is the gold standard for evaluating dietary sodium intake, but it is often not feasible in large epidemiological studies due to high participant burden and cost. Three methods—Kawasaki, INTERSALT, and Tanaka—have been proposed to estimate 24-h urinary sodium excretion from a spot urine sample, but these methods have not been validated in the general Chinese population. This aim of this study was to assess the validity of three methods for estimating 24-h urinary sodium excretion using spot urine samples against measured 24-h urinary sodium excretion in a Chinese sample population. Data are from a substudy of the Prospective Urban Rural Epidemiology (PURE) study that enrolled 120 participants aged 35 to 70 years and collected their morning fasting urine and 24-h urine specimens. Bias calculations (estimated values minus measured values) and Bland-Altman plots were used to assess the validity of the three estimation methods. 116 participants were included in the final analysis. Mean bias for the Kawasaki method was -740 mg/day (95% CI: -1219, 262 mg/day), and was the lowest among the three methods. Mean bias for the Tanaka method was -2305 mg/day (95% CI: -2735, 1875 mg/day). Mean bias for the INTERSALT method was -2797 mg/day (95% CI: -3245, 2349 mg/day), and was the highest of the three methods. Bland-Altman plots indicated that all three methods underestimated 24-h urinary sodium excretion. The Kawasaki, INTERSALT and Tanaka methods for estimation of 24-h urinary sodium excretion using spot urines all underestimated true 24-h urinary sodium excretion in this sample of Chinese adults. Among the three methods, the Kawasaki method was least biased, but was still relatively inaccurate. A more accurate method is needed to estimate the 24-h urinary sodium excretion from spot urine for assessment of dietary sodium intake in China. PMID:26895296
Variational Approach to Enhanced Sampling and Free Energy Calculations
NASA Astrophysics Data System (ADS)
Valsson, Omar; Parrinello, Michele
2014-08-01
The ability of widely used sampling methods, such as molecular dynamics or Monte Carlo simulations, to explore complex free energy landscapes is severely hampered by the presence of kinetic bottlenecks. A large number of solutions have been proposed to alleviate this problem. Many are based on the introduction of a bias potential which is a function of a small number of collective variables. However constructing such a bias is not simple. Here we introduce a functional of the bias potential and an associated variational principle. The bias that minimizes the functional relates in a simple way to the free energy surface. This variational principle can be turned into a practical, efficient, and flexible sampling method. A number of numerical examples are presented which include the determination of a three-dimensional free energy surface. We argue that, beside being numerically advantageous, our variational approach provides a convenient and novel standpoint for looking at the sampling problem.
Comparing State SAT Scores: Problems, Biases, and Corrections.
ERIC Educational Resources Information Center
Gohmann, Stephen F.
1988-01-01
One method to correct for selection bias in comparing Scholastic Aptitude Test (SAT) scores among states is presented, which is a modification of J. J. Heckman's Selection Bias Correction (1976, 1979). Empirical results suggest that sample selection bias is present in SAT score regressions. (SLD)
Wickenberg-Bolin, Ulrika; Göransson, Hanna; Fryknäs, Mårten; Gustafsson, Mats G; Isaksson, Anders
2006-03-13
Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT). Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set. We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.
NASA Astrophysics Data System (ADS)
Ghorbani, A.; Farahani, M. Mahmoodi; Rabbani, M.; Aflaki, F.; Waqifhosain, Syed
2008-01-01
In this paper we propose uncertainty estimation for the analytical results we obtained from determination of Ni, Pb and Al by solidphase extraction and inductively coupled plasma optical emission spectrometry (SPE-ICP-OES). The procedure is based on the retention of analytes in the form of 8-hydroxyquinoline (8-HQ) complexes on a mini column of XAD-4 resin and subsequent elution with nitric acid. The influence of various analytical parameters including the amount of solid phase, pH, elution factors (concentration and volume of eluting solution), volume of sample solution, and amount of ligand on the extraction efficiency of analytes was investigated. To estimate the uncertainty of analytical result obtained, we propose assessing trueness by employing spiked sample. Two types of bias are calculated in the assessment of trueness: a proportional bias and a constant bias. We applied Nested design for calculating proportional bias and Youden method to calculate the constant bias. The results we obtained for proportional bias are calculated from spiked samples. In this case, the concentration found is plotted against the concentration added and the slop of standard addition curve is an estimate of the method recovery. Estimated method of average recovery in Karaj river water is: (1.004±0.0085) for Ni, (0.999±0.010) for Pb and (0.987±0.008) for Al.
Biro, Peter A
2013-02-01
Sampling animals from the wild for study is something nearly every biologist has done, but despite our best efforts to obtain random samples of animals, 'hidden' trait biases may still exist. For example, consistent behavioral traits can affect trappability/catchability, independent of obvious factors such as size and gender, and these traits are often correlated with other repeatable physiological and/or life history traits. If so, systematic sampling bias may exist for any of these traits. The extent to which this is a problem, of course, depends on the magnitude of bias, which is presently unknown because the underlying trait distributions in populations are usually unknown, or unknowable. Indeed, our present knowledge about sampling bias comes from samples (not complete population censuses), which can possess bias to begin with. I had the unique opportunity to create naturalized populations of fish by seeding each of four small fishless lakes with equal densities of slow-, intermediate-, and fast-growing fish. Using sampling methods that are not size-selective, I observed that fast-growing fish were up to two-times more likely to be sampled than slower-growing fish. This indicates substantial and systematic bias with respect to an important life history trait (growth rate). If correlations between behavioral, physiological and life-history traits are as widespread as the literature suggests, then many animal samples may be systematically biased with respect to these traits (e.g., when collecting animals for laboratory use), and affect our inferences about population structure and abundance. I conclude with a discussion on ways to minimize sampling bias for particular physiological/behavioral/life-history types within animal populations.
de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M
2017-07-06
Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length heterogeneity spacers minimizes the need for PhiX spike-in. This design results in a significant cost reduction of highly multiplexed amplicon sequencing. The biases we characterize highlight the need for highly standardized protocols. Reassuringly, we find that the biological signal is a far stronger structuring factor than the various sources of bias.
NASA Astrophysics Data System (ADS)
Gatti, M.; Vielzeuf, P.; Davis, C.; Cawthon, R.; Rau, M. M.; DeRose, J.; De Vicente, J.; Alarcon, A.; Rozo, E.; Gaztanaga, E.; Hoyle, B.; Miquel, R.; Bernstein, G. M.; Bonnett, C.; Carnero Rosell, A.; Castander, F. J.; Chang, C.; da Costa, L. N.; Gruen, D.; Gschwend, J.; Hartley, W. G.; Lin, H.; MacCrann, N.; Maia, M. A. G.; Ogando, R. L. C.; Roodman, A.; Sevilla-Noarbe, I.; Troxel, M. A.; Wechsler, R. H.; Asorey, J.; Davis, T. M.; Glazebrook, K.; Hinton, S. R.; Lewis, G.; Lidman, C.; Macaulay, E.; Möller, A.; O'Neill, C. R.; Sommer, N. E.; Uddin, S. A.; Yuan, F.; Zhang, B.; Abbott, T. M. C.; Allam, S.; Annis, J.; Bechtol, K.; Brooks, D.; Burke, D. L.; Carollo, D.; Carrasco Kind, M.; Carretero, J.; Cunha, C. E.; D'Andrea, C. B.; DePoy, D. L.; Desai, S.; Eifler, T. F.; Evrard, A. E.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gerdes, D. W.; Goldstein, D. A.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Hoormann, J. K.; Jain, B.; James, D. J.; Jarvis, M.; Jeltema, T.; Johnson, M. W. G.; Johnson, M. D.; Krause, E.; Kuehn, K.; Kuhlmann, S.; Kuropatkin, N.; Li, T. S.; Lima, M.; Marshall, J. L.; Melchior, P.; Menanteau, F.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Reil, K.; Rykoff, E. S.; Sako, M.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sheldon, E.; Smith, M.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Tucker, B. E.; Tucker, D. L.; Vikram, V.; Walker, A. R.; Weller, J.; Wester, W.; Wolf, R. C.
2018-06-01
We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing source galaxies from the Dark Energy Survey Year 1 sample with redMaGiC galaxies (luminous red galaxies with secure photometric redshifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We apply the method to two photo-z codes run in our simulated data: Bayesian Photometric Redshift and Directional Neighbourhood Fitting. We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering versus photo-zs. The systematic uncertainty in the mean redshift bias of the source galaxy sample is Δz ≲ 0.02, though the precise value depends on the redshift bin under consideration. We discuss possible ways to mitigate the impact of our dominant systematics in future analyses.
Rus, David L.; Patton, Charles J.; Mueller, David K.; Crawford, Charles G.
2013-01-01
The characterization of total-nitrogen (TN) concentrations is an important component of many surface-water-quality programs. However, three widely used methods for the determination of total nitrogen—(1) derived from the alkaline-persulfate digestion of whole-water samples (TN-A); (2) calculated as the sum of total Kjeldahl nitrogen and dissolved nitrate plus nitrite (TN-K); and (3) calculated as the sum of dissolved nitrogen and particulate nitrogen (TN-C)—all include inherent limitations. A digestion process is intended to convert multiple species of nitrogen that are present in the sample into one measureable species, but this process may introduce bias. TN-A results can be negatively biased in the presence of suspended sediment, and TN-K data can be positively biased in the presence of elevated nitrate because some nitrate is reduced to ammonia and is therefore counted twice in the computation of total nitrogen. Furthermore, TN-C may not be subject to bias but is comparatively imprecise. In this study, the effects of suspended-sediment and nitrate concentrations on the performance of these TN methods were assessed using synthetic samples developed in a laboratory as well as a series of stream samples. A 2007 laboratory experiment measured TN-A and TN-K in nutrient-fortified solutions that had been mixed with varying amounts of sediment-reference materials. This experiment identified a connection between suspended sediment and negative bias in TN-A and detected positive bias in TN-K in the presence of elevated nitrate. A 2009–10 synoptic-field study used samples from 77 stream-sampling sites to confirm that these biases were present in the field samples and evaluated the precision and bias of TN methods. The precision of TN-C and TN-K depended on the precision and relative amounts of the TN-component species used in their respective TN computations. Particulate nitrogen had an average variability (as determined by the relative standard deviation) of 13 percent. However, because particulate nitrogen constituted only 14 percent, on average, of TN-C, the precision of the TN-C method approached that of the method for dissolved nitrogen (2.3 percent). On the other hand, total Kjeldahl nitrogen (having a variability of 7.6 percent) constituted an average of 40 percent of TN-K, suggesting that the reduced precision of the Kjeldahl digestion may affect precision of the TN-K estimates. For most samples, the precision of TN computed as TN-C would be better (lower variability) than the precision of TN-K. In general, TN-A precision (having a variability of 2.1 percent) was superior to TN-C and TN-K methods. The laboratory experiment indicated that negative bias in TN-A was present across the entire range of sediment concentration and increased as sediment concentration increased. This suggested that reagent limitation was not the predominant cause of observed bias in TN-A. Furthermore, analyses of particulate nitrogen present in digest residues provided an almost complete accounting for the nitrogen that was underestimated by alkaline-persulfate digestion. This experiment established that, for the reference materials at least, negative bias in TN-A was caused primarily by the sequestration of some particulate nitrogen that was refractory to the digestion process. TN-K biases varied between positive and negative values in the laboratory experiment. Positive bias in TN-K is likely the result of the unintended reduction of a small and variable amount of nitrate to ammonia during the Kjeldahl digestion process. Negative TN-K bias may be the result of the sequestration of a portion of particulate nitrogen during the digestion process. Negative bias in TN-A was present across the entire range of suspended-sediment concentration (1 to 14,700 milligrams per liter [mg/L]) in the synoptic-field study, with relative bias being nearly as great at sediment concentrations below 10 mg/L (median of -3.5 percent) as that observed at sediment concentrations up to 750 mg/L (median of -4.4 percent). This lent support to the laboratory-experiment finding that some particulate nitrogen is sequestered during the digestion process, and demonstrated that negative TN-A bias was present in samples with very low suspended-sediment concentrations. At sediment concentrations above 750 mg/L, the negative TN-A bias became more likely and larger (median of -13.2 percent), suggesting a secondary mechanism of bias, such as reagent limitation. From a geospatial perspective, trends in TN-A bias were not explained by selected basin characteristics. Though variable, TN-K bias generally was positive in the synoptic-field study (median of 3.1 percent), probably as a result of the reduction of nitrate. Three alternative approaches for assessing TN in surface water were evaluated for their impacts on existing and future sampling programs. Replacing TN-A with TN-C would remove the bias from subsequent data, but this approach also would introduce discontinuity in historical records. Replacing TN-K with TN-C would lead to the removal of positive bias in TN-K in the presence of elevated nitrate. However, in addition to the issues that may arise from a discontinuity in the data record, this approach may not be applicable to regulatory programs that require the use of total Kjeldahl nitrogen for stream assessment. By adding TN-C to existing TN-A or TN-K analyses, historical-data continuity would be preserved and the transitional period could be used to minimize the impact of bias on data analyses. This approach, however, imposes the greatest burdens on field operations and in terms of analytical costs. The variation in these impacts on different sampling programs will challenge U.S. Geological Survey scientists attempting to establish uniform standards for TN sample collection and analytical determinations.
Evaluation of respondent-driven sampling.
McCreesh, Nicky; Frost, Simon D W; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G
2012-01-01
Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data. Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion. Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.
Ullah, Md Ahsan; Kim, Ki-Hyun; Szulejko, Jan E; Cho, Jinwoo
2014-04-11
The production of short-chained volatile fatty acids (VFAs) by the anaerobic bacterial digestion of sewage (wastewater) affords an excellent opportunity to alternative greener viable bio-energy fuels (i.e., microbial fuel cell). VFAs in wastewater (sewage) samples are commonly quantified through direct injection (DI) into a gas chromatograph with a flame ionization detector (GC-FID). In this study, the reliability of VFA analysis by the DI-GC method has been examined against a thermal desorption (TD-GC) method. The results indicate that the VFA concentrations determined from an aliquot from each wastewater sample by the DI-GC method were generally underestimated, e.g., reductions of 7% (acetic acid) to 93.4% (hexanoic acid) relative to the TD-GC method. The observed differences between the two methods suggest the possibly important role of the matrix effect to give rise to the negative biases in DI-GC analysis. To further explore this possibility, an ancillary experiment was performed to examine bias patterns of three DI-GC approaches. For instance, the results of the standard addition (SA) method confirm the definite role of matrix effect when analyzing wastewater samples by DI-GC. More importantly, their biases tend to increase systematically with increasing molecular weight and decreasing VFA concentrations. As such, the use of DI-GC method, if applied for the analysis of samples with a complicated matrix, needs a thorough validation to improve the reliability in data acquisition. Copyright © 2014 Elsevier B.V. All rights reserved.
Nah, Hyunjin; Lee, Sang-Guk; Lee, Kyeong-Seob; Won, Jae-Hee; Kim, Hyun Ok; Kim, Jeong-Ho
2016-02-01
The aim of this study was to estimate bilirubin interference and accuracy of six routine methods for measuring creatinine compared with isotope dilution-liquid chromatography mass spectrometry (ID-LC/MS). A total of 40 clinical serum samples from 31 patients with serum total bilirubin concentration >68.4μmol/L were collected. Serum creatinine was measured using two enzymatic reagents and four Jaffe reagents as well as ID-LC/MS. Correlations between bilirubin concentration and percent difference in creatinine compared with ID-LC/MS were analyzed to investigate bilirubin interference. Bias estimations between the six reagents and ID-LC/MS were performed. Recovery tests using National Institute of Standards and Technology (NIST) Standard Reference Material (SRM) 967a were also performed. Both the enzymatic methods showed no bilirubin interference. However, three of the four Jaffe methods demonstrated significant bilirubin concentration-dependent interference in samples with creatinine levels <53μmol/L, and two of them showed significant bilirubin interference in samples with creatinine levels ranging from 53.0 to 97.2μmol/L. Comparison of these methods with ID-LC/MS using patients' samples with elevated bilirubin revealed that the tested methods failed to achieve the bias goal at especially low levels of creatinine. In addition, recovery test using NIST SRM 967a showed that bias in one Jaffe method and two enzymatic methods did not achieve the bias goal at either low or high level of creatinine, indicating they had calibration bias. One enzymatic method failed to achieve all the bias goals in both comparison experiment and recovery test. It is important to understand that both bilirubin interference and calibration traceability to ID-LC/MS should be considered to improve the accuracy of creatinine measurement. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
Can quantile mapping improve precipitation extremes from regional climate models?
NASA Astrophysics Data System (ADS)
Tani, Satyanarayana; Gobiet, Andreas
2015-04-01
The ability of quantile mapping to accurately bias correct regard to precipitation extremes is investigated in this study. We developed new methods by extending standard quantile mapping (QMα) to improve the quality of bias corrected extreme precipitation events as simulated by regional climate model (RCM) output. The new QM version (QMβ) was developed by combining parametric and nonparametric bias correction methods. The new nonparametric method is tested with and without a controlling shape parameter (Qmβ1 and Qmβ0, respectively). Bias corrections are applied on hindcast simulations for a small ensemble of RCMs at six different locations over Europe. We examined the quality of the extremes through split sample and cross validation approaches of these three bias correction methods. This split-sample approach mimics the application to future climate scenarios. A cross validation framework with particular focus on new extremes was developed. Error characteristics, q-q plots and Mean Absolute Error (MAEx) skill scores are used for evaluation. We demonstrate the unstable behaviour of correction function at higher quantiles with QMα, whereas the correction functions with for QMβ0 and QMβ1 are smoother, with QMβ1 providing the most reasonable correction values. The result from q-q plots demonstrates that, all bias correction methods are capable of producing new extremes but QMβ1 reproduces new extremes with low biases in all seasons compared to QMα, QMβ0. Our results clearly demonstrate the inherent limitations of empirical bias correction methods employed for extremes, particularly new extremes, and our findings reveals that the new bias correction method (Qmß1) produces more reliable climate scenarios for new extremes. These findings present a methodology that can better capture future extreme precipitation events, which is necessary to improve regional climate change impact studies.
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A
2015-04-01
Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence-absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence-absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model's performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence-absence data for a given species is scarceIf we have only presence-only data and no presence-absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species' geographic range.
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A.
2016-01-01
Summary Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence–absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence–absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model’s performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence–absence data for a given species is scarceIf we have only presence-only data and no presence–absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species’ geographic range. PMID:27840673
Price, A.; Peterson, James T.
2010-01-01
Stream fish managers often use fish sample data to inform management decisions affecting fish populations. Fish sample data, however, can be biased by the same factors affecting fish populations. To minimize the effect of sample biases on decision making, biologists need information on the effectiveness of fish sampling methods. We evaluated single-pass backpack electrofishing and seining combined with electrofishing by following a dual-gear, mark–recapture approach in 61 blocknetted sample units within first- to third-order streams. We also estimated fish movement out of unblocked units during sampling. Capture efficiency and fish abundances were modeled for 50 fish species by use of conditional multinomial capture–recapture models. The best-approximating models indicated that capture efficiencies were generally low and differed among species groups based on family or genus. Efficiencies of single-pass electrofishing and seining combined with electrofishing were greatest for Catostomidae and lowest for Ictaluridae. Fish body length and stream habitat characteristics (mean cross-sectional area, wood density, mean current velocity, and turbidity) also were related to capture efficiency of both methods, but the effects differed among species groups. We estimated that, on average, 23% of fish left the unblocked sample units, but net movement varied among species. Our results suggest that (1) common warmwater stream fish sampling methods have low capture efficiency and (2) failure to adjust for incomplete capture may bias estimates of fish abundance. We suggest that managers minimize bias from incomplete capture by adjusting data for site- and species-specific capture efficiency and by choosing sampling gear that provide estimates with minimal bias and variance. Furthermore, if block nets are not used, we recommend that managers adjust the data based on unconditional capture efficiency.
Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian
2015-01-01
Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with “big data” collections. PMID:25692000
Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian
2015-02-01
Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with "big data" collections.
Forester, James D; Im, Hae Kyung; Rathouz, Paul J
2009-12-01
Patterns of resource selection by animal populations emerge as a result of the behavior of many individuals. Statistical models that describe these population-level patterns of habitat use can miss important interactions between individual animals and characteristics of their local environment; however, identifying these interactions is difficult. One approach to this problem is to incorporate models of individual movement into resource selection models. To do this, we propose a model for step selection functions (SSF) that is composed of a resource-independent movement kernel and a resource selection function (RSF). We show that standard case-control logistic regression may be used to fit the SSF; however, the sampling scheme used to generate control points (i.e., the definition of availability) must be accommodated. We used three sampling schemes to analyze simulated movement data and found that ignoring sampling and the resource-independent movement kernel yielded biased estimates of selection. The level of bias depended on the method used to generate control locations, the strength of selection, and the spatial scale of the resource map. Using empirical or parametric methods to sample control locations produced biased estimates under stronger selection; however, we show that the addition of a distance function to the analysis substantially reduced that bias. Assuming a uniform availability within a fixed buffer yielded strongly biased selection estimates that could be corrected by including the distance function but remained inefficient relative to the empirical and parametric sampling methods. As a case study, we used location data collected from elk in Yellowstone National Park, USA, to show that selection and bias may be temporally variable. Because under constant selection the amount of bias depends on the scale at which a resource is distributed in the landscape, we suggest that distance always be included as a covariate in SSF analyses. This approach to modeling resource selection is easily implemented using common statistical tools and promises to provide deeper insight into the movement ecology of animals.
Rater Perceptions of Bias Using the Multiple Mini-Interview Format: A Qualitative Study
ERIC Educational Resources Information Center
Alweis, Richard L.; Fitzpatrick, Caroline; Donato, Anthony A.
2015-01-01
Introduction: The Multiple Mini-Interview (MMI) format appears to mitigate individual rater biases. However, the format itself may introduce structural systematic bias, favoring extroverted personality types. This study aimed to gain a better understanding of these biases from the perspective of the interviewer. Methods: A sample of MMI…
Adaptively biased molecular dynamics: An umbrella sampling method with a time-dependent potential
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Karpusenka, Vadzim; Moradi, Mahmoud; Roland, Christopher; Sagui, Celeste
We discuss an adaptively biased molecular dynamics (ABMD) method for the computation of a free energy surface for a set of reaction coordinates. The ABMD method belongs to the general category of umbrella sampling methods with an evolving biasing potential. It is characterized by a small number of control parameters and an O(t) numerical cost with simulation time t. The method naturally allows for extensions based on multiple walkers and replica exchange mechanism. The workings of the method are illustrated with a number of examples, including sugar puckering, and free energy landscapes for polymethionine and polyproline peptides, and for a short β-turn peptide. ABMD has been implemented into the latest version (Case et al., AMBER 10; University of California: San Francisco, 2008) of the AMBER software package and is freely available to the simulation community.
Evaluation of Respondent-Driven Sampling
McCreesh, Nicky; Frost, Simon; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda Ndagire; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G
2012-01-01
Background Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex-workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total-population data. Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, employing current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). Results We recruited 927 household-heads. Full and small RDS samples were largely representative of the total population, but both samples under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven-sampling statistical-inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven-sampling bootstrap 95% confidence intervals included the population proportion. Conclusions Respondent-driven sampling produced a generally representative sample of this well-connected non-hidden population. However, current respondent-driven-sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience-sampling method, and caution is required when interpreting findings based on the sampling method. PMID:22157309
Driven Metadynamics: Reconstructing Equilibrium Free Energies from Driven Adaptive-Bias Simulations
2013-01-01
We present a novel free-energy calculation method that constructively integrates two distinct classes of nonequilibrium sampling techniques, namely, driven (e.g., steered molecular dynamics) and adaptive-bias (e.g., metadynamics) methods. By employing nonequilibrium work relations, we design a biasing protocol with an explicitly time- and history-dependent bias that uses on-the-fly work measurements to gradually flatten the free-energy surface. The asymptotic convergence of the method is discussed, and several relations are derived for free-energy reconstruction and error estimation. Isomerization reaction of an atomistic polyproline peptide model is used to numerically illustrate the superior efficiency and faster convergence of the method compared with its adaptive-bias and driven components in isolation. PMID:23795244
Correction of bias in belt transect studies of immotile objects
Anderson, D.R.; Pospahala, R.S.
1970-01-01
Unless a correction is made, population estimates derived from a sample of belt transects will be biased if a fraction of, the individuals on the sample transects are not counted. An approach, useful for correcting this bias when sampling immotile populations using transects of a fixed width, is presented. The method assumes that a searcher's ability to find objects near the center of the transect is nearly perfect. The method utilizes a mathematical equation, estimated from the data, to represent the searcher's inability to find all objects at increasing distances from the center of the transect. An example of the analysis of data, formation of the equation, and application is presented using waterfowl nesting data collected in Colorado.
Backfitting in Smoothing Spline Anova, with Application to Historical Global Temperature Data
NASA Astrophysics Data System (ADS)
Luo, Zhen
In the attempt to estimate the temperature history of the earth using the surface observations, various biases can exist. An important source of bias is the incompleteness of sampling over both time and space. There have been a few methods proposed to deal with this problem. Although they can correct some biases resulting from incomplete sampling, they have ignored some other significant biases. In this dissertation, a smoothing spline ANOVA approach which is a multivariate function estimation method is proposed to deal simultaneously with various biases resulting from incomplete sampling. Besides that, an advantage of this method is that we can get various components of the estimated temperature history with a limited amount of information stored. This method can also be used for detecting erroneous observations in the data base. The method is illustrated through an example of modeling winter surface air temperature as a function of year and location. Extension to more complicated models are discussed. The linear system associated with the smoothing spline ANOVA estimates is too large to be solved by full matrix decomposition methods. A computational procedure combining the backfitting (Gauss-Seidel) algorithm and the iterative imputation algorithm is proposed. This procedure takes advantage of the tensor product structure in the data to make the computation feasible in an environment of limited memory. Various related issues are discussed, e.g., the computation of confidence intervals and the techniques to speed up the convergence of the backfitting algorithm such as collapsing and successive over-relaxation.
Elloumi, Fathi; Hu, Zhiyuan; Li, Yan; Parker, Joel S; Gulley, Margaret L; Amos, Keith D; Troester, Melissa A
2011-06-30
Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification. To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors. Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability. Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.
Cheung, Kei Long; Ten Klooster, Peter M; Smit, Cees; de Vries, Hein; Pieterse, Marcel E
2017-03-23
In public health monitoring of young people it is critical to understand the effects of selective non-response, in particular when a controversial topic is involved like substance abuse or sexual behaviour. Research that is dependent upon voluntary subject participation is particularly vulnerable to sampling bias. As respondents whose participation is hardest to elicit on a voluntary basis are also more likely to report risk behaviour, this potentially leads to underestimation of risk factor prevalence. Inviting adolescents to participate in a home-sent postal survey is a typical voluntary recruitment strategy with high non-response, as opposed to mandatory participation during school time. This study examines the extent to which prevalence estimates of adolescent health-related characteristics are biased due to different sampling methods, and whether this also biases within-subject analyses. Cross-sectional datasets collected in 2011 in Twente and IJsselland, two similar and adjacent regions in the Netherlands, were used. In total, 9360 youngsters in a mandatory sample (Twente) and 1952 youngsters in a voluntary sample (IJsselland) participated in the study. To test whether the samples differed on health-related variables, we conducted both univariate and multivariable logistic regression analyses controlling for any demographic difference between the samples. Additional multivariable logistic regressions were conducted to examine moderating effects of sampling method on associations between health-related variables. As expected, females, older individuals, as well as individuals with higher education levels, were over-represented in the voluntary sample, compared to the mandatory sample. Respondents in the voluntary sample tended to smoke less, consume less alcohol (ever, lifetime, and past four weeks), have better mental health, have better subjective health status, have more positive school experiences and have less sexual intercourse than respondents in the mandatory sample. No moderating effects were found for sampling method on associations between variables. This is one of first studies to provide strong evidence that voluntary recruitment may lead to a strong non-response bias in health-related prevalence estimates in adolescents, as compared to mandatory recruitment. The resulting underestimation in prevalence of health behaviours and well-being measures appeared large, up to a four-fold lower proportion for self-reported alcohol consumption. Correlations between variables, though, appeared to be insensitive to sampling bias.
Comparing interval estimates for small sample ordinal CFA models
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Density estimation in wildlife surveys
Bart, Jonathan; Droege, Sam; Geissler, Paul E.; Peterjohn, Bruce G.; Ralph, C. John
2004-01-01
Several authors have recently discussed the problems with using index methods to estimate trends in population size. Some have expressed the view that index methods should virtually never be used. Others have responded by defending index methods and questioning whether better alternatives exist. We suggest that index methods are often a cost-effective component of valid wildlife monitoring but that double-sampling or another procedure that corrects for bias or establishes bounds on bias is essential. The common assertion that index methods require constant detection rates for trend estimation is mathematically incorrect; the requirement is no long-term trend in detection "ratios" (index result/parameter of interest), a requirement that is probably approximately met by many well-designed index surveys. We urge that more attention be given to defining bird density rigorously and in ways useful to managers. Once this is done, 4 sources of bias in density estimates may be distinguished: coverage, closure, surplus birds, and detection rates. Distance, double-observer, and removal methods do not reduce bias due to coverage, closure, or surplus birds. These methods may yield unbiased estimates of the number of birds present at the time of the survey, but only if their required assumptions are met, which we doubt occurs very often in practice. Double-sampling, in contrast, produces unbiased density estimates if the plots are randomly selected and estimates on the intensive surveys are unbiased. More work is needed, however, to determine the feasibility of double-sampling in different populations and habitats. We believe the tension that has developed over appropriate survey methods can best be resolved through increased appreciation of the mathematical aspects of indices, especially the effects of bias, and through studies in which candidate methods are evaluated against known numbers determined through intensive surveys.
Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching.
Austin, Peter C
2017-02-01
Propensity-score matching is frequently used to reduce the effects of confounding when using observational data to estimate the effects of treatments. Matching allows one to estimate the average effect of treatment in the treated. Rosenbaum and Rubin coined the term "bias due to incomplete matching" to describe the bias that can occur when some treated subjects are excluded from the matched sample because no appropriate control subject was available. The presence of incomplete matching raises important questions around the generalizability of estimated treatment effects to the entire population of treated subjects. We describe an analytic solution to address the bias due to incomplete matching. Our method is based on using optimal or nearest neighbor matching, rather than caliper matching (which frequently results in the exclusion of some treated subjects). Within the sample matched on the propensity score, covariate adjustment using the propensity score is then employed to impute missing potential outcomes under lack of treatment for each treated subject. Using Monte Carlo simulations, we found that the proposed method resulted in estimates of treatment effect that were essentially unbiased. This method resulted in decreased bias compared to caliper matching alone and compared to either optimal matching or nearest neighbor matching alone. Caliper matching alone resulted in design bias or bias due to incomplete matching, while optimal matching or nearest neighbor matching alone resulted in bias due to residual confounding. The proposed method also tended to result in estimates with decreased mean squared error compared to when caliper matching was used.
Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching
2016-01-01
Propensity-score matching is frequently used to reduce the effects of confounding when using observational data to estimate the effects of treatments. Matching allows one to estimate the average effect of treatment in the treated. Rosenbaum and Rubin coined the term “bias due to incomplete matching” to describe the bias that can occur when some treated subjects are excluded from the matched sample because no appropriate control subject was available. The presence of incomplete matching raises important questions around the generalizability of estimated treatment effects to the entire population of treated subjects. We describe an analytic solution to address the bias due to incomplete matching. Our method is based on using optimal or nearest neighbor matching, rather than caliper matching (which frequently results in the exclusion of some treated subjects). Within the sample matched on the propensity score, covariate adjustment using the propensity score is then employed to impute missing potential outcomes under lack of treatment for each treated subject. Using Monte Carlo simulations, we found that the proposed method resulted in estimates of treatment effect that were essentially unbiased. This method resulted in decreased bias compared to caliper matching alone and compared to either optimal matching or nearest neighbor matching alone. Caliper matching alone resulted in design bias or bias due to incomplete matching, while optimal matching or nearest neighbor matching alone resulted in bias due to residual confounding. The proposed method also tended to result in estimates with decreased mean squared error compared to when caliper matching was used. PMID:25038071
Nonparametric and Semiparametric Regression Estimation for Length-biased Survival Data
Shen, Yu; Ning, Jing; Qin, Jing
2016-01-01
For the past several decades, nonparametric and semiparametric modeling for conventional right-censored survival data has been investigated intensively under a noninformative censoring mechanism. However, these methods may not be applicable for analyzing right-censored survival data that arise from prevalent cohorts when the failure times are subject to length-biased sampling. This review article is intended to provide a summary of some newly developed methods as well as established methods for analyzing length-biased data. PMID:27086362
Lean Keng, Soon; AlQudah, Hani Nawaf Ibrahim
2017-02-01
To raise awareness of critical care nurses' cognitive bias in decision-making, its relationship with leadership styles and its impact on care delivery. The relationship between critical care nurses' decision-making and leadership styles in hospitals has been widely studied, but the influence of cognitive bias on decision-making and leadership styles in critical care environments remains poorly understood, particularly in Jordan. Two-phase mixed methods sequential explanatory design and grounded theory. critical care unit, Prince Hamza Hospital, Jordan. Participant sampling: convenience sampling Phase 1 (quantitative, n = 96), purposive sampling Phase 2 (qualitative, n = 20). Pilot tested quantitative survey of 96 critical care nurses in 2012. Qualitative in-depth interviews, informed by quantitative results, with 20 critical care nurses in 2013. Descriptive and simple linear regression quantitative data analyses. Thematic (constant comparative) qualitative data analysis. Quantitative - correlations found between rationality and cognitive bias, rationality and task-oriented leadership styles, cognitive bias and democratic communication styles and cognitive bias and task-oriented leadership styles. Qualitative - 'being competent', 'organizational structures', 'feeling self-confident' and 'being supported' in the work environment identified as key factors influencing critical care nurses' cognitive bias in decision-making and leadership styles. Two-way impact (strengthening and weakening) of cognitive bias in decision-making and leadership styles on critical care nurses' practice performance. There is a need to heighten critical care nurses' consciousness of cognitive bias in decision-making and leadership styles and its impact and to develop organization-level strategies to increase non-biased decision-making. © 2016 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Oranje, Andreas
2006-01-01
A multitude of methods has been proposed to estimate the sampling variance of ratio estimates in complex samples (Wolter, 1985). Hansen and Tepping (1985) studied some of those variance estimators and found that a high coefficient of variation (CV) of the denominator of a ratio estimate is indicative of a biased estimate of the standard error of a…
Consistent Adjoint Driven Importance Sampling using Space, Energy and Angle
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peplow, Douglas E.; Mosher, Scott W; Evans, Thomas M
2012-08-01
For challenging radiation transport problems, hybrid methods combine the accuracy of Monte Carlo methods with the global information present in deterministic methods. One of the most successful hybrid methods is CADIS Consistent Adjoint Driven Importance Sampling. This method uses a deterministic adjoint solution to construct a biased source distribution and consistent weight windows to optimize a specific tally in a Monte Carlo calculation. The method has been implemented into transport codes using just the spatial and energy information from the deterministic adjoint and has been used in many applications to compute tallies with much higher figures-of-merit than analog calculations. CADISmore » also outperforms user-supplied importance values, which usually take long periods of user time to develop. This work extends CADIS to develop weight windows that are a function of the position, energy, and direction of the Monte Carlo particle. Two types of consistent source biasing are presented: one method that biases the source in space and energy while preserving the original directional distribution and one method that biases the source in space, energy, and direction. Seven simple example problems are presented which compare the use of the standard space/energy CADIS with the new space/energy/angle treatments.« less
Microcoulometric measurement of water in minerals
Cremer, M.; Elsheimer, H.N.; Escher, E.E.
1972-01-01
A DuPont Moisture Analyzer is used in a microcoulometric method for determining water in minerals. Certain modifications, which include the heating of the sample outside the instrument, protect the system from acid gases and insure the conversion of all hydrogen to water vapor. Moisture analyzer data are compared to concurrent data obtained by a modified Penfield method. In general, there is a positive bias of from 0.1 to 0.2% in the moisture analyzer results and a similarity of bias in minerals of the same kind. Inhomogeneity, sample size, and moisture pick-up are invoked to explain deviations. The method is particularly applicable to small samples. ?? 1972.
Clare, John; McKinney, Shawn T.; DePue, John E.; Loftin, Cynthia S.
2017-01-01
It is common to use multiple field sampling methods when implementing wildlife surveys to compare method efficacy or cost efficiency, integrate distinct pieces of information provided by separate methods, or evaluate method-specific biases and misclassification error. Existing models that combine information from multiple field methods or sampling devices permit rigorous comparison of method-specific detection parameters, enable estimation of additional parameters such as false-positive detection probability, and improve occurrence or abundance estimates, but with the assumption that the separate sampling methods produce detections independently of one another. This assumption is tenuous if methods are paired or deployed in close proximity simultaneously, a common practice that reduces the additional effort required to implement multiple methods and reduces the risk that differences between method-specific detection parameters are confounded by other environmental factors. We develop occupancy and spatial capture–recapture models that permit covariance between the detections produced by different methods, use simulation to compare estimator performance of the new models to models assuming independence, and provide an empirical application based on American marten (Martes americana) surveys using paired remote cameras, hair catches, and snow tracking. Simulation results indicate existing models that assume that methods independently detect organisms produce biased parameter estimates and substantially understate estimate uncertainty when this assumption is violated, while our reformulated models are robust to either methodological independence or covariance. Empirical results suggested that remote cameras and snow tracking had comparable probability of detecting present martens, but that snow tracking also produced false-positive marten detections that could potentially substantially bias distribution estimates if not corrected for. Remote cameras detected marten individuals more readily than passive hair catches. Inability to photographically distinguish individual sex did not appear to induce negative bias in camera density estimates; instead, hair catches appeared to produce detection competition between individuals that may have been a source of negative bias. Our model reformulations broaden the range of circumstances in which analyses incorporating multiple sources of information can be robustly used, and our empirical results demonstrate that using multiple field-methods can enhance inferences regarding ecological parameters of interest and improve understanding of how reliably survey methods sample these parameters.
Carpenter, Danielle; Walker, Susan; Prescott, Natalie; Schalkwijk, Joost; Armour, John Al
2011-08-18
Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion.
2011-01-01
Background Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. Results We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Conclusions Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion. PMID:21851606
Internal Standards: A Source of Analytical Bias For Volatile Organic Analyte Determinations
The use of internal standards in the determination of volatile organic compounds as described in SW-846 Method 8260C introduces a potential for bias in results once the internal standards (ISTDs) are added to a sample for analysis. The bias is relative to the dissimilarity betw...
Uses and biases of volunteer water quality data
Loperfido, J.V.; Beyer, P.; Just, C.L.; Schnoor, J.L.
2010-01-01
State water quality monitoring has been augmented by volunteer monitoring programs throughout the United States. Although a significant effort has been put forth by volunteers, questions remain as to whether volunteer data are accurate and can be used by regulators. In this study, typical volunteer water quality measurements from laboratory and environmental samples in Iowa were analyzed for error and bias. Volunteer measurements of nitrate+nitrite were significantly lower (about 2-fold) than concentrations determined via standard methods in both laboratory-prepared and environmental samples. Total reactive phosphorus concentrations analyzed by volunteers were similar to measurements determined via standard methods in laboratory-prepared samples and environmental samples, but were statistically lower than the actual concentration in four of the five laboratory-prepared samples. Volunteer water quality measurements were successful in identifying and classifying most of the waters which violate United States Environmental Protection Agency recommended water quality criteria for total nitrogen (66%) and for total phosphorus (52%) with the accuracy improving when accounting for error and biases in the volunteer data. An understanding of the error and bias in volunteer water quality measurements can allow regulators to incorporate volunteer water quality data into total maximum daily load planning or state water quality reporting. ?? 2010 American Chemical Society.
A study examining the bias of albumin and albumin/creatinine ratio measurements in urine.
Jacobson, Beryl E; Seccombe, David W; Katayev, Alex; Levin, Adeera
2015-10-01
The objective of the study was to examine the bias of albumin and albumin/creatinine (ACR) measurements in urine. Pools of normal human urine were augmented with purified human serum albumin to generate a series of 12 samples covering the clinical range of interest for the measurement of ACR. Albumin and creatinine concentrations in these samples were analyzed three times on each of 3 days by 24 accredited laboratories in Canada and the USA. Reference values (RV) for albumin measurements were assigned by a liquid chromatography-tandem mass spectrometry (LC-MS/MS) comparative method and gravimetrically. Ten random urine samples (check samples) were analyzed as singlets and albumin and ACR values reported according to the routine practices of each laboratory. Augmented urine pools were shown to be commutable. Gravimetrically assigned target values were corrected for the presence of endogenous albumin using the LC-MS/MS comparative method. There was excellent agreement between the RVs as assigned by these two methods. All laboratory medians demonstrated a negative bias for the measurement of albumin in urine over the concentration range examined. The magnitude of this bias tended to decrease with increasing albumin concentrations. At baseline, only 10% of the patient ACR values met a performance limit of RV ± 15%. This increased to 84% and 86% following post-analytical correction for albumin and creatinine calibration bias, respectively. International organizations should take a leading role in the standardization of albumin measurements in urine. In the interim, accuracy based urine quality control samples may be used by clinical laboratories for monitoring the accuracy of their urinary albumin measurements.
Gatti, M.
2018-02-22
We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing (WL) source galaxies from the Dark Energy Survey Year 1 (DES Y1) sample with redMaGiC galaxies (luminous red galaxies with secure photometric red- shifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We also apply the method to three photo-z codes run in our simulated data: Bayesian Photometric Redshift (BPZ), Directional Neighborhoodmore » Fitting (DNF), and Random Forest-based photo-z (RF). We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering vs photo-z's. The systematic uncertainty in the mean redshift bias of the source galaxy sample is z ≲ 0.02, though the precise value depends on the redshift bin under consideration. Here, we discuss possible ways to mitigate the impact of our dominant systematics in future analyses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gatti, M.
We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing (WL) source galaxies from the Dark Energy Survey Year 1 (DES Y1) sample with redMaGiC galaxies (luminous red galaxies with secure photometric red- shifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We also apply the method to three photo-z codes run in our simulated data: Bayesian Photometric Redshift (BPZ), Directional Neighborhoodmore » Fitting (DNF), and Random Forest-based photo-z (RF). We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering vs photo-z's. The systematic uncertainty in the mean redshift bias of the source galaxy sample is z ≲ 0.02, though the precise value depends on the redshift bin under consideration. Here, we discuss possible ways to mitigate the impact of our dominant systematics in future analyses.« less
Good, Nicholas; Mölter, Anna; Peel, Jennifer L; Volckens, John
2017-07-01
The AE51 micro-Aethalometer (microAeth) is a popular and useful tool for assessing personal exposure to particulate black carbon (BC). However, few users of the AE51 are aware that its measurements are biased low (by up to 70%) due to the accumulation of BC on the filter substrate over time; previous studies of personal black carbon exposure are likely to have suffered from this bias. Although methods to correct for bias in micro-Aethalometer measurements of particulate black carbon have been proposed, these methods have not been verified in the context of personal exposure assessment. Here, five Aethalometer loading correction equations based on published methods were evaluated. Laboratory-generated aerosols of varying black carbon content (ammonium sulfate, Aquadag and NIST diesel particulate matter) were used to assess the performance of these methods. Filters from a personal exposure assessment study were also analyzed to determine how the correction methods performed for real-world samples. Standard correction equations produced correction factors with root mean square errors of 0.10 to 0.13 and mean bias within ±0.10. An optimized correction equation is also presented, along with sampling recommendations for minimizing bias when assessing personal exposure to BC using the AE51 micro-Aethalometer.
Mixed Model Association with Family-Biased Case-Control Ascertainment.
Hayeck, Tristan J; Loh, Po-Ru; Pollack, Samuela; Gusev, Alexander; Patterson, Nick; Zaitlen, Noah A; Price, Alkes L
2017-01-05
Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold-based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where case and control subjects are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ 2 = 1.00-1.02 for null SNPs), whereas the Armitage trend test (ATT), standard mixed model association (MLM), and case-control retrospective association test (CARAT) were mis-calibrated (e.g., average χ 2 = 0.50-1.22 for MLM, 0.89-2.65 for CARAT). LT-Fam also attained higher power than other methods in some settings. In 1,259 type 2 diabetes-affected case subjects and 5,765 control subjects from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT, MLM, and CARAT were again mis-calibrated. Our results highlight the importance of modeling family sampling bias in case-control datasets with related samples. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
The late Neandertal supraorbital fossils from Vindija Cave, Croatia: a biased sample?
Ahern, James C M; Lee, Sang-Hee; Hawks, John D
2002-09-01
The late Neandertal sample from Vindija (Croatia) has been described as transitional between the earlier Central European Neandertals from Krapina (Croatia) and modern humans. However, the morphological differences indicating this transition may rather be the result of different sex and/or age compositions between the samples. This study tests the hypothesis that the metric differences between the Krapina and Vindija supraorbital samples are due to sampling bias. We focus upon the supraorbital region because past studies have posited this region as particularly indicative of the Vindija sample's transitional nature. Furthermore, the supraorbital region varies significantly with both age and sex. We analyzed four chords and two derived indices of supraorbital torus form as defined by Smith & Ranyard (1980, Am. J. phys. Anthrop.93, pp. 589-610). For each variable, we analyzed relative sample bias of the Krapina and Vindija samples using three sampling methods. In order to test the hypothesis that the Vindija sample contains an over-representation of females and/or young while the Krapina sample is normal or also female/young biased, we determined the probability of drawing a sample of the same size as and with a mean equal to or less than Vindija's from a Krapina-based population. In order to test the hypothesis that the Vindija sample is female/young biased while the Krapina sample is male/old biased, we determined the probability of drawing a sample of the same size as and with a mean equal or less than Vindija's from a generated population whose mean is halfway between Krapina's and Vindija's. Finally, in order to test the hypothesis that the Vindija sample is normal while the Krapina sample contains an over-representation of males and/or old, we determined the probability of drawing a sample of the same size as and with a mean equal to or greater than Krapina's from a Vindija-based population. Unless we assume that the Vindija sample is female/young and the Krapina sample is male/old biased, our results falsify the hypothesis that the metric differences between the Krapina and Vindija samples are due to sample bias.
Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach
Risso, Davide; Taglioli, Luca; De Iasio, Sergio; Gueresi, Paola; Alfani, Guido; Nelli, Sergio; Rossi, Paolo; Paoli, Giorgio; Tofanelli, Sergio
2015-01-01
This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447–2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8–30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics. PMID:26452043
Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach.
Risso, Davide; Taglioli, Luca; De Iasio, Sergio; Gueresi, Paola; Alfani, Guido; Nelli, Sergio; Rossi, Paolo; Paoli, Giorgio; Tofanelli, Sergio
2015-01-01
This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447-2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8-30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics.
ERIC Educational Resources Information Center
Rutkowski, Leslie; Rutkowski, David; Zhou, Yan
2016-01-01
Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…
Estimation and applications of size-biased distributions in forestry
Jeffrey H. Gove
2003-01-01
Size-biased distributions arise naturally in several contexts in forestry and ecology. Simple power relationships (e.g. basal area and diameter at breast height) between variables are one such area of interest arising from a modelling perspective. Another, probability proportional to size PPS) sampling, is found in the most widely used methods for sampling standing or...
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner,…
ERIC Educational Resources Information Center
Kim, Soyoung; Olejnik, Stephen
2005-01-01
The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…
Mapping of bird distributions from point count surveys
Sauer, J.R.; Pendleton, G.W.; Orsillo, Sandra; Ralph, C.J.; Sauer, J.R.; Droege, S.
1995-01-01
Maps generated from bird survey data are used for a variety of scientific purposes, but little is known about their bias and precision. We review methods for preparing maps from point count data and appropriate sampling methods for maps based on point counts. Maps based on point counts can be affected by bias associated with incomplete counts, primarily due to changes in proportion counted as a function of observer or habitat differences. Large-scale surveys also generally suffer from regional and temporal variation in sampling intensity. A simulated surface is used to demonstrate sampling principles for maps.
Darbani, Behrooz; Stewart, C Neal; Noeparvar, Shahin; Borg, Søren
2014-10-20
This report investigates for the first time the potential inter-treatment bias source of cell number for gene expression studies. Cell-number bias can affect gene expression analysis when comparing samples with unequal total cellular RNA content or with different RNA extraction efficiencies. For maximal reliability of analysis, therefore, comparisons should be performed at the cellular level. This could be accomplished using an appropriate correction method that can detect and remove the inter-treatment bias for cell-number. Based on inter-treatment variations of reference genes, we introduce an analytical approach to examine the suitability of correction methods by considering the inter-treatment bias as well as the inter-replicate variance, which allows use of the best correction method with minimum residual bias. Analyses of RNA sequencing and microarray data showed that the efficiencies of correction methods are influenced by the inter-treatment bias as well as the inter-replicate variance. Therefore, we recommend inspecting both of the bias sources in order to apply the most efficient correction method. As an alternative correction strategy, sequential application of different correction approaches is also advised. Copyright © 2014 Elsevier B.V. All rights reserved.
Hansen, Halvor S; Daura, Xavier; Hünenberger, Philippe H
2010-09-14
A new method, fragment-based local elevation umbrella sampling (FB-LEUS), is proposed to enhance the conformational sampling in explicit-solvent molecular dynamics (MD) simulations of solvated polymers. The method is derived from the local elevation umbrella sampling (LEUS) method [ Hansen and Hünenberger , J. Comput. Chem. 2010 , 31 , 1 - 23 ], which combines the local elevation (LE) conformational searching and the umbrella sampling (US) conformational sampling approaches into a single scheme. In LEUS, an initial (relatively short) LE build-up (searching) phase is used to construct an optimized (grid-based) biasing potential within a subspace of conformationally relevant degrees of freedom, which is then frozen and used in a (comparatively longer) US sampling phase. This combination dramatically enhances the sampling power of MD simulations but, due to computational and memory costs, is only applicable to relevant subspaces of low dimensionalities. As an attempt to expand the scope of the LEUS approach to solvated polymers with more than a few relevant degrees of freedom, the FB-LEUS scheme involves an US sampling phase that relies on a superposition of low-dimensionality biasing potentials optimized using LEUS at the fragment level. The feasibility of this approach is tested using polyalanine (poly-Ala) and polyvaline (poly-Val) oligopeptides. Two-dimensional biasing potentials are preoptimized at the monopeptide level, and subsequently applied to all dihedral-angle pairs within oligopeptides of 4, 6, 8, or 10 residues. Two types of fragment-based biasing potentials are distinguished: (i) the basin-filling (BF) potentials act so as to "fill" free-energy basins up to a prescribed free-energy level above the global minimum; (ii) the valley-digging (VD) potentials act so as to "dig" valleys between the (four) free-energy minima of the two-dimensional maps, preserving barriers (relative to linearly interpolated free-energy changes) of a prescribed magnitude. The application of these biasing potentials may lead to an impressive enhancement of the searching power (volume of conformational space visited in a given amount of simulation time). However, this increase is largely offset by a deterioration of the statistical efficiency (representativeness of the biased ensemble in terms of the conformational distribution appropriate for the physical ensemble). As a result, it appears difficult to engineer FB-LEUS schemes representing a significant improvement over plain MD, at least for the systems considered here.
Application of Biased Metropolis Algorithms: From protons to proteins
Bazavov, Alexei; Berg, Bernd A.; Zhou, Huan-Xiang
2015-01-01
We show that sampling with a biased Metropolis scheme is essentially equivalent to using the heatbath algorithm. However, the biased Metropolis method can also be applied when an efficient heatbath algorithm does not exist. This is first illustrated with an example from high energy physics (lattice gauge theory simulations). We then illustrate the Rugged Metropolis method, which is based on a similar biased updating scheme, but aims at very different applications. The goal of such applications is to locate the most likely configurations in a rugged free energy landscape, which is most relevant for simulations of biomolecules. PMID:26612967
Influence of item distribution pattern and abundance on efficiency of benthic core sampling
Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.
2014-01-01
ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
Effects of Sample Selection Bias on the Accuracy of Population Structure and Ancestry Inference
Shringarpure, Suyash; Xing, Eric P.
2014-01-01
Population stratification is an important task in genetic analyses. It provides information about the ancestry of individuals and can be an important confounder in genome-wide association studies. Public genotyping projects have made a large number of datasets available for study. However, practical constraints dictate that of a geographical/ethnic population, only a small number of individuals are genotyped. The resulting data are a sample from the entire population. If the distribution of sample sizes is not representative of the populations being sampled, the accuracy of population stratification analyses of the data could be affected. We attempt to understand the effect of biased sampling on the accuracy of population structure analysis and individual ancestry recovery. We examined two commonly used methods for analyses of such datasets, ADMIXTURE and EIGENSOFT, and found that the accuracy of recovery of population structure is affected to a large extent by the sample used for analysis and how representative it is of the underlying populations. Using simulated data and real genotype data from cattle, we show that sample selection bias can affect the results of population structure analyses. We develop a mathematical framework for sample selection bias in models for population structure and also proposed a correction for sample selection bias using auxiliary information about the sample. We demonstrate that such a correction is effective in practice using simulated and real data. PMID:24637351
Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco
2016-01-01
Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population's sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns.
Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco
2016-01-01
Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population’s sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns. PMID:27441554
Clare, John; McKinney, Shawn T; DePue, John E; Loftin, Cynthia S
2017-10-01
It is common to use multiple field sampling methods when implementing wildlife surveys to compare method efficacy or cost efficiency, integrate distinct pieces of information provided by separate methods, or evaluate method-specific biases and misclassification error. Existing models that combine information from multiple field methods or sampling devices permit rigorous comparison of method-specific detection parameters, enable estimation of additional parameters such as false-positive detection probability, and improve occurrence or abundance estimates, but with the assumption that the separate sampling methods produce detections independently of one another. This assumption is tenuous if methods are paired or deployed in close proximity simultaneously, a common practice that reduces the additional effort required to implement multiple methods and reduces the risk that differences between method-specific detection parameters are confounded by other environmental factors. We develop occupancy and spatial capture-recapture models that permit covariance between the detections produced by different methods, use simulation to compare estimator performance of the new models to models assuming independence, and provide an empirical application based on American marten (Martes americana) surveys using paired remote cameras, hair catches, and snow tracking. Simulation results indicate existing models that assume that methods independently detect organisms produce biased parameter estimates and substantially understate estimate uncertainty when this assumption is violated, while our reformulated models are robust to either methodological independence or covariance. Empirical results suggested that remote cameras and snow tracking had comparable probability of detecting present martens, but that snow tracking also produced false-positive marten detections that could potentially substantially bias distribution estimates if not corrected for. Remote cameras detected marten individuals more readily than passive hair catches. Inability to photographically distinguish individual sex did not appear to induce negative bias in camera density estimates; instead, hair catches appeared to produce detection competition between individuals that may have been a source of negative bias. Our model reformulations broaden the range of circumstances in which analyses incorporating multiple sources of information can be robustly used, and our empirical results demonstrate that using multiple field-methods can enhance inferences regarding ecological parameters of interest and improve understanding of how reliably survey methods sample these parameters. © 2017 by the Ecological Society of America.
Correction of sampling bias in a cross-sectional study of post-surgical complications.
Fluss, Ronen; Mandel, Micha; Freedman, Laurence S; Weiss, Inbal Salz; Zohar, Anat Ekka; Haklai, Ziona; Gordon, Ethel-Sherry; Simchen, Elisheva
2013-06-30
Cross-sectional designs are often used to monitor the proportion of infections and other post-surgical complications acquired in hospitals. However, conventional methods for estimating incidence proportions when applied to cross-sectional data may provide estimators that are highly biased, as cross-sectional designs tend to include a high proportion of patients with prolonged hospitalization. One common solution is to use sampling weights in the analysis, which adjust for the sampling bias inherent in a cross-sectional design. The current paper describes in detail a method to build weights for a national survey of post-surgical complications conducted in Israel. We use the weights to estimate the probability of surgical site infections following colon resection, and validate the results of the weighted analysis by comparing them with those obtained from a parallel study with a historically prospective design. Copyright © 2012 John Wiley & Sons, Ltd.
Comparing four methods to estimate usual intake distributions.
Souverein, O W; Dekkers, A L; Geelen, A; Haubrock, J; de Vries, J H; Ocké, M C; Harttig, U; Boeing, H; van 't Veer, P
2011-07-01
The aim of this paper was to compare methods to estimate usual intake distributions of nutrients and foods. As 'true' usual intake distributions are not known in practice, the comparison was carried out through a simulation study, as well as empirically, by application to data from the European Food Consumption Validation (EFCOVAL) Study in which two 24-h dietary recalls (24-HDRs) and food frequency data were collected. The methods being compared were the Iowa State University Method (ISU), National Cancer Institute Method (NCI), Multiple Source Method (MSM) and Statistical Program for Age-adjusted Dietary Assessment (SPADE). Simulation data were constructed with varying numbers of subjects (n), different values for the Box-Cox transformation parameter (λ(BC)) and different values for the ratio of the within- and between-person variance (r(var)). All data were analyzed with the four different methods and the estimated usual mean intake and selected percentiles were obtained. Moreover, the 2-day within-person mean was estimated as an additional 'method'. These five methods were compared in terms of the mean bias, which was calculated as the mean of the differences between the estimated value and the known true value. The application of data from the EFCOVAL Project included calculations of nutrients (that is, protein, potassium, protein density) and foods (that is, vegetables, fruit and fish). Overall, the mean bias of the ISU, NCI, MSM and SPADE Methods was small. However, for all methods, the mean bias and the variation of the bias increased with smaller sample size, higher variance ratios and with more pronounced departures from normality. Serious mean bias (especially in the 95th percentile) was seen using the NCI Method when r(var) = 9, λ(BC) = 0 and n = 1000. The ISU Method and MSM showed a somewhat higher s.d. of the bias compared with NCI and SPADE Methods, indicating a larger method uncertainty. Furthermore, whereas the ISU, NCI and SPADE Methods produced unimodal density functions by definition, MSM produced distributions with 'peaks', when sample size was small, because of the fact that the population's usual intake distribution was based on estimated individual usual intakes. The application to the EFCOVAL data showed that all estimates of the percentiles and mean were within 5% of each other for the three nutrients analyzed. For vegetables, fruit and fish, the differences were larger than that for nutrients, but overall the sample mean was estimated reasonably. The four methods that were compared seem to provide good estimates of the usual intake distribution of nutrients. Nevertheless, care needs to be taken when a nutrient has a high within-person variation or has a highly skewed distribution, and when the sample size is small. As the methods offer different features, practical reasons may exist to prefer one method over the other.
Mapping ecological systems with a random foret model: tradeoffs between errors and bias
Emilie Grossmann; Janet Ohmann; James Kagan; Heather May; Matthew Gregory
2010-01-01
New methods for predictive vegetation mapping allow improved estimations of plant community composition across large regions. Random Forest (RF) models limit over-fitting problems of other methods, and are known for making accurate classification predictions from noisy, nonnormal data, but can be biased when plot samples are unbalanced. We developed two contrasting...
USDA-ARS?s Scientific Manuscript database
The Karl Fischer Titration (KFT) reference method is specific for water in lint cotton and was designed for samples conditioned to moisture equilibrium, thus limiting its biases. There is a standard method for moisture content – weight loss – by oven drying (OD), just not for equilibrium moisture c...
Field efficiency and bias of snag inventory methods
Robert S. Kenning; Mark J. Ducey; John C. Brissette; Jeffery H. Gove
2005-01-01
Snags and cavity trees are important components of forests, but can be difficult to inventory precisely and are not always included in inventories because of limited resources. We tested the application of N-tree distance sampling as a time-saving snag sampling method and compared N-tree distance sampling to fixed-area sampling and modified horizontal line sampling in...
Sampling considerations for disease surveillance in wildlife populations
Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.
2008-01-01
Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
A new method to measure galaxy bias by combining the density and weak lensing fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pujol, Arnau; Chang, Chihway; Gaztañaga, Enrique
We present a new method to measure redshift-dependent galaxy bias by combining information from the galaxy density field and the weak lensing field. This method is based on the work of Amara et al., who use the galaxy density field to construct a bias-weighted convergence field κg. The main difference between Amara et al.'s work and our new implementation is that here we present another way to measure galaxy bias, using tomography instead of bias parametrizations. The correlation between κg and the true lensing field κ allows us to measure galaxy bias using different zero-lag correlations, such as / ormore » /. Our method measures the linear bias factor on linear scales, under the assumption of no stochasticity between galaxies and matter. We use the Marenostrum Institut de Ciències de l'Espai (MICE) simulation to measure the linear galaxy bias for a flux-limited sample (i < 22.5) in tomographic redshift bins using this method. This article is the first that studies the accuracy and systematic uncertainties associated with the implementation of the method and the regime in which it is consistent with the linear galaxy bias defined by projected two-point correlation functions (2PCF). We find that our method is consistent with a linear bias at the per cent level for scales larger than 30 arcmin, while non-linearities appear at smaller scales. This measurement is a good complement to other measurements of bias, since it does not depend strongly on σ8 as do the 2PCF measurements. We will apply this method to the Dark Energy Survey Science Verification data in a follow-up article.« less
Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering.
Gil-Ley, Alejandro; Bussi, Giovanni
2015-03-10
The computational study of conformational transitions in RNA and proteins with atomistic molecular dynamics often requires suitable enhanced sampling techniques. We here introduce a novel method where concurrent metadynamics are integrated in a Hamiltonian replica-exchange scheme. The ladder of replicas is built with different strengths of the bias potential exploiting the tunability of well-tempered metadynamics. Using this method, free-energy barriers of individual collective variables are significantly reduced compared with simple force-field scaling. The introduced methodology is flexible and allows adaptive bias potentials to be self-consistently constructed for a large number of simple collective variables, such as distances and dihedral angles. The method is tested on alanine dipeptide and applied to the difficult problem of conformational sampling in a tetranucleotide.
Geldsetzer, Pascal; Fink, Günther; Vaikath, Maria; Bärnighausen, Till
2018-02-01
(1) To evaluate the operational efficiency of various sampling methods for patient exit interviews; (2) to discuss under what circumstances each method yields an unbiased sample; and (3) to propose a new, operationally efficient, and unbiased sampling method. Literature review, mathematical derivation, and Monte Carlo simulations. Our simulations show that in patient exit interviews it is most operationally efficient if the interviewer, after completing an interview, selects the next patient exiting the clinical consultation. We demonstrate mathematically that this method yields a biased sample: patients who spend a longer time with the clinician are overrepresented. This bias can be removed by selecting the next patient who enters, rather than exits, the consultation room. We show that this sampling method is operationally more efficient than alternative methods (systematic and simple random sampling) in most primary health care settings. Under the assumption that the order in which patients enter the consultation room is unrelated to the length of time spent with the clinician and the interviewer, selecting the next patient entering the consultation room tends to be the operationally most efficient unbiased sampling method for patient exit interviews. © 2016 The Authors. Health Services Research published by Wiley Periodicals, Inc. on behalf of Health Research and Educational Trust.
A novel method for correcting scanline-observational bias of discontinuity orientation
Huang, Lei; Tang, Huiming; Tan, Qinwen; Wang, Dingjian; Wang, Liangqing; Ez Eldin, Mutasim A. M.; Li, Changdong; Wu, Qiong
2016-01-01
Scanline observation is known to introduce an angular bias into the probability distribution of orientation in three-dimensional space. In this paper, numerical solutions expressing the functional relationship between the scanline-observational distribution (in one-dimensional space) and the inherent distribution (in three-dimensional space) are derived using probability theory and calculus under the independence hypothesis of dip direction and dip angle. Based on these solutions, a novel method for obtaining the inherent distribution (also for correcting the bias) is proposed, an approach which includes two procedures: 1) Correcting the cumulative probabilities of orientation according to the solutions, and 2) Determining the distribution of the corrected orientations using approximation methods such as the one-sample Kolmogorov-Smirnov test. The inherent distribution corrected by the proposed method can be used for discrete fracture network (DFN) modelling, which is applied to such areas as rockmass stability evaluation, rockmass permeability analysis, rockmass quality calculation and other related fields. To maximize the correction capacity of the proposed method, the observed sample size is suggested through effectiveness tests for different distribution types, dispersions and sample sizes. The performance of the proposed method and the comparison of its correction capacity with existing methods are illustrated with two case studies. PMID:26961249
A method of bias correction for maximal reliability with dichotomous measures.
Penev, Spiridon; Raykov, Tenko
2010-02-01
This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example.
Is probabilistic bias analysis approximately Bayesian?
MacLehose, Richard F.; Gustafson, Paul
2011-01-01
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well. PMID:22157311
Latkin, Carl A; Edwards, Catie; Davey-Rothwell, Melissa A; Tobin, Karin E
2017-10-01
Social desirability response bias may lead to inaccurate self-reports and erroneous study conclusions. The present study examined the relationship between social desirability response bias and self-reports of mental health, substance use, and social network factors among a community sample of inner-city substance users. The study was conducted in a sample of 591 opiate and cocaine users in Baltimore, Maryland from 2009 to 2013. Modified items from the Marlowe-Crowne Social Desirability Scale were included in the survey, which was conducted face-to-face and using Audio Computer Self Administering Interview (ACASI) methods. There were highly statistically significant differences in levels of social desirability response bias by levels of depressive symptoms, drug use stigma, physical health status, recent opiate and cocaine use, Alcohol Use Disorders Identification Test (AUDIT) scores, and size of social networks. There were no associations between health service utilization measures and social desirability bias. In multiple logistic regression models, even after including the Center for Epidemiologic Studies Depression Scale (CES-D) as a measure of depressive symptomology, social desirability bias was associated with recent drug use and drug user stigma. Social desirability bias was not associated with enrollment in prior research studies. These findings suggest that social desirability bias is associated with key health measures and that the associations are not primarily due to depressive symptoms. Methods are needed to reduce social desirability bias. Such methods may include the wording and prefacing of questions, clearly defining the role of "study participant," and assessing and addressing motivations for socially desirable responses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas
2014-01-01
Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357
Mapping of Bird Distributions from Point Count Surveys
John R. Sauer; Grey W. Pendleton; Sandra Orsillo
1995-01-01
Maps generated from bird survey data are used for a variety of scientific purposes, but little is known about their bias and precision. We review methods for preparing maps from point count data and appropriate sampling methods for maps based on point counts. Maps based on point counts can be affected by bias associated with incomplete counts, primarily due to changes...
He, Hua; McDermott, Michael P.
2012-01-01
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified. PMID:21856650
Kang, Leni; Zhang, Shaokai; Zhao, Fanghui; Qiao, Youlin
2014-03-01
To evaluate and adjust the verification bias existed in the screening or diagnostic tests. Inverse-probability weighting method was used to adjust the sensitivity and specificity of the diagnostic tests, with an example of cervical cancer screening used to introduce the Compare Tests package in R software which could be implemented. Sensitivity and specificity calculated from the traditional method and maximum likelihood estimation method were compared to the results from Inverse-probability weighting method in the random-sampled example. The true sensitivity and specificity of the HPV self-sampling test were 83.53% (95%CI:74.23-89.93)and 85.86% (95%CI: 84.23-87.36). In the analysis of data with randomly missing verification by gold standard, the sensitivity and specificity calculated by traditional method were 90.48% (95%CI:80.74-95.56)and 71.96% (95%CI:68.71-75.00), respectively. The adjusted sensitivity and specificity under the use of Inverse-probability weighting method were 82.25% (95% CI:63.11-92.62) and 85.80% (95% CI: 85.09-86.47), respectively, whereas they were 80.13% (95%CI:66.81-93.46)and 85.80% (95%CI: 84.20-87.41) under the maximum likelihood estimation method. The inverse-probability weighting method could effectively adjust the sensitivity and specificity of a diagnostic test when verification bias existed, especially when complex sampling appeared.
Elwan, Ahmed; Singh, Ranvir; Patterson, Maree; Roygard, Jon; Horne, Dave; Clothier, Brent; Jones, Geoffrey
2018-01-11
Better management of water quality in streams, rivers and lakes requires precise and accurate estimates of different contaminant loads. We assessed four sampling frequencies (2 days, weekly, fortnightly and monthly) and five load calculation methods (global mean (GM), rating curve (RC), ratio estimator (RE), flow-stratified (FS) and flow-weighted (FW)) to quantify loads of nitrate-nitrogen (NO 3 - -N), soluble inorganic nitrogen (SIN), total nitrogen (TN), dissolved reactive phosphorus (DRP), total phosphorus (TP) and total suspended solids (TSS), in the Manawatu River, New Zealand. The estimated annual river loads were compared to the reference 'true' loads, calculated using daily measurements of flow and water quality from May 2010 to April 2011, to quantify bias (i.e. accuracy) and root mean square error 'RMSE' (i.e. accuracy and precision). The GM method resulted into relatively higher RMSE values and a consistent negative bias (i.e. underestimation) in estimates of annual river loads across all sampling frequencies. The RC method resulted in the lowest RMSE for TN, TP and TSS at monthly sampling frequency. Yet, RC highly overestimated the loads for parameters that showed dilution effect such as NO 3 - -N and SIN. The FW and RE methods gave similar results, and there was no essential improvement in using RE over FW. In general, FW and RE performed better than FS in terms of bias, but FS performed slightly better than FW and RE in terms of RMSE for most of the water quality parameters (DRP, TP, TN and TSS) using a monthly sampling frequency. We found no significant decrease in RMSE values for estimates of NO 3 - N, SIN, TN and DRP loads when the sampling frequency was increased from monthly to fortnightly. The bias and RMSE values in estimates of TP and TSS loads (estimated by FW, RE and FS), however, showed a significant decrease in the case of weekly or 2-day sampling. This suggests potential for a higher sampling frequency during flow peaks for more precise and accurate estimates of annual river loads for TP and TSS, in the study river and other similar conditions.
Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies
Theis, Fabian J.
2017-01-01
Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464
Adaptive-numerical-bias metadynamics.
Khanjari, Neda; Eslami, Hossein; Müller-Plathe, Florian
2017-12-05
A metadynamics scheme is presented in which the free energy surface is filled with progressively adding adaptive biasing potentials, obtained from the accumulated probability distribution of the collective variables. Instead of adding Gaussians with assigned height and width in conventional metadynamics method, here we add a more realistic adaptive biasing potential to the Hamiltonian of the system. The shape of the adaptive biasing potential is adjusted on the fly by sampling over the visited states. As the top of the barrier is approached, the biasing potentials become wider. This decreases the problem of trapping the system in the niches, introduced by the addition of Gaussians of fixed height in metadynamics. Our results for the free energy profiles of three test systems show that this method is more accurate and converges more quickly than the conventional metadynamics, and is quite comparable (in accuracy and convergence rate) with the well-tempered metadynamics method. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Bias correction for selecting the minimal-error classifier from many machine learning models.
Ding, Ying; Tang, Shaowu; Liao, Serena G; Jia, Jia; Oesterreich, Steffi; Lin, Yan; Tseng, George C
2014-11-15
Supervised machine learning is commonly applied in genomic research to construct a classifier from the training data that is generalizable to predict independent testing data. When test datasets are not available, cross-validation is commonly used to estimate the error rate. Many machine learning methods are available, and it is well known that no universally best method exists in general. It has been a common practice to apply many machine learning methods and report the method that produces the smallest cross-validation error rate. Theoretically, such a procedure produces a selection bias. Consequently, many clinical studies with moderate sample sizes (e.g. n = 30-60) risk reporting a falsely small cross-validation error rate that could not be validated later in independent cohorts. In this article, we illustrated the probabilistic framework of the problem and explored the statistical and asymptotic properties. We proposed a new bias correction method based on learning curve fitting by inverse power law (IPL) and compared it with three existing methods: nested cross-validation, weighted mean correction and Tibshirani-Tibshirani procedure. All methods were compared in simulation datasets, five moderate size real datasets and two large breast cancer datasets. The result showed that IPL outperforms the other methods in bias correction with smaller variance, and it has an additional advantage to extrapolate error estimates for larger sample sizes, a practical feature to recommend whether more samples should be recruited to improve the classifier and accuracy. An R package 'MLbias' and all source files are publicly available. tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering
2015-01-01
The computational study of conformational transitions in RNA and proteins with atomistic molecular dynamics often requires suitable enhanced sampling techniques. We here introduce a novel method where concurrent metadynamics are integrated in a Hamiltonian replica-exchange scheme. The ladder of replicas is built with different strengths of the bias potential exploiting the tunability of well-tempered metadynamics. Using this method, free-energy barriers of individual collective variables are significantly reduced compared with simple force-field scaling. The introduced methodology is flexible and allows adaptive bias potentials to be self-consistently constructed for a large number of simple collective variables, such as distances and dihedral angles. The method is tested on alanine dipeptide and applied to the difficult problem of conformational sampling in a tetranucleotide. PMID:25838811
Bias of shear wave elasticity measurements in thin layer samples and a simple correction strategy.
Mo, Jianqiang; Xu, Hao; Qiang, Bo; Giambini, Hugo; Kinnick, Randall; An, Kai-Nan; Chen, Shigao; Luo, Zongping
2016-01-01
Shear wave elastography (SWE) is an emerging technique for measuring biological tissue stiffness. However, the application of SWE in thin layer tissues is limited by bias due to the influence of geometry on measured shear wave speed. In this study, we investigated the bias of Young's modulus measured by SWE in thin layer gelatin-agar phantoms, and compared the result with finite element method and Lamb wave model simulation. The result indicated that the Young's modulus measured by SWE decreased continuously when the sample thickness decreased, and this effect was more significant for smaller thickness. We proposed a new empirical formula which can conveniently correct the bias without the need of using complicated mathematical modeling. In summary, we confirmed the nonlinear relation between thickness and Young's modulus measured by SWE in thin layer samples, and offered a simple and practical correction strategy which is convenient for clinicians to use.
Parametric study of statistical bias in laser Doppler velocimetry
NASA Technical Reports Server (NTRS)
Gould, Richard D.; Stevenson, Warren H.; Thompson, H. Doyle
1989-01-01
Analytical studies have often assumed that LDV velocity bias depends on turbulence intensity in conjunction with one or more characteristic time scales, such as the time between validated signals, the time between data samples, and the integral turbulence time-scale. These parameters are presently varied independently, in an effort to quantify the biasing effect. Neither of the post facto correction methods employed is entirely accurate. The mean velocity bias error is found to be nearly independent of data validation rate.
Selection within households in health surveys
Alves, Maria Cecilia Goi Porto; Escuder, Maria Mercedes Loureiro; Claro, Rafael Moreira; da Silva, Nilza Nunes
2014-01-01
OBJECTIVE To compare the efficiency and accuracy of sampling designs including and excluding the sampling of individuals within sampled households in health surveys. METHODS From a population survey conducted in Baixada Santista Metropolitan Area, SP, Southeastern Brazil, lowlands between 2006 and 2007, 1,000 samples were drawn for each design and estimates for people aged 18 to 59 and 18 and over were calculated for each sample. In the first design, 40 census tracts, 12 households per sector, and one person per household were sampled. In the second, no sampling within the household was performed and 40 census sectors and 6 households for the 18 to 59-year old group and 5 or 6 for the 18 and over age group or more were sampled. Precision and bias of proportion estimates for 11 indicators were assessed in the two final sets of the 1000 selected samples with the two types of design. They were compared by means of relative measurements: coefficient of variation, bias/mean ratio, bias/standard error ratio, and relative mean square error. Comparison of costs contrasted basic cost per person, household cost, number of people, and households. RESULTS Bias was found to be negligible for both designs. A lower precision was found in the design including individuals sampling within households, and the costs were higher. CONCLUSIONS The design excluding individual sampling achieved higher levels of efficiency and accuracy and, accordingly, should be first choice for investigators. Sampling of household dwellers should be adopted when there are reasons related to the study subject that may lead to bias in individual responses if multiple dwellers answer the proposed questionnaire. PMID:24789641
A double-observer method for reducing bias in faecal pellet surveys of forest ungulates
Jenkins, K.J.; Manly, B.F.J.
2008-01-01
1. Faecal surveys are used widely to study variations in abundance and distribution of forest-dwelling mammals when direct enumeration is not feasible. The utility of faecal indices of abundance is limited, however, by observational bias and variation in faecal disappearance rates that obscure their relationship to population size. We developed methods to reduce variability in faecal surveys and improve reliability of faecal indices. 2. We used double-observer transect sampling to estimate observational bias of faecal surveys of Roosevelt elk Cervus elaphus roosevelti and Columbian black-tailed deer Odocoileus hemionus columbianus in Olympic National Park, Washington, USA. We also modelled differences in counts of faecal groups obtained from paired cleared and uncleared transect segments as a means to adjust standing crop faecal counts for a standard accumulation interval and to reduce bias resulting from variable decay rates. 3. Estimated detection probabilities of faecal groups ranged from < 0.2-1.0 depending upon the observer, whether the faecal group was from elk or deer, faecal group size, distance of the faecal group from the sampling transect, ground vegetation cover, and the interaction between faecal group size and distance from the transect. 4. Models of plot-clearing effects indicated that standing crop counts of deer faecal groups required 34% reduction on flat terrain and 53% reduction on sloping terrain to represent faeces accumulated over a standard 100-day interval, whereas counts of elk faecal groups required 0% and 46% reductions on flat and sloping terrain, respectively. 5. Synthesis and applications. Double-observer transect sampling provides a cost-effective means of reducing observational bias and variation in faecal decay rates that obscure the interpretation of faecal indices of large mammal abundance. Given the variation we observed in observational bias of faecal surveys and persistence of faeces, we emphasize the need for future researchers to account for these comparatively manageable sources of bias before comparing faecal indices spatially or temporally. Double-observer sampling methods are readily adaptable to study variations in faecal indices of large mammals at the scale of the large forest reserve, natural area, or other forested regions when direct estimation of populations is problematic. ?? 2008 The Authors.
Estimation and correction of visibility bias in aerial surveys of wintering ducks
Pearse, A.T.; Gerard, P.D.; Dinsmore, S.J.; Kaminski, R.M.; Reinecke, K.J.
2008-01-01
Incomplete detection of all individuals leading to negative bias in abundance estimates is a pervasive source of error in aerial surveys of wildlife, and correcting that bias is a critical step in improving surveys. We conducted experiments using duck decoys as surrogates for live ducks to estimate bias associated with surveys of wintering ducks in Mississippi, USA. We found detection of decoy groups was related to wetland cover type (open vs. forested), group size (1?100 decoys), and interaction of these variables. Observers who detected decoy groups reported counts that averaged 78% of the decoys actually present, and this counting bias was not influenced by either covariate cited above. We integrated this sightability model into estimation procedures for our sample surveys with weight adjustments derived from probabilities of group detection (estimated by logistic regression) and count bias. To estimate variances of abundance estimates, we used bootstrap resampling of transects included in aerial surveys and data from the bias-correction experiment. When we implemented bias correction procedures on data from a field survey conducted in January 2004, we found bias-corrected estimates of abundance increased 36?42%, and associated standard errors increased 38?55%, depending on species or group estimated. We deemed our method successful for integrating correction of visibility bias in an existing sample survey design for wintering ducks in Mississippi, and we believe this procedure could be implemented in a variety of sampling problems for other locations and species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, C.; Pujol, A.; Gaztañaga, E.
We measure the redshift evolution of galaxy bias from a magnitude-limited galaxy sample by combining the galaxy density maps and weak lensing shear maps for amore » $$\\sim$$116 deg$$^{2}$$ area of the Dark Energy Survey (DES) Science Verification data. This method was first developed in Amara et al. (2012) and later re-examined in a companion paper (Pujol et al., in prep) with rigorous simulation tests and analytical treatment of tomographic measurements. In this work we apply this method to the DES SV data and measure the galaxy bias for a magnitude-limited galaxy sample. We find the galaxy bias and 1$$\\sigma$$ error bars in 4 photometric redshift bins to be 1.33$$\\pm$$0.18 (z=0.2-0.4), 1.19$$\\pm$$0.23 (z=0.4-0.6), 0.99$$\\pm$$0.36 ( z=0.6-0.8), and 1.66$$\\pm$$0.56 (z=0.8-1.0). These measurements are consistent at the 1-2$$\\sigma$$ level with mea- surements on the same dataset using galaxy clustering and cross-correlation of galaxies with CMB lensing. In addition, our method provides the only $$\\sigma_8$$-independent constraint among the three. We forward-model the main observational effects using mock galaxy catalogs by including shape noise, photo-z errors and masking effects. We show that our bias measurement from the data is consistent with that expected from simulations. With the forthcoming full DES data set, we expect this method to provide additional constraints on the galaxy bias measurement from more traditional methods. Furthermore, in the process of our measurement, we build up a 3D mass map that allows further exploration of the dark matter distribution and its relation to galaxy evolution.« less
The Petersen-Lincoln estimator and its extension to estimate the size of a shared population.
Chao, Anne; Pan, H-Y; Chiang, Shu-Chuan
2008-12-01
The Petersen-Lincoln estimator has been used to estimate the size of a population in a single mark release experiment. However, the estimator is not valid when the capture sample and recapture sample are not independent. We provide an intuitive interpretation for "independence" between samples based on 2 x 2 categorical data formed by capture/non-capture in each of the two samples. From the interpretation, we review a general measure of "dependence" and quantify the correlation bias of the Petersen-Lincoln estimator when two types of dependences (local list dependence and heterogeneity of capture probability) exist. An important implication in the census undercount problem is that instead of using a post enumeration sample to assess the undercount of a census, one should conduct a prior enumeration sample to avoid correlation bias. We extend the Petersen-Lincoln method to the case of two populations. This new estimator of the size of the shared population is proposed and its variance is derived. We discuss a special case where the correlation bias of the proposed estimator due to dependence between samples vanishes. The proposed method is applied to a study of the relapse rate of illicit drug use in Taiwan. ((c) 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim).
Evaluation of bias and logistics in a survey of adults at increased risk for oral health decrements.
Gilbert, G H; Duncan, R P; Kulley, A M; Coward, R T; Heft, M W
1997-01-01
Designing research to include sufficient respondents in groups at highest risk for oral health decrements can present unique challenges. Our purpose was to evaluate bias and logistics in this survey of adults at increased risk for oral health decrements. We used a telephone survey methodology that employed both listed numbers and random digit dialing to identify dentate persons 45 years old or older and to oversample blacks, poor persons, and residents of nonmetropolitan counties. At a second stage, a subsample of the respondents to the initial telephone screening was selected for further study, which consisted of a baseline in-person interview and a clinical examination. We assessed bias due to: (1) limiting the sample to households with telephones, (2) using predominantly listed numbers instead of random digit dialing, and (3) nonresponse at two stages of data collection. While this approach apparently created some biases in the sample, they were small in magnitude. Specifically, limiting the sample to households with telephones biased the sample overall toward more females, larger households, and fewer functionally impaired persons. Using predominantly listed numbers led to a modest bias toward selection of persons more likely to be younger, healthier, female, have had a recent dental visit, and reside in smaller households. Blacks who were selected randomly at a second stage were more likely to participate in baseline data gathering than their white counterparts. Comparisons of the data obtained in this survey with those from recent national surveys suggest that this methodology for sampling high-risk groups did not substantively bias the sample with respect to two important dental parameters, prevalence of edentulousness and dental care use, nor were conclusions about multivariate associations with dental care recency substantively affected. This method of sampling persons at high risk for oral health decrements resulted in only modest bias with respect to the population of interest.
Comparability of river suspended-sediment sampling and laboratory analysis methods
Groten, Joel T.; Johnson, Gregory D.
2018-03-06
Accurate measurements of suspended sediment, a leading water-quality impairment in many Minnesota rivers, are important for managing and protecting water resources; however, water-quality standards for suspended sediment in Minnesota are based on grab field sampling and total suspended solids (TSS) laboratory analysis methods that have underrepresented concentrations of suspended sediment in rivers compared to U.S. Geological Survey equal-width-increment or equal-discharge-increment (EWDI) field sampling and suspended sediment concentration (SSC) laboratory analysis methods. Because of this underrepresentation, the U.S. Geological Survey, in collaboration with the Minnesota Pollution Control Agency, collected concurrent grab and EWDI samples at eight sites to compare results obtained using different combinations of field sampling and laboratory analysis methods.Study results determined that grab field sampling and TSS laboratory analysis results were biased substantially low compared to EWDI sampling and SSC laboratory analysis results, respectively. Differences in both field sampling and laboratory analysis methods caused grab and TSS methods to be biased substantially low. The difference in laboratory analysis methods was slightly greater than field sampling methods.Sand-sized particles had a strong effect on the comparability of the field sampling and laboratory analysis methods. These results indicated that grab field sampling and TSS laboratory analysis methods fail to capture most of the sand being transported by the stream. The results indicate there is less of a difference among samples collected with grab field sampling and analyzed for TSS and concentration of fines in SSC. Even though differences are present, the presence of strong correlations between SSC and TSS concentrations provides the opportunity to develop site specific relations to address transport processes not captured by grab field sampling and TSS laboratory analysis methods.
Onsite Gaseous Centrifuge Enrichment Plant UF6 Cylinder Destructive Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anheier, Norman C.; Cannon, Bret D.; Qiao, Hong
2012-07-17
The IAEA safeguards approach for gaseous centrifuge enrichment plants (GCEPs) includes measurements of gross, partial, and bias defects in a statistical sampling plan. These safeguard methods consist principally of mass and enrichment nondestructive assay (NDA) verification. Destructive assay (DA) samples are collected from a limited number of cylinders for high precision offsite mass spectrometer analysis. DA is typically used to quantify bias defects in the GCEP material balance. Under current safeguards measures, the operator collects a DA sample from a sample tap following homogenization. The sample is collected in a small UF6 sample bottle, then sealed and shipped under IAEAmore » chain of custody to an offsite analytical laboratory. Current practice is expensive and resource intensive. We propose a new and novel approach for performing onsite gaseous UF6 DA analysis that provides rapid and accurate assessment of enrichment bias defects. DA samples are collected using a custom sampling device attached to a conventional sample tap. A few micrograms of gaseous UF6 is chemically adsorbed onto a sampling coupon in a matter of minutes. The collected DA sample is then analyzed onsite using Laser Ablation Absorption Ratio Spectrometry-Destructive Assay (LAARS-DA). DA results are determined in a matter of minutes at sufficient accuracy to support reliable bias defect conclusions, while greatly reducing DA sample volume, analysis time, and cost.« less
Improving power and robustness for detecting genetic association with extreme-value sampling design.
Chen, Hua Yun; Li, Mingyao
2011-12-01
Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs. © 2011 Wiley Periodicals, Inc.
Olives, Casey; Valadez, Joseph J; Pagano, Marcello
2014-03-01
To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.
Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods
NASA Astrophysics Data System (ADS)
Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.
2011-12-01
Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.
Morrison, Christopher; Lee, Juliet P.; Gruenewald, Paul J.; Marzell, Miesha
2015-01-01
Location-based sampling is a method to obtain samples of people within ecological contexts relevant to specific public health outcomes. Random selection increases generalizability, however in some circumstances (such as surveying bar patrons) recruitment conditions increase risks of sample bias. We attempted to recruit representative samples of bars and patrons in six California cities, but low response rates precluded meaningful analysis. A systematic review of 24 similar studies revealed that none addressed the key shortcomings of our study. We recommend steps to improve studies that use location-based sampling: (i) purposively sample places of interest, (ii) utilize recruitment strategies appropriate to the environment, and (iii) provide full information on response rates at all levels of sampling. PMID:26574657
Curuksu, Jeremy; Zacharias, Martin
2009-03-14
Although molecular dynamics (MD) simulations have been applied frequently to study flexible molecules, the sampling of conformational states separated by barriers is limited due to currently possible simulation time scales. Replica-exchange (Rex)MD simulations that allow for exchanges between simulations performed at different temperatures (T-RexMD) can achieve improved conformational sampling. However, in the case of T-RexMD the computational demand grows rapidly with system size. A Hamiltonian RexMD method that specifically enhances coupled dihedral angle transitions has been developed. The method employs added biasing potentials as replica parameters that destabilize available dihedral substates and was applied to study coupled dihedral transitions in nucleic acid molecules. The biasing potentials can be either fixed at the beginning of the simulation or optimized during an equilibration phase. The method was extensively tested and compared to conventional MD simulations and T-RexMD simulations on an adenine dinucleotide system and on a DNA abasic site. The biasing potential RexMD method showed improved sampling of conformational substates compared to conventional MD simulations similar to T-RexMD simulations but at a fraction of the computational demand. It is well suited to study systematically the fine structure and dynamics of large nucleic acids under realistic conditions including explicit solvent and ions and can be easily extended to other types of molecules.
Design, analysis, and interpretation of field quality-control data for water-sampling projects
Mueller, David K.; Schertz, Terry L.; Martin, Jeffrey D.; Sandstrom, Mark W.
2015-01-01
The report provides extensive information about statistical methods used to analyze quality-control data in order to estimate potential bias and variability in environmental data. These methods include construction of confidence intervals on various statistical measures, such as the mean, percentiles and percentages, and standard deviation. The methods are used to compare quality-control results with the larger set of environmental data in order to determine whether the effects of bias and variability might interfere with interpretation of these data. Examples from published reports are presented to illustrate how the methods are applied, how bias and variability are reported, and how the interpretation of environmental data can be qualified based on the quality-control analysis.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Yamanis, Thespina J.; Merli, M. Giovanna; Neely, William Whipple; Tian, Felicia Feng; Moody, James; Tu, Xiaowen; Gao, Ersheng
2013-01-01
Respondent-driven sampling (RDS) is a method for recruiting “hidden” populations through a network-based, chain and peer referral process. RDS recruits hidden populations more effectively than other sampling methods and promises to generate unbiased estimates of their characteristics. RDS’s faithful representation of hidden populations relies on the validity of core assumptions regarding the unobserved referral process. With empirical recruitment data from an RDS study of female sex workers (FSWs) in Shanghai, we assess the RDS assumption that participants recruit nonpreferentially from among their network alters. We also present a bootstrap method for constructing the confidence intervals around RDS estimates. This approach uniquely incorporates real-world features of the population under study (e.g., the sample’s observed branching structure). We then extend this approach to approximate the distribution of RDS estimates under various peer recruitment scenarios consistent with the data as a means to quantify the impact of recruitment bias and of rejection bias on the RDS estimates. We find that the hierarchical social organization of FSWs leads to recruitment biases by constraining RDS recruitment across social classes and introducing bias in the RDS estimates. PMID:24288418
Lead burdens and behavioral impairments of the lined shore crab Pachygrapsus crassipes
Hui, Clifford A.
2002-01-01
Unless a correction is made, population estimates derived from a sample of belt transects will be biased if a fraction of, the individuals on the sample transects are not counted. An approach, useful for correcting this bias when sampling immotile populations using transects of a fixed width, is presented. The method assumes that a searcher's ability to find objects near the center of the transect is nearly perfect. The method utilizes a mathematical equation, estimated from the data, to represent the searcher's inability to find all objects at increasing distances from the center of the transect. An example of the analysis of data, formation of the equation, and application is presented using waterfowl nesting data collected in Colorado.
Brooks, M.H.; Schroder, L.J.; Malo, B.A.
1985-01-01
Four laboratories were evaluated in their analysis of identical natural and simulated precipitation water samples. Interlaboratory comparability was evaluated using analysis of variance coupled with Duncan 's multiple range test, and linear-regression models describing the relations between individual laboratory analytical results for natural precipitation samples. Results of the statistical analyses indicate that certain pairs of laboratories produce different results when analyzing identical samples. Analyte bias for each laboratory was examined using analysis of variance coupled with Duncan 's multiple range test on data produced by the laboratories from the analysis of identical simulated precipitation samples. Bias for a given analyte produced by a single laboratory has been indicated when the laboratory mean for that analyte is shown to be significantly different from the mean for the most-probable analyte concentrations in the simulated precipitation samples. Ion-chromatographic methods for the determination of chloride, nitrate, and sulfate have been compared with the colorimetric methods that were also in use during the study period. Comparisons were made using analysis of variance coupled with Duncan 's multiple range test for means produced by the two methods. Analyte precision for each laboratory has been estimated by calculating a pooled variance for each analyte. Analyte estimated precisions have been compared using F-tests and differences in analyte precisions for laboratory pairs have been reported. (USGS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bieler, Noah S.; Hünenberger, Philippe H., E-mail: phil@igc.phys.chem.ethz.ch
2014-11-28
In a recent article [Bieler et al., J. Chem. Theory Comput. 10, 3006–3022 (2014)], we introduced a combination of the λ-dynamics (λD) approach for calculating alchemical free-energy differences and of the local-elevation umbrella-sampling (LEUS) memory-based biasing method to enhance the sampling along the alchemical coordinate. The combined scheme, referred to as λ-LEUS, was applied to the perturbation of hydroquinone to benzene in water as a test system, and found to represent an improvement over thermodynamic integration (TI) in terms of sampling efficiency at equivalent accuracy. However, the preoptimization of the biasing potential required in the λ-LEUS method requires “filling up”more » all the basins in the potential of mean force. This introduces a non-productive pre-sampling time that is system-dependent, and generally exceeds the corresponding equilibration time in a TI calculation. In this letter, a remedy is proposed to this problem, termed the slow growth memory guessing (SGMG) approach. Instead of initializing the biasing potential to zero at the start of the preoptimization, an approximate potential of mean force is estimated from a short slow growth calculation, and its negative used to construct the initial memory. Considering the same test system as in the preceding article, it is shown that of the application of SGMG in λ-LEUS permits to reduce the preoptimization time by about a factor of four.« less
NASA Astrophysics Data System (ADS)
Bieler, Noah S.; Hünenberger, Philippe H.
2014-11-01
In a recent article [Bieler et al., J. Chem. Theory Comput. 10, 3006-3022 (2014)], we introduced a combination of the λ-dynamics (λD) approach for calculating alchemical free-energy differences and of the local-elevation umbrella-sampling (LEUS) memory-based biasing method to enhance the sampling along the alchemical coordinate. The combined scheme, referred to as λ-LEUS, was applied to the perturbation of hydroquinone to benzene in water as a test system, and found to represent an improvement over thermodynamic integration (TI) in terms of sampling efficiency at equivalent accuracy. However, the preoptimization of the biasing potential required in the λ-LEUS method requires "filling up" all the basins in the potential of mean force. This introduces a non-productive pre-sampling time that is system-dependent, and generally exceeds the corresponding equilibration time in a TI calculation. In this letter, a remedy is proposed to this problem, termed the slow growth memory guessing (SGMG) approach. Instead of initializing the biasing potential to zero at the start of the preoptimization, an approximate potential of mean force is estimated from a short slow growth calculation, and its negative used to construct the initial memory. Considering the same test system as in the preceding article, it is shown that of the application of SGMG in λ-LEUS permits to reduce the preoptimization time by about a factor of four.
Validity of mail survey data on bagged waterfowl
Atwood, E.L.
1956-01-01
Knowledge of the pattern of occurrence and characteristics of response errors obtained during an investigation of the validity of post-season surveys of hunters was used to advantage to devise a two-step method for removing the response-bias errors from the raw survey data. The method was tested on data with known errors and found to have a high efficiency in reducing the effect of response-bias errors. The development of this method for removing the effect of the response-bias errors, and its application to post-season hunter-take survey data, increased the reliability of the data from below the point of practical management significance up to the approximate reliability limits corresponding to the sampling errors.
Determination of Protein Content by NIR Spectroscopy in Protein Powder Mix Products.
Ingle, Prashant D; Christian, Roney; Purohit, Piyush; Zarraga, Veronica; Handley, Erica; Freel, Keith; Abdo, Saleem
2016-01-01
Protein is a principal component in commonly used dietary supplements and health food products. The analysis of these products, within the consumer package form, is of critical importance for the purpose of ensuring quality and supporting label claims. A rapid test method was developed using near-infrared (NIR) spectroscopy as a compliment to current protein determination by the Dumas combustion method. The NIR method was found to be a rapid, low-cost, and green (no use of chemicals and reagents) complimentary technique. The protein powder samples analyzed in this study were in the range of 22-90% protein. The samples were prepared as mixtures of soy protein, whey protein, and silicon dioxide ingredients, which are common in commercially sold protein powder drink-mix products in the market. A NIR regression model was developed with 17 samples within the constituent range and was validated with 20 independent samples of known protein levels (85-88%). The results show that the NIR method is capable of predicting the protein content with a bias of ±2% and a maximum bias of 3% between NIR and the external Dumas method.
Active Search on Carcasses versus Pitfall Traps: a Comparison of Sampling Methods.
Zanetti, N I; Camina, R; Visciarelli, E C; Centeno, N D
2016-04-01
The study of insect succession in cadavers and the classification of arthropods have mostly been done by placing a carcass in a cage, protected from vertebrate scavengers, which is then visited periodically. An alternative is to use specific traps. Few studies on carrion ecology and forensic entomology involving the carcasses of large vertebrates have employed pitfall traps. The aims of this study were to compare both sampling methods (active search on a carcass and pitfall trapping) for each coleopteran family, and to establish whether there is a discrepancy (underestimation and/or overestimation) in the presence of each family by either method. A great discrepancy was found for almost all families with some of them being more abundant in samples obtained through active search on carcasses and others in samples from traps, whereas two families did not show any bias towards a given sampling method. The fact that families may be underestimated or overestimated by the type of sampling technique highlights the importance of combining both methods, active search on carcasses and pitfall traps, in order to obtain more complete information on decomposition, carrion habitat and cadaveric families or species. Furthermore, a hypothesis advanced on the reasons for the underestimation by either sampling method showing biases towards certain families. Information about the sampling techniques indicating which would be more appropriate to detect or find a particular family is provided.
Ma, Jianzhong; Amos, Christopher I; Warwick Daw, E
2007-09-01
Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci. Copyright (c) 2007 Wiley-Liss, Inc.
Clerkin, Elise M.; Magee, Joshua C.; Wells, Tony T.; Beard, Courtney; Barnett, Nancy P.
2016-01-01
Objective Attention biases may be an important treatment target for both alcohol dependence and social anxiety. This is the first ABM trial to investigate two (vs. one) targets of attention bias within a sample with co-occurring symptoms of social anxiety and alcohol dependence. Additionally, we used trial-level bias scores (TL-BS) to capture the phenomena of attention bias in a more ecologically valid, dynamic way compared to traditional attention bias scores. Method Adult participants (N=86; 41% Female; 52% African American; 40% White) with elevated social anxiety symptoms and alcohol dependence were randomly assigned to an 8-session training condition in this 2 (Social Anxiety ABM vs. Social Anxiety Control) by 2 (Alcohol ABM vs. Alcohol Control) design. Symptoms of social anxiety, alcohol dependence, and attention bias were assessed across time. Results Multilevel models estimated the trajectories for each measure within individuals, and tested whether these trajectories differed according to the randomized training conditions. Across time, there were significant or trending decreases in all attention TL-BS parameters (but not traditional attention bias scores) and most symptom measures. However, there were not significant differences in the trajectories of change between any ABM and control conditions for any symptom measures. Conclusions These findings add to previous evidence questioning the robustness of ABM and point to the need to extend the effects of ABM to samples that are racially diverse and/or have co-occurring psychopathology. The results also illustrate the potential importance of calculating trial-level attention bias scores rather than only including traditional bias scores. PMID:27591918
Shape measurement biases from underfitting and ellipticity gradients
Bernstein, Gary M.
2010-08-21
With this study, precision weak gravitational lensing experiments require measurements of galaxy shapes accurate to <1 part in 1000. We investigate measurement biases, noted by Voigt and Bridle (2009) and Melchior et al. (2009), that are common to shape measurement methodologies that rely upon fitting elliptical-isophote galaxy models to observed data. The first bias arises when the true galaxy shapes do not match the models being fit. We show that this "underfitting bias" is due, at root, to these methods' attempts to use information at high spatial frequencies that has been destroyed by the convolution with the point-spread function (PSF)more » and/or by sampling. We propose a new shape-measurement technique that is explicitly confined to observable regions of k-space. A second bias arises for galaxies whose ellipticity varies with radius. For most shape-measurement methods, such galaxies are subject to "ellipticity gradient bias". We show how to reduce such biases by factors of 20–100 within the new shape-measurement method. The resulting shear estimator has multiplicative errors < 1 part in 10 3 for high-S/N images, even for highly asymmetric galaxies. Without any training or recalibration, the new method obtains Q = 3000 in the GREAT08 Challenge of blind shear reconstruction on low-noise galaxies, several times better than any previous method.« less
Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference
Karcher, Michael D.; Palacios, Julia A.; Bedford, Trevor; Suchard, Marc A.; Minin, Vladimir N.
2016-01-01
Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals’ genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples. PMID:26938243
Comparison of estimators of standard deviation for hydrologic time series
Tasker, Gary D.; Gilroy, Edward J.
1982-01-01
Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.
Kunz, Cornelia U; Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2017-03-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under- or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re-assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one-sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. © 2016 The Authors. Biometrical Journal Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2016-01-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under‐ or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re‐assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one‐sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. PMID:27886393
Large exchange bias effect in NiFe2O4/CoO nanocomposites
NASA Astrophysics Data System (ADS)
Mohan, Rajendra; Prasad Ghosh, Mritunjoy; Mukherjee, Samrat
2018-03-01
In this work, we report the exchange bias effect of NiFe2O4/CoO nanocomposites, synthesized via chemical co-precipitation method. Four samples of different particle size ranging from 4 nm to 31 nm were prepared with the annealing temperature varying from 200 °C to 800 °C. X-ray diffraction analysis of all the samples confirmed the presence of cubic spinel phase of Nickel ferrite along with CoO phase without trace of any impurity. Sizes of the particles were studied from transmission electron micrographs and were found to be in agreement with those estimated from x-ray diffraction. Field cooled (FC) hysteresis loops at 5 K revealed an exchange bias (HE) of 2.2 kOe for the sample heated at 200 °C which decreased with the increase of particle size. Exchange bias expectedly vanished at 300 K due to high thermal energy (kBT) and low effective surface anisotropy. M-T curves revealed a blocking temperature of 135 K for the sample with smaller particle size.
Statistical approaches to account for false-positive errors in environmental DNA samples.
Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid
2016-05-01
Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies. © 2015 John Wiley & Sons Ltd.
Zhu, Wensheng; Yuan, Ying; Zhang, Jingwen; Zhou, Fan; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-02-01
The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme. Copyright © 2016 Elsevier Inc. All rights reserved.
Adaptively biased molecular dynamics for free energy calculations
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Roland, Christopher; Sagui, Celeste
2008-04-01
We present an adaptively biased molecular dynamics (ABMD) method for the computation of the free energy surface of a reaction coordinate using nonequilibrium dynamics. The ABMD method belongs to the general category of umbrella sampling methods with an evolving biasing potential and is inspired by the metadynamics method. The ABMD method has several useful features, including a small number of control parameters and an O(t ) numerical cost with molecular dynamics time t. The ABMD method naturally allows for extensions based on multiple walkers and replica exchange, where different replicas can have different temperatures and/or collective variables. This is beneficial not only in terms of the speed and accuracy of a calculation, but also in terms of the amount of useful information that may be obtained from a given simulation. The workings of the ABMD method are illustrated via a study of the folding of the Ace-GGPGGG-Nme peptide in a gaseous and solvated environment.
Length bias correction in one-day cross-sectional assessments - The nutritionDay study.
Frantal, Sophie; Pernicka, Elisabeth; Hiesmayr, Michael; Schindler, Karin; Bauer, Peter
2016-04-01
A major problem occurring in cross-sectional studies is sampling bias. Length of hospital stay (LOS) differs strongly between patients and causes a length bias as patients with longer LOS are more likely to be included and are therefore overrepresented in this type of study. To adjust for the length bias higher weights are allocated to patients with shorter LOS. We determined the effect of length-bias adjustment in two independent populations. Length-bias correction is applied to the data of the nutritionDay project, a one-day multinational cross-sectional audit capturing data on disease and nutrition of patients admitted to hospital wards with right-censoring after 30 days follow-up. We applied the weighting method for estimating the distribution function of patient baseline variables based on the method of non-parametric maximum likelihood. Results are validated using data from all patients admitted to the General Hospital of Vienna between 2005 and 2009, where the distribution of LOS can be assumed to be known. Additionally, a simplified calculation scheme for estimating the adjusted distribution function of LOS is demonstrated on a small patient example. The crude median (lower quartile; upper quartile) LOS in the cross-sectional sample was 14 (8; 24) and decreased to 7 (4; 12) when adjusted. Hence, adjustment for length bias in cross-sectional studies is essential to get appropriate estimates. Copyright © 2015 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Daly, Caitlin H; Higgins, Victoria; Adeli, Khosrow; Grey, Vijay L; Hamid, Jemila S
2017-12-01
To statistically compare and evaluate commonly used methods of estimating reference intervals and to determine which method is best based on characteristics of the distribution of various data sets. Three approaches for estimating reference intervals, i.e. parametric, non-parametric, and robust, were compared with simulated Gaussian and non-Gaussian data. The hierarchy of the performances of each method was examined based on bias and measures of precision. The findings of the simulation study were illustrated through real data sets. In all Gaussian scenarios, the parametric approach provided the least biased and most precise estimates. In non-Gaussian scenarios, no single method provided the least biased and most precise estimates for both limits of a reference interval across all sample sizes, although the non-parametric approach performed the best for most scenarios. The hierarchy of the performances of the three methods was only impacted by sample size and skewness. Differences between reference interval estimates established by the three methods were inflated by variability. Whenever possible, laboratories should attempt to transform data to a Gaussian distribution and use the parametric approach to obtain the most optimal reference intervals. When this is not possible, laboratories should consider sample size and skewness as factors in their choice of reference interval estimation method. The consequences of false positives or false negatives may also serve as factors in this decision. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Chromý, Vratislav; Vinklárková, Bára; Šprongl, Luděk; Bittová, Miroslava
2015-01-01
We found previously that albumin-calibrated total protein in certified reference materials causes unacceptable positive bias in analysis of human sera. The simplest way to cure this defect is the use of human-based serum/plasma standards calibrated by the Kjeldahl method. Such standards, commutative with serum samples, will compensate for bias caused by lipids and bilirubin in most human sera. To find a suitable primary reference procedure for total protein in reference materials, we reviewed Kjeldahl methods adopted by laboratory medicine. We found two methods recommended for total protein in human samples: an indirect analysis based on total Kjeldahl nitrogen corrected for its nonprotein nitrogen and a direct analysis made on isolated protein precipitates. The methods found will be assessed in a subsequent article.
Jiang, Wenyu; Simon, Richard
2007-12-20
This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.
Field trials of line transect methods applied to estimation of desert tortoise abundance
Anderson, David R.; Burnham, Kenneth P.; Lubow, Bruce C.; Thomas, L. E. N.; Corn, Paul Stephen; Medica, Philip A.; Marlow, R.W.
2001-01-01
We examine the degree to which field observers can meet the assumptions underlying line transect sampling to monitor populations of desert tortoises (Gopherus agassizii). We present the results of 2 field trials using artificial tortoise models in 3 size classes. The trials were conducted on 2 occasions on an area south of Las Vegas, Nevada, where the density of the test population was known. In the first trials, conducted largely by experienced biologists who had been involved in tortoise surveys for many years, the density of adult tortoise models was well estimated (-3.9% bias), while the bias was higher (-20%) for subadult tortoise models. The bias for combined data was -12.0%. The bias was largely attributed to the failure to detect all tortoise models on or near the transect centerline. The second trials were conducted with a group of largely inexperienced student volunteers and used somewhat different searching methods, and the results were similar to the first trials. Estimated combined density of subadult and adult tortoise models had a negative bias (-7.3%), again attributable to failure to detect some models on or near the centerline. Experience in desert tortoise biology, either comparing the first and second trials or in the second trial with 2 experienced biologists versus 16 novices, did not have an apparent effect on the quality of the data or the accuracy of the estimates. Observer training, specific to line transect sampling, and field testing are important components of a reliable survey. Line transect sampling represents a viable method for large-scale monitoring of populations of desert tortoise; however, field protocol must be improved to assure the key assumptions are met.
Avula, Haritha
2013-01-01
A good research beginning refers to formulating a well-defined research question, developing a hypothesis and choosing an appropriate study design. The first part of the review series has discussed these issues in depth and this paper intends to throw light on other issues pertaining to the implementation of research. These include the various ethical norms and standards in human experimentation, the eligibility criteria for the participants, sampling methods and sample size calculation, various outcome measures that need to be defined and the biases that can be introduced in research. PMID:24174747
Kendall, William L.; White, Gary C.
2009-01-01
1. Assessing the probability that a given site is occupied by a species of interest is important to resource managers, as well as metapopulation or landscape ecologists. Managers require accurate estimates of the state of the system, in order to make informed decisions. Models that yield estimates of occupancy, while accounting for imperfect detection, have proven useful by removing a potentially important source of bias. To account for detection probability, multiple independent searches per site for the species are required, under the assumption that the species is available for detection during each search of an occupied site. 2. We demonstrate that when multiple samples per site are defined by searching different locations within a site, absence of the species from a subset of these spatial subunits induces estimation bias when locations are exhaustively assessed or sampled without replacement. 3. We further demonstrate that this bias can be removed by choosing sampling locations with replacement, or if the species is highly mobile over a short period of time. 4. Resampling an existing data set does not mitigate bias due to exhaustive assessment of locations or sampling without replacement. 5. Synthesis and applications. Selecting sampling locations for presence/absence surveys with replacement is practical in most cases. Such an adjustment to field methods will prevent one source of bias, and therefore produce more robust statistical inferences about species occupancy. This will in turn permit managers to make resource decisions based on better knowledge of the state of the system.
Estimating time-dependent ROC curves using data under prevalent sampling.
Li, Shanshan
2017-04-15
Prevalent sampling is frequently a convenient and economical sampling technique for the collection of time-to-event data and thus is commonly used in studies of the natural history of a disease. However, it is biased by design because it tends to recruit individuals with longer survival times. This paper considers estimation of time-dependent receiver operating characteristic curves when data are collected under prevalent sampling. To correct the sampling bias, we develop both nonparametric and semiparametric estimators using extended risk sets and the inverse probability weighting techniques. The proposed estimators are consistent and converge to Gaussian processes, while substantial bias may arise if standard estimators for right-censored data are used. To illustrate our method, we analyze data from an ovarian cancer study and estimate receiver operating characteristic curves that assess the accuracy of the composite markers in distinguishing subjects who died within 3-5 years from subjects who remained alive. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Sampling designs matching species biology produce accurate and affordable abundance indices
Farley, Sean; Russell, Gareth J.; Butler, Matthew J.; Selinger, Jeff
2013-01-01
Wildlife biologists often use grid-based designs to sample animals and generate abundance estimates. Although sampling in grids is theoretically sound, in application, the method can be logistically difficult and expensive when sampling elusive species inhabiting extensive areas. These factors make it challenging to sample animals and meet the statistical assumption of all individuals having an equal probability of capture. Violating this assumption biases results. Does an alternative exist? Perhaps by sampling only where resources attract animals (i.e., targeted sampling), it would provide accurate abundance estimates more efficiently and affordably. However, biases from this approach would also arise if individuals have an unequal probability of capture, especially if some failed to visit the sampling area. Since most biological programs are resource limited, and acquiring abundance data drives many conservation and management applications, it becomes imperative to identify economical and informative sampling designs. Therefore, we evaluated abundance estimates generated from grid and targeted sampling designs using simulations based on geographic positioning system (GPS) data from 42 Alaskan brown bears (Ursus arctos). Migratory salmon drew brown bears from the wider landscape, concentrating them at anadromous streams. This provided a scenario for testing the targeted approach. Grid and targeted sampling varied by trap amount, location (traps placed randomly, systematically or by expert opinion), and traps stationary or moved between capture sessions. We began by identifying when to sample, and if bears had equal probability of capture. We compared abundance estimates against seven criteria: bias, precision, accuracy, effort, plus encounter rates, and probabilities of capture and recapture. One grid (49 km2 cells) and one targeted configuration provided the most accurate results. Both placed traps by expert opinion and moved traps between capture sessions, which raised capture probabilities. The grid design was least biased (−10.5%), but imprecise (CV 21.2%), and used most effort (16,100 trap-nights). The targeted configuration was more biased (−17.3%), but most precise (CV 12.3%), with least effort (7,000 trap-nights). Targeted sampling generated encounter rates four times higher, and capture and recapture probabilities 11% and 60% higher than grid sampling, in a sampling frame 88% smaller. Bears had unequal probability of capture with both sampling designs, partly because some bears never had traps available to sample them. Hence, grid and targeted sampling generated abundance indices, not estimates. Overall, targeted sampling provided the most accurate and affordable design to index abundance. Targeted sampling may offer an alternative method to index the abundance of other species inhabiting expansive and inaccessible landscapes elsewhere, provided their attraction to resource concentrations. PMID:24392290
Zhang, Haixia; Zhao, Junkang; Gu, Caijiao; Cui, Yan; Rong, Huiying; Meng, Fanlong; Wang, Tong
2015-05-01
The study of the medical expenditure and its influencing factors among the students enrolling in Urban Resident Basic Medical Insurance (URBMI) in Taiyuan indicated that non response bias and selection bias coexist in dependent variable of the survey data. Unlike previous studies only focused on one missing mechanism, a two-stage method to deal with two missing mechanisms simultaneously was suggested in this study, combining multiple imputation with sample selection model. A total of 1 190 questionnaires were returned by the students (or their parents) selected in child care settings, schools and universities in Taiyuan by stratified cluster random sampling in 2012. In the returned questionnaires, 2.52% existed not missing at random (NMAR) of dependent variable and 7.14% existed missing at random (MAR) of dependent variable. First, multiple imputation was conducted for MAR by using completed data, then sample selection model was used to correct NMAR in multiple imputation, and a multi influencing factor analysis model was established. Based on 1 000 times resampling, the best scheme of filling the random missing values is the predictive mean matching (PMM) method under the missing proportion. With this optimal scheme, a two stage survey was conducted. Finally, it was found that the influencing factors on annual medical expenditure among the students enrolling in URBMI in Taiyuan included population group, annual household gross income, affordability of medical insurance expenditure, chronic disease, seeking medical care in hospital, seeking medical care in community health center or private clinic, hospitalization, hospitalization canceled due to certain reason, self medication and acceptable proportion of self-paid medical expenditure. The two-stage method combining multiple imputation with sample selection model can deal with non response bias and selection bias effectively in dependent variable of the survey data.
Comparison of DNA preservation methods for environmental bacterial community samples
Gray, Michael A.; Pratte, Zoe A.; Kellogg, Christina A.
2013-01-01
Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard™, RNAlater®, DMSO–EDTA–salt (DESS), FTA® cards, and FTA Elute® cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA® cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard™, RNAlater®, and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost.
Sampling free energy surfaces as slices by combining umbrella sampling and metadynamics.
Awasthi, Shalini; Kapil, Venkat; Nair, Nisanth N
2016-06-15
Metadynamics (MTD) is a very powerful technique to sample high-dimensional free energy landscapes, and due to its self-guiding property, the method has been successful in studying complex reactions and conformational changes. MTD sampling is based on filling the free energy basins by biasing potentials and thus for cases with flat, broad, and unbound free energy wells, the computational time to sample them becomes very large. To alleviate this problem, we combine the standard Umbrella Sampling (US) technique with MTD to sample orthogonal collective variables (CVs) in a simultaneous way. Within this scheme, we construct the equilibrium distribution of CVs from biased distributions obtained from independent MTD simulations with umbrella potentials. Reweighting is carried out by a procedure that combines US reweighting and Tiwary-Parrinello MTD reweighting within the Weighted Histogram Analysis Method (WHAM). The approach is ideal for a controlled sampling of a CV in a MTD simulation, making it computationally efficient in sampling flat, broad, and unbound free energy surfaces. This technique also allows for a distributed sampling of a high-dimensional free energy surface, further increasing the computational efficiency in sampling. We demonstrate the application of this technique in sampling high-dimensional surface for various chemical reactions using ab initio and QM/MM hybrid molecular dynamics simulations. Further, to carry out MTD bias reweighting for computing forward reaction barriers in ab initio or QM/MM simulations, we propose a computationally affordable approach that does not require recrossing trajectories. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Charrier, Jessica G.; McFall, Alexander S.; Vu, Kennedy K.-T.; Baroi, James; Olea, Catalina; Hasson, Alam; Anastasio, Cort
2016-11-01
The dithiothreitol (DTT) assay is widely used to measure the oxidative potential of particulate matter. Results are typically presented in mass-normalized units (e.g., pmols DTT lost per minute per microgram PM) to allow for comparison among samples. Use of this unit assumes that the mass-normalized DTT response is constant and independent of the mass concentration of PM added to the DTT assay. However, based on previous work that identified non-linear DTT responses for copper and manganese, this basic assumption (that the mass-normalized DTT response is independent of the concentration of PM added to the assay) should not be true for samples where Cu and Mn contribute significantly to the DTT signal. To test this we measured the DTT response at multiple PM concentrations for eight ambient particulate samples collected at two locations in California. The results confirm that for samples with significant contributions from Cu and Mn, the mass-normalized DTT response can strongly depend on the concentration of PM added to the assay, varying by up to an order of magnitude for PM concentrations between 2 and 34 μg mL-1. This mass dependence confounds useful interpretation of DTT assay data in samples with significant contributions from Cu and Mn, requiring additional quality control steps to check for this bias. To minimize this problem, we discuss two methods to correct the mass-normalized DTT result and we apply those methods to our samples. We find that it is possible to correct the mass-normalized DTT result, although the correction methods have some drawbacks and add uncertainty to DTT analyses. More broadly, other DTT-active species might also have non-linear concentration-responses in the assay and cause a bias. In addition, the same problem of Cu- and Mn-mediated bias in mass-normalized DTT results might affect other measures of acellular redox activity in PM and needs to be addressed.
Rosenberger, Amanda E.; Dunham, Jason B.
2005-01-01
Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.
Ma, Wenxia; Yin, Xuejun; Zhang, Ruijuan; Liu, Furong; Yang, Danrong; Fan, Yameng; Rong, Jie; Tian, Maoyi; Yu, Yan
2017-01-01
Background: 24-h urine collection is regarded as the “gold standard” for monitoring sodium intake at the population level, but ensuring high quality urine samples is difficult to achieve. The Kawasaki, International Study of Sodium, Potassium, and Blood Pressure (INTERSALT) and Tanaka methods have been used to estimate 24-h urinary sodium excretion from spot urine samples in some countries, but few studies have been performed to compare and validate these methods in the Chinese population. Objective: To compare and validate the Kawasaki, INTERSALT and Tanaka formulas in predicting 24-h urinary sodium excretion using spot urine samples in 365 high-risk elder patients of strokefrom the rural areas of Shaanxi province. Methods: Data were collected from a sub-sample of theSalt Substitute and Stroke Study. 365 high-risk elder patients of stroke from the rural areas of Shaanxi province participated and their spot and 24-h urine specimens were collected. The concentrations of sodium, potassium and creatinine in spot and 24-h urine samples wereanalysed. Estimated 24-h sodium excretion was predicted from spot urine concentration using the Kawasaki, INTERSALT, and Tanaka formulas. Pearson correlation coefficients and agreement by Bland-Altman method were computed for estimated and measured 24-h urinary sodium excretion. Results: The average 24-h urinary sodium excretion was 162.0 mmol/day, which representing a salt intake of 9.5 g/day. Three predictive equations had low correlation with the measured 24-h sodium excretion (r = 0.38, p < 0.01; ICC = 0.38, p < 0.01 for the Kawasaki; r = 0.35, p < 0.01; ICC = 0.31, p < 0.01 for the INTERSALT; r = 0.37, p < 0.01; ICC = 0.34, p < 0.01 for the Tanaka). Significant biases between estimated and measured 24-h sodium excretion were observed (all p < 0.01 for three methods). Among the three methods, the Kawasaki method was the least biased compared with the other two methods (mean bias: 31.90, 95% Cl: 23.84, 39.97). Overestimation occurred when the Kawasaki and Tanaka methods were used while the INTERSALT method underestimated 24-h sodium excretion. Conclusion: The Kawasaki, INTERSALT and Tanaka methods for estimation of 24-h urinary sodium excretion from spot urine specimens were inadequate for the assessment of sodium intake at the population level in high-risk elder patients of stroke from the rural areas of Shaanxi province, although the Kawasaki method was the least biased compared with the other two methods. PMID:29019912
Ma, Wenxia; Yin, Xuejun; Zhang, Ruijuan; Liu, Furong; Yang, Danrong; Fan, Yameng; Rong, Jie; Tian, Maoyi; Yu, Yan
2017-10-11
Background : 24-h urine collection is regarded as the "gold standard" for monitoring sodium intake at the population level, but ensuring high quality urine samples is difficult to achieve. The Kawasaki, International Study of Sodium, Potassium, and Blood Pressure (INTERSALT) and Tanaka methods have been used to estimate 24-h urinary sodium excretion from spot urine samples in some countries, but few studies have been performed to compare and validate these methods in the Chinese population. Objective : To compare and validate the Kawasaki, INTERSALT and Tanaka formulas in predicting 24-h urinary sodium excretion using spot urine samples in 365 high-risk elder patients of strokefrom the rural areas of Shaanxi province. Methods : Data were collected from a sub-sample of theSalt Substitute and Stroke Study. 365 high-risk elder patients of stroke from the rural areas of Shaanxi province participated and their spot and 24-h urine specimens were collected. The concentrations of sodium, potassium and creatinine in spot and 24-h urine samples wereanalysed. Estimated 24-h sodium excretion was predicted from spot urine concentration using the Kawasaki, INTERSALT, and Tanaka formulas. Pearson correlation coefficients and agreement by Bland-Altman method were computed for estimated and measured 24-h urinary sodium excretion. Results : The average 24-h urinary sodium excretion was 162.0 mmol/day, which representing a salt intake of 9.5 g/day. Three predictive equations had low correlation with the measured 24-h sodium excretion (r = 0.38, p < 0.01; ICC = 0.38, p < 0.01 for the Kawasaki; r = 0.35, p < 0.01; ICC = 0.31, p < 0.01 for the INTERSALT; r = 0.37, p < 0.01; ICC = 0.34, p < 0.01 for the Tanaka). Significant biases between estimated and measured 24-h sodium excretion were observed (all p < 0.01 for three methods). Among the three methods, the Kawasaki method was the least biased compared with the other two methods (mean bias: 31.90, 95% Cl: 23.84, 39.97). Overestimation occurred when the Kawasaki and Tanaka methods were used while the INTERSALT method underestimated 24-h sodium excretion. Conclusion : The Kawasaki, INTERSALT and Tanaka methods for estimation of 24-h urinary sodium excretion from spot urine specimens were inadequate for the assessment of sodium intake at the population level in high-risk elder patients of stroke from the rural areas of Shaanxi province, although the Kawasaki method was the least biased compared with the other two methods.
Association between attention bias to threat and anxiety symptoms in children and adolescents.
Abend, Rany; de Voogd, Leone; Salemink, Elske; Wiers, Reinout W; Pérez-Edgar, Koraly; Fitzgerald, Amanda; White, Lauren K; Salum, Giovanni A; He, Jie; Silverman, Wendy K; Pettit, Jeremy W; Pine, Daniel S; Bar-Haim, Yair
2018-03-01
Considerable research links threat-related attention biases to anxiety symptoms in adults, whereas extant findings on threat biases in youth are limited and mixed. Inconsistent findings may arise due to substantial methodological variability and limited sample sizes, emphasizing the need for systematic research on large samples. The aim of this report is to examine the association between threat bias and pediatric anxiety symptoms using standardized measures in a large, international, multi-site youth sample. A total of 1,291 children and adolescents from seven research sites worldwide completed standardized attention bias assessment task (dot-probe task) and child anxiety symptoms measure (Screen for Child Anxiety Related Emotional Disorders). Using a dimensional approach to symptomatology, we conducted regression analyses predicting overall, and disorder-specific, anxiety symptoms severity, based on threat bias scores. Threat bias correlated positively with overall anxiety symptoms severity (ß = 0.078, P = .004). Furthermore, threat bias was positively associated specifically with social anxiety (ß = 0.072, P = .008) and school phobia (ß = 0.076, P = .006) symptoms severity, but not with panic, generalized anxiety, or separation anxiety symptoms. These associations were not moderated by age or gender. These findings indicate associations between threat bias and pediatric anxiety symptoms, and suggest that vigilance to external threats manifests more prominently in symptoms of social anxiety and school phobia, regardless of age and gender. These findings point to the role of attention bias to threat in anxiety, with implications for translational clinical research. The significance of applying standardized methods in multi-site collaborations for overcoming challenges inherent to clinical research is discussed. © 2017 Wiley Periodicals, Inc.
A multi-source precipitation approach to fill gaps over a radar precipitation field
NASA Astrophysics Data System (ADS)
Tesfagiorgis, K. B.; Mahani, S. E.; Khanbilvardi, R.
2012-12-01
Satellite Precipitation Estimates (SPEs) may be the only available source of information for operational hydrologic and flash flood prediction due to spatial limitations of radar and gauge products. The present work develops an approach to seamlessly blend satellite, radar, climatological and gauge precipitation products to fill gaps over ground-based radar precipitation fields. To mix different precipitation products, the bias of any of the products relative to each other should be removed. For bias correction, the study used an ensemble-based method which aims to estimate spatially varying multiplicative biases in SPEs using a radar rainfall product. Bias factors were calculated for a randomly selected sample of rainy pixels in the study area. Spatial fields of estimated bias were generated taking into account spatial variation and random errors in the sampled values. A weighted Successive Correction Method (SCM) is proposed to make the merging between error corrected satellite and radar rainfall estimates. In addition to SCM, we use a Bayesian spatial method for merging the gap free radar with rain gauges, climatological rainfall sources and SPEs. We demonstrate the method using SPE Hydro-Estimator (HE), radar- based Stage-II, a climatological product PRISM and rain gauge dataset for several rain events from 2006 to 2008 over three different geographical locations of the United States. Results show that: the SCM method in combination with the Bayesian spatial model produced a precipitation product in good agreement with independent measurements. The study implies that using the available radar pixels surrounding the gap area, rain gauge, PRISM and satellite products, a radar like product is achievable over radar gap areas that benefits the scientific community.
NASA Technical Reports Server (NTRS)
Hearty, Thomas J.; Savtchenko, Andrey K.; Tian, Baijun; Fetzer, Eric; Yung, Yuk L.; Theobald, Michael; Vollmer, Bruce; Fishbein, Evan; Won, Young-In
2014-01-01
We use MERRA (Modern Era Retrospective-Analysis for Research Applications) temperature and water vapor data to estimate the sampling biases of climatologies derived from the AIRS/AMSU-A (Atmospheric Infrared Sounder/Advanced Microwave Sounding Unit-A) suite of instruments. We separate the total sampling bias into temporal and instrumental components. The temporal component is caused by the AIRS/AMSU-A orbit and swath that are not able to sample all of time and space. The instrumental component is caused by scenes that prevent successful retrievals. The temporal sampling biases are generally smaller than the instrumental sampling biases except in regions with large diurnal variations, such as the boundary layer, where the temporal sampling biases of temperature can be +/- 2 K and water vapor can be 10% wet. The instrumental sampling biases are the main contributor to the total sampling biases and are mainly caused by clouds. They are up to 2 K cold and greater than 30% dry over mid-latitude storm tracks and tropical deep convective cloudy regions and up to 20% wet over stratus regions. However, other factors such as surface emissivity and temperature can also influence the instrumental sampling bias over deserts where the biases can be up to 1 K cold and 10% wet. Some instrumental sampling biases can vary seasonally and/or diurnally. We also estimate the combined measurement uncertainties of temperature and water vapor from AIRS/AMSU-A and MERRA by comparing similarly sampled climatologies from both data sets. The measurement differences are often larger than the sampling biases and have longitudinal variations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, C.; Pujol, A.; Gaztañaga, E.
We measure the redshift evolution of galaxy bias for a magnitude-limited galaxy sample by combining the galaxy density maps and weak lensing shear maps for a ~116 deg 2 area of the Dark Energy Survey (DES) Science Verification (SV) data. This method was first developed in Amara et al. and later re-examined in a companion paper with rigorous simulation tests and analytical treatment of tomographic measurements. In this work we apply this method to the DES SV data and measure the galaxy bias for a i < 22.5 galaxy sample. We find the galaxy bias and 1σ error bars inmore » four photometric redshift bins to be 1.12 ± 0.19 (z = 0.2–0.4), 0.97 ± 0.15 (z = 0.4–0.6), 1.38 ± 0.39 (z = 0.6–0.8), and 1.45 ± 0.56 (z = 0.8–1.0). These measurements are consistent at the 2σ level with measurements on the same data set using galaxy clustering and cross-correlation of galaxies with cosmic microwave background lensing, with most of the redshift bins consistent within the 1σ error bars. In addition, our method provides the only σ8 independent constraint among the three. We forward model the main observational effects using mock galaxy catalogues by including shape noise, photo-z errors, and masking effects. We show that our bias measurement from the data is consistent with that expected from simulations. With the forthcoming full DES data set, we expect this method to provide additional constraints on the galaxy bias measurement from more traditional methods. Moreover, in the process of our measurement, we build up a 3D mass map that allows further exploration of the dark matter distribution and its relation to galaxy evolution.« less
Chang, C.; Pujol, A.; Gaztañaga, E.; ...
2016-04-15
We measure the redshift evolution of galaxy bias for a magnitude-limited galaxy sample by combining the galaxy density maps and weak lensing shear maps for a ~116 deg 2 area of the Dark Energy Survey (DES) Science Verification (SV) data. This method was first developed in Amara et al. and later re-examined in a companion paper with rigorous simulation tests and analytical treatment of tomographic measurements. In this work we apply this method to the DES SV data and measure the galaxy bias for a i < 22.5 galaxy sample. We find the galaxy bias and 1σ error bars inmore » four photometric redshift bins to be 1.12 ± 0.19 (z = 0.2–0.4), 0.97 ± 0.15 (z = 0.4–0.6), 1.38 ± 0.39 (z = 0.6–0.8), and 1.45 ± 0.56 (z = 0.8–1.0). These measurements are consistent at the 2σ level with measurements on the same data set using galaxy clustering and cross-correlation of galaxies with cosmic microwave background lensing, with most of the redshift bins consistent within the 1σ error bars. In addition, our method provides the only σ8 independent constraint among the three. We forward model the main observational effects using mock galaxy catalogues by including shape noise, photo-z errors, and masking effects. We show that our bias measurement from the data is consistent with that expected from simulations. With the forthcoming full DES data set, we expect this method to provide additional constraints on the galaxy bias measurement from more traditional methods. Moreover, in the process of our measurement, we build up a 3D mass map that allows further exploration of the dark matter distribution and its relation to galaxy evolution.« less
Albers, D. J.; Hripcsak, George
2012-01-01
A method to estimate the time-dependent correlation via an empirical bias estimate of the time-delayed mutual information for a time-series is proposed. In particular, the bias of the time-delayed mutual information is shown to often be equivalent to the mutual information between two distributions of points from the same system separated by infinite time. Thus intuitively, estimation of the bias is reduced to estimation of the mutual information between distributions of data points separated by large time intervals. The proposed bias estimation techniques are shown to work for Lorenz equations data and glucose time series data of three patients from the Columbia University Medical Center database. PMID:22536009
ERIC Educational Resources Information Center
Le Mens, Gael; Denrell, Jerker
2011-01-01
Recent research has argued that several well-known judgment biases may be due to biases in the available information sample rather than to biased information processing. Most of these sample-based explanations assume that decision makers are "naive": They are not aware of the biases in the available information sample and do not correct for them.…
Hall, William J.; Lee, Kent M.; Merino, Yesenia M.; Thomas, Tainayah W.; Payne, B. Keith; Eng, Eugenia; Day, Steven H.; Coyne-Beasley, Tamera
2015-01-01
Background. In the United States, people of color face disparities in access to health care, the quality of care received, and health outcomes. The attitudes and behaviors of health care providers have been identified as one of many factors that contribute to health disparities. Implicit attitudes are thoughts and feelings that often exist outside of conscious awareness, and thus are difficult to consciously acknowledge and control. These attitudes are often automatically activated and can influence human behavior without conscious volition. Objectives. We investigated the extent to which implicit racial/ethnic bias exists among health care professionals and examined the relationships between health care professionals’ implicit attitudes about racial/ethnic groups and health care outcomes. Search Methods. To identify relevant studies, we searched 10 computerized bibliographic databases and used a reference harvesting technique. Selection Criteria. We assessed eligibility using double independent screening based on a priori inclusion criteria. We included studies if they sampled existing health care providers or those in training to become health care providers, measured and reported results on implicit racial/ethnic bias, and were written in English. Data Collection and Analysis. We included a total of 15 studies for review and then subjected them to double independent data extraction. Information extracted included the citation, purpose of the study, use of theory, study design, study site and location, sampling strategy, response rate, sample size and characteristics, measurement of relevant variables, analyses performed, and results and findings. We summarized study design characteristics, and categorized and then synthesized substantive findings. Main Results. Almost all studies used cross-sectional designs, convenience sampling, US participants, and the Implicit Association Test to assess implicit bias. Low to moderate levels of implicit racial/ethnic bias were found among health care professionals in all but 1 study. These implicit bias scores are similar to those in the general population. Levels of implicit bias against Black, Hispanic/Latino/Latina, and dark-skinned people were relatively similar across these groups. Although some associations between implicit bias and health care outcomes were nonsignificant, results also showed that implicit bias was significantly related to patient–provider interactions, treatment decisions, treatment adherence, and patient health outcomes. Implicit attitudes were more often significantly related to patient–provider interactions and health outcomes than treatment processes. Conclusions. Most health care providers appear to have implicit bias in terms of positive attitudes toward Whites and negative attitudes toward people of color. Future studies need to employ more rigorous methods to examine the relationships between implicit bias and health care outcomes. Interventions targeting implicit attitudes among health care professionals are needed because implicit bias may contribute to health disparities for people of color. PMID:26469668
Errors in causal inference: an organizational schema for systematic error and random error.
Suzuki, Etsuji; Tsuda, Toshihide; Mitsuhashi, Toshiharu; Mansournia, Mohammad Ali; Yamamoto, Eiji
2016-11-01
To provide an organizational schema for systematic error and random error in estimating causal measures, aimed at clarifying the concept of errors from the perspective of causal inference. We propose to divide systematic error into structural error and analytic error. With regard to random error, our schema shows its four major sources: nondeterministic counterfactuals, sampling variability, a mechanism that generates exposure events and measurement variability. Structural error is defined from the perspective of counterfactual reasoning and divided into nonexchangeability bias (which comprises confounding bias and selection bias) and measurement bias. Directed acyclic graphs are useful to illustrate this kind of error. Nonexchangeability bias implies a lack of "exchangeability" between the selected exposed and unexposed groups. A lack of exchangeability is not a primary concern of measurement bias, justifying its separation from confounding bias and selection bias. Many forms of analytic errors result from the small-sample properties of the estimator used and vanish asymptotically. Analytic error also results from wrong (misspecified) statistical models and inappropriate statistical methods. Our organizational schema is helpful for understanding the relationship between systematic error and random error from a previously less investigated aspect, enabling us to better understand the relationship between accuracy, validity, and precision. Copyright © 2016 Elsevier Inc. All rights reserved.
Getting DNA copy numbers without control samples
2012-01-01
Background The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias. We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Results Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. Conclusions NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons. PMID:22898240
Getting DNA copy numbers without control samples.
Ortiz-Estevez, Maria; Aramburu, Ander; Rubio, Angel
2012-08-16
The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias.We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons.
Normalization, bias correction, and peak calling for ChIP-seq
Diaz, Aaron; Park, Kiyoub; Lim, Daniel A.; Song, Jun S.
2012-01-01
Next-generation sequencing is rapidly transforming our ability to profile the transcriptional, genetic, and epigenetic states of a cell. In particular, sequencing DNA from the immunoprecipitation of protein-DNA complexes (ChIP-seq) and methylated DNA (MeDIP-seq) can reveal the locations of protein binding sites and epigenetic modifications. These approaches contain numerous biases which may significantly influence the interpretation of the resulting data. Rigorous computational methods for detecting and removing such biases are still lacking. Also, multi-sample normalization still remains an important open problem. This theoretical paper systematically characterizes the biases and properties of ChIP-seq data by comparing 62 separate publicly available datasets, using rigorous statistical models and signal processing techniques. Statistical methods for separating ChIP-seq signal from background noise, as well as correcting enrichment test statistics for sequence-dependent and sonication biases, are presented. Our method effectively separates reads into signal and background components prior to normalization, improving the signal-to-noise ratio. Moreover, most peak callers currently use a generic null model which suffers from low specificity at the sensitivity level requisite for detecting subtle, but true, ChIP enrichment. The proposed method of determining a cell type-specific null model, which accounts for cell type-specific biases, is shown to be capable of achieving a lower false discovery rate at a given significance threshold than current methods. PMID:22499706
The efficacy of respondent-driven sampling for the health assessment of minority populations.
Badowski, Grazyna; Somera, Lilnabeth P; Simsiman, Brayan; Lee, Hye-Ryeon; Cassel, Kevin; Yamanaka, Alisha; Ren, JunHao
2017-10-01
Respondent driven sampling (RDS) is a relatively new network sampling technique typically employed for hard-to-reach populations. Like snowball sampling, initial respondents or "seeds" recruit additional respondents from their network of friends. Under certain assumptions, the method promises to produce a sample independent from the biases that may have been introduced by the non-random choice of "seeds." We conducted a survey on health communication in Guam's general population using the RDS method, the first survey that has utilized this methodology in Guam. It was conducted in hopes of identifying a cost-efficient non-probability sampling strategy that could generate reasonable population estimates for both minority and general populations. RDS data was collected in Guam in 2013 (n=511) and population estimates were compared with 2012 BRFSS data (n=2031) and the 2010 census data. The estimates were calculated using the unweighted RDS sample and the weighted sample using RDS inference methods and compared with known population characteristics. The sample size was reached in 23days, providing evidence that the RDS method is a viable, cost-effective data collection method, which can provide reasonable population estimates. However, the results also suggest that the RDS inference methods used to reduce bias, based on self-reported estimates of network sizes, may not always work. Caution is needed when interpreting RDS study findings. For a more diverse sample, data collection should not be conducted in just one location. Fewer questions about network estimates should be asked, and more careful consideration should be given to the kind of incentives offered to participants. Copyright © 2017. Published by Elsevier Ltd.
Probabilistic data integration and computational complexity
NASA Astrophysics Data System (ADS)
Hansen, T. M.; Cordua, K. S.; Mosegaard, K.
2016-12-01
Inverse problems in Earth Sciences typically refer to the problem of inferring information about properties of the Earth from observations of geophysical data (the result of nature's solution to the `forward' problem). This problem can be formulated more generally as a problem of `integration of information'. A probabilistic formulation of data integration is in principle simple: If all information available (from e.g. geology, geophysics, remote sensing, chemistry…) can be quantified probabilistically, then different algorithms exist that allow solving the data integration problem either through an analytical description of the combined probability function, or sampling the probability function. In practice however, probabilistic based data integration may not be easy to apply successfully. This may be related to the use of sampling methods, which are known to be computationally costly. But, another source of computational complexity is related to how the individual types of information are quantified. In one case a data integration problem is demonstrated where the goal is to determine the existence of buried channels in Denmark, based on multiple sources of geo-information. Due to one type of information being too informative (and hence conflicting), this leads to a difficult sampling problems with unrealistic uncertainty. Resolving this conflict prior to data integration, leads to an easy data integration problem, with no biases. In another case it is demonstrated how imperfections in the description of the geophysical forward model (related to solving the wave-equation) can lead to a difficult data integration problem, with severe bias in the results. If the modeling error is accounted for, the data integration problems becomes relatively easy, with no apparent biases. Both examples demonstrate that biased information can have a dramatic effect on the computational efficiency solving a data integration problem and lead to biased results, and under-estimation of uncertainty. However, in both examples, one can also analyze the performance of the sampling methods used to solve the data integration problem to indicate the existence of biased information. This can be used actively to avoid biases in the available information and subsequently in the final uncertainty evaluation.
Causal inference and the data-fusion problem
Bareinboim, Elias; Pearl, Judea
2016-01-01
We review concepts, principles, and tools that unify current approaches to causal analysis and attend to new challenges presented by big data. In particular, we address the problem of data fusion—piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) to obtain valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, because the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. We here present a general, nonparametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data fusion in causal inference tasks. PMID:27382148
Sampling for Soil Carbon Stock Assessment in Rocky Agricultural Soils
NASA Technical Reports Server (NTRS)
Beem-Miller, Jeffrey P.; Kong, Angela Y. Y.; Ogle, Stephen; Wolfe, David
2016-01-01
Coring methods commonly employed in soil organic C (SOC) stock assessment may not accurately capture soil rock fragment (RF) content or soil bulk density (rho (sub b)) in rocky agricultural soils, potentially biasing SOC stock estimates. Quantitative pits are considered less biased than coring methods but are invasive and often cost-prohibitive. We compared fixed-depth and mass-based estimates of SOC stocks (0.3-meters depth) for hammer, hydraulic push, and rotary coring methods relative to quantitative pits at four agricultural sites ranging in RF content from less than 0.01 to 0.24 cubic meters per cubic meter. Sampling costs were also compared. Coring methods significantly underestimated RF content at all rocky sites, but significant differences (p is less than 0.05) in SOC stocks between pits and corers were only found with the hammer method using the fixed-depth approach at the less than 0.01 cubic meters per cubic meter RF site (pit, 5.80 kilograms C per square meter; hammer, 4.74 kilograms C per square meter) and at the 0.14 cubic meters per cubic meter RF site (pit, 8.81 kilograms C per square meter; hammer, 6.71 kilograms C per square meter). The hammer corer also underestimated rho (sub b) at all sites as did the hydraulic push corer at the 0.21 cubic meters per cubic meter RF site. No significant differences in mass-based SOC stock estimates were observed between pits and corers. Our results indicate that (i) calculating SOC stocks on a mass basis can overcome biases in RF and rho (sub b) estimates introduced by sampling equipment and (ii) a quantitative pit is the optimal sampling method for establishing reference soil masses, followed by rotary and then hydraulic push corers.
Quantitative imaging biomarkers: Effect of sample size and bias on confidence interval coverage.
Obuchowski, Nancy A; Bullen, Jennifer
2017-01-01
Introduction Quantitative imaging biomarkers (QIBs) are being increasingly used in medical practice and clinical trials. An essential first step in the adoption of a quantitative imaging biomarker is the characterization of its technical performance, i.e. precision and bias, through one or more performance studies. Then, given the technical performance, a confidence interval for a new patient's true biomarker value can be constructed. Estimating bias and precision can be problematic because rarely are both estimated in the same study, precision studies are usually quite small, and bias cannot be measured when there is no reference standard. Methods A Monte Carlo simulation study was conducted to assess factors affecting nominal coverage of confidence intervals for a new patient's quantitative imaging biomarker measurement and for change in the quantitative imaging biomarker over time. Factors considered include sample size for estimating bias and precision, effect of fixed and non-proportional bias, clustered data, and absence of a reference standard. Results Technical performance studies of a quantitative imaging biomarker should include at least 35 test-retest subjects to estimate precision and 65 cases to estimate bias. Confidence intervals for a new patient's quantitative imaging biomarker measurement constructed under the no-bias assumption provide nominal coverage as long as the fixed bias is <12%. For confidence intervals of the true change over time, linearity must hold and the slope of the regression of the measurements vs. true values should be between 0.95 and 1.05. The regression slope can be assessed adequately as long as fixed multiples of the measurand can be generated. Even small non-proportional bias greatly reduces confidence interval coverage. Multiple lesions in the same subject can be treated as independent when estimating precision. Conclusion Technical performance studies of quantitative imaging biomarkers require moderate sample sizes in order to provide robust estimates of bias and precision for constructing confidence intervals for new patients. Assumptions of linearity and non-proportional bias should be assessed thoroughly.
Williams, M S; Ebel, E D; Cao, Y
2013-01-01
The fitting of statistical distributions to microbial sampling data is a common application in quantitative microbiology and risk assessment applications. An underlying assumption of most fitting techniques is that data are collected with simple random sampling, which is often times not the case. This study develops a weighted maximum likelihood estimation framework that is appropriate for microbiological samples that are collected with unequal probabilities of selection. A weighted maximum likelihood estimation framework is proposed for microbiological samples that are collected with unequal probabilities of selection. Two examples, based on the collection of food samples during processing, are provided to demonstrate the method and highlight the magnitude of biases in the maximum likelihood estimator when data are inappropriately treated as a simple random sample. Failure to properly weight samples to account for how data are collected can introduce substantial biases into inferences drawn from the data. The proposed methodology will reduce or eliminate an important source of bias in inferences drawn from the analysis of microbial data. This will also make comparisons between studies and the combination of results from different studies more reliable, which is important for risk assessment applications. © 2012 No claim to US Government works.
Problems with sampling desert tortoises: A simulation analysis based on field data
Freilich, J.E.; Camp, R.J.; Duda, J.J.; Karl, A.E.
2005-01-01
The desert tortoise (Gopherus agassizii) was listed as a U.S. threatened species in 1990 based largely on population declines inferred from mark-recapture surveys of 2.59-km2 (1-mi2) plots. Since then, several census methods have been proposed and tested, but all methods still pose logistical or statistical difficulties. We conducted computer simulations using actual tortoise location data from 2 1-mi2 plot surveys in southern California, USA, to identify strengths and weaknesses of current sampling strategies. We considered tortoise population estimates based on these plots as "truth" and then tested various sampling methods based on sampling smaller plots or transect lines passing through the mile squares. Data were analyzed using Schnabel's mark-recapture estimate and program CAPTURE. Experimental subsampling with replacement of the 1-mi2 data using 1-km2 and 0.25-km2 plot boundaries produced data sets of smaller plot sizes, which we compared to estimates from the 1-mi 2 plots. We also tested distance sampling by saturating a 1-mi 2 site with computer simulated transect lines, once again evaluating bias in density estimates. Subsampling estimates from 1-km2 plots did not differ significantly from the estimates derived at 1-mi2. The 0.25-km2 subsamples significantly overestimated population sizes, chiefly because too few recaptures were made. Distance sampling simulations were biased 80% of the time and had high coefficient of variation to density ratios. Furthermore, a prospective power analysis suggested limited ability to detect population declines as high as 50%. We concluded that poor performance and bias of both sampling procedures was driven by insufficient sample size, suggesting that all efforts must be directed to increasing numbers found in order to produce reliable results. Our results suggest that present methods may not be capable of accurately estimating desert tortoise populations.
Anderson, Samantha F; Maxwell, Scott E
2017-01-01
Psychology is undergoing a replication crisis. The discussion surrounding this crisis has centered on mistrust of previous findings. Researchers planning replication studies often use the original study sample effect size as the basis for sample size planning. However, this strategy ignores uncertainty and publication bias in estimated effect sizes, resulting in overly optimistic calculations. A psychologist who intends to obtain power of .80 in the replication study, and performs calculations accordingly, may have an actual power lower than .80. We performed simulations to reveal the magnitude of the difference between actual and intended power based on common sample size planning strategies and assessed the performance of methods that aim to correct for effect size uncertainty and/or bias. Our results imply that even if original studies reflect actual phenomena and were conducted in the absence of questionable research practices, popular approaches to designing replication studies may result in a low success rate, especially if the original study is underpowered. Methods correcting for bias and/or uncertainty generally had higher actual power, but were not a panacea for an underpowered original study. Thus, it becomes imperative that 1) original studies are adequately powered and 2) replication studies are designed with methods that are more likely to yield the intended level of power.
Northrup, Joseph M.; Hooten, Mevin B.; Anderson, Charles R.; Wittemyer, George
2013-01-01
Habitat selection is a fundamental aspect of animal ecology, the understanding of which is critical to management and conservation. Global positioning system data from animals allow fine-scale assessments of habitat selection and typically are analyzed in a use-availability framework, whereby animal locations are contrasted with random locations (the availability sample). Although most use-availability methods are in fact spatial point process models, they often are fit using logistic regression. This framework offers numerous methodological challenges, for which the literature provides little guidance. Specifically, the size and spatial extent of the availability sample influences coefficient estimates potentially causing interpretational bias. We examined the influence of availability on statistical inference through simulations and analysis of serially correlated mule deer GPS data. Bias in estimates arose from incorrectly assessing and sampling the spatial extent of availability. Spatial autocorrelation in covariates, which is common for landscape characteristics, exacerbated the error in availability sampling leading to increased bias. These results have strong implications for habitat selection analyses using GPS data, which are increasingly prevalent in the literature. We recommend researchers assess the sensitivity of their results to their availability sample and, where bias is likely, take care with interpretations and use cross validation to assess robustness.
NASA Astrophysics Data System (ADS)
Zhang, Qian; Harman, Ciaran J.; Kirchner, James W.
2018-02-01
River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1) fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2) the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling - in the form of spectral slope (β) or other equivalent scaling parameters (e.g., Hurst exponent) - are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1) they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β = 0) to Brown noise (β = 2) and (2) their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths) in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb-Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among all methods for a wide range of prescribed β values and gap distributions. The aliasing method, however, does not itself account for sampling irregularity, and this introduces some bias in the result. Nonetheless, the wavelet method is recommended for estimating β in irregular time series until improved methods are developed. Finally, all methods' performances depend strongly on the sampling irregularity, highlighting that the accuracy and precision of each method are data specific. Accurately quantifying the strength of fractal scaling in irregular water-quality time series remains an unresolved challenge for the hydrologic community and for other disciplines that must grapple with irregular sampling.
Convergence and Efficiency of Adaptive Importance Sampling Techniques with Partial Biasing
NASA Astrophysics Data System (ADS)
Fort, G.; Jourdain, B.; Lelièvre, T.; Stoltz, G.
2018-04-01
We propose a new Monte Carlo method to efficiently sample a multimodal distribution (known up to a normalization constant). We consider a generalization of the discrete-time Self Healing Umbrella Sampling method, which can also be seen as a generalization of well-tempered metadynamics. The dynamics is based on an adaptive importance technique. The importance function relies on the weights (namely the relative probabilities) of disjoint sets which form a partition of the space. These weights are unknown but are learnt on the fly yielding an adaptive algorithm. In the context of computational statistical physics, the logarithm of these weights is, up to an additive constant, the free-energy, and the discrete valued function defining the partition is called the collective variable. The algorithm falls into the general class of Wang-Landau type methods, and is a generalization of the original Self Healing Umbrella Sampling method in two ways: (i) the updating strategy leads to a larger penalization strength of already visited sets in order to escape more quickly from metastable states, and (ii) the target distribution is biased using only a fraction of the free-energy, in order to increase the effective sample size and reduce the variance of importance sampling estimators. We prove the convergence of the algorithm and analyze numerically its efficiency on a toy example.
Bakal, Tomas; Janata, Jiri; Sabova, Lenka; Grabic, Roman; Zlabek, Vladimir; Najmanova, Lucie
2018-06-16
A robust and widely applicable method for sampling of aquatic microbial biofilm and further sample processing is presented. The method is based on next-generation sequencing of V4-V5 variable regions of 16S rRNA gene and further statistical analysis of sequencing data, which could be useful not only to investigate taxonomic composition of biofilm bacterial consortia but also to assess aquatic ecosystem health. Five artificial materials commonly used for biofilm growth (glass, stainless steel, aluminum, polypropylene, polyethylene) were tested to determine the one giving most robust and reproducible results. The effect of used sampler material on total microbial composition was not statistically significant; however, the non-plastic materials (glass, metal) gave more stable outputs without irregularities among sample parallels. The bias of the method is assessed with respect to the employment of a non-quantitative step (PCR amplification) to obtain quantitative results (relative abundance of identified taxa). This aspect is often overlooked in ecological and medical studies. We document that sequencing of a mixture of three merged primary PCR reactions for each sample and further evaluation of median values from three technical replicates for each sample enables to overcome this bias and gives robust and repeatable results well distinguishing among sampling localities and seasons.
Large biases in regression-based constituent flux estimates: causes and diagnostic tools
Hirsch, Robert M.
2014-01-01
It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.
Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap
NASA Astrophysics Data System (ADS)
Spiwok, Vojtěch; Králová, Blanka
2011-12-01
Atomic motions in molecules are not linear. This infers that nonlinear dimensionality reduction methods can outperform linear ones in analysis of collective atomic motions. In addition, nonlinear collective motions can be used as potentially efficient guides for biased simulation techniques. Here we present a simulation with a bias potential acting in the directions of collective motions determined by a nonlinear dimensionality reduction method. Ad hoc generated conformations of trans,trans-1,2,4-trifluorocyclooctane were analyzed by Isomap method to map these 72-dimensional coordinates to three dimensions, as described by Brown and co-workers [J. Chem. Phys. 129, 064118 (2008)]. Metadynamics employing the three-dimensional embeddings as collective variables was applied to explore all relevant conformations of the studied system and to calculate its conformational free energy surface. The method sampled all relevant conformations (boat, boat-chair, and crown) and corresponding transition structures inaccessible by an unbiased simulation. This scheme allows to use essentially any parameter of the system as a collective variable in biased simulations. Moreover, the scheme we used for mapping out-of-sample conformations from the 72D to 3D space can be used as a general purpose mapping for dimensionality reduction, beyond the context of molecular modeling.
Elwaer, Nagmeddin; Hintelmann, Holger
2007-11-01
The analytical performance of five sample introduction systems, a cross flow nebulizer spray chamber, two different solvent desolvation systems, a multi-mode sample introduction system (MSIS), and a hydride generation (LI2) system were compared for the determination of Se isotope ratio measurements using multi-collector inductively coupled plasma mass spectrometry (MC-ICP/MS). The optimal operating parameters for obtaining the highest Se signal-to-noise (S/N) ratios and isotope ratio precision for each sample introduction were determined. The hydride generation (LI2) system was identified as the most suitable sample introduction method yielding maximum sensitivity and precision for Se isotope ratio measurement. It provided five times higher S/N ratios for all Se isotopes compared to the MSIS, 20 times the S/N ratios of both desolvation units, and 100 times the S/N ratios produced by the conventional spray chamber sample introduction method. The internal precision achieved for the (78)Se/(82)Se ratio at 100 ng mL(-1) Se with the spray chamber, two desolvation, MSIS, and the LI2 systems coupled to MC-ICP/MS was 150, 125, 114, 13, and 7 ppm, respectively. Instrument mass bias factors (K) were calculated using an exponential law correction function. Among the five studied sample introduction systems the LI2 showed the lowest mass bias of -0.0265 and the desolvation system showed the largest bias with -0.0321.
Some comments on Anderson and Pospahala's correction of bias in line transect sampling
Anderson, D.R.; Burnham, K.P.; Chain, B.R.
1980-01-01
ANDERSON and POSPAHALA (1970) investigated the estimation of wildlife population size using the belt or line transect sampling method and devised a correction for bias, thus leading to an estimator with interesting characteristics. This work was given a uniform mathematical framework in BURNHAM and ANDERSON (1976). In this paper we show that the ANDERSON-POSPAHALA estimator is optimal in the sense of being the (unique) best linear unbiased estimator within the class of estimators which are linear combinations of cell frequencies, provided certain assumptions are met.
Delatour, Vincent; Lalere, Beatrice; Saint-Albin, Karène; Peignaux, Maryline; Hattchouel, Jean-Marc; Dumont, Gilles; De Graeve, Jacques; Vaslin-Reimann, Sophie; Gillery, Philippe
2012-11-20
The reliability of biological tests is a major issue for patient care in terms of public health that involves high economic stakes. Reference methods, as well as regular external quality assessment schemes (EQAS), are needed to monitor the analytical performance of field methods. However, control material commutability is a major concern to assess method accuracy. To overcome material non-commutability, we investigated the possibility of using lyophilized serum samples together with a limited number of frozen serum samples to assign matrix-corrected target values, taking the example of glucose assays. Trueness of the current glucose assays was first measured against a primary reference method by using human frozen sera. Methods using hexokinase and glucose oxidase with spectroreflectometric detection proved very accurate, with bias ranging between -2.2% and +2.3%. Bias of methods using glucose oxidase with spectrophotometric detection was +4.5%. Matrix-related bias of the lyophilized materials was then determined and ranged from +2.5% to -14.4%. Matrix-corrected target values were assigned and used to assess trueness of 22 sub-peer groups. We demonstrated that matrix-corrected target values can be a valuable tool to assess field method accuracy in large scale surveys where commutable materials are not available in sufficient amount with acceptable costs. Copyright © 2012 Elsevier B.V. All rights reserved.
Chiang, Kuo-Szu; Bock, Clive H; Lee, I-Hsuan; El Jarroudi, Moussa; Delfosse, Philippe
2016-12-01
The effect of rater bias and assessment method on hypothesis testing was studied for representative experimental designs for plant disease assessment using balanced and unbalanced data sets. Data sets with the same number of replicate estimates for each of two treatments are termed "balanced" and those with unequal numbers of replicate estimates are termed "unbalanced". The three assessment methods considered were nearest percent estimates (NPEs), an amended 10% incremental scale, and the Horsfall-Barratt (H-B) scale. Estimates of severity of Septoria leaf blotch on leaves of winter wheat were used to develop distributions for a simulation model. The experimental designs are presented here in the context of simulation experiments which consider the optimal design for the number of specimens (individual units sampled) and the number of replicate estimates per specimen for a fixed total number of observations (total sample size for the treatments being compared). The criterion used to gauge each method was the power of the hypothesis test. As expected, at a given fixed number of observations, the balanced experimental designs invariably resulted in a higher power compared with the unbalanced designs at different disease severity means, mean differences, and variances. Based on these results, with unbiased estimates using NPE, the recommended number of replicate estimates taken per specimen is 2 (from a sample of specimens of at least 30), because this conserves resources. Furthermore, for biased estimates, an apparent difference in the power of the hypothesis test was observed between assessment methods and between experimental designs. Results indicated that, regardless of experimental design or rater bias, an amended 10% incremental scale has slightly less power compared with NPEs, and that the H-B scale is more likely than the others to cause a type II error. These results suggest that choice of assessment method, optimizing sample number and number of replicate estimates, and using a balanced experimental design are important criteria to consider to maximize the power of hypothesis tests for comparing treatments using disease severity estimates.
An integrate-over-temperature approach for enhanced sampling.
Gao, Yi Qin
2008-02-14
A simple method is introduced to achieve efficient random walking in the energy space in molecular dynamics simulations which thus enhances the sampling over a large energy range. The approach is closely related to multicanonical and replica exchange simulation methods in that it allows configurations of the system to be sampled in a wide energy range by making use of Boltzmann distribution functions at multiple temperatures. A biased potential is quickly generated using this method and is then used in accelerated molecular dynamics simulations.
Spijkerman, Renske; Knibbe, Ronald; Knoops, Kim; Van De Mheen, Dike; Van Den Eijnden, Regina
2009-10-01
Rather than using the traditional, costly method of personal interviews in a general population sample, substance-use prevalence rates can be derived more conveniently from data collected among members of an online access panel. To examine the utility of this method, we compared the outcomes of an online survey with those obtained with the computer-assisted personal interviews (CAPI) method. Data were gathered from a large sample of online panellists and in a two-stage stratified sample of the Dutch population using the CAPI method. The Netherlands. Participants The online sample comprised 57 125 Dutch online panellists (15-64 years) of Survey Sampling International LLC (SSI), and the CAPI cohort 7204 respondents (15-64 years). All participants answered identical questions about their use of alcohol, cannabis, ecstasy, cocaine and performance-enhancing drugs. The CAPI respondents were asked additionally about internet access and online panel membership. Both data sets were weighted statistically according to the distribution of demographic characteristics of the general Dutch population. Response rates were 35.5% (n = 20 282) for the online panel cohort and 62.7% (n = 4516) for the CAPI cohort. The data showed almost consistently lower substance-use prevalence rates for the CAPI respondents. Although the observed differences could be due to bias in both data sets, coverage and non-response bias were higher in the online panel survey. Despite its economic advantage, the online panel survey showed stronger non-response and coverage bias than the CAPI survey, leading to less reliable estimates of substance use in the general population. © 2009 The Authors. Journal compilation © 2009 Society for the Study of Addiction.
O’Leary-Barrett, Maeve; Pihl, Robert O.; Artiges, Eric; Banaschewski, Tobias; Bokde, Arun L. W.; Büchel, Christian; Flor, Herta; Frouin, Vincent; Garavan, Hugh; Heinz, Andreas; Ittermann, Bernd; Mann, Karl; Paillère-Martinot, Marie-Laure; Nees, Frauke; Paus, Tomas; Pausova, Zdenka; Poustka, Luise; Rietschel, Marcella; Robbins, Trevor W.; Smolka, Michael N.; Ströhle, Andreas; Schumann, Gunter; Conrod, Patricia J.
2015-01-01
Objective To investigate the role of personality factors and attentional biases towards emotional faces, in establishing concurrent and prospective risk for mental disorder diagnosis in adolescence. Method Data were obtained as part of the IMAGEN study, conducted across 8 European sites, with a community sample of 2257 adolescents. At 14 years, participants completed an emotional variant of the dot-probe task, as well two personality measures, namely the Substance Use Risk Profile Scale and the revised NEO Personality Inventory. At 14 and 16 years, participants and their parents were interviewed to determine symptoms of mental disorders. Results Personality traits were general and specific risk indicators for mental disorders at 14 years. Increased specificity was obtained when investigating the likelihood of mental disorders over a 2-year period, with the Substance Use Risk Profile Scale showing incremental validity over the NEO Personality Inventory. Attentional biases to emotional faces did not characterise or predict mental disorders examined in the current sample. Discussion Personality traits can indicate concurrent and prospective risk for mental disorders in a community youth sample, and identify at-risk youth beyond the impact of baseline symptoms. This study does not support the hypothesis that attentional biases mediate the relationship between personality and psychopathology in a community sample. Task and sample characteristics that contribute to differing results among studies are discussed. PMID:26046352
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Putter, Roland; Doré, Olivier; Das, Sudeep
2014-01-10
Cross correlations between the galaxy number density in a lensing source sample and that in an overlapping spectroscopic sample can in principle be used to calibrate the lensing source redshift distribution. In this paper, we study in detail to what extent this cross-correlation method can mitigate the loss of cosmological information in upcoming weak lensing surveys (combined with a cosmic microwave background prior) due to lack of knowledge of the source distribution. We consider a scenario where photometric redshifts are available and find that, unless the photometric redshift distribution p(z {sub ph}|z) is calibrated very accurately a priori (bias andmore » scatter known to ∼0.002 for, e.g., EUCLID), the additional constraint on p(z {sub ph}|z) from the cross-correlation technique to a large extent restores the cosmological information originally lost due to the uncertainty in dn/dz(z). Considering only the gain in photo-z accuracy and not the additional cosmological information, enhancements of the dark energy figure of merit of up to a factor of four (40) can be achieved for a SuMIRe-like (EUCLID-like) combination of lensing and redshift surveys, where SuMIRe stands for Subaru Measurement of Images and Redshifts). However, the success of the method is strongly sensitive to our knowledge of the galaxy bias evolution in the source sample and we find that a percent level bias prior is needed to optimize the gains from the cross-correlation method (i.e., to approach the cosmology constraints attainable if the bias was known exactly).« less
Free energy calculations: an efficient adaptive biasing potential method.
Dickson, Bradley M; Legoll, Frédéric; Lelièvre, Tony; Stoltz, Gabriel; Fleurat-Lessard, Paul
2010-05-06
We develop an efficient sampling and free energy calculation technique within the adaptive biasing potential (ABP) framework. By mollifying the density of states we obtain an approximate free energy and an adaptive bias potential that is computed directly from the population along the coordinates of the free energy. Because of the mollifier, the bias potential is "nonlocal", and its gradient admits a simple analytic expression. A single observation of the reaction coordinate can thus be used to update the approximate free energy at every point within a neighborhood of the observation. This greatly reduces the equilibration time of the adaptive bias potential. This approximation introduces two parameters: strength of mollification and the zero of energy of the bias potential. While we observe that the approximate free energy is a very good estimate of the actual free energy for a large range of mollification strength, we demonstrate that the errors associated with the mollification may be removed via deconvolution. The zero of energy of the bias potential, which is easy to choose, influences the speed of convergence but not the limiting accuracy. This method is simple to apply to free energy or mean force computation in multiple dimensions and does not involve second derivatives of the reaction coordinates, matrix manipulations nor on-the-fly adaptation of parameters. For the alanine dipeptide test case, the new method is found to gain as much as a factor of 10 in efficiency as compared to two basic implementations of the adaptive biasing force methods, and it is shown to be as efficient as well-tempered metadynamics with the postprocess deconvolution giving a clear advantage to the mollified density of states method.
Lapierre, Marguerite; Blin, Camille; Lambert, Amaury; Achaz, Guillaume; Rocha, Eduardo P C
2016-07-01
Recent studies have linked demographic changes and epidemiological patterns in bacterial populations using coalescent-based approaches. We identified 26 studies using skyline plots and found that 21 inferred overall population expansion. This surprising result led us to analyze the impact of natural selection, recombination (gene conversion), and sampling biases on demographic inference using skyline plots and site frequency spectra (SFS). Forward simulations based on biologically relevant parameters from Escherichia coli populations showed that theoretical arguments on the detrimental impact of recombination and especially natural selection on the reconstructed genealogies cannot be ignored in practice. In fact, both processes systematically lead to spurious interpretations of population expansion in skyline plots (and in SFS for selection). Weak purifying selection, and especially positive selection, had important effects on skyline plots, showing patterns akin to those of population expansions. State-of-the-art techniques to remove recombination further amplified these biases. We simulated three common sampling biases in microbiological research: uniform, clustered, and mixed sampling. Alone, or together with recombination and selection, they further mislead demographic inferences producing almost any possible skyline shape or SFS. Interestingly, sampling sub-populations also affected skyline plots and SFS, because the coalescent rates of populations and their sub-populations had different distributions. This study suggests that extreme caution is needed to infer demographic changes solely based on reconstructed genealogies. We suggest that the development of novel sampling strategies and the joint analyzes of diverse population genetic methods are strictly necessary to estimate demographic changes in populations where selection, recombination, and biased sampling are present. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Wan, Xiaomin; Peng, Liubao; Li, Yuanjian
2015-01-01
Background In general, the individual patient-level data (IPD) collected in clinical trials are not available to independent researchers to conduct economic evaluations; researchers only have access to published survival curves and summary statistics. Thus, methods that use published survival curves and summary statistics to reproduce statistics for economic evaluations are essential. Four methods have been identified: two traditional methods 1) least squares method, 2) graphical method; and two recently proposed methods by 3) Hoyle and Henley, 4) Guyot et al. The four methods were first individually reviewed and subsequently assessed regarding their abilities to estimate mean survival through a simulation study. Methods A number of different scenarios were developed that comprised combinations of various sample sizes, censoring rates and parametric survival distributions. One thousand simulated survival datasets were generated for each scenario, and all methods were applied to actual IPD. The uncertainty in the estimate of mean survival time was also captured. Results All methods provided accurate estimates of the mean survival time when the sample size was 500 and a Weibull distribution was used. When the sample size was 100 and the Weibull distribution was used, the Guyot et al. method was almost as accurate as the Hoyle and Henley method; however, more biases were identified in the traditional methods. When a lognormal distribution was used, the Guyot et al. method generated noticeably less bias and a more accurate uncertainty compared with the Hoyle and Henley method. Conclusions The traditional methods should not be preferred because of their remarkable overestimation. When the Weibull distribution was used for a fitted model, the Guyot et al. method was almost as accurate as the Hoyle and Henley method. However, if the lognormal distribution was used, the Guyot et al. method was less biased compared with the Hoyle and Henley method. PMID:25803659
The association between ruminative thinking and negative interpretation bias in social anxiety.
Badra, Marcel; Schulze, Lars; Becker, Eni S; Vrijsen, Janna Nonja; Renneberg, Babette; Zetsche, Ulrike
2017-09-01
Cognitive models propose that both, negative interpretations of ambiguous social situations and ruminative thoughts about social events contribute to the maintenance of social anxiety disorder. It has further been postulated that ruminative thoughts fuel biased negative interpretations, however, evidence is rare. The present study used a multi-method approach to assess ruminative processing following a social interaction (post-event processing by self-report questionnaire and social rumination by experience sampling method) and negative interpretation bias (via two separate tasks) in a student sample (n = 51) screened for high (HSA) and low social anxiety (LSA). Results support the hypothesis that group differences in negative interpretations of ambiguous social situations in HSAs vs. LSAs are mediated by higher levels of post-event processing assessed in the questionnaire. Exploratory analyses highlight the potential role of comorbid depressive symptoms. The current findings help to advance the understanding of the association between two cognitive processes involved in social anxiety and stress the importance of ruminative post-event processing.
Meng, Yilin; Roux, Benoît
2015-08-11
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
NASA Astrophysics Data System (ADS)
Bilionis, I.; Koutsourelakis, P. S.
2012-05-01
The present paper proposes an adaptive biasing potential technique for the computation of free energy landscapes. It is motivated by statistical learning arguments and unifies the tasks of biasing the molecular dynamics to escape free energy wells and estimating the free energy function, under the same objective of minimizing the Kullback-Leibler divergence between appropriately selected densities. It offers rigorous convergence diagnostics even though history dependent, non-Markovian dynamics are employed. It makes use of a greedy optimization scheme in order to obtain sparse representations of the free energy function which can be particularly useful in multidimensional cases. It employs embarrassingly parallelizable sampling schemes that are based on adaptive Sequential Monte Carlo and can be readily coupled with legacy molecular dynamics simulators. The sequential nature of the learning and sampling scheme enables the efficient calculation of free energy functions parametrized by the temperature. The characteristics and capabilities of the proposed method are demonstrated in three numerical examples.
2015-01-01
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
Wan, Xiaomin; Peng, Liubao; Li, Yuanjian
2015-01-01
In general, the individual patient-level data (IPD) collected in clinical trials are not available to independent researchers to conduct economic evaluations; researchers only have access to published survival curves and summary statistics. Thus, methods that use published survival curves and summary statistics to reproduce statistics for economic evaluations are essential. Four methods have been identified: two traditional methods 1) least squares method, 2) graphical method; and two recently proposed methods by 3) Hoyle and Henley, 4) Guyot et al. The four methods were first individually reviewed and subsequently assessed regarding their abilities to estimate mean survival through a simulation study. A number of different scenarios were developed that comprised combinations of various sample sizes, censoring rates and parametric survival distributions. One thousand simulated survival datasets were generated for each scenario, and all methods were applied to actual IPD. The uncertainty in the estimate of mean survival time was also captured. All methods provided accurate estimates of the mean survival time when the sample size was 500 and a Weibull distribution was used. When the sample size was 100 and the Weibull distribution was used, the Guyot et al. method was almost as accurate as the Hoyle and Henley method; however, more biases were identified in the traditional methods. When a lognormal distribution was used, the Guyot et al. method generated noticeably less bias and a more accurate uncertainty compared with the Hoyle and Henley method. The traditional methods should not be preferred because of their remarkable overestimation. When the Weibull distribution was used for a fitted model, the Guyot et al. method was almost as accurate as the Hoyle and Henley method. However, if the lognormal distribution was used, the Guyot et al. method was less biased compared with the Hoyle and Henley method.
Comparison of DNA preservation methods for environmental bacterial community samples.
Gray, Michael A; Pratte, Zoe A; Kellogg, Christina A
2013-02-01
Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard(™), RNAlater(®), DMSO-EDTA-salt (DESS), FTA(®) cards, and FTA Elute(®) cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA(®) cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard(™), RNAlater(®), and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Darnaude, Audrey M.
2016-01-01
Background Mixture models (MM) can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM), under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011), from four distinct nursery habitats. (Mediterranean lagoons) Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI) and uncertainty (SE) were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06) when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI < 0.13, SE < 0.29). Increasing separation among nursery signatures improved reliability of mixing proportion estimates, but lead to non-linear responses in baseline signature parameters. Low uncertainty, but a consistent underestimation bias affected the estimated number of nursery sources, across all incomplete sampling scenarios. Discussion ML-MM produced reliable estimates of mixing proportions and nursery-signatures under an important range of incomplete sampling and nursery-signature separation scenarios. This method failed, however, in estimating the true number of nursery sources, reflecting a pervasive issue affecting mixture models, within and beyond the ML framework. Large differences in bias and uncertainty found among cohorts were linked to differences in separation of chemical signatures among nursery habitats. Simulation approaches, such as those presented here, could be useful to evaluate sensitivity of MM results to separation and variability in nursery-signatures for other species, habitats or cohorts. PMID:27761305
Cutts, Felicity T; Izurieta, Hector S; Rhoda, Dale A
2013-01-01
Vaccination coverage is an important public health indicator that is measured using administrative reports and/or surveys. The measurement of vaccination coverage in low- and middle-income countries using surveys is susceptible to numerous challenges. These challenges include selection bias and information bias, which cannot be solved by increasing the sample size, and the precision of the coverage estimate, which is determined by the survey sample size and sampling method. Selection bias can result from an inaccurate sampling frame or inappropriate field procedures and, since populations likely to be missed in a vaccination coverage survey are also likely to be missed by vaccination teams, most often inflates coverage estimates. Importantly, the large multi-purpose household surveys that are often used to measure vaccination coverage have invested substantial effort to reduce selection bias. Information bias occurs when a child's vaccination status is misclassified due to mistakes on his or her vaccination record, in data transcription, in the way survey questions are presented, or in the guardian's recall of vaccination for children without a written record. There has been substantial reliance on the guardian's recall in recent surveys, and, worryingly, information bias may become more likely in the future as immunization schedules become more complex and variable. Finally, some surveys assess immunity directly using serological assays. Sero-surveys are important for assessing public health risk, but currently are unable to validate coverage estimates directly. To improve vaccination coverage estimates based on surveys, we recommend that recording tools and practices should be improved and that surveys should incorporate best practices for design, implementation, and analysis.
Jung, R.E.; Droege, S.; Sauer, J.R.; Landy, R.B.
2000-01-01
In response to concerns about amphibian declines, a study evaluating and validating amphibian monitoring techniques was initiated in Shenandoah and Big Bend National Parks in the spring of 1998. We evaluate precision, bias, and efficiency of several sampling methods for terrestrial and streamside salamanders in Shenandoah National Park and assess salamander abundance in relation to environmental variables, notably soil and water pH. Terrestrial salamanders, primarily redback salamanders (Plethodon cinereus), were sampled by searching under cover objects during the day in square plots (10 to 35 m2). We compared population indices (mean daily and total counts) with adjusted population estimates from capture-recapture. Analyses suggested that the proportion of salamanders detected (p) during sampling varied among plots, necessitating the use of adjusted population estimates. However, adjusted population estimates were less precise than population indices, and may not be efficient in relating salamander populations to environmental variables. In future sampling, strategic use of capture-recapture to verify consistency of p's among sites may be a reasonable compromise between the possibility of bias in estimation of population size and deficiencies due to inefficiency associated with the estimation of p. The streamside two-lined salamander (Eurycea bislineata) was surveyed using four methods: leaf litter refugia bags, 1 m2 quadrats, 50 x 1 m visual encounter transects, and electric shocking. Comparison of survey methods at nine streams revealed congruent patterns of abundance among sites, suggesting that relative bias among the methods is similar, and that choice of survey method should be based on precision and logistical efficiency. Redback and two-lined salamander abundance were not significantly related to soil or water pH, respectively.
Gupta, Manan; Joshi, Amitabh; Vidya, T N C
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species.
A New Source Biasing Approach in ADVANTG
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bevill, Aaron M; Mosher, Scott W
2012-01-01
The ADVANTG code has been developed at Oak Ridge National Laboratory to generate biased sources and weight window maps for MCNP using the CADIS and FW-CADIS methods. In preparation for an upcoming RSICC release, a new approach for generating a biased source has been developed. This improvement streamlines user input and improves reliability. Previous versions of ADVANTG generated the biased source from ADVANTG input, writing an entirely new general fixed-source definition (SDEF). Because volumetric sources were translated into SDEF-format as a finite set of points, the user had to perform a convergence study to determine whether the number of sourcemore » points used accurately represented the source region. Further, the large number of points that must be written in SDEF-format made the MCNP input and output files excessively long and difficult to debug. ADVANTG now reads SDEF-format distributions and generates corresponding source biasing cards, eliminating the need for a convergence study. Many problems of interest use complicated source regions that are defined using cell rejection. In cell rejection, the source distribution in space is defined using an arbitrarily complex cell and a simple bounding region. Source positions are sampled within the bounding region but accepted only if they fall within the cell; otherwise, the position is resampled entirely. When biasing in space is applied to sources that use rejection sampling, current versions of MCNP do not account for the rejection in setting the source weight of histories, resulting in an 'unfair game'. This problem was circumvented in previous versions of ADVANTG by translating volumetric sources into a finite set of points, which does not alter the mean history weight ({bar w}). To use biasing parameters without otherwise modifying the original cell-rejection SDEF-format source, ADVANTG users now apply a correction factor for {bar w} in post-processing. A stratified-random sampling approach in ADVANTG is under development to automatically report the correction factor with estimated uncertainty. This study demonstrates the use of ADVANTG's new source biasing method, including the application of {bar w}.« less
Joshi, Amitabh; Vidya, T. N. C.
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species. PMID:28306735
Conformational free energies of methyl-α-L-iduronic and methyl-β-D-glucuronic acids in water
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Sagui, Celeste
2010-03-01
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of α-L-iduronic acid and its C5 epimer β-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Conformational free energies of methyl-alpha-L-iduronic and methyl-beta-D-glucuronic acids in water.
Babin, Volodymyr; Sagui, Celeste
2010-03-14
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of alpha-L-iduronic acid and its C5 epimer beta-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Quezada, Amado D; García-Guerra, Armando; Escobar, Leticia
2016-06-01
To assess the performance of a simple correction method for nutritional status estimates in children under five years of age when exact age is not available from the data. The proposed method was based on the assumption of symmetry of age distributions within a given month of age and validated in a large population-based survey sample of Mexican preschool children. The main distributional assumption was consistent with the data. All prevalence estimates derived from the correction method showed no statistically significant bias. In contrast, failing to correct attained age resulted in an underestimation of stunting in general and an overestimation of overweight or obesity among the youngest. The proposed method performed remarkably well in terms of bias correction of estimates and could be easily applied in situations in which either birth or interview dates are not available from the data.
Performance of Random Effects Model Estimators under Complex Sampling Designs
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…
Social Desirability Bias in Self-Reporting of Hearing Protector Use among Farm Operators
McCullagh, Marjorie C.; Rosemberg, Marie-Anne
2015-01-01
Objective: The purposes of this study were (i) to examine the relationship between reported hearing protector use and social desirability bias, and (ii) to compare results of the Marlowe-Crowne social desirability instrument when administered using two different methods (i.e. online and by telephone). Methods: A shortened version of the Marlowe-Crowne social desirability instrument, as well as a self-administered instrument measuring use of hearing protectors, was administered to 497 participants in a study of hearing protector use. The relationship between hearing protector use and social desirability bias was examined using regression analysis. The results of two methods of administration of the Marlowe-Crowne social desirability instrument were compared using t-tests and regression analysis. Results: Reliability (using Cronbach’s alpha) for the shortened seven-item scale for this sample was 0.58. There was no evidence of a relationship between reported hearing protector use and social desirability reporting bias, as measured by the shortened Marlowe-Crowne. The difference in results by method of administration (i.e. online, telephone) was very small. Conclusions: This is the first published study to measure social desirability bias in reporting of hearing protector use among farmers. Findings of this study do not support the presence of social desirability bias in farmers’ reporting of hearing protector use, lending support for the validity of self-report in hearing protector use in this population. PMID:26209595
Accelerating atomistic simulations through self-learning bond-boost hyperdynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perez, Danny; Voter, Arthur F
2008-01-01
By altering the potential energy landscape on which molecular dynamics are carried out, the hyperdynamics method of Voter enables one to significantly accelerate the simulation state-to-state dynamics of physical systems. While very powerful, successful application of the method entails solving the subtle problem of the parametrization of the so-called bias potential. In this study, we first clarify the constraints that must be obeyed by the bias potential and demonstrate that fast sampling of the biased landscape is key to the obtention of proper kinetics. We then propose an approach by which the bond boost potential of Miron and Fichthorn canmore » be safely parametrized based on data acquired in the course of a molecular dynamics simulation. Finally, we introduce a procedure, the Self-Learning Bond Boost method, in which the parametrization is step efficiently carried out on-the-fly for each new state that is visited during the simulation by safely ramping up the strength of the bias potential up to its optimal value. The stability and accuracy of the method are demonstrated.« less
Guo, Ying; Little, Roderick J; McConnell, Daniel S
2012-01-01
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
Schneier, Franklin R.; Kimeldorf, Marcia B.; Choo, Tse; Steinglass, Joanna E.; Wall, Melanie; Fyer, Abby J.; Simpson, H. Blair
2016-01-01
Background Attention bias to threat (selective attention toward threatening stimuli) has been frequently found in anxiety disorder samples, but its distribution both within and beyond this category is unclear. Attention bias has been studied extensively in social anxiety disorder (SAD) but relatively little in obsessive compulsive disorder (OCD), historically considered an anxiety disorder, or anorexia nervosa (AN), which is often characterized by interpersonal as well as body image/eating fears. Methods Medication-free adults with SAD (n=43), OCD (n=50), or AN (n=30), and healthy control volunteers (HC, n=74) were evaluated for attention bias with an established dot probe task presenting images of angry and neutral faces. Additional outcomes included attention bias variability (ABV), which summarizes fluctuation in attention between vigilance and avoidance, and has been reported to have superior reliability. We hypothesized that attention bias would be elevated in SAD and associated with SAD severity. Results Attention bias in each disorder did not differ from HC, but within the SAD group attention bias correlated significantly with severity of social avoidance. ABV was significantly lower in OCD versus HC, and it correlated positively with severity of OCD symptoms within the OCD group. Conclusions Findings do not support differences from HC in attention bias to threat faces for SAD, OCD, or AN. Within the SAD sample, the association of attention bias with severity of social avoidance is consistent with evidence that attention bias moderates development of social withdrawal. The association of ABV with OCD diagnosis and severity is novel and deserves further study. PMID:27174402
NASA Astrophysics Data System (ADS)
Desai, A. R.; Reed, D. E.; Dugan, H. A.; Loken, L. C.; Schramm, P.; Golub, M.; Huerd, H.; Baldocchi, A. K.; Roberts, R.; Taebel, Z.; Hart, J.; Hanson, P. C.; Stanley, E. H.; Cartwright, E.
2017-12-01
Freshwater ecosystems are hotspots of regional to global carbon cycling. However, significant sample biases limit our ability to quantify and predict these fluxes. For lakes, scaled flux estimates suffer biased sampling toward 1) low-nutrient pristine lakes, 2) infrequent temporal sampling, 3) field campaigns limited to the growing season, and 4) replicates limited to near the center of the lake. While these biases partly reflect the realities of ecological sampling, there is a need to extend observations towards the large fraction of freshwater systems worldwide that are impaired by human activities and those facing significant interannual variability owing to climatic change. Also, for seasonally ice-covered lakes, much of the annual budget of carbon fluxes is thought to be explained by variation in the shoulder seasons of spring ice melt and fall turnover. Recent advances in automated, continuous multi-year temporal sampling coupled with rapid methods for spatial mapping of CO2 fluxes has strong potential to rectify these sampling biases. Here, we demonstrate these advances in an eutrophic seasonally-ice covered lake with an urban shoreline and agricultural watershed. Multiple years of half-hourly eddy covariance flux tower observations from two locations are coupled with frequent spatial samples of these fluxes and drivers by speedboat, floating chamber fluxes, automated buoy-based monitoring of lake nutrient and physical profiles, and ensemble of physical-ecosystem models. High primary productivity in the water column leads to an average net carbon sink during the growing season in much of the lake, but annual net carbon fluxes show the lake can act as an annual source or a sink of carbon depending the timing of spring and fall turnover. Trophic interactions and internal waves drive shorter-term variation while nutrients and biology drive seasonal variation. However, discrepancies remain among methods to quantify fluxes, requiring further investigation.
Ludtke, Amy S.; Woodworth, Mark T.; Marsh, Philip S.
2000-01-01
The U.S. Geological Survey operates a quality-assurance program based on the analyses of reference samples for two laboratories: the National Water Quality Laboratory and the Quality of Water Service Unit. Reference samples that contain selected inorganic, nutrient, and low-level constituents are prepared and submitted to the laboratory as disguised routine samples. The program goal is to estimate precision and bias for as many analytical methods offered by the participating laboratories as possible. Blind reference samples typically are submitted at a rate of 2 to 5 percent of the annual environmental-sample load for each constituent. The samples are distributed to the laboratories throughout the year. The reference samples are subject to the identical laboratory handling, processing, and analytical procedures as those applied to environmental samples and, therefore, have been used as an independent source to verify bias and precision of laboratory analytical methods and ambient water-quality measurements. The results are stored permanently in the National Water Information System and the Blind Sample Project's data base. During water year 1998, 95 analytical procedures were evaluated at the National Water Quality Laboratory and 63 analytical procedures were evaluated at the Quality of Water Service Unit. An overall evaluation of the inorganic and low-level constituent data for water year 1998 indicated 77 of 78 analytical procedures at the National Water Quality Laboratory met the criteria for precision. Silver (dissolved, inductively coupled plasma-mass spectrometry) was determined to be imprecise. Five of 78 analytical procedures showed bias throughout the range of reference samples: chromium (dissolved, inductively coupled plasma-atomic emission spectrometry), dissolved solids (dissolved, gravimetric), lithium (dissolved, inductively coupled plasma-atomic emission spectrometry), silver (dissolved, inductively coupled plasma-mass spectrometry), and zinc (dissolved, inductively coupled plasma-mass spectrometry). At the National Water Quality Laboratory during water year 1998, lack of precision was indicated for 2 of 17 nutrient procedures: ammonia as nitrogen (dissolved, colorimetric) and orthophosphate as phosphorus (dissolved, colorimetric). Bias was indicated throughout the reference sample range for ammonia as nitrogen (dissolved, colorimetric, low level) and nitrate plus nitrite as nitrogen (dissolved, colorimetric, low level). All analytical procedures tested at the Quality of Water Service Unit during water year 1998 met the criteria for precision. One of the 63 analytical procedures indicated a bias throughout the range of reference samples: aluminum (whole-water recoverable, inductively coupled plasma-atomic emission spectrometry, trace).
Stereotypical images and implicit weight bias in overweight/obese people
Hinman, Nova G.; Burmeister, Jacob M.; Hoffmann, Debra A.; Ashrafioun, Lisham; Koball, Afton M.
2013-01-01
Purpose In this brief report, an unanswered question in implicit weight bias research is addressed: Is weight bias stronger when obese and thin people are pictured engaging in stereotype consistent behaviors (e.g., obese—watching TV/eating junk food; thin—exercising/eating healthy) as opposed to the converse? Methods Implicit Associations Test (IAT) data were collected from two samples of overweight/obese adults participating in weight loss treatment. Both samples completed two IATs. In one IAT, obese and thin people were pictured engaging in stereotype consistent behaviors (e.g., obese—watching TV/eating junk food; thin—exercising/eating healthy). In the second IAT, obese and thin people were pictured engaging in stereotype inconsistent behaviors (e.g., obese—exercising/eating healthy; thin—watching TV/eating junk food). Results Implicit weight bias was evident regardless of whether participants viewed stereotype consistent or inconsistent pictures. However, implicit bias was significantly stronger for stereotype consistent compared to stereotype inconsistent images. Conclusion Implicit anti-fat attitudes may be connected to the way in which people with obesity are portrayed. PMID:24057679
Monitoring Species of Concern Using Noninvasive Genetic Sampling and Capture-Recapture Methods
2016-11-01
ABBREVIATIONS AICc Akaike’s Information Criterion with small sample size correction AZGFD Arizona Game and Fish Department BMGR Barry M. Goldwater...MNKA Minimum Number Known Alive N Abundance Ne Effective Population Size NGS Noninvasive Genetic Sampling NGS-CR Noninvasive Genetic...parameter estimates from capture-recapture models require sufficient sample sizes , capture probabilities and low capture biases. For NGS-CR, sample
Fischer, Jesse R.; Quist, Michael C.
2014-01-01
All freshwater fish sampling methods are biased toward particular species, sizes, and sexes and are further influenced by season, habitat, and fish behavior changes over time. However, little is known about gear-specific biases for many common fish species because few multiple-gear comparison studies exist that have incorporated seasonal dynamics. We sampled six lakes and impoundments representing a diversity of trophic and physical conditions in Iowa, USA, using multiple gear types (i.e., standard modified fyke net, mini-modified fyke net, sinking experimental gill net, bag seine, benthic trawl, boat-mounted electrofisher used diurnally and nocturnally) to determine the influence of sampling methodology and season on fisheries assessments. Specifically, we describe the influence of season on catch per unit effort, proportional size distribution, and the number of samples required to obtain 125 stock-length individuals for 12 species of recreational and ecological importance. Mean catch per unit effort generally peaked in the spring and fall as a result of increased sampling effectiveness in shallow areas and seasonal changes in habitat use (e.g., movement offshore during summer). Mean proportional size distribution decreased from spring to fall for white bass Morone chrysops, largemouth bass Micropterus salmoides, bluegill Lepomis macrochirus, and black crappie Pomoxis nigromaculatus, suggesting selectivity for large and presumably sexually mature individuals in the spring and summer. Overall, the mean number of samples required to sample 125 stock-length individuals was minimized in the fall with sinking experimental gill nets, a boat-mounted electrofisher used at night, and standard modified nets for 11 of the 12 species evaluated. Our results provide fisheries scientists with relative comparisons between several recommended standard sampling methods and illustrate the effects of seasonal variation on estimates of population indices that will be critical to the future development of standardized sampling methods for freshwater fish in lentic ecosystems.
Ciceri, E; Recchia, S; Dossi, C; Yang, L; Sturgeon, R E
2008-01-15
The development and validation of a method for the determination of mercury in sediments using a sector field inductively coupled plasma mass spectrometer (SF-ICP-MS) for detection is described. The utilization of isotope dilution (ID) calibration is shown to solve analytical problems related to matrix composition. Mass bias is corrected using an internal mass bias correction technique, validated against the traditional standard bracketing method. The overall analytical protocol is validated against NRCC PACS-2 marine sediment CRM. The estimated limit of detection is 12ng/g. The proposed procedure was applied to the analysis of a real sediment core sampled to a depth of 160m in Lake Como, where Hg concentrations ranged from 66 to 750ng/g.
Wang, Chaolong; Schroeder, Kari B.; Rosenberg, Noah A.
2012-01-01
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology. PMID:22851645
Measurement and estimation of performance characteristics (i.e., precision, bias, performance range, interferences and sensitivity) are often neglected in the development and use of new biological sampling methods. However, knowledge of this information is critical in enabling p...
Importance sampling large deviations in nonequilibrium steady states. I.
Ray, Ushnish; Chan, Garnet Kin-Lic; Limmer, David T
2018-03-28
Large deviation functions contain information on the stability and response of systems driven into nonequilibrium steady states and in such a way are similar to free energies for systems at equilibrium. As with equilibrium free energies, evaluating large deviation functions numerically for all but the simplest systems is difficult because by construction they depend on exponentially rare events. In this first paper of a series, we evaluate different trajectory-based sampling methods capable of computing large deviation functions of time integrated observables within nonequilibrium steady states. We illustrate some convergence criteria and best practices using a number of different models, including a biased Brownian walker, a driven lattice gas, and a model of self-assembly. We show how two popular methods for sampling trajectory ensembles, transition path sampling and diffusion Monte Carlo, suffer from exponentially diverging correlations in trajectory space as a function of the bias parameter when estimating large deviation functions. Improving the efficiencies of these algorithms requires introducing guiding functions for the trajectories.
Importance sampling large deviations in nonequilibrium steady states. I
NASA Astrophysics Data System (ADS)
Ray, Ushnish; Chan, Garnet Kin-Lic; Limmer, David T.
2018-03-01
Large deviation functions contain information on the stability and response of systems driven into nonequilibrium steady states and in such a way are similar to free energies for systems at equilibrium. As with equilibrium free energies, evaluating large deviation functions numerically for all but the simplest systems is difficult because by construction they depend on exponentially rare events. In this first paper of a series, we evaluate different trajectory-based sampling methods capable of computing large deviation functions of time integrated observables within nonequilibrium steady states. We illustrate some convergence criteria and best practices using a number of different models, including a biased Brownian walker, a driven lattice gas, and a model of self-assembly. We show how two popular methods for sampling trajectory ensembles, transition path sampling and diffusion Monte Carlo, suffer from exponentially diverging correlations in trajectory space as a function of the bias parameter when estimating large deviation functions. Improving the efficiencies of these algorithms requires introducing guiding functions for the trajectories.
Schmidt, Joshua H; Wilson, Tammy L; Thompson, William L; Reynolds, Joel H
2017-07-01
Obtaining useful estimates of wildlife abundance or density requires thoughtful attention to potential sources of bias and precision, and it is widely understood that addressing incomplete detection is critical to appropriate inference. When the underlying assumptions of sampling approaches are violated, both increased bias and reduced precision of the population estimator may result. Bear ( Ursus spp.) populations can be difficult to sample and are often monitored using mark-recapture distance sampling (MRDS) methods, although obtaining adequate sample sizes can be cost prohibitive. With the goal of improving inference, we examined the underlying methodological assumptions and estimator efficiency of three datasets collected under an MRDS protocol designed specifically for bears. We analyzed these data using MRDS, conventional distance sampling (CDS), and open-distance sampling approaches to evaluate the apparent bias-precision tradeoff relative to the assumptions inherent under each approach. We also evaluated the incorporation of informative priors on detection parameters within a Bayesian context. We found that the CDS estimator had low apparent bias and was more efficient than the more complex MRDS estimator. When combined with informative priors on the detection process, precision was increased by >50% compared to the MRDS approach with little apparent bias. In addition, open-distance sampling models revealed a serious violation of the assumption that all bears were available to be sampled. Inference is directly related to the underlying assumptions of the survey design and the analytical tools employed. We show that for aerial surveys of bears, avoidance of unnecessary model complexity, use of prior information, and the application of open population models can be used to greatly improve estimator performance and simplify field protocols. Although we focused on distance sampling-based aerial surveys for bears, the general concepts we addressed apply to a variety of wildlife survey contexts.
Online Reinforcement Learning Using a Probability Density Estimation.
Agostini, Alejandro; Celaya, Enric
2017-01-01
Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space. We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.
Molloy, Kevin; Shehu, Amarda
2013-01-01
Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers.
Cardone, Antonio; Pant, Harish; Hassan, Sergio A.
2013-01-01
Weak and ultra-weak protein-protein association play a role in molecular recognition, and can drive spontaneous self-assembly and aggregation. Such interactions are difficult to detect experimentally, and are a challenge to the force field and sampling technique. A method is proposed to identify low-population protein-protein binding modes in aqueous solution. The method is designed to identify preferential first-encounter complexes from which the final complex(es) at equilibrium evolves. A continuum model is used to represent the effects of the solvent, which accounts for short- and long-range effects of water exclusion and for liquid-structure forces at protein/liquid interfaces. These effects control the behavior of proteins in close proximity and are optimized based on binding enthalpy data and simulations. An algorithm is described to construct a biasing function for self-adaptive configurational-bias Monte Carlo of a set of interacting proteins. The function allows mixing large and local changes in the spatial distribution of proteins, thereby enhancing sampling of relevant microstates. The method is applied to three binary systems. Generalization to multiprotein complexes is discussed. PMID:24044772
A Stellar Dynamical Black Hole Mass for the Reverberation Mapped AGN NGC 5273
NASA Astrophysics Data System (ADS)
Batiste, Merida; Bentz, Misty C.; Valluri, Monica; Onken, Christopher A.
2018-01-01
We present preliminary results from stellar dynamical modeling of the mass of the central super-massive black hole (MBH) in the active galaxy NGC 5273. NGC 5273 is one of the few AGN with a secure MBH measurement from reverberation-mapping that is also nearby enough to measure MBH with stellar dynamical modeling. Dynamical modeling and reverberation-mapping are the two most heavily favored methods of direct MBH determination in the literature, however the specific limitations of each method means that there are very few galaxies for which both can be used. To date only two such galaxies, NGC 3227 and NGC 4151, have MBH determinations from both methods. Given this small sample size, it is not yet clear that the two methods give consistent results. Moreover, given the inherent uncertainties and potential systematic biases in each method, it is likewise unclear whether one method should be preferred over the other. This study is part of an ongoing project to increase the sample of galaxies with secure MBH measurements from both methods, so that a direct comparison may be made. NGC 5273 provides a particularly valuable comparison because it is free of kinematic substructure (e.g. the presence of a bar, as is the case for NGC 4151) which can complicate and potentially bias results from stellar dynamical modeling. I will discuss our current results as well as the advantages and limitations of each method, and the potential sources of systematic bias that may affect comparison between results.
NASA Astrophysics Data System (ADS)
Schoenberg, Ronny; von Blanckenburg, Friedhelm
2005-04-01
Multicollector ICP-MS-based stable isotope procedures provide the capability to determine small variations in metal isotope composition of materials, but they are prone to substantial bias introduced by inadequate sample preparation. Such a "cryptic" bias is not necessarily identifiable from the measured isotope ratios. The analytical protocol for Fe isotope analyses of organic and inorganic materials described here identifies and avoids such pitfalls. In medium-mass resolution mode of the ThermoFinnigan Neptune MC-ICP-MS, a 1-ppm Fe solution with an uptake rate of 50-70 [mu]L min-1 yielded 3 × 10-11 A on 56Fe for the ThermoFinnigan stable introduction system and 1.2-1.8 × 10-10 A for the ESI Apex-Q uptake system. Sensitivity was increased again 3-5-fold when using Finnigan X-cones instead of the standard H-cones. The combination of the ESI Apex-Q apparatus and X-cones allowed the determination of the isotope composition on as little as 50 ng of Fe. Fe isotope compositions were corrected for mass bias with both the standard-sample bracketing (SSB) method, and by using the 65Cu/63Cu ratio of added synthetic copper (Cu-doping) as internal monitor of mass discrimination. Both methods provide identical results on high-purity Fe solutions of either synthetic or natural samples. We prefer the SSB method because of its shorter analysis time and more straightforward correction of instrumental mass bias compared to Cu-doping. Strong error correlations of the data are observed in three isotope diagrams. Thus, we suggest that the quality assessment in such diagrams should be performed with error ellipses rather than error bars. Reproducibility of [delta]56Fe, [delta]57Fe and [delta]58Fe values of natural samples alone is not a sufficient criterion for accuracy. A set of tests is lined out that identify cryptic matrix effects and ensure a reproducible level of quality control. Using these criteria and the SSB correction method, we determined the external reproducibilities for [delta]56Fe, [delta]57Fe and [delta]58Fe at the 95% confidence interval from 318 measurements of 95 natural samples to be 0.049, 0.071 and 0.28[per mille sign], respectively.
Binns, Michael; de Atauri, Pedro; Vlysidis, Anestis; Cascante, Marta; Theodoropoulos, Constantinos
2015-02-18
Flux balance analysis is traditionally implemented to identify the maximum theoretical flux for some specified reaction and a single distribution of flux values for all the reactions present which achieve this maximum value. However it is well known that the uncertainty in reaction networks due to branches, cycles and experimental errors results in a large number of combinations of internal reaction fluxes which can achieve the same optimal flux value. In this work, we have modified the applied linear objective of flux balance analysis to include a poling penalty function, which pushes each new set of reaction fluxes away from previous solutions generated. Repeated poling-based flux balance analysis generates a sample of different solutions (a characteristic set), which represents all the possible functionality of the reaction network. Compared to existing sampling methods, for the purpose of generating a relatively "small" characteristic set, our new method is shown to obtain a higher coverage than competing methods under most conditions. The influence of the linear objective function on the sampling (the linear bias) constrains optimisation results to a subspace of optimal solutions all producing the same maximal fluxes. Visualisation of reaction fluxes plotted against each other in 2 dimensions with and without the linear bias indicates the existence of correlations between fluxes. This method of sampling is applied to the organism Actinobacillus succinogenes for the production of succinic acid from glycerol. A new method of sampling for the generation of different flux distributions (sets of individual fluxes satisfying constraints on the steady-state mass balances of intermediates) has been developed using a relatively simple modification of flux balance analysis to include a poling penalty function inside the resulting optimisation objective function. This new methodology can achieve a high coverage of the possible flux space and can be used with and without linear bias to show optimal versus sub-optimal solution spaces. Basic analysis of the Actinobacillus succinogenes system using sampling shows that in order to achieve the maximal succinic acid production CO₂ must be taken into the system. Solutions involving release of CO₂ all give sub-optimal succinic acid production.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Wei; Krishnan, Kannan M.
Exchange bias has been investigated for more than half a century and several insightful reviews, published around the year 2000, have already summarized many key experimental and theoretical aspects related to this phenomenon. Since then, due to developments in thin-film fabrication and sophisticated characterization methods, exchange bias continues to show substantial advances; in particular, recent studies on epitaxial systems, which is the focus of this review, allow many long-standing mysteries of exchange bias to be unambiguously resolved. The advantage of epitaxial samples lies in the well-defined interface structures, larger coherence lengths, and competing magnetic anisotropies, which are often negligible inmore » polycrystalline samples. Beginning with a discussion of the microscopic spin properties at the ferromagnetic/antiferromagnetic interface, we correlate the details of spin lattices with phenomenological anisotropies, and finally connect the two by introducing realistic measurement approaches and models. We conclude by providing a brief perspective on the future of exchange bias and related studies in the context of the rapidly evolving interest in antiferromagnetic spintronics.« less
The Discovery of Single-Nucleotide Polymorphisms—and Inferences about Human Demographic History
Wakeley, John; Nielsen, Rasmus; Liu-Cordero, Shau Neen; Ardlie, Kristin
2001-01-01
A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation “SDL,” for “SNP-discovered locus,” in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled. PMID:11704929
Nakamura, Masakazu; Iso, Hiroyasu; Kitamura, Akihiko; Imano, Hironori; Noda, Hiroyuki; Kiyama, Masahiko; Sato, Shinichi; Yamagishi, Kazumasa; Nishimura, Kunihiro; Nakai, Michikazu; Vesper, Hubert W; Teramoto, Tamio; Miyamoto, Yoshihiro
2016-11-01
Background The US Centers for Disease Control and Prevention ensured adequate performance of the routine triglycerides methods used in Japan by a chromotropic acid reference measurement procedure used by the Centers for Disease Control and Prevention lipid standardization programme as a reference point. We examined standardized data to clarify the performance of routine triglycerides methods. Methods The two routine triglycerides methods were the fluorometric method of Kessler and Lederer and the enzymatic method. The methods were standardized using 495 Centers for Disease Control and Prevention reference pools with 98 different concentrations ranging between 0.37 and 5.15 mmol/L in 141 survey runs. The triglycerides criteria for laboratories which perform triglycerides analyses are used: accuracy, as bias ≤5% from the Centers for Disease Control and Prevention reference value and precision, as measured by CV, ≤5%. Results The correlation of the bias of both methods to the Centers for Disease Control and Prevention reference method was: y (%bias) = 0.516 × (Centers for Disease Control and Prevention reference value) -1.292 ( n = 495, R 2 = 0.018). Triglycerides bias at medical decision points of 1.13, 1.69 and 2.26 mmol/L was -0.71%, -0.42% and -0.13%, respectively. For the combined precision, the equation y (CV) = -0.398 × (triglycerides value) + 1.797 ( n = 495, R 2 = 0.081) was used. Precision was 1.35%, 1.12% and 0.90%, respectively. It was shown that triglycerides measurements at Osaka were stable for 36 years. Conclusions The epidemiologic laboratory in Japan met acceptable accuracy goals for 88.7% of all samples, and met acceptable precision goals for 97.8% of all samples measured through the Centers for Disease Control and Prevention lipid standardization programme and demonstrated stable results for an extended period of time.
Nakamura, Masakazu; Iso, Hiroyasu; Kitamura, Akihiko; Imano, Hironori; Noda, Hiroyuki; Kiyama, Masahiko; Sato, Shinichi; Yamagishi, Kazumasa; Nishimura, Kunihiro; Nakai, Michikazu; Vesper, Hubert W; Teramoto, Tamio; Miyamoto, Yoshihiro
2017-01-01
Background The US Centers for Disease Control and Prevention ensured adequate performance of the routine triglycerides methods used in Japan by a chromotropic acid reference measurement procedure used by the Centers for Disease Control and Prevention lipid standardization programme as a reference point. We examined standardized data to clarify the performance of routine triglycerides methods. Methods The two routine triglycerides methods were the fluorometric method of Kessler and Lederer and the enzymatic method. The methods were standardized using 495 Centers for Disease Control and Prevention reference pools with 98 different concentrations ranging between 0.37 and 5.15 mmol/L in 141 survey runs. The triglycerides criteria for laboratories which perform triglycerides analyses are used: accuracy, as bias ≤5% from the Centers for Disease Control and Prevention reference value and precision, as measured by CV, ≤5%. Results The correlation of the bias of both methods to the Centers for Disease Control and Prevention reference method was: y (%bias) = 0.516 × (Centers for Disease Control and Prevention reference value) −1.292 (n = 495, R2 = 0.018). Triglycerides bias at medical decision points of 1.13, 1.69 and 2.26 mmol/L was −0.71%, −0.42% and −0.13%, respectively. For the combined precision, the equation y (CV) = −0.398 × (triglycerides value) + 1.797 (n = 495, R2 = 0.081) was used. Precision was 1.35%, 1.12% and 0.90%, respectively. It was shown that triglycerides measurements at Osaka were stable for 36 years. Conclusions The epidemiologic laboratory in Japan met acceptable accuracy goals for 88.7% of all samples, and met acceptable precision goals for 97.8% of all samples measured through the Centers for Disease Control and Prevention lipid standardization programme and demonstrated stable results for an extended period of time. PMID:26680645
Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap.
Spiwok, Vojtěch; Králová, Blanka
2011-12-14
Atomic motions in molecules are not linear. This infers that nonlinear dimensionality reduction methods can outperform linear ones in analysis of collective atomic motions. In addition, nonlinear collective motions can be used as potentially efficient guides for biased simulation techniques. Here we present a simulation with a bias potential acting in the directions of collective motions determined by a nonlinear dimensionality reduction method. Ad hoc generated conformations of trans,trans-1,2,4-trifluorocyclooctane were analyzed by Isomap method to map these 72-dimensional coordinates to three dimensions, as described by Brown and co-workers [J. Chem. Phys. 129, 064118 (2008)]. Metadynamics employing the three-dimensional embeddings as collective variables was applied to explore all relevant conformations of the studied system and to calculate its conformational free energy surface. The method sampled all relevant conformations (boat, boat-chair, and crown) and corresponding transition structures inaccessible by an unbiased simulation. This scheme allows to use essentially any parameter of the system as a collective variable in biased simulations. Moreover, the scheme we used for mapping out-of-sample conformations from the 72D to 3D space can be used as a general purpose mapping for dimensionality reduction, beyond the context of molecular modeling. © 2011 American Institute of Physics
NASA Astrophysics Data System (ADS)
Oh, Seok-Geun; Suh, Myoung-Seok
2017-07-01
The projection skills of five ensemble methods were analyzed according to simulation skills, training period, and ensemble members, using 198 sets of pseudo-simulation data (PSD) produced by random number generation assuming the simulated temperature of regional climate models. The PSD sets were classified into 18 categories according to the relative magnitude of bias, variance ratio, and correlation coefficient, where each category had 11 sets (including 1 truth set) with 50 samples. The ensemble methods used were as follows: equal weighted averaging without bias correction (EWA_NBC), EWA with bias correction (EWA_WBC), weighted ensemble averaging based on root mean square errors and correlation (WEA_RAC), WEA based on the Taylor score (WEA_Tay), and multivariate linear regression (Mul_Reg). The projection skills of the ensemble methods improved generally as compared with the best member for each category. However, their projection skills are significantly affected by the simulation skills of the ensemble member. The weighted ensemble methods showed better projection skills than non-weighted methods, in particular, for the PSD categories having systematic biases and various correlation coefficients. The EWA_NBC showed considerably lower projection skills than the other methods, in particular, for the PSD categories with systematic biases. Although Mul_Reg showed relatively good skills, it showed strong sensitivity to the PSD categories, training periods, and number of members. On the other hand, the WEA_Tay and WEA_RAC showed relatively superior skills in both the accuracy and reliability for all the sensitivity experiments. This indicates that WEA_Tay and WEA_RAC are applicable even for simulation data with systematic biases, a short training period, and a small number of ensemble members.
Le Mens, Gaël; Denrell, Jerker
2011-04-01
Recent research has argued that several well-known judgment biases may be due to biases in the available information sample rather than to biased information processing. Most of these sample-based explanations assume that decision makers are "naive": They are not aware of the biases in the available information sample and do not correct for them. Here, we show that this "naivety" assumption is not necessary. Systematically biased judgments can emerge even when decision makers process available information perfectly and are also aware of how the information sample has been generated. Specifically, we develop a rational analysis of Denrell's (2005) experience sampling model, and we prove that when information search is interested rather than disinterested, even rational information sampling and processing can give rise to systematic patterns of errors in judgments. Our results illustrate that a tendency to favor alternatives for which outcome information is more accessible can be consistent with rational behavior. The model offers a rational explanation for behaviors that had previously been attributed to cognitive and motivational biases, such as the in-group bias or the tendency to prefer popular alternatives. 2011 APA, all rights reserved
Parameter recovery, bias and standard errors in the linear ballistic accumulator model.
Visser, Ingmar; Poessé, Rens
2017-05-01
The linear ballistic accumulator (LBA) model (Brown & Heathcote, , Cogn. Psychol., 57, 153) is increasingly popular in modelling response times from experimental data. An R package, glba, has been developed to fit the LBA model using maximum likelihood estimation which is validated by means of a parameter recovery study. At sufficient sample sizes parameter recovery is good, whereas at smaller sample sizes there can be large bias in parameters. In a second simulation study, two methods for computing parameter standard errors are compared. The Hessian-based method is found to be adequate and is (much) faster than the alternative bootstrap method. The use of parameter standard errors in model selection and inference is illustrated in an example using data from an implicit learning experiment (Visser et al., , Mem. Cogn., 35, 1502). It is shown that typical implicit learning effects are captured by different parameters of the LBA model. © 2017 The British Psychological Society.
Gravimetric Analysis of Particulate Matter using Air Samplers Housing Internal Filtration Capsules.
O'Connor, Sean; O'Connor, Paula Fey; Feng, H Amy; Ashley, Kevin
2014-10-01
An evaluation was carried out to investigate the suitability of polyvinyl chloride (PVC) internal capsules, housed within air sampling devices, for gravimetric analysis of airborne particles collected in workplaces. Experiments were carried out using blank PVC capsules and PVC capsules spiked with 0,1 - 4 mg of National Institute of Standards and Technology Standard Reference Material ® (NIST SRM) 1648 (Urban Particulate Matter) and Arizona Road Dust (Air Cleaner Test Dust). The capsules were housed within plastic closed-face cassette samplers (CFCs). A method detection limit (MDL) of 0,075 mg per sample was estimated. Precision S r at 0,5 - 4 mg per sample was 0,031 and the estimated bias was 0,058. Weight stability over 28 days was verified for both blanks and spiked capsules. Independent laboratory testing on blanks and field samples verified long-term weight stability as well as sampling and analysis precision and bias estimates. An overall precision estimate Ŝ rt of 0,059 was obtained. An accuracy measure of ±15,5% was found for the gravimetric method using PVC internal capsules.
Gravimetric Analysis of Particulate Matter using Air Samplers Housing Internal Filtration Capsules
O'Connor, Sean; O'Connor, Paula Fey; Feng, H. Amy
2015-01-01
Summary An evaluation was carried out to investigate the suitability of polyvinyl chloride (PVC) internal capsules, housed within air sampling devices, for gravimetric analysis of airborne particles collected in workplaces. Experiments were carried out using blank PVC capsules and PVC capsules spiked with 0,1 – 4 mg of National Institute of Standards and Technology Standard Reference Material® (NIST SRM) 1648 (Urban Particulate Matter) and Arizona Road Dust (Air Cleaner Test Dust). The capsules were housed within plastic closed-face cassette samplers (CFCs). A method detection limit (MDL) of 0,075 mg per sample was estimated. Precision Sr at 0,5 - 4 mg per sample was 0,031 and the estimated bias was 0,058. Weight stability over 28 days was verified for both blanks and spiked capsules. Independent laboratory testing on blanks and field samples verified long-term weight stability as well as sampling and analysis precision and bias estimates. An overall precision estimate Ŝrt of 0,059 was obtained. An accuracy measure of ±15,5% was found for the gravimetric method using PVC internal capsules. PMID:26435581
Enhanced conformational sampling of carbohydrates by Hamiltonian replica-exchange simulation.
Mishra, Sushil Kumar; Kara, Mahmut; Zacharias, Martin; Koca, Jaroslav
2014-01-01
Knowledge of the structure and conformational flexibility of carbohydrates in an aqueous solvent is important to improving our understanding of how carbohydrates function in biological systems. In this study, we extend a variant of the Hamiltonian replica-exchange molecular dynamics (MD) simulation to improve the conformational sampling of saccharides in an explicit solvent. During the simulations, a biasing potential along the glycosidic-dihedral linkage between the saccharide monomer units in an oligomer is applied at various levels along the replica runs to enable effective transitions between various conformations. One reference replica runs under the control of the original force field. The method was tested on disaccharide structures and further validated on biologically relevant blood group B, Lewis X and Lewis A trisaccharides. The biasing potential-based replica-exchange molecular dynamics (BP-REMD) method provided a significantly improved sampling of relevant conformational states compared with standard continuous MD simulations, with modest computational costs. Thus, the proposed BP-REMD approach adds a new dimension to existing carbohydrate conformational sampling approaches by enhancing conformational sampling in the presence of solvent molecules explicitly at relatively low computational cost.
Pal, Raktim; Kim, Ki-Hyun
2008-03-10
In this study, the analytical bias involved in the application of the 2,4-dinitrophenylhydrazine (2,4-DNPH)-coated cartridge sampling method was investigated for the analysis of five atmospheric carbonyl species (i.e., acetaldehyde, propionaldehyde, butyraldehyde, isovaleraldehyde, and valeraldehyde). In order to evaluate the potential bias of the sampling technique, a series of the laboratory experiments were conducted to cover a wide range of volumes (1-20 L) and concentration levels (approximately 100-2000 ppb in case of acetaldehyde). The results of these experiments were then evaluated in terms of the recovery rate (RR) for each carbonyl species. The detection properties of these carbonyls were clearly distinguished between light and heavy species in terms of RR and its relative standard error (R.S.E.). It also indicates that the studied analytical approach can yield the most reliable pattern for light carbonyls, especially acetaldehyde. When these experimental results were tested further by a two-factor analysis of variance (ANOVA), the analysis based on the cartridge sampling method is affected more sensitively by the concentration levels of samples rather than the sampling volume.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feng, Tao; Tsui, Benjamin M. W.; Li, Xin
Purpose: The radioligand {sup 11}C-KR31173 has been introduced for positron emission tomography (PET) imaging of the angiotensin II subtype 1 receptor in the kidney in vivo. To study the biokinetics of {sup 11}C-KR31173 with a compartmental model, the input function is needed. Collection and analysis of arterial blood samples are the established approach to obtain the input function but they are not feasible in patients with renal diseases. The goal of this study was to develop a quantitative technique that can provide an accurate image-derived input function (ID-IF) to replace the conventional invasive arterial sampling and test the method inmore » pigs with the goal of translation into human studies. Methods: The experimental animals were injected with [{sup 11}C]KR31173 and scanned up to 90 min with dynamic PET. Arterial blood samples were collected for the artery derived input function (AD-IF) and used as a gold standard for ID-IF. Before PET, magnetic resonance angiography of the kidneys was obtained to provide the anatomical information required for derivation of the recovery coefficients in the abdominal aorta, a requirement for partial volume correction of the ID-IF. Different image reconstruction methods, filtered back projection (FBP) and ordered subset expectation maximization (OS-EM), were investigated for the best trade-off between bias and variance of the ID-IF. The effects of kidney uptakes on the quantitative accuracy of ID-IF were also studied. Biological variables such as red blood cell binding and radioligand metabolism were also taken into consideration. A single blood sample was used for calibration in the later phase of the input function. Results: In the first 2 min after injection, the OS-EM based ID-IF was found to be biased, and the bias was found to be induced by the kidney uptake. No such bias was found with the FBP based image reconstruction method. However, the OS-EM based image reconstruction was found to reduce variance in the subsequent phase of the ID-IF. The combined use of FBP and OS-EM resulted in reduced bias and noise. After performing all the necessary corrections, the areas under the curves (AUCs) of the AD-IF were close to that of the AD-IF (average AUC ratio =1 ± 0.08) during the early phase. When applied in a two-tissue-compartmental kinetic model, the average difference between the estimated model parameters from ID-IF and AD-IF was 10% which was within the error of the estimation method. Conclusions: The bias of radioligand concentration in the aorta from the OS-EM image reconstruction is significantly affected by radioligand uptake in the adjacent kidney and cannot be neglected for quantitative evaluation. With careful calibrations and corrections, the ID-IF derived from quantitative dynamic PET images can be used as the input function of the compartmental model to quantify the renal kinetics of {sup 11}C-KR31173 in experimental animals and the authors intend to evaluate this method in future human studies.« less
An experimental verification of laser-velocimeter sampling bias and its correction
NASA Technical Reports Server (NTRS)
Johnson, D. A.; Modarress, D.; Owen, F. K.
1982-01-01
The existence of 'sampling bias' in individual-realization laser velocimeter measurements is experimentally verified and shown to be independent of sample rate. The experiments were performed in a simple two-stream mixing shear flow with the standard for comparison being laser-velocimeter results obtained under continuous-wave conditions. It is also demonstrated that the errors resulting from sampling bias can be removed by a proper interpretation of the sampling statistics. In addition, data obtained in a shock-induced separated flow and in the near-wake of airfoils are presented, both bias-corrected and uncorrected, to illustrate the effects of sampling bias in the extreme.
40 CFR 53.35 - Test procedure for Class II and Class III methods for PM2.5 and PM-2.5
Code of Federal Regulations, 2010 CFR
2010-07-01
... reference method samplers shall be of single-filter design (not multi-filter, sequential sample design... and multiplicative bias (comparative slope and intercept). (1) For each test site, calculate the mean...
40 CFR 53.35 - Test procedure for Class II and Class III methods for PM2.5 and PM-2.5
Code of Federal Regulations, 2011 CFR
2011-07-01
... reference method samplers shall be of single-filter design (not multi-filter, sequential sample design... and multiplicative bias (comparative slope and intercept). (1) For each test site, calculate the mean...
40 CFR 53.35 - Test procedure for Class II and Class III methods for PM2.5 and PM−2.5.
Code of Federal Regulations, 2012 CFR
2012-07-01
... reference method samplers shall be of single-filter design (not multi-filter, sequential sample design... and multiplicative bias (comparative slope and intercept). (1) For each test site, calculate the mean...
Cutts, Felicity T.; Izurieta, Hector S.; Rhoda, Dale A.
2013-01-01
Vaccination coverage is an important public health indicator that is measured using administrative reports and/or surveys. The measurement of vaccination coverage in low- and middle-income countries using surveys is susceptible to numerous challenges. These challenges include selection bias and information bias, which cannot be solved by increasing the sample size, and the precision of the coverage estimate, which is determined by the survey sample size and sampling method. Selection bias can result from an inaccurate sampling frame or inappropriate field procedures and, since populations likely to be missed in a vaccination coverage survey are also likely to be missed by vaccination teams, most often inflates coverage estimates. Importantly, the large multi-purpose household surveys that are often used to measure vaccination coverage have invested substantial effort to reduce selection bias. Information bias occurs when a child's vaccination status is misclassified due to mistakes on his or her vaccination record, in data transcription, in the way survey questions are presented, or in the guardian's recall of vaccination for children without a written record. There has been substantial reliance on the guardian's recall in recent surveys, and, worryingly, information bias may become more likely in the future as immunization schedules become more complex and variable. Finally, some surveys assess immunity directly using serological assays. Sero-surveys are important for assessing public health risk, but currently are unable to validate coverage estimates directly. To improve vaccination coverage estimates based on surveys, we recommend that recording tools and practices should be improved and that surveys should incorporate best practices for design, implementation, and analysis. PMID:23667334
NASA Astrophysics Data System (ADS)
Zhang, Wenlei; Hirai, Yoshikazu; Tsuchiya, Toshiyuki; Tabata, Osamu
2018-06-01
Tensile strength and strength distribution in a microstructure of single crystal silicon (SCS) were improved significantly by coating the surface with a diamond-like carbon (DLC) film. To explore the influence of coating parameters and the mechanism of film fracture, SCS microstructure surfaces (120 × 4 × 5 μm3) were fully coated by plasma enhanced chemical vapor deposition (PECVD) of a DLC at five different bias voltages. After the depositions, Raman spectroscopy, X-ray photoelectron spectroscopy (XPS), thermal desorption spectrometry (TDS), surface profilometry, atomic force microscope (AFM) measurement, and nanoindentation methods were used to study the chemical and mechanical properties of the deposited DLC films. Tensile test indicated that the average strength of coated samples was 13.2-29.6% higher than that of the SCS sample, and samples fabricated with a -400 V bias voltage were strongest. The fracture toughness of the DLC film was the dominant factor in the observed tensile strength. Deviations in strength were reduced with increasingly negative bias voltage. The effect of residual stress on the tensile properties is discussed in detail.
Conductance switching in Ag(2)S devices fabricated by in situ sulfurization.
Morales-Masis, M; van der Molen, S J; Fu, W T; Hesselberth, M B; van Ruitenbeek, J M
2009-03-04
We report a simple and reproducible method to fabricate switchable Ag(2)S devices. The alpha-Ag(2)S thin films are produced by a sulfurization process after silver deposition on an Si substrate. Structure and composition of the Ag(2)S are characterized using XRD and RBS. Our samples show semiconductor behaviour at low bias voltages, whereas they exhibit reproducible bipolar resistance switching at higher bias voltages. The transition between both types of behaviour is observed by hysteresis in the I-V curves, indicating decomposition of the Ag(2)S, increasing the Ag(+) ion mobility. The as-fabricated Ag(2)S samples are a good candidate for future solid state memory devices, as they show reproducible memory resistive properties and they are fabricated by an accessible and reliable method.
Cid, Jaime A; von Davier, Alina A
2015-05-01
Test equating is a method of making the test scores from different test forms of the same assessment comparable. In the equating process, an important step involves continuizing the discrete score distributions. In traditional observed-score equating, this step is achieved using linear interpolation (or an unscaled uniform kernel). In the kernel equating (KE) process, this continuization process involves Gaussian kernel smoothing. It has been suggested that the choice of bandwidth in kernel smoothing controls the trade-off between variance and bias. In the literature on estimating density functions using kernels, it has also been suggested that the weight of the kernel depends on the sample size, and therefore, the resulting continuous distribution exhibits bias at the endpoints, where the samples are usually smaller. The purpose of this article is (a) to explore the potential effects of atypical scores (spikes) at the extreme ends (high and low) on the KE method in distributions with different degrees of asymmetry using the randomly equivalent groups equating design (Study I), and (b) to introduce the Epanechnikov and adaptive kernels as potential alternative approaches to reducing boundary bias in smoothing (Study II). The beta-binomial model is used to simulate observed scores reflecting a range of different skewed shapes.
Aczel, Balazs; Bago, Bence; Szollosi, Aba; Foldes, Andrei; Lukacs, Bence
2015-01-01
The aim of this study was to initiate the exploration of debiasing methods applicable in real-life settings for achieving lasting improvement in decision making competence regarding multiple decision biases. Here, we tested the potentials of the analogical encoding method for decision debiasing. The advantage of this method is that it can foster the transfer from learning abstract principles to improving behavioral performance. For the purpose of the study, we devised an analogical debiasing technique for 10 biases (covariation detection, insensitivity to sample size, base rate neglect, regression to the mean, outcome bias, sunk cost fallacy, framing effect, anchoring bias, overconfidence bias, planning fallacy) and assessed the susceptibility of the participants (N = 154) to these biases before and 4 weeks after the training. We also compared the effect of the analogical training to the effect of ‘awareness training’ and a ‘no-training’ control group. Results suggested improved performance of the analogical training group only on tasks where the violations of statistical principles are measured. The interpretation of these findings require further investigation, yet it is possible that analogical training may be the most effective in the case of learning abstract concepts, such as statistical principles, which are otherwise difficult to master. The study encourages a systematic research of debiasing trainings and the development of intervention assessment methods to measure the endurance of behavior change in decision debiasing. PMID:26300816
Ke, Peifeng; Liu, Jiawei; Chao, Yan; Wu, Xiaobin; Xiong, Yujuan; Lin, Li; Wan, Zemin; Wu, Xinzhong; Xu, Jianhua; Zhuang, Junhua; Huang, Xianzhang
2017-01-01
Introduction Thalassemia could interfere with some assays for haemoglobin A1c (HbA1c) measurement, therefore, it is useful to be able to screen for thalassemia while measuring HbA1c. We used Capillarys 2 Flex Piercing (Capillarys 2FP) HbA1c programme to simultaneously measure HbA1c and screen for thalassemia. Materials and methods Samples from 498 normal controls and 175 thalassemia patients were analysed by Capillarys 2FP HbA1c programme (Sebia, France). For method comparison, HbA1c was quantified by Premier Hb9210 (Trinity Biotech, Ireland) in 98 thalassaemia patients samples. For verification, HbA1c from eight thalassaemia patients was confirmed by IFCC reference method. Results Among 98 thalassaemia samples, Capillarys 2FP did not provide an HbA1c result in three samples with HbH due to the overlapping of HbBart’s with HbA1c fraction; for the remaining 95 thalassaemia samples, Bland-Altman plot showed 0.00 ± 0.35% absolute bias between two systems, and a significant positive bias above 7% was observed only in two HbH samples. The HbA1c values obtained by Capillarys 2FP were consistent with the IFCC targets (relative bias below ± 6%) in all of the eight samples tested by both methods. For screening samples with alpha (α-) thalassaemia silent/trait or beta (β-) thalassemia trait, the optimal HbA2 cut-off values were ≤ 2.2% and > 2.8%, respectively. Conclusions Our results demonstrated the Capillarys 2FP HbA1c system could report an accurate HbA1c value in thalassemia silent/trait, and HbA2 value (≤ 2.2% for α-thalassaemia silent/trait and > 2.8% for β-thalassemia trait) and abnormal bands (HbH and/or HbBart’s for HbH disease, HbF for β-thalassemia) may provide valuable information for screening. PMID:28900367
The lack of selection bias in a snowball sampled case-control study on drug abuse.
Lopes, C S; Rodrigues, L C; Sichieri, R
1996-12-01
Friend controls in matched case-control studies can be a potential source of bias based on the assumption that friends are more likely to share exposure factors. This study evaluates the role of selection bias in a case-control study that used the snowball sampling method based on friendship for the selection of cases and controls. The cases selected fro the study were drug abusers located in the community. Exposure was defined by the presence of at least one psychiatric diagnosis. Psychiatric and drug abuse/dependence diagnoses were made according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-III-R) criteria. Cases and controls were matched on sex, age and friendship. The measurement of selection bias was made through the comparison of the proportion of exposed controls selected by exposed cases (p1) with the proportion of exposed controls selected by unexposed cases (p2). If p1 = p2 then, selection bias should not occur. The observed distribution of the 185 matched pairs having at least one psychiatric disorder showed a p1 value of 0.52 and a p2 value of 0.51, indicating no selection bias in this study. Our findings support the idea that the use of friend controls can produce a valid basis for a case-control study.
The Adaptive Biasing Force Method: Everything You Always Wanted To Know but Were Afraid To Ask
2014-01-01
In the host of numerical schemes devised to calculate free energy differences by way of geometric transformations, the adaptive biasing force algorithm has emerged as a promising route to map complex free-energy landscapes. It relies upon the simple concept that as a simulation progresses, a continuously updated biasing force is added to the equations of motion, such that in the long-time limit it yields a Hamiltonian devoid of an average force acting along the transition coordinate of interest. This means that sampling proceeds uniformly on a flat free-energy surface, thus providing reliable free-energy estimates. Much of the appeal of the algorithm to the practitioner is in its physically intuitive underlying ideas and the absence of any requirements for prior knowledge about free-energy landscapes. Since its inception in 2001, the adaptive biasing force scheme has been the subject of considerable attention, from in-depth mathematical analysis of convergence properties to novel developments and extensions. The method has also been successfully applied to many challenging problems in chemistry and biology. In this contribution, the method is presented in a comprehensive, self-contained fashion, discussing with a critical eye its properties, applicability, and inherent limitations, as well as introducing novel extensions. Through free-energy calculations of prototypical molecular systems, many methodological aspects are examined, from stratification strategies to overcoming the so-called hidden barriers in orthogonal space, relevant not only to the adaptive biasing force algorithm but also to other importance-sampling schemes. On the basis of the discussions in this paper, a number of good practices for improving the efficiency and reliability of the computed free-energy differences are proposed. PMID:25247823
Peng, Enxi; Todorova, Nevena
2017-01-01
Although several computational modelling studies have investigated the conformational behaviour of inherently disordered protein (IDP) amylin, discrepancies in identifying its preferred solution conformations still exist between various forcefields and sampling methods used. Human islet amyloid polypeptide has long been a subject of research, both experimentally and theoretically, as the aggregation of this protein is believed to be the lead cause of type-II diabetes. In this work, we present a systematic forcefield assessment using one of the most advanced non-biased sampling techniques, Replica Exchange with Solute Tempering (REST2), by comparing the secondary structure preferences of monomeric amylin in solution. This study also aims to determine the ability of common forcefields to sample a transition of the protein from a helical membrane bound conformation into the disordered solution state of amylin. Our results demonstrated that the CHARMM22* forcefield showed the best ability to sample multiple conformational states inherent for amylin. It is revealed that REST2 yielded results qualitatively consistent with experiments and in quantitative agreement with other sampling methods, however far more computationally efficiently and without any bias. Therefore, combining an unbiased sampling technique such as REST2 with a vigorous forcefield testing could be suggested as an important step in developing an efficient and robust strategy for simulating IDPs. PMID:29023509
Peng, Enxi; Todorova, Nevena; Yarovsky, Irene
2017-01-01
Although several computational modelling studies have investigated the conformational behaviour of inherently disordered protein (IDP) amylin, discrepancies in identifying its preferred solution conformations still exist between various forcefields and sampling methods used. Human islet amyloid polypeptide has long been a subject of research, both experimentally and theoretically, as the aggregation of this protein is believed to be the lead cause of type-II diabetes. In this work, we present a systematic forcefield assessment using one of the most advanced non-biased sampling techniques, Replica Exchange with Solute Tempering (REST2), by comparing the secondary structure preferences of monomeric amylin in solution. This study also aims to determine the ability of common forcefields to sample a transition of the protein from a helical membrane bound conformation into the disordered solution state of amylin. Our results demonstrated that the CHARMM22* forcefield showed the best ability to sample multiple conformational states inherent for amylin. It is revealed that REST2 yielded results qualitatively consistent with experiments and in quantitative agreement with other sampling methods, however far more computationally efficiently and without any bias. Therefore, combining an unbiased sampling technique such as REST2 with a vigorous forcefield testing could be suggested as an important step in developing an efficient and robust strategy for simulating IDPs.
Zerze, Gül H; Miller, Cayla M; Granata, Daniele; Mittal, Jeetain
2015-06-09
Intrinsically disordered proteins (IDPs), which are expected to be largely unstructured under physiological conditions, make up a large fraction of eukaryotic proteins. Molecular dynamics simulations have been utilized to probe structural characteristics of these proteins, which are not always easily accessible to experiments. However, exploration of the conformational space by brute force molecular dynamics simulations is often limited by short time scales. Present literature provides a number of enhanced sampling methods to explore protein conformational space in molecular simulations more efficiently. In this work, we present a comparison of two enhanced sampling methods: temperature replica exchange molecular dynamics and bias exchange metadynamics. By investigating both the free energy landscape as a function of pertinent order parameters and the per-residue secondary structures of an IDP, namely, human islet amyloid polypeptide, we found that the two methods yield similar results as expected. We also highlight the practical difference between the two methods by describing the path that we followed to obtain both sets of data.
Exploring high dimensional free energy landscapes: Temperature accelerated sliced sampling
NASA Astrophysics Data System (ADS)
Awasthi, Shalini; Nair, Nisanth N.
2017-03-01
Biased sampling of collective variables is widely used to accelerate rare events in molecular simulations and to explore free energy surfaces. However, computational efficiency of these methods decreases with increasing number of collective variables, which severely limits the predictive power of the enhanced sampling approaches. Here we propose a method called Temperature Accelerated Sliced Sampling (TASS) that combines temperature accelerated molecular dynamics with umbrella sampling and metadynamics to sample the collective variable space in an efficient manner. The presented method can sample a large number of collective variables and is advantageous for controlled exploration of broad and unbound free energy basins. TASS is also shown to achieve quick free energy convergence and is practically usable with ab initio molecular dynamics techniques.
Spectral gap optimization of order parameters for sampling complex molecular systems
Tiwary, Pratyush; Berne, B. J.
2016-01-01
In modern-day simulations of many-body systems, much of the computational complexity is shifted to the identification of slowly changing molecular order parameters called collective variables (CVs) or reaction coordinates. A vast array of enhanced-sampling methods are based on the identification and biasing of these low-dimensional order parameters, whose fluctuations are important in driving rare events of interest. Here, we describe a new algorithm for finding optimal low-dimensional CVs for use in enhanced-sampling biasing methods like umbrella sampling, metadynamics, and related methods, when limited prior static and dynamic information is known about the system, and a much larger set of candidate CVs is specified. The algorithm involves estimating the best combination of these candidate CVs, as quantified by a maximum path entropy estimate of the spectral gap for dynamics viewed as a function of that CV. The algorithm is called spectral gap optimization of order parameters (SGOOP). Through multiple practical examples, we show how this postprocessing procedure can lead to optimization of CV and several orders of magnitude improvement in the convergence of the free energy calculated through metadynamics, essentially giving the ability to extract useful information even from unsuccessful metadynamics runs. PMID:26929365
Dorazio, R.M.; Rago, P.J.
1991-01-01
We simulated mark–recapture experiments to evaluate a method for estimating fishing mortality and migration rates of populations stratified at release and recovery. When fish released in two or more strata were recovered from different recapture strata in nearly the same proportions, conditional recapture probabilities were estimated outside the [0, 1] interval. The maximum likelihood estimates tended to be biased and imprecise when the patterns of recaptures produced extremely "flat" likelihood surfaces. Absence of bias was not guaranteed, however, in experiments where recapture rates could be estimated within the [0, 1] interval. Inadequate numbers of tag releases and recoveries also produced biased estimates, although the bias was easily detected by the high sampling variability of the estimates. A stratified tag–recapture experiment with sockeye salmon (Oncorhynchus nerka) was used to demonstrate procedures for analyzing data that produce biased estimates of recapture probabilities. An estimator was derived to examine the sensitivity of recapture rate estimates to assumed differences in natural and tagging mortality, tag loss, and incomplete reporting of tag recoveries.
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
El-Gabbas, Ahmed; Dormann, Carsten F
2018-02-01
Species distribution modeling (SDM) is an essential method in ecology and conservation. SDMs are often calibrated within one country's borders, typically along a limited environmental gradient with biased and incomplete data, making the quality of these models questionable. In this study, we evaluated how adequate are national presence-only data for calibrating regional SDMs. We trained SDMs for Egyptian bat species at two different scales: only within Egypt and at a species-specific global extent. We used two modeling algorithms: Maxent and elastic net, both under the point-process modeling framework. For each modeling algorithm, we measured the congruence of the predictions of global and regional models for Egypt, assuming that the lower the congruence, the lower the appropriateness of the Egyptian dataset to describe the species' niche. We inspected the effect of incorporating predictions from global models as additional predictor ("prior") to regional models, and quantified the improvement in terms of AUC and the congruence between regional models run with and without priors. Moreover, we analyzed predictive performance improvements after correction for sampling bias at both scales. On average, predictions from global and regional models in Egypt only weakly concur. Collectively, the use of priors did not lead to much improvement: similar AUC and high congruence between regional models calibrated with and without priors. Correction for sampling bias led to higher model performance, whatever prior used, making the use of priors less pronounced. Under biased and incomplete sampling, the use of global bats data did not improve regional model performance. Without enough bias-free regional data, we cannot objectively identify the actual improvement of regional models after incorporating information from the global niche. However, we still believe in great potential for global model predictions to guide future surveys and improve regional sampling in data-poor regions.
Post-standardization of routine creatinine assays: are they suitable for clinical applications.
Jassam, Nuthar; Weykamp, Cas; Thomas, Annette; Secchiero, Sandra; Sciacovelli, Laura; Plebani, Mario; Thelen, Marc; Cobbaert, Christa; Perich, Carmen; Ricós, Carmen; Paula, Faria A; Barth, Julian H
2017-05-01
Introduction Reliable serum creatinine measurements are of vital importance for the correct classification of chronic kidney disease and early identification of kidney injury. The National Kidney Disease Education Programme working group and other groups have defined clinically acceptable analytical limits for creatinine methods. The aim of this study was to re-evaluate the performance of routine creatinine methods in the light of these defined limits so as to assess their suitability for clinical practice. Method In collaboration with the Dutch External Quality Assurance scheme, six frozen commutable samples, with a creatinine concentration ranging from 80 to 239 μmol/L and traceable to isotope dilution mass spectrometry, were circulated to 91 laboratories in four European countries for creatinine measurement and estimated glomerular filtration rate calculation. Two out of the six samples were spiked with glucose to give high and low final concentrations of glucose. Results Results from 89 laboratories were analysed for bias, imprecision (%CV) for each creatinine assay and total error for estimated glomerular filtration rate. The participating laboratories used analytical instruments from four manufacturers; Abbott, Beckman, Roche and Siemens. All enzymatic methods in this study complied with the National Kidney Disease Education Programme working group recommended limits of bias of 5% above a creatinine concentration of 100 μmol/L. They also did not show any evidence of interference from glucose. In addition, they also showed compliance with the clinically recommended %CV of ≤4% across the analytical range. In contrast, the Jaffe methods showed variable performance with regard to the interference of glucose and unsatisfactory bias and precision. Conclusion Jaffe-based creatinine methods still exhibit considerable analytical variability in terms of bias, imprecision and lack of specificity, and this variability brings into question their clinical utility. We believe that clinical laboratories and manufacturers should work together to phase out the use of relatively non-specific Jaffe methods and replace them with more specific methods that are enzyme based.
A perturbative solution to metadynamics ordinary differential equation
NASA Astrophysics Data System (ADS)
Tiwary, Pratyush; Dama, James F.; Parrinello, Michele
2015-12-01
Metadynamics is a popular enhanced sampling scheme wherein by periodic application of a repulsive bias, one can surmount high free energy barriers and explore complex landscapes. Recently, metadynamics was shown to be mathematically well founded, in the sense that the biasing procedure is guaranteed to converge to the true free energy surface in the long time limit irrespective of the precise choice of biasing parameters. A differential equation governing the post-transient convergence behavior of metadynamics was also derived. In this short communication, we revisit this differential equation, expressing it in a convenient and elegant Riccati-like form. A perturbative solution scheme is then developed for solving this differential equation, which is valid for any generic biasing kernel. The solution clearly demonstrates the robustness of metadynamics to choice of biasing parameters and gives further confidence in the widely used method.
A perturbative solution to metadynamics ordinary differential equation.
Tiwary, Pratyush; Dama, James F; Parrinello, Michele
2015-12-21
Metadynamics is a popular enhanced sampling scheme wherein by periodic application of a repulsive bias, one can surmount high free energy barriers and explore complex landscapes. Recently, metadynamics was shown to be mathematically well founded, in the sense that the biasing procedure is guaranteed to converge to the true free energy surface in the long time limit irrespective of the precise choice of biasing parameters. A differential equation governing the post-transient convergence behavior of metadynamics was also derived. In this short communication, we revisit this differential equation, expressing it in a convenient and elegant Riccati-like form. A perturbative solution scheme is then developed for solving this differential equation, which is valid for any generic biasing kernel. The solution clearly demonstrates the robustness of metadynamics to choice of biasing parameters and gives further confidence in the widely used method.
System and method for assaying radiation
DiPrete, David P; Whiteside, Tad; Pak, Donald J; DiPrete, Cecilia C
2013-11-12
A system for assaying radiation includes a sample holder configured to hold a liquid scintillation solution. A photomultiplier receives light from the liquid scintillation solution and generates a signal reflective of the light. A control circuit biases the photomultiplier and receives the signal from the photomultiplier reflective of the light. A light impermeable casing surrounds the sample holder, photomultiplier, and control circuit. A method for assaying radiation includes placing a sample in a liquid scintillation solution, placing the liquid scintillation solution in a sample holder, and placing the sample holder inside a light impermeable casing. The method further includes positioning a photomultiplier inside the light impermeable casing and supplying power to a control circuit inside the light impermeable casing.
NASA Astrophysics Data System (ADS)
Tesfagiorgis, Kibrewossen B.
Satellite Precipitation Estimates (SPEs) may be the only available source of information for operational hydrologic and flash flood prediction due to spatial limitations of radar and gauge products in mountainous regions. The present work develops an approach to seamlessly blend satellite, available radar, climatological and gauge precipitation products to fill gaps in ground-based radar precipitation field. To mix different precipitation products, the error of any of the products relative to each other should be removed. For bias correction, the study uses a new ensemble-based method which aims to estimate spatially varying multiplicative biases in SPEs using a radar-gauge precipitation product. Bias factors were calculated for a randomly selected sample of rainy pixels in the study area. Spatial fields of estimated bias were generated taking into account spatial variation and random errors in the sampled values. In addition to biases, sometimes there is also spatial error between the radar and satellite precipitation estimates; one of them has to be geometrically corrected with reference to the other. A set of corresponding raining points between SPE and radar products are selected to apply linear registration using a regularized least square technique to minimize the dislocation error in SPEs with respect to available radar products. A weighted Successive Correction Method (SCM) is used to make the merging between error corrected satellite and radar precipitation estimates. In addition to SCM, we use a combination of SCM and Bayesian spatial method for merging the rain gauges and climatological precipitation sources with radar and SPEs. We demonstrated the method using two satellite-based, CPC Morphing (CMORPH) and Hydro-Estimator (HE), two radar-gauge based, Stage-II and ST-IV, a climatological product PRISM and rain gauge dataset for several rain events from 2006 to 2008 over different geographical locations of the United States. Results show that: (a) the method of ensembles helped reduce biases in SPEs significantly; (b) the SCM method in combination with the Bayesian spatial model produced a precipitation product in good agreement with independent measurements .The study implies that using the available radar pixels surrounding the gap area, rain gauge, PRISM and satellite products, a radar like product is achievable over radar gap areas that benefits the operational meteorology and hydrology community.
Atwood, E.L.
1958-01-01
Response bias errors are studied by comparing questionnaire responses from waterfowl hunters using four large public hunting areas with actual hunting data from these areas during two hunting seasons. To the extent that the data permit, the sources of the error in the responses were studied and the contribution of each type to the total error was measured. Response bias errors, including both prestige and memory bias, were found to be very large as compared to non-response and sampling errors. Good fits were obtained with the seasonal kill distribution of the actual hunting data and the negative binomial distribution and a good fit was obtained with the distribution of total season hunting activity and the semi-logarithmic curve. A comparison of the actual seasonal distributions with the questionnaire response distributions revealed that the prestige and memory bias errors are both positive. The comparisons also revealed the tendency for memory bias errors to occur at digit frequencies divisible by five and for prestige bias errors to occur at frequencies which are multiples of the legal daily bag limit. A graphical adjustment of the response distributions was carried out by developing a smooth curve from those frequency classes not included in the predictable biased frequency classes referred to above. Group averages were used in constructing the curve, as suggested by Ezekiel [1950]. The efficiency of the technique described for reducing response bias errors in hunter questionnaire responses on seasonal waterfowl kill is high in large samples. The graphical method is not as efficient in removing response bias errors in hunter questionnaire responses on seasonal hunting activity where an average of 60 percent was removed.
Kitchen, Robert R; Sabine, Vicky S; Sims, Andrew H; Macaskill, E Jane; Renshaw, Lorna; Thomas, Jeremy S; van Hemert, Jano I; Dixon, J Michael; Bartlett, John M S
2010-02-24
Microarray technology is a popular means of producing whole genome transcriptional profiles, however high cost and scarcity of mRNA has led many studies to be conducted based on the analysis of single samples. We exploit the design of the Illumina platform, specifically multiple arrays on each chip, to evaluate intra-experiment technical variation using repeated hybridisations of universal human reference RNA (UHRR) and duplicate hybridisations of primary breast tumour samples from a clinical study. A clear batch-specific bias was detected in the measured expressions of both the UHRR and clinical samples. This bias was found to persist following standard microarray normalisation techniques. However, when mean-centering or empirical Bayes batch-correction methods (ComBat) were applied to the data, inter-batch variation in the UHRR and clinical samples were greatly reduced. Correlation between replicate UHRR samples improved by two orders of magnitude following batch-correction using ComBat (ranging from 0.9833-0.9991 to 0.9997-0.9999) and increased the consistency of the gene-lists from the duplicate clinical samples, from 11.6% in quantile normalised data to 66.4% in batch-corrected data. The use of UHRR as an inter-batch calibrator provided a small additional benefit when used in conjunction with ComBat, further increasing the agreement between the two gene-lists, up to 74.1%. In the interests of practicalities and cost, these results suggest that single samples can generate reliable data, but only after careful compensation for technical bias in the experiment. We recommend that investigators appreciate the propensity for such variation in the design stages of a microarray experiment and that the use of suitable correction methods become routine during the statistical analysis of the data.
2010-01-01
Background Microarray technology is a popular means of producing whole genome transcriptional profiles, however high cost and scarcity of mRNA has led many studies to be conducted based on the analysis of single samples. We exploit the design of the Illumina platform, specifically multiple arrays on each chip, to evaluate intra-experiment technical variation using repeated hybridisations of universal human reference RNA (UHRR) and duplicate hybridisations of primary breast tumour samples from a clinical study. Results A clear batch-specific bias was detected in the measured expressions of both the UHRR and clinical samples. This bias was found to persist following standard microarray normalisation techniques. However, when mean-centering or empirical Bayes batch-correction methods (ComBat) were applied to the data, inter-batch variation in the UHRR and clinical samples were greatly reduced. Correlation between replicate UHRR samples improved by two orders of magnitude following batch-correction using ComBat (ranging from 0.9833-0.9991 to 0.9997-0.9999) and increased the consistency of the gene-lists from the duplicate clinical samples, from 11.6% in quantile normalised data to 66.4% in batch-corrected data. The use of UHRR as an inter-batch calibrator provided a small additional benefit when used in conjunction with ComBat, further increasing the agreement between the two gene-lists, up to 74.1%. Conclusion In the interests of practicalities and cost, these results suggest that single samples can generate reliable data, but only after careful compensation for technical bias in the experiment. We recommend that investigators appreciate the propensity for such variation in the design stages of a microarray experiment and that the use of suitable correction methods become routine during the statistical analysis of the data. PMID:20181233
Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E
2016-06-01
Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.
Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.
2017-01-01
Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161
Lahey, Joanna N.; Beasley, Ryan A.
2014-01-01
This paper briefly discusses the history, benefits, and shortcomings of traditional audit field experiments to study market discrimination. Specifically it identifies template bias and experimenter bias as major concerns in the traditional audit method, and demonstrates through an empirical example that computerization of a resume or correspondence audit can efficiently increase sample size and greatly mitigate these concerns. Finally, it presents a useful meta-tool that future researchers can use to create their own resume audits. PMID:24904189
Confidence Interval Coverage for Cohen's Effect Size Statistic
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2006-01-01
Kelley compared three methods for setting a confidence interval (CI) around Cohen's standardized mean difference statistic: the noncentral-"t"-based, percentile (PERC) bootstrap, and biased-corrected and accelerated (BCA) bootstrap methods under three conditions of nonnormality, eight cases of sample size, and six cases of population…
Savoie, Jennifer G.; LeBlanc, Denis R.
2012-01-01
Field tests were conducted near the Impact Area at Camp Edwards on the Massachusetts Military Reservation, Cape Cod, Massachusetts, to determine the utility of no-purge groundwater sampling for monitoring concentrations of ordnance-related explosive compounds and perchlorate in the sand and gravel aquifer. The no-purge methods included (1) a diffusion sampler constructed of rigid porous polyethylene, (2) a diffusion sampler constructed of regenerated-cellulose membrane, and (3) a tubular grab sampler (bailer) constructed of polyethylene film. In samples from 36 monitoring wells, concentrations of perchlorate (ClO4-), hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX), and octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (HMX), the major contaminants of concern in the Impact Area, in the no-purge samples were compared to concentrations of these compounds in samples collected by low-flow pumped sampling with dedicated bladder pumps. The monitoring wells are constructed of 2- and 2.5-inch-diameter polyvinyl chloride pipe and have approximately 5- to 10-foot-long slotted screens. The no-purge samplers were left in place for 13-64 days to ensure that ambient groundwater flow had flushed the well screen and concentrations in the screen represented water in the adjacent formation. The sampling methods were compared first in six monitoring wells. Concentrations of ClO4-, RDX, and HMX in water samples collected by the three no-purge sampling methods and low-flow pumped sampling were in close agreement for all six monitoring wells. There is no evidence of a systematic bias in the concentration differences among the methods on the basis of type of sampling device, type of contaminant, or order in which the no-purge samplers were tested. A subsequent examination of vertical variations in concentrations of ClO4- in the 10-foot-long screens of six wells by using rigid porous polyethylene diffusion samplers indicated that concentrations in a given well varied by less than 15 percent and the small variations were unlikely to affect the utility of the various sampling methods. The grab sampler was selected for additional tests in 29 of the 36 monitoring wells used during the study. Concentrations of ClO4-, RDX, HMX, and other minor explosive compounds in water samples collected by using a 1-liter grab sampler and low-flow pumped sampling were in close agreement in field tests in the 29 wells. A statistical analysis based on the sign test indicated that there was no bias in the concentration differences between the methods. There also was no evidence for a systematic bias in concentration differences between the methods related to location of the monitoring wells laterally or vertically in the groundwater-flow system. Field tests in five wells also demonstrated that sample collection by using a 2-liter grab sampler and sequential bailing with the 1-liter grab sampler were options for obtaining sufficient sample volume for replicate and spiked quality assurance and control samples. The evidence from the field tests supports the conclusion that diffusion sampling with the rigid porous polyethylene and regenerated-cellulose membranes and grab sampling with the polyethylene-film samplers provide comparable data on the concentrations of ordnance-related compounds in groundwater at the MMR to that obtained by low-flow pumped sampling. These sampling methods are useful methods for monitoring these compounds at the MMR and in similar hydrogeologic environments.
Mook, P; McCormick, J; Kanagarajah, S; Adak, G K; Cleary, P; Elson, R; Gobin, M; Hawker, J; Inns, T; Sinclair, C; Trienekens, S C M; Vivancos, R; McCarthy, N D
2018-03-01
Established methods of recruiting population controls for case-control studies to investigate gastrointestinal disease outbreaks can be time consuming, resulting in delays in identifying the source or vehicle of infection. After an initial evaluation of using online market research panel members as controls in a case-control study to investigate a Salmonella outbreak in 2013, this method was applied in four further studies in the UK between 2014 and 2016. We used data from all five studies and interviews with members of each outbreak control team and market research panel provider to review operational issues, evaluate risk of bias in this approach and consider methods to reduce confounding and bias. The investigators of each outbreak reported likely time and cost savings from using market research controls. There were systematic differences between case and control groups in some studies but no evidence that conclusions on the likely source or vehicle of infection were incorrect. Potential selection biases introduced by using this sampling frame and the low response rate are unclear. Methods that might reduce confounding and some bias should be balanced with concerns for overmatching. Further evaluation of this approach using comparisons with traditional methods and population-based exposure survey data is recommended.
Terza, Joseph V; Bradford, W David; Dismuke, Clara E
2008-01-01
Objective To investigate potential bias in the use of the conventional linear instrumental variables (IV) method for the estimation of causal effects in inherently nonlinear regression settings. Data Sources Smoking Supplement to the 1979 National Health Interview Survey, National Longitudinal Alcohol Epidemiologic Survey, and simulated data. Study Design Potential bias from the use of the linear IV method in nonlinear models is assessed via simulation studies and real world data analyses in two commonly encountered regression setting: (1) models with a nonnegative outcome (e.g., a count) and a continuous endogenous regressor; and (2) models with a binary outcome and a binary endogenous regressor. Principle Findings The simulation analyses show that substantial bias in the estimation of causal effects can result from applying the conventional IV method in inherently nonlinear regression settings. Moreover, the bias is not attenuated as the sample size increases. This point is further illustrated in the survey data analyses in which IV-based estimates of the relevant causal effects diverge substantially from those obtained with appropriate nonlinear estimation methods. Conclusions We offer this research as a cautionary note to those who would opt for the use of linear specifications in inherently nonlinear settings involving endogeneity. PMID:18546544
The Impact of Assimilation of GPM Clear Sky Radiance on HWRF Hurricane Track and Intensity Forecasts
NASA Astrophysics Data System (ADS)
Yu, C. L.; Pu, Z.
2016-12-01
The impact of GPM microwave imager (GMI) clear sky radiances on hurricane forecasting is examined by ingesting GMI level 1C recalibrated brightness temperature into the NCEP Gridpoint Statistical Interpolation (GSI)- based ensemble-variational hybrid data assimilation system for the operational Hurricane Weather Research and Forecast (HWRF) system. The GMI clear sky radiances are compared with the Community Radiative Transfer Model (CRTM) simulated radiances to closely study the quality of the radiance observations. The quality check result indicates the presence of bias in various channels. A static bias correction scheme, in which the appropriate bias correction coefficients for GMI data is evaluated by applying regression method on a sufficiently large sample of data representative to the observational bias in the regions of concern, is used to correct the observational bias in GMI clear sky radiances. Forecast results with and without assimilation of GMI radiance are compared using hurricane cases from recent hurricane seasons (e.g., Hurricane Joaquin in 2015). Diagnoses of data assimilation results show that the bias correction coefficients obtained from the regression method can correct the inherent biases in GMI radiance data, significantly reducing observational residuals. The removal of biases also allows more data to pass GSI quality control and hence to be assimilated into the model. Forecast results for hurricane Joaquin demonstrates that the quality of analysis from the data assimilation is sensitive to the bias correction, with positive impacts on the hurricane track forecast when systematic biases are removed from the radiance data. Details will be presented at the symposium.
A comparison of two sampling designs for fish assemblage assessment in a large river
Kiraly, Ian A.; Coghlan, Stephen M.; Zydlewski, Joseph D.; Hayes, Daniel
2014-01-01
We compared the efficiency of stratified random and fixed-station sampling designs to characterize fish assemblages in anticipation of dam removal on the Penobscot River, the largest river in Maine. We used boat electrofishing methods in both sampling designs. Multiple 500-m transects were selected randomly and electrofished in each of nine strata within the stratified random sampling design. Within the fixed-station design, up to 11 transects (1,000 m) were electrofished, all of which had been sampled previously. In total, 88 km of shoreline were electrofished during summer and fall in 2010 and 2011, and 45,874 individuals of 34 fish species were captured. Species-accumulation and dissimilarity curve analyses indicated that all sampling effort, other than fall 2011 under the fixed-station design, provided repeatable estimates of total species richness and proportional abundances. Overall, our sampling designs were similar in precision and efficiency for sampling fish assemblages. The fixed-station design was negatively biased for estimating the abundance of species such as Common Shiner Luxilus cornutus and Fallfish Semotilus corporalis and was positively biased for estimating biomass for species such as White Sucker Catostomus commersonii and Atlantic Salmon Salmo salar. However, we found no significant differences between the designs for proportional catch and biomass per unit effort, except in fall 2011. The difference observed in fall 2011 was due to limitations on the number and location of fixed sites that could be sampled, rather than an inherent bias within the design. Given the results from sampling in the Penobscot River, application of the stratified random design is preferable to the fixed-station design due to less potential for bias caused by varying sampling effort, such as what occurred in the fall 2011 fixed-station sample or due to purposeful site selection.
NASA Astrophysics Data System (ADS)
VandeVondele, Joost; Rothlisberger, Ursula
2000-09-01
We present a method for calculating multidimensional free energy surfaces within the limited time scale of a first-principles molecular dynamics scheme. The sampling efficiency is enhanced using selected terms of a classical force field as a bias potential. This simple procedure yields a very substantial increase in sampling accuracy while retaining the high quality of the underlying ab initio potential surface and can thus be used for a parameter free calculation of free energy surfaces. The success of the method is demonstrated by the applications to two gas phase molecules, ethane and peroxynitrous acid, as test case systems. A statistical analysis of the results shows that the entire free energy landscape is well converged within a 40 ps simulation at 500 K, even for a system with barriers as high as 15 kcal/mol.
Berger, Lawrence M; Bruch, Sarah K; Johnson, Elizabeth I; James, Sigrid; Rubin, David
2009-01-01
This study used data on 2,453 children aged 4-17 from the National Survey of Child and Adolescent Well-Being and 5 analytic methods that adjust for selection factors to estimate the impact of out-of-home placement on children's cognitive skills and behavior problems. Methods included ordinary least squares (OLS) regressions and residualized change, simple change, difference-in-difference, and fixed effects models. Models were estimated using the full sample and a matched sample generated by propensity scoring. Although results from the unmatched OLS and residualized change models suggested that out-of-home placement is associated with increased child behavior problems, estimates from models that more rigorously adjust for selection bias indicated that placement has little effect on children's cognitive skills or behavior problems.
Roon, David A.; Waits, L.P.; Kendall, K.C.
2005-01-01
Non-invasive genetic sampling (NGS) is becoming a popular tool for population estimation. However, multiple NGS studies have demonstrated that polymerase chain reaction (PCR) genotyping errors can bias demographic estimates. These errors can be detected by comprehensive data filters such as the multiple-tubes approach, but this approach is expensive and time consuming as it requires three to eight PCR replicates per locus. Thus, researchers have attempted to correct PCR errors in NGS datasets using non-comprehensive error checking methods, but these approaches have not been evaluated for reliability. We simulated NGS studies with and without PCR error and 'filtered' datasets using non-comprehensive approaches derived from published studies and calculated mark-recapture estimates using CAPTURE. In the absence of data-filtering, simulated error resulted in serious inflations in CAPTURE estimates; some estimates exceeded N by ??? 200%. When data filters were used, CAPTURE estimate reliability varied with per-locus error (E??). At E?? = 0.01, CAPTURE estimates from filtered data displayed < 5% deviance from error-free estimates. When E?? was 0.05 or 0.09, some CAPTURE estimates from filtered data displayed biases in excess of 10%. Biases were positive at high sampling intensities; negative biases were observed at low sampling intensities. We caution researchers against using non-comprehensive data filters in NGS studies, unless they can achieve baseline per-locus error rates below 0.05 and, ideally, near 0.01. However, we suggest that data filters can be combined with careful technique and thoughtful NGS study design to yield accurate demographic information. ?? 2005 The Zoological Society of London.
James T. Peterson; Nolan P. Banish; Russell F. Thurow
2005-01-01
Fish movement during sampling may negatively bias sample data and population estimates. We evaluated the short-term movements of stream-dwelling salmonids by recapture of marked individuals during day and night snorkeling and backpack electrofishing. Bull trout Salvelinus confluentus and rainbow trout Oncorhynchus mykiss were...
An Investigation of Sample Size Splitting on ATFIND and DIMTEST
ERIC Educational Resources Information Center
Socha, Alan; DeMars, Christine E.
2013-01-01
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Armijo-Olivo, Susan; Cummings, Greta G.; Amin, Maryam; Flores-Mir, Carlos
2017-01-01
Objectives To examine the risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions and the development of these aspects over time. Methods We included 540 randomized clinical trials from 64 selected systematic reviews. We extracted, in duplicate, details from each of the selected randomized clinical trials with respect to publication and trial characteristics, reporting and methodologic characteristics, and Cochrane risk of bias domains. We analyzed data using logistic regression and Chi-square statistics. Results Sequence generation was assessed to be inadequate (at unclear or high risk of bias) in 68% (n = 367) of the trials, while allocation concealment was inadequate in the majority of trials (n = 464; 85.9%). Blinding of participants and blinding of the outcome assessment were judged to be inadequate in 28.5% (n = 154) and 40.5% (n = 219) of the trials, respectively. A sample size calculation before the initiation of the study was not performed/reported in 79.1% (n = 427) of the trials, while the sample size was assessed as adequate in only 17.6% (n = 95) of the trials. Two thirds of the trials were not described as double blinded (n = 358; 66.3%), while the method of blinding was appropriate in 53% (n = 286) of the trials. We identified a significant decrease over time (1955–2013) in the proportion of trials assessed as having inadequately addressed methodological quality items (P < 0.05) in 30 out of the 40 quality criteria, or as being inadequate (at high or unclear risk of bias) in five domains of the Cochrane risk of bias tool: sequence generation, allocation concealment, incomplete outcome data, other sources of bias, and overall risk of bias. Conclusions The risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions have improved over time; however, further efforts that contribute to the development of more stringent methodology and detailed reporting of trials are still needed. PMID:29272315
2013-01-01
Background Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. Methods We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Results and conclusions Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers. PMID:24565158
Overcoming the winner's curse: estimating penetrance parameters from case-control data.
Zollner, Sebastian; Pritchard, Jonathan K
2007-04-01
Genomewide association studies are now a widely used approach in the search for loci that affect complex traits. After detection of significant association, estimates of penetrance and allele-frequency parameters for the associated variant indicate the importance of that variant and facilitate the planning of replication studies. However, when these estimates are based on the original data used to detect the variant, the results are affected by an ascertainment bias known as the "winner's curse." The actual genetic effect is typically smaller than its estimate. This overestimation of the genetic effect may cause replication studies to fail because the necessary sample size is underestimated. Here, we present an approach that corrects for the ascertainment bias and generates an estimate of the frequency of a variant and its penetrance parameters. The method produces a point estimate and confidence region for the parameter estimates. We study the performance of this method using simulated data sets and show that it is possible to greatly reduce the bias in the parameter estimates, even when the original association study had low power. The uncertainty of the estimate decreases with increasing sample size, independent of the power of the original test for association. Finally, we show that application of the method to case-control data can improve the design of replication studies considerably.
Sampling and handling artifacts can bias filter-based measurements of particulate organic carbon (OC). Several measurement-based methods for OC artifact reduction and/or estimation are currently used in research-grade field studies. OC frequently is not artifact-corrected in larg...
Krueger, Aaron B; Carnell, Pauline; Carpenter, John F
2016-04-01
In many manufacturing and research areas, the ability to accurately monitor and characterize nanoparticles is becoming increasingly important. Nanoparticle tracking analysis is rapidly becoming a standard method for this characterization, yet several key factors in data acquisition and analysis may affect results. Nanoparticle tracking analysis is prone to user input and bias on account of a high number of parameters available, contains a limited analysis volume, and individual sample characteristics such as polydispersity or complex protein solutions may affect analysis results. This study systematically addressed these key issues. The integrated syringe pump was used to increase the sample volume analyzed. It was observed that measurements recorded under flow caused a reduction in total particle counts for both polystyrene and protein particles compared to those collected under static conditions. In addition, data for polydisperse samples tended to lose peak resolution at higher flow rates, masking distinct particle populations. Furthermore, in a bimodal particle population, a bias was seen toward the larger species within the sample. The impacts of filtration on an agitated intravenous immunoglobulin sample and operating parameters including "MINexps" and "blur" were investigated to optimize the method. Taken together, this study provides recommendations on instrument settings and sample preparations to properly characterize complex samples. Copyright © 2016. Published by Elsevier Inc.
Agashiwala, Rajiv M; Louis, Elan D; Hof, Patrick R; Perl, Daniel P
2008-10-21
Non-biased systematic sampling using the principles of stereology provides accurate quantitative estimates of objects within neuroanatomic structures. However, the basic principles of stereology are not optimally suited for counting objects that selectively exist within a limited but complex and convoluted portion of the sample, such as occurs when counting cerebellar Purkinje cells. In an effort to quantify Purkinje cells in association with certain neurodegenerative disorders, we developed a new method for stereologic sampling of the cerebellar cortex, involving calculating the volume of the cerebellar tissues, identifying and isolating the Purkinje cell layer and using this information to extrapolate non-biased systematic sampling data to estimate the total number of Purkinje cells in the tissues. Using this approach, we counted Purkinje cells in the right cerebella of four human male control specimens, aged 41, 67, 70 and 84 years, and estimated the total Purkinje cell number for the four entire cerebella to be 27.03, 19.74, 20.44 and 22.03 million cells, respectively. The precision of the method is seen when comparing the density of the cells within the tissue: 266,274, 173,166, 167,603 and 183,575 cells/cm3, respectively. Prior literature documents Purkinje cell counts ranging from 14.8 to 30.5 million cells. These data demonstrate the accuracy of our approach. Our novel approach, which offers an improvement over previous methodologies, is of value for quantitative work of this nature. This approach could be applied to morphometric studies of other similarly complex tissues as well.
Agashiwala, Rajiv M.; Louis, Elan D.; Hof, Patrick R.; Perl, Daniel P.
2010-01-01
Non-biased systematic sampling using the principles of stereology provides accurate quantitative estimates of objects within neuroanatomic structures. However, the basic principles of stereology are not optimally suited for counting objects that selectively exist within a limited but complex and convoluted portion of the sample, such as occurs when counting cerebellar Purkinje cells. In an effort to quantify Purkinje cells in association with certain neurodegenerative disorders, we developed a new method for stereologic sampling of the cerebellar cortex, involving calculating the volume of the cerebellar tissues, identifying and isolating the Purkinje cell layer and using this information to extrapolate non-biased systematic sampling data to estimate the total number of Purkinje cells in the tissues. Using this approach, we counted Purkinje cells in the right cerebella of four human male control specimens, aged 41, 67, 70 and 84 years, and estimated the total Purkinje cell number for the four entire cerebella to be 27.03, 19.74, 20.44 and 22.03 million cells, respectively. The precision of the method is seen when comparing the density of the cells within the tissue: 266,274, 173,166, 167,603 and 183,575 cells/cm3, respectively. Prior literature documents Purkinje cell counts ranging from 14.8 to 30.5 million cells. These data demonstrate the accuracy of our approach. Our novel approach, which offers an improvement over previous methodologies, is of value for quantitative work of this nature. This approach could be applied to morphometric studies of other similarly complex tissues as well. PMID:18725208
Syfert, Mindy M; Smith, Matthew J; Coomes, David A
2013-01-01
Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as "feature types" in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.
Pan, Minghao; Yang, Yongmin; Guan, Fengjiao; Hu, Haifeng; Xu, Hailong
2017-01-01
The accurate monitoring of blade vibration under operating conditions is essential in turbo-machinery testing. Blade tip timing (BTT) is a promising non-contact technique for the measurement of blade vibrations. However, the BTT sampling data are inherently under-sampled and contaminated with several measurement uncertainties. How to recover frequency spectra of blade vibrations though processing these under-sampled biased signals is a bottleneck problem. A novel method of BTT signal processing for alleviating measurement uncertainties in recovery of multi-mode blade vibration frequency spectrum is proposed in this paper. The method can be divided into four phases. First, a single measurement vector model is built by exploiting that the blade vibration signals are sparse in frequency spectra. Secondly, the uniqueness of the nonnegative sparse solution is studied to achieve the vibration frequency spectrum. Thirdly, typical sources of BTT measurement uncertainties are quantitatively analyzed. Finally, an improved vibration frequency spectra recovery method is proposed to get a guaranteed level of sparse solution when measurement results are biased. Simulations and experiments are performed to prove the feasibility of the proposed method. The most outstanding advantage is that this method can prevent the recovered multi-mode vibration spectra from being affected by BTT measurement uncertainties without increasing the probe number. PMID:28758952
Texas Adolescent Tobacco and Marketing Surveillance System’s Design
Pérez, Adriana; Harrell, Melissa B.; Malkani, Raja I.; Jackson, Christian D.; Delk, Joanne; Allotey, Prince A.; Matthews, Krystin J.; Martinez, Pablo; Perry, Cheryl L.
2017-01-01
Objectives To provide a full methodological description of the design of the wave I and II (6-month follow-up) surveys of the Texas Adolescent Tobacco and Marketing Surveillance System (TATAMS), a longitudinal surveillance study of 6th, 8th, and 10th grade students who attended schools in Bexar, Dallas, Tarrant, Harris, or Travis counties, where the 4 largest cities in Texas (San Antonio, Dallas, Fort Worth, Houston, and Austin, respectively) are located. Methods TATAMS used a complex probability design, yielding representative estimates of these students in these counties during the 2014–2015 academic year. Weighted prevalence of the use of tobacco products, drugs and alcohol in wave I, and the percent of: (i) bias, (ii) relative bias, and (iii) relative bias ratio, between waves I and II are estimated. Results The wave I sample included 79 schools and 3,907 students. The prevalence of current cigarette, e-cigarette and hookah use at wave I was 3.5%, 7.4%, and 2.5%, respectively. Small biases, mostly less than 3.5%, were observed for nonrespondents in wave II. Conclusions Even with adaptions to the sampling methodology, the resulting sample adequately represents the target population. Results from TATAMS will have important implications for future tobacco policy in Texas and federal regulation. PMID:29098172
Assessing Compliance-Effect Bias in the Two Stage Least Squares Estimator
ERIC Educational Resources Information Center
Reardon, Sean; Unlu, Fatih; Zhu, Pei; Bloom, Howard
2011-01-01
The proposed paper studies the bias in the two-stage least squares, or 2SLS, estimator that is caused by the compliance-effect covariance (hereafter, the compliance-effect bias). It starts by deriving the formula for the bias in an infinite sample (i.e., in the absence of finite sample bias) under different circumstances. Specifically, it…
Common component classification: what can we learn from machine learning?
Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S
2011-05-15
Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.
Proportional hazards model with varying coefficients for length-biased data.
Zhang, Feipeng; Chen, Xuerong; Zhou, Yong
2014-01-01
Length-biased data arise in many important applications including epidemiological cohort studies, cancer prevention trials and studies of labor economics. Such data are also often subject to right censoring due to loss of follow-up or the end of study. In this paper, we consider a proportional hazards model with varying coefficients for right-censored and length-biased data, which is used to study the interact effect nonlinearly of covariates with an exposure variable. A local estimating equation method is proposed for the unknown coefficients and the intercept function in the model. The asymptotic properties of the proposed estimators are established by using the martingale theory and kernel smoothing techniques. Our simulation studies demonstrate that the proposed estimators have an excellent finite-sample performance. The Channing House data is analyzed to demonstrate the applications of the proposed method.
Reducing bias in survival under non-random temporary emigration
Peñaloza, Claudia L.; Kendall, William L.; Langtimm, Catherine Ann
2014-01-01
Despite intensive monitoring, temporary emigration from the sampling area can induce bias severe enough for managers to discard life-history parameter estimates toward the terminus of the times series (terminal bias). Under random temporary emigration unbiased parameters can be estimated with CJS models. However, unmodeled Markovian temporary emigration causes bias in parameter estimates and an unobservable state is required to model this type of emigration. The robust design is most flexible when modeling temporary emigration, and partial solutions to mitigate bias have been identified, nonetheless there are conditions were terminal bias prevails. Long-lived species with high adult survival and highly variable non-random temporary emigration present terminal bias in survival estimates, despite being modeled with the robust design and suggested constraints. Because this bias is due to uncertainty about the fate of individuals that are undetected toward the end of the time series, solutions should involve using additional information on survival status or location of these individuals at that time. Using simulation, we evaluated the performance of models that jointly analyze robust design data and an additional source of ancillary data (predictive covariate on temporary emigration, telemetry, dead recovery, or auxiliary resightings) in reducing terminal bias in survival estimates. The auxiliary resighting and predictive covariate models reduced terminal bias the most. Additional telemetry data was effective at reducing terminal bias only when individuals were tracked for a minimum of two years. High adult survival of long-lived species made the joint model with recovery data ineffective at reducing terminal bias because of small-sample bias. The naïve constraint model (last and penultimate temporary emigration parameters made equal), was the least efficient, though still able to reduce terminal bias when compared to an unconstrained model. Joint analysis of several sources of data improved parameter estimates and reduced terminal bias. Efforts to incorporate or acquire such data should be considered by researchers and wildlife managers, especially in the years leading up to status assessments of species of interest. Simulation modeling is a very cost effective method to explore the potential impacts of using different sources of data to produce high quality demographic data to inform management.
Attentional bias in smokers: exposure to dynamic smoking cues in contemporary movies.
Lochbuehler, Kirsten; Voogd, Hubert; Scholte, Ron H J; Engels, Rutger C M E
2011-04-01
Research has shown that smokers have an attentional bias for pictorial smoking cues. The objective of the present study was to examine whether smokers also have an attentional bias for dynamic smoking cues in contemporary movies and therefore fixate more quickly, more often and for longer periods of time on dynamic smoking cues than non-smokers. By drawing upon established methods for assessing attentional biases for pictorial cues, we aimed to develop a new method for assessing attentional biases for dynamic smoking cues. We examined smokers' and non-smokers' eye movements while watching a movie clip by using eye-tracking technology. The sample consisted of 16 smoking and 17 non-smoking university students. Our results confirm the results of traditional pictorial attentional bias research. Smokers initially directed their gaze more quickly towards smoking-related cues (p = 0.01), focusing on them more often (p = 0.05) and for a longer duration (p = 0.01) compared with non-smokers. Thus, smoking cues in movies directly affect the attention of smokers. These findings indicate that the effects of dynamic smoking cues, in addition to other environmental smoking cues, need to be taken into account in smoking cessation therapies in order to increase successful smoking cessation and to prevent relapses.
Garbarino, John R.
1999-01-01
The inductively coupled plasma?mass spectrometric (ICP?MS) methods have been expanded to include the determination of dissolved arsenic, boron, lithium, selenium, strontium, thallium, and vanadium in filtered, acidified natural water. Method detection limits for these elements are now 10 to 200 times lower than by former U.S. Geological Survey (USGS) methods, thus providing lower variability at ambient concentrations. The bias and variability of the method was determined by using results from spike recoveries, standard reference materials, and validation samples. Spike recoveries at 5 to 10 times the method detection limit and 75 micrograms per liter in reagent-water, surface-water, and groundwater matrices averaged 93 percent for seven replicates, although selected elemental recoveries in a ground-water matrix with an extremely high iron sulfate concentration were negatively biased by 30 percent. Results for standard reference materials were within 1 standard deviation of the most probable value. Statistical analysis of the results from about 60 filtered, acidified natural-water samples indicated that there was no significant difference between ICP?MS and former USGS official methods of analysis.
Adjustment of pesticide concentrations for temporal changes in analytical recovery, 1992–2010
Martin, Jeffrey D.; Eberle, Michael
2011-01-01
Recovery is the proportion of a target analyte that is quantified by an analytical method and is a primary indicator of the analytical bias of a measurement. Recovery is measured by analysis of quality-control (QC) water samples that have known amounts of target analytes added ("spiked" QC samples). For pesticides, recovery is the measured amount of pesticide in the spiked QC sample expressed as a percentage of the amount spiked, ideally 100 percent. Temporal changes in recovery have the potential to adversely affect time-trend analysis of pesticide concentrations by introducing trends in apparent environmental concentrations that are caused by trends in performance of the analytical method rather than by trends in pesticide use or other environmental conditions. This report presents data and models related to the recovery of 44 pesticides and 8 pesticide degradates (hereafter referred to as "pesticides") that were selected for a national analysis of time trends in pesticide concentrations in streams. Water samples were analyzed for these pesticides from 1992 through 2010 by gas chromatography/mass spectrometry. Recovery was measured by analysis of pesticide-spiked QC water samples. Models of recovery, based on robust, locally weighted scatterplot smooths (lowess smooths) of matrix spikes, were developed separately for groundwater and stream-water samples. The models of recovery can be used to adjust concentrations of pesticides measured in groundwater or stream-water samples to 100 percent recovery to compensate for temporal changes in the performance (bias) of the analytical method.
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
Light distribution modulated diffuse reflectance spectroscopy.
Huang, Pin-Yuan; Chien, Chun-Yu; Sheu, Chia-Rong; Chen, Yu-Wen; Tseng, Sheng-Hao
2016-06-01
Typically, a diffuse reflectance spectroscopy (DRS) system employing a continuous wave light source would need to acquire diffuse reflectances measured at multiple source-detector separations for determining the absorption and reduced scattering coefficients of turbid samples. This results in a multi-fiber probe structure and an indefinite probing depth. Here we present a novel DRS method that can utilize a few diffuse reflectances measured at one source-detector separation for recovering the optical properties of samples. The core of innovation is a liquid crystal (LC) cell whose scattering property can be modulated by the bias voltage. By placing the LC cell between the light source and the sample, the spatial distribution of light in the sample can be varied as the scattering property of the LC cell modulated by the bias voltage, and this would induce intensity variation of the collected diffuse reflectance. From a series of Monte Carlo simulations and phantom measurements, we found that this new light distribution modulated DRS (LDM DRS) system was capable of accurately recover the absorption and scattering coefficients of turbid samples and its probing depth only varied by less than 3% over the full bias voltage variation range. Our results suggest that this LDM DRS platform could be developed to various low-cost, efficient, and compact systems for in-vivo superficial tissue investigation.
Light distribution modulated diffuse reflectance spectroscopy
Huang, Pin-Yuan; Chien, Chun-Yu; Sheu, Chia-Rong; Chen, Yu-Wen; Tseng, Sheng-Hao
2016-01-01
Typically, a diffuse reflectance spectroscopy (DRS) system employing a continuous wave light source would need to acquire diffuse reflectances measured at multiple source-detector separations for determining the absorption and reduced scattering coefficients of turbid samples. This results in a multi-fiber probe structure and an indefinite probing depth. Here we present a novel DRS method that can utilize a few diffuse reflectances measured at one source-detector separation for recovering the optical properties of samples. The core of innovation is a liquid crystal (LC) cell whose scattering property can be modulated by the bias voltage. By placing the LC cell between the light source and the sample, the spatial distribution of light in the sample can be varied as the scattering property of the LC cell modulated by the bias voltage, and this would induce intensity variation of the collected diffuse reflectance. From a series of Monte Carlo simulations and phantom measurements, we found that this new light distribution modulated DRS (LDM DRS) system was capable of accurately recover the absorption and scattering coefficients of turbid samples and its probing depth only varied by less than 3% over the full bias voltage variation range. Our results suggest that this LDM DRS platform could be developed to various low-cost, efficient, and compact systems for in-vivo superficial tissue investigation. PMID:27375931
Hierarchical modeling of cluster size in wildlife surveys
Royle, J. Andrew
2008-01-01
Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Pendleton, G.W.; Ralph, C. John; Sauer, John R.; Droege, Sam
1995-01-01
Many factors affect the use of point counts for monitoring bird populations, including sampling strategies, variation in detection rates, and independence of sample points. The most commonly used sampling plans are stratified sampling, cluster sampling, and systematic sampling. Each of these might be most useful for different objectives or field situations. Variation in detection probabilities and lack of independence among sample points can bias estimates and measures of precision. All of these factors should be con-sidered when using point count methods.
Kristensen, Gunn B B; Rustad, Pål; Berg, Jens P; Aakre, Kristin M
2016-09-01
We undertook this study to evaluate method differences for 5 components analyzed by immunoassays, to explore whether the use of method-dependent reference intervals may compensate for method differences, and to investigate commutability of external quality assessment (EQA) materials. Twenty fresh native single serum samples, a fresh native serum pool, Nordic Federation of Clinical Chemistry Reference Serum X (serum X) (serum pool), and 2 EQA materials were sent to 38 laboratories for measurement of cobalamin, folate, ferritin, free T4, and thyroid-stimulating hormone (TSH) by 5 different measurement procedures [Roche Cobas (n = 15), Roche Modular (n = 4), Abbott Architect (n = 8), Beckman Coulter Unicel (n = 2), and Siemens ADVIA Centaur (n = 9)]. The target value for each component was calculated based on the mean of method means or measured by a reference measurement procedure (free T4). Quality specifications were based on biological variation. Local reference intervals were reported from all laboratories. Method differences that exceeded acceptable bias were found for all components except folate. Free T4 differences from the uncommonly used reference measurement procedure were large. Reference intervals differed between measurement procedures but also within 1 measurement procedure. The serum X material was commutable for all components and measurement procedures, whereas the EQA materials were noncommutable in 13 of 50 occasions (5 components, 5 methods, 2 EQA materials). The bias between the measurement procedures was unacceptably large in 4/5 tested components. Traceability to reference materials as claimed by the manufacturers did not lead to acceptable harmonization. Adjustment of reference intervals in accordance with method differences and use of commutable EQA samples are not implemented commonly. © 2016 American Association for Clinical Chemistry.
Validation of continuous particle monitors for personal, indoor, and outdoor exposures.
Wallace, Lance A; Wheeler, Amanda J; Kearney, Jill; Van Ryswyk, Keith; You, Hongyu; Kulka, Ryan H; Rasmussen, Pat E; Brook, Jeff R; Xu, Xiaohong
2011-01-01
Continuous monitors can be used to supplement traditional filter-based methods of determining personal exposure to air pollutants. They have the advantages of being able to identify nearby sources and detect temporal changes on a time scale of a few minutes. The Windsor Ontario Exposure Assessment Study (WOEAS) adopted an approach of using multiple continuous monitors to measure indoor, outdoor (near-residential) and personal exposures to PM₂.₅, ultrafine particles and black carbon. About 48 adults and households were sampled for five consecutive 24-h periods in summer and winter 2005, and another 48 asthmatic children for five consecutive 24-h periods in summer and winter 2006. This article addresses the laboratory and field validation of these continuous monitors. A companion article (Wheeler et al., 2010) provides similar analyses for the 24-h integrated methods, as well as providing an overview of the objectives and study design. The four continuous monitors were the DustTrak (Model 8520, TSI, St. Paul, MN, USA) and personal DataRAM (pDR) (ThermoScientific, Waltham, MA, USA) for PM₂.₅; the P-Trak (Model 8525, TSI) for ultrafine particles; and the Aethalometer (AE-42, Magee Scientific, Berkeley, CA, USA) for black carbon (BC). All monitors were tested in multiple co-location studies involving as many as 16 monitors of a given type to determine their limits of detection as well as bias and precision. The effect of concentration and electronic drift on bias and precision were determined from both the collocated studies and the full field study. The effect of rapid changes in environmental conditions on switching an instrument from indoor to outdoor sampling was also studied. The use of multiple instruments for outdoor sampling was valuable in identifying occasional poor performance by one instrument and in better determining local contributions to the spatial variation of particulate pollution. Both the DustTrak and pDR were shown to be in reasonable agreement (R² of 90 and 70%, respectively) with the gravimetric PM₂.₅ method. Both instruments had limits of detection of about 5 μg/m³. The DustTrak and pDR had multiplicative biases of about 2.5 and 1.6, respectively, compared with the gravimetric samplers. However, their average bias-corrected precisions were <10%, indicating that a proper correction for bias would bring them into very good agreement with standard methods. Although no standard methods exist to establish the bias of the Aethalometer and P-Trak, the precision was within 20% for the Aethalometer and within 10% for the P-Trak. These findings suggest that all four instruments can supply useful information in environmental studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morawski, Ireneusz; Institute of Experimental Physics, University of Wrocław, pl. M. Borna 9, 50-204 Wrocław; Spiegelberg, Richard
A method which allows scanning tunneling microscopy (STM) tip biasing independent of the sample bias during frequency modulated atomic force microscopy (AFM) operation is presented. The AFM sensor is supplied by an electronic circuit combining both a frequency shift signal and a tunneling current signal by means of an inductive coupling. This solution enables a control of the tip potential independent of the sample potential. Individual tip biasing is specifically important in order to implement multi-tip STM/AFM applications. An extensional quartz sensor (needle sensor) with a conductive tip is applied to record simultaneously topography and conductivity of the sample. Themore » high resonance frequency of the needle sensor (1 MHz) allows scanning of a large area of the surface being investigated in a reasonably short time. A recipe for the amplitude calibration which is based only on the frequency shift signal and does not require the tip being in contact is presented. Additionally, we show spectral measurements of the mechanical vibration noise of the scanning system used in the investigations.« less
2016-01-01
We report a theoretical description and numerical tests of the extended-system adaptive biasing force method (eABF), together with an unbiased estimator of the free energy surface from eABF dynamics. Whereas the original ABF approach uses its running estimate of the free energy gradient as the adaptive biasing force, eABF is built on the idea that the exact free energy gradient is not necessary for efficient exploration, and that it is still possible to recover the exact free energy separately with an appropriate estimator. eABF does not directly bias the collective coordinates of interest, but rather fictitious variables that are harmonically coupled to them; therefore is does not require second derivative estimates, making it easily applicable to a wider range of problems than ABF. Furthermore, the extended variables present a smoother, coarse-grain-like sampling problem on a mollified free energy surface, leading to faster exploration and convergence. We also introduce CZAR, a simple, unbiased free energy estimator from eABF trajectories. eABF/CZAR converges to the physical free energy surface faster than standard ABF for a wide range of parameters. PMID:27959559
Ke, Peifeng; Liu, Jiawei; Chao, Yan; Wu, Xiaobin; Xiong, Yujuan; Lin, Li; Wan, Zemin; Wu, Xinzhong; Xu, Jianhua; Zhuang, Junhua; Huang, Xianzhang
2017-10-01
Thalassemia could interfere with some assays for haemoglobin A 1c (HbA 1c ) measurement, therefore, it is useful to be able to screen for thalassemia while measuring HbA 1c . We used Capillarys 2 Flex Piercing (Capillarys 2FP) HbA 1c programme to simultaneously measure HbA 1c and screen for thalassemia. Samples from 498 normal controls and 175 thalassemia patients were analysed by Capillarys 2FP HbA 1c programme (Sebia, France). For method comparison, HbA 1c was quantified by Premier Hb9210 (Trinity Biotech, Ireland) in 98 thalassaemia patients samples. For verification, HbA 1c from eight thalassaemia patients was confirmed by IFCC reference method. Among 98 thalassaemia samples, Capillarys 2FP did not provide an HbA 1c result in three samples with HbH due to the overlapping of HbBart's with HbA 1c fraction; for the remaining 95 thalassaemia samples, Bland-Altman plot showed 0.00 ± 0.35% absolute bias between two systems, and a significant positive bias above 7% was observed only in two HbH samples. The HbA 1c values obtained by Capillarys 2FP were consistent with the IFCC targets (relative bias below ± 6%) in all of the eight samples tested by both methods. For screening samples with alpha (α-) thalassaemia silent/trait or beta (β-) thalassemia trait, the optimal HbA 2 cut-off values were ≤ 2.2% and > 2.8%, respectively. Our results demonstrated the Capillarys 2FP HbA 1c system could report an accurate HbA 1c value in thalassemia silent/trait, and HbA 2 value (≤ 2.2% for α-thalassaemia silent/trait and > 2.8% for β-thalassemia trait) and abnormal bands (HbH and/or HbBart's for HbH disease, HbF for β-thalassemia) may provide valuable information for screening.
Linear response to long wavelength fluctuations using curvature simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baldauf, Tobias; Zaldarriaga, Matias; Seljak, Uroš
2016-09-01
We study the local response to long wavelength fluctuations in cosmological N -body simulations, focusing on the matter and halo power spectra, halo abundance and non-linear transformations of the density field. The long wavelength mode is implemented using an effective curved cosmology and a mapping of time and distances. The method provides an alternative, more direct, way to measure the isotropic halo biases. Limiting ourselves to the linear case, we find generally good agreement between the biases obtained from the curvature method and the traditional power spectrum method at the level of a few percent. We also study the responsemore » of halo counts to changes in the variance of the field and find that the slope of the relation between the responses to density and variance differs from the naïve derivation assuming a universal mass function by approximately 8–20%. This has implications for measurements of the amplitude of local non-Gaussianity using scale dependent bias. We also analyze the halo power spectrum and halo-dark matter cross-spectrum response to long wavelength fluctuations and derive second order halo bias from it, as well as the super-sample variance contribution to the galaxy power spectrum covariance matrix.« less
Congleton, J.L.; LaVoie, W.J.
2001-01-01
Thirteen blood chemistry indices were compared for samples collected by three commonly used methods: caudal transection, heart puncture, and caudal vessel puncture. Apparent biases in blood chemistry values for samples obtained by caudal transection were consistent with dilution with tissue fluids: alanine aminotransferase (ALT), aspartate aminotransferase (AST), lactate dehydrogenase (LDH), creatine kinase (CK), triglyceride, and K+ were increased and Na+ and Cl- were decreased relative to values for samples obtained by caudal vessel puncture. Some enzyme activities (ALT, AST, LDH) and K+ concentrations were also greater in samples taken by heart puncture than in samples taken by caudal vessel puncture. Of the methods tested, caudal vessel puncture had the least effect on blood chemistry values and should be preferred for blood chemistry studies on juvenile salmonids.
Handling nonresponse in surveys: analytic corrections compared with converting nonresponders.
Jenkins, Paul; Earle-Richardson, Giulia; Burdick, Patrick; May, John
2008-02-01
A large health survey was combined with a simulation study to contrast the reduction in bias achieved by double sampling versus two weighting methods based on propensity scores. The survey used a census of one New York county and double sampling in six others. Propensity scores were modeled as a logistic function of demographic variables and were used in conjunction with a random uniform variate to simulate response in the census. These data were used to estimate the prevalence of chronic disease in a population whose parameters were defined as values from the census. Significant (p < 0.0001) predictors in the logistic function included multiple (vs. single) occupancy (odds ratio (OR) = 1.3), bank card ownership (OR = 2.1), gender (OR = 1.5), home ownership (OR = 1.3), head of household's age (OR = 1.4), and income >$18,000 (OR = 0.8). The model likelihood ratio chi-square was significant (p < 0.0001), with the area under the receiver operating characteristic curve = 0.59. Double-sampling estimates were marginally closer to population values than those from either weighting method. However, the variance was also greater (p < 0.01). The reduction in bias for point estimation from double sampling may be more than offset by the increased variance associated with this method.
NASA Astrophysics Data System (ADS)
Dillner, A. M.; Takahama, S.
2014-11-01
Organic carbon (OC) can constitute 50% or more of the mass of atmospheric particulate matter. Typically, the organic carbon concentration is measured using thermal methods such as Thermal-Optical Reflectance (TOR) from quartz fiber filters. Here, methods are presented whereby Fourier Transform Infrared (FT-IR) absorbance spectra from polytetrafluoroethylene (PTFE or Teflon) filters are used to accurately predict TOR OC. Transmittance FT-IR analysis is rapid, inexpensive, and non-destructive to the PTFE filters. To develop and test the method, FT-IR absorbance spectra are obtained from 794 samples from seven Interagency Monitoring of PROtected Visual Environment (IMPROVE) sites sampled during 2011. Partial least squares regression is used to calibrate sample FT-IR absorbance spectra to artifact-corrected TOR OC. The FTIR spectra are divided into calibration and test sets by sampling site and date which leads to precise and accurate OC predictions by FT-IR as indicated by high coefficient of determination (R2; 0.96), low bias (0.02 μg m-3, all μg m-3 values based on the nominal IMPROVE sample volume of 32.8 m-3), low error (0.08 μg m-3) and low normalized error (11%). These performance metrics can be achieved with various degrees of spectral pretreatment (e.g., including or excluding substrate contributions to the absorbances) and are comparable in precision and accuracy to collocated TOR measurements. FT-IR spectra are also divided into calibration and test sets by OC mass and by OM / OC which reflects the organic composition of the particulate matter and is obtained from organic functional group composition; this division also leads to precise and accurate OC predictions. Low OC concentrations have higher bias and normalized error due to TOR analytical errors and artifact correction errors, not due to the range of OC mass of the samples in the calibration set. However, samples with low OC mass can be used to predict samples with high OC mass indicating that the calibration is linear. Using samples in the calibration set that have a different OM / OC or ammonium / OC distributions than the test set leads to only a modest increase in bias and normalized error in the predicted samples. We conclude that FT-IR analysis with partial least squares regression is a robust method for accurately predicting TOR OC in IMPROVE network samples; providing complementary information to the organic functional group composition and organic aerosol mass estimated previously from the same set of sample spectra (Ruthenburg et al., 2014).
Hall, William J; Chapman, Mimi V; Lee, Kent M; Merino, Yesenia M; Thomas, Tainayah W; Payne, B Keith; Eng, Eugenia; Day, Steven H; Coyne-Beasley, Tamera
2015-12-01
In the United States, people of color face disparities in access to health care, the quality of care received, and health outcomes. The attitudes and behaviors of health care providers have been identified as one of many factors that contribute to health disparities. Implicit attitudes are thoughts and feelings that often exist outside of conscious awareness, and thus are difficult to consciously acknowledge and control. These attitudes are often automatically activated and can influence human behavior without conscious volition. We investigated the extent to which implicit racial/ethnic bias exists among health care professionals and examined the relationships between health care professionals' implicit attitudes about racial/ethnic groups and health care outcomes. To identify relevant studies, we searched 10 computerized bibliographic databases and used a reference harvesting technique. We assessed eligibility using double independent screening based on a priori inclusion criteria. We included studies if they sampled existing health care providers or those in training to become health care providers, measured and reported results on implicit racial/ethnic bias, and were written in English. We included a total of 15 studies for review and then subjected them to double independent data extraction. Information extracted included the citation, purpose of the study, use of theory, study design, study site and location, sampling strategy, response rate, sample size and characteristics, measurement of relevant variables, analyses performed, and results and findings. We summarized study design characteristics, and categorized and then synthesized substantive findings. Almost all studies used cross-sectional designs, convenience sampling, US participants, and the Implicit Association Test to assess implicit bias. Low to moderate levels of implicit racial/ethnic bias were found among health care professionals in all but 1 study. These implicit bias scores are similar to those in the general population. Levels of implicit bias against Black, Hispanic/Latino/Latina, and dark-skinned people were relatively similar across these groups. Although some associations between implicit bias and health care outcomes were nonsignificant, results also showed that implicit bias was significantly related to patient-provider interactions, treatment decisions, treatment adherence, and patient health outcomes. Implicit attitudes were more often significantly related to patient-provider interactions and health outcomes than treatment processes. Most health care providers appear to have implicit bias in terms of positive attitudes toward Whites and negative attitudes toward people of color. Future studies need to employ more rigorous methods to examine the relationships between implicit bias and health care outcomes. Interventions targeting implicit attitudes among health care professionals are needed because implicit bias may contribute to health disparities for people of color.
A field test for differences in condition among trapped and shot mallards
Reinecke, K.J.; Shaiffer, C.W.
1988-01-01
We tested predictions from the condition bias hypothesis (Weatherland and Greenwood 1981) regarding the effects of sampling methods of body weights of mallards (Anas platyrhynchos) at White River National Wildlife Refuge (WRNWR), Arkansas, during 24 November-8 December 1985. Body weights of 84 mallards caught with unbaited rocket nets in a natural wetland were used as experimental controls and compared to the body weights of 70 mallards captured with baited rocket nets, 86 mallards captured with baited swim-in traps, and 130 mallards killed by hunters. We found no differences (P > 0.27) in body weight among sampling methods, but body condition (wt/wing length) of the birds killed by hunters was less (P 0.75 for differences > 50 g. The condition bias hypothesis probably applies to ducks killed by hunters but not to trapping operations when substantial (> 20 at 1 time) numbers of birds are captured.
Electric Field-aided Selective Activation for Indium-Gallium-Zinc-Oxide Thin Film Transistors.
Lee, Heesoo; Chang, Ki Soo; Tak, Young Jun; Jung, Tae Soo; Park, Jeong Woo; Kim, Won-Gi; Chung, Jusung; Jeong, Chan Bae; Kim, Hyun Jae
2016-10-11
A new technique is proposed for the activation of low temperature amorphous InGaZnO thin film transistor (a-IGZO TFT) backplanes through application of a bias voltage and annealing at 130 °C simultaneously. In this 'electrical activation', the effects of annealing under bias are selectively focused in the channel region. Therefore, electrical activation can be an effective method for lower backplane processing temperatures from 280 °C to 130 °C. Devices fabricated with this method exhibit equivalent electrical properties to those of conventionally-fabricated samples. These results are analyzed electrically and thermodynamically using infrared microthermography. Various bias voltages are applied to the gate, source, and drain electrodes while samples are annealed at 130 °C for 1 hour. Without conventional high temperature annealing or electrical activation, current-voltage curves do not show transfer characteristics. However, electrically activated a-IGZO TFTs show superior electrical characteristics, comparable to the reference TFTs annealed at 280 °C for 1 hour. This effect is a result of the lower activation energy, and efficient transfer of electrical and thermal energy to a-IGZO TFTs. With this approach, superior low-temperature a-IGZO TFTs are fabricated successfully.
Gini estimation under infinite variance
NASA Astrophysics Data System (ADS)
Fontanari, Andrea; Taleb, Nassim Nicholas; Cirillo, Pasquale
2018-07-01
We study the problems related to the estimation of the Gini index in presence of a fat-tailed data generating process, i.e. one in the stable distribution class with finite mean but infinite variance (i.e. with tail index α ∈(1 , 2)). We show that, in such a case, the Gini coefficient cannot be reliably estimated using conventional nonparametric methods, because of a downward bias that emerges under fat tails. This has important implications for the ongoing discussion about economic inequality. We start by discussing how the nonparametric estimator of the Gini index undergoes a phase transition in the symmetry structure of its asymptotic distribution, as the data distribution shifts from the domain of attraction of a light-tailed distribution to that of a fat-tailed one, especially in the case of infinite variance. We also show how the nonparametric Gini bias increases with lower values of α. We then prove that maximum likelihood estimation outperforms nonparametric methods, requiring a much smaller sample size to reach efficiency. Finally, for fat-tailed data, we provide a simple correction mechanism to the small sample bias of the nonparametric estimator based on the distance between the mode and the mean of its asymptotic distribution.
Vitis Phylogenomics: Hybridization Intensities from a SNP Array Outperform Genotype Calls
Miller, Allison J.; Matasci, Naim; Schwaninger, Heidi; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Simon, Charles; Buckler, Edward S.; Myles, Sean
2013-01-01
Understanding relationships among species is a fundamental goal of evolutionary biology. Single nucleotide polymorphisms (SNPs) identified through next generation sequencing and related technologies enable phylogeny reconstruction by providing unprecedented numbers of characters for analysis. One approach to SNP-based phylogeny reconstruction is to identify SNPs in a subset of individuals, and then to compile SNPs on an array that can be used to genotype additional samples at hundreds or thousands of sites simultaneously. Although powerful and efficient, this method is subject to ascertainment bias because applying variation discovered in a representative subset to a larger sample favors identification of SNPs with high minor allele frequencies and introduces bias against rare alleles. Here, we demonstrate that the use of hybridization intensity data, rather than genotype calls, reduces the effects of ascertainment bias. Whereas traditional SNP calls assess known variants based on diversity housed in the discovery panel, hybridization intensity data survey variation in the broader sample pool, regardless of whether those variants are present in the initial SNP discovery process. We apply SNP genotype and hybridization intensity data derived from the Vitis9kSNP array developed for grape to show the effects of ascertainment bias and to reconstruct evolutionary relationships among Vitis species. We demonstrate that phylogenies constructed using hybridization intensities suffer less from the distorting effects of ascertainment bias, and are thus more accurate than phylogenies based on genotype calls. Moreover, we reconstruct the phylogeny of the genus Vitis using hybridization data, show that North American subgenus Vitis species are monophyletic, and resolve several previously poorly known relationships among North American species. This study builds on earlier work that applied the Vitis9kSNP array to evolutionary questions within Vitis vinifera and has general implications for addressing ascertainment bias in array-enabled phylogeny reconstruction. PMID:24236035
Crampin, A C; Mwinuka, V; Malema, S S; Glynn, J R; Fine, P E
2001-01-01
Selection bias, particularly of controls, is common in case-control studies and may materially affect the results. Methods of control selection should be tailored both for the risk factors and disease under investigation and for the population being studied. We present here a control selection method devised for a case-control study of tuberculosis in rural Africa (Karonga, northern Malawi) that selects an age/sex frequency-matched random sample of the population, with a geographical distribution in proportion to the population density. We also present an audit of the selection process, and discuss the potential of this method in other settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Telfeyan, Katherine Christina; Ware, Stuart Douglas; Reimus, Paul William
Diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged from 14 to 30%,more » and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.« less
NASA Astrophysics Data System (ADS)
Telfeyan, Katherine; Ware, S. Doug; Reimus, Paul W.; Birdsell, Kay H.
2018-02-01
Diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating effective matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of effective matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged from 14 to 30%, and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than effective matrix diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields effective matrix diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.
A survey method for characterizing daily life experience: the day reconstruction method.
Kahneman, Daniel; Krueger, Alan B; Schkade, David A; Schwarz, Norbert; Stone, Arthur A
2004-12-03
The Day Reconstruction Method (DRM) assesses how people spend their time and how they experience the various activities and settings of their lives, combining features of time-budget measurement and experience sampling. Participants systematically reconstruct their activities and experiences of the preceding day with procedures designed to reduce recall biases. The DRM's utility is shown by documenting close correspondences between the DRM reports of 909 employed women and established results from experience sampling. An analysis of the hedonic treadmill shows the DRM's potential for well-being research.
Breen, Kevin J.
2000-01-01
Assessments to determine whether agricultural pesticides are present in ground water are performed by the Commonwealth of Pennsylvania under the aquifer monitoring provisions of the State Pesticides and Ground Water Strategy. Pennsylvania's Department of Agriculture conducts the monitoring and collects samples; the Department of Environmental Protection (PaDEP) Laboratory analyzes the samples to measure pesticide concentration. To evaluate the quality of the measurements of pesticide concentration for a groundwater assessment, a quality-assurance design was developed and applied to a selected assessment area in Pennsylvania. This report describes the quality-assurance design, describes how and where the design was applied, describes procedures used to collect and analyze samples and to evaluate the results, and summarizes the quality assurance results along with the assessment results.The design was applied in an agricultural area of the Delaware River Basin in Berks, Lebanon, Lehigh, and Northampton Counties to evaluate the bias and variability in laboratory results for pesticides. The design—with random spatial and temporal components—included four data-quality objectives for bias and variability. The spatial design was primary and represented an area comprising 30 sampling cells. A quality-assurance sampling frequency of 20 percent of cells was selected to ensure a sample number of five or more for analysis. Quality-control samples included blanks, spikes, and replicates of laboratory water and spikes, replicates, and 2-lab splits of groundwater. Two analytical laboratories, the PaDEP Laboratory and a U.S. Geological Survey Laboratory, were part of the design. Bias and variability were evaluated by use of data collected from October 1997 through January 1998 for alachlor, atrazine, cyanazine, metolachlor, simazine, pendimethalin, metribuzin, and chlorpyrifos.Results of analyses of field blanks indicate that collection, processing, transport, and laboratory analysis procedures did not contaminate the samples; there were no false-positive results. Pesticides were detected in water when pesticides were spiked into (added to) samples. There were no false negatives for the eight pesticides in all spiked samples. Negative bias was characteristic of analytical results for the eight pesticides, and bias was generally in excess of 10 percent from the ‘true’ or expected concentration (34 of 39 analyses, or 87 percent of the ground-water results) for pesticide concentrations ranging from 0.31 to 0.51 mg/L (micrograms per liter). The magnitude of the negative bias for the eight pesticides, with the exception of cyanazine, would result in reported concentrations commonly 75-80 percent of the expected concentration in the water sample. The bias for cyanazine was negative and within 10 percent of the expected concentration. A comparison of spiked pesticide-concentration recoveries in laboratory water and ground water indicated no effect of the ground-water matrix, and matrix interference was not a source of the negative bias. Results for the laboratory-water spikes submitted in triplicate showed large variability for recoveries of atrazine, cyanazine, and pendimethalin. The relative standard deviation (RSD) was used as a measure of method variability over the course of the study for laboratory waters at a concentration of 0.4 mg/L. An RSD of about 11 percent (or about ?0.05 mg/L)characterizes the method results for alachlor, chlorpyrifos, metolachlor, metribuzin, and simazine. Atrazine and pendimethalin have RSD values of about 17 and 23 percent, respectively. Cyanazine showed the largest RSD at nearly 51 percent. The pesticides with low variability in laboratory-water spikes also had low variability in ground water.The assessment results showed that atrazinewas the most commonly detected pesticide in ground water in the assessment area. Atrazine was detected in water from 22 of the 28 wells sampled, and recovery results for atrazine were some of the worst (largest negative bias). Concentrations of the eight pesticides in ground water from wells were generally less than 0.3 µg/L. Only six individual measurements of the concentrations in water from six of the wells were at or above 0.3 µg/L, five for atrazine and one for metolachlor. There were eight additional detections of metolachlor and simazine at concentrations less than 0.1 µg/L. No well water contained more than one pesticide at concentra-tions at or above 0.3 µg/L. Evidence exists, how-ever, for a pattern of co-occurrence of metolachlor and simazine at low concentrations with higher concentrations of atrazine.Large variability in replicate samples and negative bias for pesticide recovery from spiked samples indicate the need to use data for pesticide recovery in the interpretation of measured pesti-cide concentrations in ground water. Data from samples spiked with known amounts of pesticides were a critical component of a quality-assurance design for the monitoring component of the Pesti-cides and Ground Water Strategy.Trigger concentrations, the concentrations that require action under the Pesticides and Ground Water Strategy, should be considered maximums for action. This consideration is needed because of the magnitude of negative bias.
40 CFR 53.35 - Test procedure for Class II and Class III methods for PM 2.5 and PM −2.5.
Code of Federal Regulations, 2014 CFR
2014-07-01
... section. All reference method samplers shall be of single-filter design (not multi-filter, sequential sample design). Each candidate method shall be setup and operated in accordance with its associated... precision specified in table C-4 of this subpart. (g) Test for additive and multiplicative bias (comparative...
40 CFR 53.35 - Test procedure for Class II and Class III methods for PM 2.5 and PM −2.5.
Code of Federal Regulations, 2013 CFR
2013-07-01
... section. All reference method samplers shall be of single-filter design (not multi-filter, sequential sample design). Each candidate method shall be setup and operated in accordance with its associated... precision specified in table C-4 of this subpart. (g) Test for additive and multiplicative bias (comparative...
Pecky rot in incense-cedar: evaluation of five scaling methods.
James M. Cahill; W.Y. Pong; D.L. Weyermann
1987-01-01
A sample of 58 logs was used to evaluate five methods of making scale deduc-tions for pecky rot in incense-cedar (Libocedrus decurrens Torr.) logs. Bias and ac-curacy were computed for three Scribner and two cubic scaling methods. The lumber yield of sound incense-cedar logs, as measured in a product recovery study, was used as the basis for...
Krishna P. Poudel; Temesgen. Hailemariam
2015-01-01
Performance of three groups of methods to estimate total and/or component aboveground biomass was evaluated using the data collected from destructively sampled trees in different parts of Oregon. First group of methods used analytical approach to estimate total and component biomass using existing equations, and produced biased estimates for our dataset. The second...
Equalizer reduces SNP bias in Affymetrix microarrays.
Quigley, David
2015-07-30
Gene expression microarrays measure the levels of messenger ribonucleic acid (mRNA) in a sample using probe sequences that hybridize with transcribed regions. These probe sequences are designed using a reference genome for the relevant species. However, most model organisms and all humans have genomes that deviate from their reference. These variations, which include single nucleotide polymorphisms, insertions of additional nucleotides, and nucleotide deletions, can affect the microarray's performance. Genetic experiments comparing individuals bearing different population-associated single nucleotide polymorphisms that intersect microarray probes are therefore subject to systemic bias, as the reduction in binding efficiency due to a technical artifact is confounded with genetic differences between parental strains. This problem has been recognized for some time, and earlier methods of compensation have attempted to identify probes affected by genome variants using statistical models. These methods may require replicate microarray measurement of gene expression in the relevant tissue in inbred parental samples, which are not always available in model organisms and are never available in humans. By using sequence information for the genomes of organisms under investigation, potentially problematic probes can now be identified a priori. However, there is no published software tool that makes it easy to eliminate these probes from an annotation. I present equalizer, a software package that uses genome variant data to modify annotation files for the commonly used Affymetrix IVT and Gene/Exon platforms. These files can be used by any microarray normalization method for subsequent analysis. I demonstrate how use of equalizer on experiments mapping germline influence on gene expression in a genetic cross between two divergent mouse species and in human samples significantly reduces probe hybridization-induced bias, reducing false positive and false negative findings. The equalizer package reduces probe hybridization bias from experiments performed on the Affymetrix microarray platform, allowing accurate assessment of germline influence on gene expression.
NASA Astrophysics Data System (ADS)
Lv, Chao; Zheng, Lianqing; Yang, Wei
2012-01-01
Molecular dynamics sampling can be enhanced via the promoting of potential energy fluctuations, for instance, based on a Hamiltonian modified with the addition of a potential-energy-dependent biasing term. To overcome the diffusion sampling issue, which reveals the fact that enlargement of event-irrelevant energy fluctuations may abolish sampling efficiency, the essential energy space random walk (EESRW) approach was proposed earlier. To more effectively accelerate the sampling of solute conformations in aqueous environment, in the current work, we generalized the EESRW method to a two-dimension-EESRW (2D-EESRW) strategy. Specifically, the essential internal energy component of a focused region and the essential interaction energy component between the focused region and the environmental region are employed to define the two-dimensional essential energy space. This proposal is motivated by the general observation that in different conformational events, the two essential energy components have distinctive interplays. Model studies on the alanine dipeptide and the aspartate-arginine peptide demonstrate sampling improvement over the original one-dimension-EESRW strategy; with the same biasing level, the present generalization allows more effective acceleration of the sampling of conformational transitions in aqueous solution. The 2D-EESRW generalization is readily extended to higher dimension schemes and employed in more advanced enhanced-sampling schemes, such as the recent orthogonal space random walk method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martínez-García, Eric E.; González-Lópezlira, Rosa A.; Bruzual A, Gustavo
2017-01-20
Stellar masses of galaxies are frequently obtained by fitting stellar population synthesis models to galaxy photometry or spectra. The state of the art method resolves spatial structures within a galaxy to assess the total stellar mass content. In comparison to unresolved studies, resolved methods yield, on average, higher fractions of stellar mass for galaxies. In this work we improve the current method in order to mitigate a bias related to the resolved spatial distribution derived for the mass. The bias consists in an apparent filamentary mass distribution and a spatial coincidence between mass structures and dust lanes near spiral arms.more » The improved method is based on iterative Bayesian marginalization, through a new algorithm we have named Bayesian Successive Priors (BSP). We have applied BSP to M51 and to a pilot sample of 90 spiral galaxies from the Ohio State University Bright Spiral Galaxy Survey. By quantitatively comparing both methods, we find that the average fraction of stellar mass missed by unresolved studies is only half what previously thought. In contrast with the previous method, the output BSP mass maps bear a better resemblance to near-infrared images.« less
Shen, Yue; Wang, Ying; Zhou, Yuan; Hai, Chunxi; Hu, Jun; Zhang, Yi
2018-01-01
Electrostatic force spectroscopy (EFS) is a method for monitoring the electrostatic force microscopy (EFM) phase with high resolution as a function of the electrical direct current bias applied either to the probe or sample. Based on the dielectric constant difference of graphene oxide (GO) sheets (reduced using various methods), EFS can be used to characterize the degree of reduction of uniformly reduced one-atom-thick GO sheets at the nanoscale. In this paper, using thermally or chemically reduced individual GO sheets on mica substrates as examples, we characterize their degree of reduction at the nanoscale using EFS. For the reduced graphene oxide (rGO) sheets with a given degree of reduction (sample n), the EFS curve is very close to a parabola within a restricted area. We found that the change in parabola opening direction (or sign the parabola opening value) indicates the onset of reduction on GO sheets. Moreover, the parabola opening value, the peak bias value (tip bias leads to the peak or valley EFM phases) and the EFM phase contrast at a certain tip bias less than the peak value can all indicate the degree of reduction of rGO samples, which is positively correlated with the dielectric constant. In addition, we gave the ranking of degree for reduction on thermally or chemically reduced GO sheets and evaluated the effects of the reducing conditions. The identification of the degree of reduction of GO sheets using EFS is important for reduction strategy optimization and mass application of GO, which is highly desired owing to its mechanical, thermal, optical and electronic applications. Furthermore, as a general and quantitative technique for evaluating the small differences in the dielectric properties of nanomaterials, the EFS technique will extend and facilitate its nanoscale electronic devices applications in the future.
NASA Astrophysics Data System (ADS)
Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John
2012-01-01
Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.
Greenberg, Neil; Roberts, William L; Bachmann, Lorin M; Wright, Elizabeth C; Dalton, R Neil; Zakowski, Jack J; Miller, W Greg
2012-02-01
Standardized calibration does not change a creatinine measurement procedure's susceptibility to potentially interfering substances. We obtained individual residual serum or plasma samples (n = 365) from patients with 19 different disease categories associated with potentially interfering substances and from healthy controls. Additional sera at 0.9 mg/dL (80 μmol/L) and 3.8 mg/dL (336 μmol/L) creatinine were supplemented with acetoacetate, acetone, ascorbate, and pyruvate. We measured samples by 4 enzymatic and 3 Jaffe commercially available procedures and by a liquid chromatography/isotope dilution/mass spectrometry measurement procedure against which biases were determined. The number of instances when 3 or more results in a disease category had biases greater than the limits of acceptability was 28 of 57 (49%) for Jaffe and 14 of 76 (18%) for enzymatic procedures. For the aggregate group of 59 diabetes samples with increased β-hydroxybutyrate, glucose, or glycosylated hemoglobin (Hb A(1c)), the enzymatic procedures had 10 biased results of 236 (4.2%) compared with 89 of 177 (50.3%) for the Jaffe procedures, and these interferences were highly procedure dependent. For supplemented sera, interferences were observed in 11 of 24 (46%) of groups for Jaffe and 8 of 32 (25%) of groups for enzymatic procedures and were different at low or high creatinine concentrations. There were differences in both magnitude and direction of bias among measurement procedures, whether enzymatic or Jaffe. The influence of interfering substances was less frequent with the enzymatic procedures, but no procedure was unaffected. The details of implementation of a method principle influenced its susceptibility to potential interfering substances.
Segmental analysis of amphetamines in hair using a sensitive UHPLC-MS/MS method.
Jakobsson, Gerd; Kronstrand, Robert
2014-06-01
A sensitive and robust ultra high performance liquid chromatography-tandem mass spectrometry (UHPLC-MS/MS) method was developed and validated for quantification of amphetamine, methamphetamine, 3,4-methylenedioxyamphetamine and 3,4-methylenedioxy methamphetamine in hair samples. Segmented hair (10 mg) was incubated in 2M sodium hydroxide (80°C, 10 min) before liquid-liquid extraction with isooctane followed by centrifugation and evaporation of the organic phase to dryness. The residue was reconstituted in methanol:formate buffer pH 3 (20:80). The total run time was 4 min and after optimization of UHPLC-MS/MS-parameters validation included selectivity, matrix effects, recovery, process efficiency, calibration model and range, lower limit of quantification, precision and bias. The calibration curve ranged from 0.02 to 12.5 ng/mg, and the recovery was between 62 and 83%. During validation the bias was less than ±7% and the imprecision was less than 5% for all analytes. In routine analysis, fortified control samples demonstrated an imprecision <13% and control samples made from authentic hair demonstrated an imprecision <26%. The method was applied to samples from a controlled study of amphetamine intake as well as forensic hair samples previously analyzed with an ultra high performance liquid chromatography time of flight mass spectrometry (UHPLC-TOF-MS) screening method. The proposed method was suitable for quantification of these drugs in forensic cases including violent crimes, autopsy cases, drug testing and re-granting of driving licences. This study also demonstrated that if hair samples are divided into several short segments, the time point for intake of a small dose of amphetamine can be estimated, which might be useful when drug facilitated crimes are investigated. Copyright © 2014 John Wiley & Sons, Ltd.
Medalie, Laura; Martin, Jeffrey D.
2017-08-14
Potential contamination bias was estimated for 8 nutrient analytes and 40 pesticides in stream water collected by the U.S. Geological Survey at 147 stream sites from across the United States, and representing a variety of hydrologic conditions and site types, for water years 2002–12. This study updates previous U.S. Geological Survey evaluations of potential contamination bias for nutrients and pesticides. Contamination is potentially introduced to water samples by exposure to airborne gases and particulates, from inadequate cleaning of sampling or analytic equipment, and from inadvertent sources during sample collection, field processing, shipment, and laboratory analysis. Potential contamination bias, based on frequency and magnitude of detections in field blanks, is used to determine whether or under what conditions environmental data might need to be qualified for the interpretation of results in the context of comparisons with background levels, drinking-water standards, aquatic-life criteria or benchmarks, or human-health benchmarks. Environmental samples for which contamination bias as determined in this report applies are those from historical U.S. Geological Survey water-quality networks or programs that were collected during the same time frame and according to the same protocols and that were analyzed in the same laboratory as field blanks described in this report.Results from field blanks for ammonia, nitrite, nitrite plus nitrate, orthophosphate, and total phosphorus were partitioned by analytical method; results from the most commonly used analytical method for total phosphorus were further partitioned by date. Depending on the analytical method, 3.8, 9.2, or 26.9 percent of environmental samples, the last of these percentages pertaining to all results from 2007 through 2012, were potentially affected by ammonia contamination. Nitrite contamination potentially affected up to 2.6 percent of environmental samples collected between 2002 and 2006 and affected about 3.3 percent of samples collected between 2007 and 2012. The percentages of environmental samples collected between 2002 and 2011 that were potentially affected by nitrite plus nitrate contamination were 7.3 for samples analyzed with the low-level method and 0.4 for samples analyzed with the standard-level method. These percentages increased to 14.8 and 2.2 for samples collected in 2012 and analyzed using replacement low- and standard-level methods, respectively. The maximum potentially affected concentrations for nitrite and for nitrite plus nitrate were much less than their respective maximum contamination levels for drinking-water standards. Although contamination from particulate nitrogen can potentially affect up to 21.2 percent and that from total Kjeldahl nitrogen can affect up to 16.5 percent of environmental samples, there are no critical or background levels for these substances.For total nitrogen, orthophosphate, and total phosphorus, contamination in a small percentage of environmental samples might be consequential for comparisons relative to impairment risks or background levels. At the low ends of the respective ranges of impairment risk for these nutrients, contamination in up to 5 percent of stream samples could account for at least 23 percent of measured concentrations of total nitrogen, for at least 40 or 90 percent of concentrations of orthophosphate, depending on the analytical method, and for 31 to 76 percent of concentrations of total phosphorus, depending on the time period.Twenty-six pesticides had no detections in field blanks. Atrazine with 12 and metolachlor with 11 had the highest number of detections, mostly occurring in spring or early summer. At a 99-percent level of confidence, contamination was estimated to be no greater than the detection limit in at least 98 percent of all samples for 38 of 40 pesticides. For metolachlor and atrazine, potential contamination was no greater than 0.0053 and 0.0093 micrograms per liter in 98 percent of samples. For 11 of 14 pesticides with at least one detection, the maximum potentially affected concentration of the environmental sample was less than their respective human-health or aquatic-life benchmarks. Small percentages of environmental samples had concentrations high enough that atrazine contamination potentially could account for the entire aquatic-life benchmark for acute effects on nonvascular plants, that dieldrin contamination could account for up to 100 percent of the cancer health-based screening level, or that chlorpyrifos contamination could account for 13 or 12 percent of the concentrations in the aquatic-life benchmarks for chronic effects on invertebrates or the criterion continuous concentration for chronic effects on aquatic life.
Avoiding treatment bias of REDD+ monitoring by sampling with partial replacement.
Köhl, Michael; Scott, Charles T; Lister, Andrew J; Demon, Inez; Plugge, Daniel
2015-12-01
Implementing REDD+ renders the development of a measurement, reporting and verification (MRV) system necessary to monitor carbon stock changes. MRV systems generally apply a combination of remote sensing techniques and in-situ field assessments. In-situ assessments can be based on 1) permanent plots, which are assessed on all successive occasions, 2) temporary plots, which are assessed only once, and 3) a combination of both. The current study focuses on in-situ assessments and addresses the effect of treatment bias, which is introduced by managing permanent sampling plots differently than the surrounding forests. Temporary plots are not subject to treatment bias, but are associated with large sampling errors and low cost-efficiency. Sampling with partial replacement (SPR) utilizes both permanent and temporary plots. We apply a scenario analysis with different intensities of deforestation and forest degradation to show that SPR combines cost-efficiency with the handling of treatment bias. Without treatment bias permanent plots generally provide lower sampling errors for change estimates than SPR and temporary plots, but do not provide reliable estimates, if treatment bias occurs, SPR allows for change estimates that are comparable to those provided by permanent plots, offers the flexibility to adjust sample sizes in the course of time, and allows to compare data on permanent versus temporary plots for detecting treatment bias. Equivalence of biomass or carbon stock estimates between permanent and temporary plots serves as an indication for the absence of treatment bias while differences suggest that there is evidence for treatment bias. SPR is a flexible tool for estimating emission factors from successive measurements. It does not entirely depend on sample plots that are installed at the first occasion but allows for the adjustment of sample sizes and placement of new plots at any occasion. This ensures that in-situ samples provide representative estimates over time. SPR offers the possibility to increase sampling intensity in areas with high degradation intensities or to establish new plots in areas where permanent plots are lost due to deforestation. SPR is also an ideal approach to mitigate concerns about treatment bias.
Characterizing sampling and quality screening biases in infrared and microwave limb sounding
NASA Astrophysics Data System (ADS)
Millán, Luis F.; Livesey, Nathaniel J.; Santee, Michelle L.; von Clarmann, Thomas
2018-03-01
This study investigates orbital sampling biases and evaluates the additional impact caused by data quality screening for the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) and the Aura Microwave Limb Sounder (MLS). MIPAS acts as a proxy for typical infrared limb emission sounders, while MLS acts as a proxy for microwave limb sounders. These biases were calculated for temperature and several trace gases by interpolating model fields to real sampling patterns and, additionally, screening those locations as directed by their corresponding quality criteria. Both instruments have dense uniform sampling patterns typical of limb emission sounders, producing almost identical sampling biases. However, there is a substantial difference between the number of locations discarded. MIPAS, as a mid-infrared instrument, is very sensitive to clouds, and measurements affected by them are thus rejected from the analysis. For example, in the tropics, the MIPAS yield is strongly affected by clouds, while MLS is mostly unaffected. The results show that upper-tropospheric sampling biases in zonally averaged data, for both instruments, can be up to 10 to 30 %, depending on the species, and up to 3 K for temperature. For MIPAS, the sampling reduction due to quality screening worsens the biases, leading to values as large as 30 to 100 % for the trace gases and expanding the 3 K bias region for temperature. This type of sampling bias is largely induced by the geophysical origins of the screening (e.g. clouds). Further, analysis of long-term time series reveals that these additional quality screening biases may affect the ability to accurately detect upper-tropospheric long-term changes using such data. In contrast, MLS data quality screening removes sufficiently few points that no additional bias is introduced, although its penetration is limited to the upper troposphere, while MIPAS may cover well into the mid-troposphere in cloud-free scenarios. We emphasize that the results of this study refer only to the representativeness of the respective data, not to their intrinsic quality.
Stochastic sampling of quadrature grids for the evaluation of vibrational expectation values
NASA Astrophysics Data System (ADS)
López Ríos, Pablo; Monserrat, Bartomeu; Needs, Richard J.
2018-02-01
The thermal lines method for the evaluation of vibrational expectation values of electronic observables [B. Monserrat, Phys. Rev. B 93, 014302 (2016), 10.1103/PhysRevB.93.014302] was recently proposed as a physically motivated approximation offering balance between the accuracy of direct Monte Carlo integration and the low computational cost of using local quadratic approximations. In this paper we reformulate thermal lines as a stochastic implementation of quadrature-grid integration, analyze the analytical form of its bias, and extend the method to multiple-point quadrature grids applicable to any factorizable harmonic or anharmonic nuclear wave function. The bias incurred by thermal lines is found to depend on the local form of the expectation value, and we demonstrate that the use of finer quadrature grids along selected modes can eliminate this bias, while still offering an ˜30 % lower computational cost than direct Monte Carlo integration in our tests.
Evaluation on the use of cerium in the NBL Titrimetric Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zebrowski, J.P.; Orlowicz, G.J.; Johnson, K.D.
An alternative to potassium dichromate as titrant in the New Brunswick Laboratory Titrimetric Method for uranium analysis was sought since chromium in the waste makes disposal difficult. Substitution of a ceric-based titrant was statistically evaluated. Analysis of the data indicated statistically equivalent precisions for the two methods, but a significant overall bias of +0.035% for the ceric titrant procedure. The cause of the bias was investigated, alterations to the procedure were made, and a second statistical study was performed. This second study revealed no statistically significant bias, nor any analyst-to-analyst variation in the ceric titration procedure. A statistically significant day-to-daymore » variation was detected, but this was physically small (0.01 5%) and was only detected because of the within-day precision of the method. The added mean and standard deviation of the %RD for a single measurement was found to be 0.031%. A comparison with quality control blind dichromate titration data again indicated similar overall precision. Effects of ten elements on the ceric titration`s performance was determined. Co, Ti, Cu, Ni, Na, Mg, Gd, Zn, Cd, and Cr in previous work at NBL these impurities did not interfere with the potassium dichromate titrant. This study indicated similar results for the ceric titrant, with the exception of Ti. All the elements (excluding Ti and Cr), caused no statistically significant bias in uranium measurements at levels of 10 mg impurity per 20-40 mg uranium. The presence of Ti was found to cause a bias of {minus}0.05%; this is attributed to the presence of sulfate ions, resulting in precipitation of titanium sulfate and occlusion of uranium. A negative bias of 0.012% was also statistically observed in the samples containing chromium impurities.« less
Rattray, Gordon W.
2014-01-01
Quality-control (QC) samples were collected from 2002 through 2008 by the U.S. Geological Survey, in cooperation with the U.S. Department of Energy, to ensure data robustness by documenting the variability and bias of water-quality data collected at surface-water and groundwater sites at and near the Idaho National Laboratory. QC samples consisted of 139 replicates and 22 blanks (approximately 11 percent of the number of environmental samples collected). Measurements from replicates were used to estimate variability (from field and laboratory procedures and sample heterogeneity), as reproducibility and reliability, of water-quality measurements of radiochemical, inorganic, and organic constituents. Measurements from blanks were used to estimate the potential contamination bias of selected radiochemical and inorganic constituents in water-quality samples, with an emphasis on identifying any cross contamination of samples collected with portable sampling equipment. The reproducibility of water-quality measurements was estimated with calculations of normalized absolute difference for radiochemical constituents and relative standard deviation (RSD) for inorganic and organic constituents. The reliability of water-quality measurements was estimated with pooled RSDs for all constituents. Reproducibility was acceptable for all constituents except dissolved aluminum and total organic carbon. Pooled RSDs were equal to or less than 14 percent for all constituents except for total organic carbon, which had pooled RSDs of 70 percent for the low concentration range and 4.4 percent for the high concentration range. Source-solution and equipment blanks were measured for concentrations of tritium, strontium-90, cesium-137, sodium, chloride, sulfate, and dissolved chromium. Field blanks were measured for the concentration of iodide. No detectable concentrations were measured from the blanks except for strontium-90 in one source solution and one equipment blank collected in September and October 2004, respectively. The detectable concentrations of strontium-90 in the blanks probably were from a small source of strontium-90 contamination or large measurement variability, or both. Order statistics and the binomial probability distribution were used to estimate the magnitude and extent of any potential contamination bias of tritium, strontium-90, cesium-137, sodium, chloride, sulfate, dissolved chromium, and iodide in water-quality samples. These statistical methods indicated that, with (1) 87 percent confidence, contamination bias of cesium-137 and sodium in 60 percent of water-quality samples was less than the minimum detectable concentration or reporting level; (2) 92‒94 percent confidence, contamination bias of tritium, strontium-90, chloride, sulfate, and dissolved chromium in 70 percent of water-quality samples was less than the minimum detectable concentration or reporting level; and (3) 75 percent confidence, contamination bias of iodide in 50 percent of water-quality samples was less than the reporting level for iodide. These results support the conclusion that contamination bias of water-quality samples from sample processing, storage, shipping, and analysis was insignificant and that cross-contamination of perched groundwater samples collected with bailers during 2002–08 was insignificant.
Potential sources of analytical bias and error in selected trace element data-quality analyses
Paul, Angela P.; Garbarino, John R.; Olsen, Lisa D.; Rosen, Michael R.; Mebane, Christopher A.; Struzeski, Tedmund M.
2016-09-28
Potential sources of analytical bias and error associated with laboratory analyses for selected trace elements where concentrations were greater in filtered samples than in paired unfiltered samples were evaluated by U.S. Geological Survey (USGS) Water Quality Specialists in collaboration with the USGS National Water Quality Laboratory (NWQL) and the Branch of Quality Systems (BQS).Causes for trace-element concentrations in filtered samples to exceed those in associated unfiltered samples have been attributed to variability in analytical measurements, analytical bias, sample contamination either in the field or laboratory, and (or) sample-matrix chemistry. These issues have not only been attributed to data generated by the USGS NWQL but have been observed in data generated by other laboratories. This study continues the evaluation of potential analytical bias and error resulting from matrix chemistry and instrument variability by evaluating the performance of seven selected trace elements in paired filtered and unfiltered surface-water and groundwater samples collected from 23 sampling sites of varying chemistries from six States, matrix spike recoveries, and standard reference materials.Filtered and unfiltered samples have been routinely analyzed on separate inductively coupled plasma-mass spectrometry instruments. Unfiltered samples are treated with hydrochloric acid (HCl) during an in-bottle digestion procedure; filtered samples are not routinely treated with HCl as part of the laboratory analytical procedure. To evaluate the influence of HCl on different sample matrices, an aliquot of the filtered samples was treated with HCl. The addition of HCl did little to differentiate the analytical results between filtered samples treated with HCl from those samples left untreated; however, there was a small, but noticeable, decrease in the number of instances where a particular trace-element concentration was greater in a filtered sample than in the associated unfiltered sample for all trace elements except selenium. Accounting for the small dilution effect (2 percent) from the addition of HCl, as required for the in-bottle digestion procedure for unfiltered samples, may be one step toward decreasing the number of instances where trace-element concentrations are greater in filtered samples than in paired unfiltered samples.The laboratory analyses of arsenic, cadmium, lead, and zinc did not appear to be influenced by instrument biases. These trace elements showed similar results on both instruments used to analyze filtered and unfiltered samples. The results for aluminum and molybdenum tended to be higher on the instrument designated to analyze unfiltered samples; the results for selenium tended to be lower. The matrices used to prepare calibration standards were different for the two instruments. The instrument designated for the analysis of unfiltered samples was calibrated using standards prepared in a nitric:hydrochloric acid (HNO3:HCl) matrix. The instrument designated for the analysis of filtered samples was calibrated using standards prepared in a matrix acidified only with HNO3. Matrix chemistry may have influenced the responses of aluminum, molybdenum, and selenium on the two instruments. The best analytical practice is to calibrate instruments using calibration standards prepared in matrices that reasonably match those of the samples being analyzed.Filtered and unfiltered samples were spiked over a range of trace-element concentrations from less than 1 to 58 times ambient concentrations. The greater the magnitude of the trace-element spike concentration relative to the ambient concentration, the greater the likelihood spike recoveries will be within data control guidelines (80–120 percent). Greater variability in spike recoveries occurred when trace elements were spiked at concentrations less than 10 times the ambient concentration. Spike recoveries that were considerably lower than 90 percent often were associated with spiked concentrations substantially lower than what was present in the ambient sample. Because the main purpose of spiking natural water samples with known quantities of a particular analyte is to assess possible matrix effects on analytical results, the results of this study stress the importance of spiking samples at concentrations that are reasonably close to what is expected but sufficiently high to exceed analytical variability. Generally, differences in spike recovery results between paired filtered and unfiltered samples were minimal when samples were analyzed on the same instrument.Analytical results for trace-element concentrations in ambient filtered and unfiltered samples greater than 10 and 40 μg/L, respectively, were within the data-quality objective for precision of ±25 percent. Ambient trace-element concentrations in filtered samples greater than the long-term method detection limits but less than 10 μg/L failed to meet the data-quality objective for precision for at least one trace element in about 54 percent of the samples. Similarly, trace-element concentrations in unfiltered samples greater than the long-term method detection limits but less than 40 μg/L failed to meet this data-quality objective for at least one trace-element analysis in about 58 percent of the samples. Although, aluminum and zinc were particularly problematic, limited re-analyses of filtered and unfiltered samples appeared to improve otherwise failed analytical precision.The evaluation of analytical bias using standard reference materials indicate a slight low bias for results for arsenic, cadmium, selenium, and zinc. Aluminum and molybdenum show signs of high bias. There was no observed bias, as determined using the standard reference materials, during the analysis of lead.
Empirical Recommendations for Improving the Stability of the Dot-Probe Task in Clinical Research
Price, Rebecca B.; Kuckertz, Jennie M.; Siegle, Greg J.; Ladouceur, Cecile D.; Silk, Jennifer S.; Ryan, Neal D.; Dahl, Ronald E.; Amir, Nader
2014-01-01
The dot-probe task has been widely used in research to produce an index of biased attention based on reaction times (RTs). Despite its popularity, very few published studies have examined psychometric properties of the task, including test-retest reliability, and no previous study has examined reliability in clinically anxious samples or systematically explored the effects of task design and analysis decisions on reliability. In the current analysis, we utilized dot-probe data from three studies where attention bias towards threat-related faces was assessed at multiple (≥5) timepoints. Two of the studies were similar (adults with Social Anxiety Disorder, similar design features) while one was much more disparate (pediatric healthy volunteers, distinct task design). We explored the effects of analysis choices (e.g., bias score calculation formula, methods for outlier handling) on reliability and searched for convergence of findings across the three studies. We found that, when considering the three studies concurrently, the most reliable RT bias index utilized data from dot-bottom trials, comparing congruent to incongruent trials, with rescaled outliers, particularly after averaging across more than one assessment point. Although reliability of RT bias indices was moderate to low under most circumstances, within-session variability in bias (attention bias variability; ABV), a recently proposed RT index, was more reliable across sessions. Several eyetracking-based indices of attention bias (available in the pediatric healthy sample only) showed reliability that matched the optimal RT index (ABV). On the basis of these findings, we make specific recommendations to researchers using the dot probe, particularly those wishing to investigate individual differences and/or single-patient applications. PMID:25419646
Classification based upon gene expression data: bias and precision of error rates.
Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L
2007-06-01
Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp
Bias correction of surface downwelling longwave and shortwave radiation for the EWEMBI dataset
NASA Astrophysics Data System (ADS)
Lange, Stefan
2018-05-01
Many meteorological forcing datasets include bias-corrected surface downwelling longwave and shortwave radiation (rlds and rsds). Methods used for such bias corrections range from multi-year monthly mean value scaling to quantile mapping at the daily timescale. An additional downscaling is necessary if the data to be corrected have a higher spatial resolution than the observational data used to determine the biases. This was the case when EartH2Observe (E2OBS; Calton et al., 2016) rlds and rsds were bias-corrected using more coarsely resolved Surface Radiation Budget (SRB; Stackhouse Jr. et al., 2011) data for the production of the meteorological forcing dataset EWEMBI (Lange, 2016). This article systematically compares various parametric quantile mapping methods designed specifically for this purpose, including those used for the production of EWEMBI rlds and rsds. The methods vary in the timescale at which they operate, in their way of accounting for physical upper radiation limits, and in their approach to bridging the spatial resolution gap between E2OBS and SRB. It is shown how temporal and spatial variability deflation related to bilinear interpolation and other deterministic downscaling approaches can be overcome by downscaling the target statistics of quantile mapping from the SRB to the E2OBS grid such that the sub-SRB-grid-scale spatial variability present in the original E2OBS data is retained. Cross validations at the daily and monthly timescales reveal that it is worthwhile to take empirical estimates of physical upper limits into account when adjusting either radiation component and that, overall, bias correction at the daily timescale is more effective than bias correction at the monthly timescale if sampling errors are taken into account.
Magnetic properties in polycrystalline and single crystal Ca-doped LaCoO3
NASA Astrophysics Data System (ADS)
Zeng, R.; Debnath, J. C.; Chen, D. P.; Shamba, P.; Wang, J. L.; Kennedy, S. J.; Campbell, S. J.; Silver, T.; Dou, S. X.
2011-04-01
Polycrystalline (PC) and single crystalline (SC) Ca-doped LaCoO3 (LCCO) samples with the perovskite structure were synthesized by conventional solid-state reaction and the floating-zone growth method. We present the results of a comprehensive investigation of the magnetic properties of the LCCO system. Systematic measurements have been conducted on dc magnetization, ac susceptibility, exchange-bias, and the magnetocaloric effect. These findings suggest that complex structural phases, ferromagnetic (FM), and spin-glass/cluster-spin-glass (CSG), and their transitions exist in PC samples, while there is a much simpler magnetic phase in SC samples. It was also of interest to discover that the CSG induced a magnetic field memory effect and an exchange-bias-like effect, and that a large inverse irreversible magnetocaloric effect exists in this system.
Empirical Validation of a Procedure to Correct Position and Stimulus Biases in Matching-to-Sample
ERIC Educational Resources Information Center
Kangas, Brian D.; Branch, Marc N.
2008-01-01
The development of position and stimulus biases often occurs during initial training on matching-to-sample tasks. Furthermore, without intervention, these biases can be maintained via intermittent reinforcement provided by matching-to-sample contingencies. The present study evaluated the effectiveness of a correction procedure designed to…
Enko, Dietmar; Mangge, Harald; Münch, Andreas; Niedrist, Tobias; Mahla, Elisabeth; Metzler, Helfried; Prüller, Florian
2017-01-01
Introduction The aim of this study was to assess pneumatic tube system (PTS) alteration on platelet function by the light transmission aggregometry (LTA) and whole blood aggregometry (WBA) method, and on the results of platelet count, prothrombin time (PT), activated partial thromboplastin time (APTT), and fibrinogen. Materials and methods Venous blood was collected into six 4.5 mL VACUETTE® 9NC coagulation sodium citrate 3.8% tubes (Greiner Bio-One International GmbH, Kremsmünster, Austria) from 49 intensive care unit (ICU) patients on dual anti-platelet therapy and immediately hand carried to the central laboratory. Blood samples were divided into 2 Groups: Group 1 samples (N = 49) underwent PTS (4 m/s) transport from the central laboratory to the distant laboratory and back to the central laboratory, whereas Group 2 samples (N = 49) were excluded from PTS forces. In both groups, LTA and WBA stimulated with collagen, adenosine-5’-diphosphate (ADP), arachidonic acid (AA) and thrombin-receptor-activated-peptide 6 (TRAP-6) as well as platelet count, PT, APTT, and fibrinogen were performed. Results No statistically significant differences were observed between blood samples with (Group 1) and without (Group 2) PTS transport (P values from 0.064 – 0.968). The AA-induced LTA (bias: 68.57%) exceeded the bias acceptance limit of ≤ 25%. Conclusions Blood sample transportation with computer controlled PTS in our hospital had no statistically significant effects on platelet aggregation determined in patients with anti-platelet therapy. Although AA induced LTA showed a significant bias, the diagnostic accuracy was not influenced. PMID:28392742
Correcting length-frequency distributions for imperfect detection
Breton, André R.; Hawkins, John A.; Winkelman, Dana L.
2013-01-01
Sampling gear selects for specific sizes of fish, which may bias length-frequency distributions that are commonly used to assess population size structure, recruitment patterns, growth, and survival. To properly correct for sampling biases caused by gear and other sources, length-frequency distributions need to be corrected for imperfect detection. We describe a method for adjusting length-frequency distributions when capture and recapture probabilities are a function of fish length, temporal variation, and capture history. The method is applied to a study involving the removal of Smallmouth Bass Micropterus dolomieu by boat electrofishing from a 38.6-km reach on the Yampa River, Colorado. Smallmouth Bass longer than 100 mm were marked and released alive from 2005 to 2010 on one or more electrofishing passes and removed on all other passes from the population. Using the Huggins mark–recapture model, we detected a significant effect of fish total length, previous capture history (behavior), year, pass, year×behavior, and year×pass on capture and recapture probabilities. We demonstrate how to partition the Huggins estimate of abundance into length frequencies to correct for these effects. Uncorrected length frequencies of fish removed from Little Yampa Canyon were negatively biased in every year by as much as 88% relative to mark–recapture estimates for the smallest length-class in our analysis (100–110 mm). Bias declined but remained high even for adult length-classes (≥200 mm). The pattern of bias across length-classes was variable across years. The percentage of unadjusted counts that were below the lower 95% confidence interval from our adjusted length-frequency estimates were 95, 89, 84, 78, 81, and 92% from 2005 to 2010, respectively. Length-frequency distributions are widely used in fisheries science and management. Our simple method for correcting length-frequency estimates for imperfect detection could be widely applied when mark–recapture data are available.
Exchange bias effect and glassy-like behavior of EuCrO{sub 3} and CeCrO{sub 3} nano-powders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taheri, M., E-mail: maryam.taheri@brocku.ca; Razavi, F. S.; Kremer, R. K.
2015-09-28
The magnetic properties of nano-sized EuCrO{sub 3} and CeCrO{sub 3} powders, synthesized by a solution combustion method, were investigated using DC/AC magnetization measurements. An exchange bias effect, magnetization irreversibility and AC susceptibility dispersion in these samples provided evidence for the presence of the spin disorder magnetic phase. The exchange bias phenomenon, which is assigned to the exchange coupling between the glassy-like shell and canted antiferromagnetic core, showed the opposite sign in EuCrO{sub 3} and CeCrO{sub 3} at low temperatures, suggesting different exchange interactions at the interfaces in these compounds. We also observed a sign reversal of exchange bias in CeCrO{submore » 3} at different temperatures.« less
Number-counts slope estimation in the presence of Poisson noise
NASA Technical Reports Server (NTRS)
Schmitt, Juergen H. M. M.; Maccacaro, Tommaso
1986-01-01
The slope determination of a power-law number flux relationship in the case of photon-limited sampling. This case is important for high-sensitivity X-ray surveys with imaging telescopes, where the error in an individual source measurement depends on integrated flux and is Poisson, rather than Gaussian, distributed. A bias-free method of slope estimation is developed that takes into account the exact error distribution, the influence of background noise, and the effects of varying limiting sensitivities. It is shown that the resulting bias corrections are quite insensitive to the bias correction procedures applied, as long as only sources with signal-to-noise ratio five or greater are considered. However, if sources with signal-to-noise ratio five or less are included, the derived bias corrections depend sensitively on the shape of the error distribution.
Accounting for selection bias in association studies with complex survey data.
Wirth, Kathleen E; Tchetgen Tchetgen, Eric J
2014-05-01
Obtaining representative information from hidden and hard-to-reach populations is fundamental to describe the epidemiology of many sexually transmitted diseases, including HIV. Unfortunately, simple random sampling is impractical in these settings, as no registry of names exists from which to sample the population at random. However, complex sampling designs can be used, as members of these populations tend to congregate at known locations, which can be enumerated and sampled at random. For example, female sex workers may be found at brothels and street corners, whereas injection drug users often come together at shooting galleries. Despite the logistical appeal, complex sampling schemes lead to unequal probabilities of selection, and failure to account for this differential selection can result in biased estimates of population averages and relative risks. However, standard techniques to account for selection can lead to substantial losses in efficiency. Consequently, researchers implement a variety of strategies in an effort to balance validity and efficiency. Some researchers fully or partially account for the survey design, whereas others do nothing and treat the sample as a realization of the population of interest. We use directed acyclic graphs to show how certain survey sampling designs, combined with subject-matter considerations unique to individual exposure-outcome associations, can induce selection bias. Finally, we present a novel yet simple maximum likelihood approach for analyzing complex survey data; this approach optimizes statistical efficiency at no cost to validity. We use simulated data to illustrate this method and compare it with other analytic techniques.
Enhanced Conformational Sampling of N-Glycans in Solution with Replica State Exchange Metadynamics.
Galvelis, Raimondas; Re, Suyong; Sugita, Yuji
2017-05-09
Molecular dynamics (MD) simulation of a N-glycan in solution is challenging because of high-energy barriers of the glycosidic linkages, functional group rotational barriers, and numerous intra- and intermolecular hydrogen bonds. In this study, we apply different enhanced conformational sampling approaches, namely, metadynamics (MTD), the replica-exchange MD (REMD), and the recently proposed replica state exchange MTD (RSE-MTD), to a N-glycan in solution and compare the conformational sampling efficiencies of the approaches. MTD helps to cross the high-energy barrier along the ω angle by utilizing a bias potential, but it cannot enhance sampling of the other degrees of freedom. REMD ensures moderate-energy barrier crossings by exchanging temperatures between replicas, while it hardly crosses the barriers along ω. In contrast, RSE-MTD succeeds to cross the high-energy barrier along ω as well as to enhance sampling of the other degrees of freedom. We tested two RSE-MTD schemes: in one scheme, 64 replicas were simulated with the bias potential along ω at different temperatures, while simulations of four replicas were performed with the bias potentials for different CVs at 300 K. In both schemes, one unbiased replica at 300 K was included to compute conformational properties of the glycan. The conformational sampling of the former is better than the other enhanced sampling methods, while the latter shows reasonable performance without spending large computational resources. The latter scheme is likely to be useful when a N-glycan-attached protein is simulated.
Empirical single sample quantification of bias and variance in Q-ball imaging.
Hainline, Allison E; Nath, Vishwesh; Parvathaneni, Prasanna; Blaber, Justin A; Schilling, Kurt G; Anderson, Adam W; Kang, Hakmook; Landman, Bennett A
2018-02-06
The bias and variance of high angular resolution diffusion imaging methods have not been thoroughly explored in the literature and may benefit from the simulation extrapolation (SIMEX) and bootstrap techniques to estimate bias and variance of high angular resolution diffusion imaging metrics. The SIMEX approach is well established in the statistics literature and uses simulation of increasingly noisy data to extrapolate back to a hypothetical case with no noise. The bias of calculated metrics can then be computed by subtracting the SIMEX estimate from the original pointwise measurement. The SIMEX technique has been studied in the context of diffusion imaging to accurately capture the bias in fractional anisotropy measurements in DTI. Herein, we extend the application of SIMEX and bootstrap approaches to characterize bias and variance in metrics obtained from a Q-ball imaging reconstruction of high angular resolution diffusion imaging data. The results demonstrate that SIMEX and bootstrap approaches provide consistent estimates of the bias and variance of generalized fractional anisotropy, respectively. The RMSE for the generalized fractional anisotropy estimates shows a 7% decrease in white matter and an 8% decrease in gray matter when compared with the observed generalized fractional anisotropy estimates. On average, the bootstrap technique results in SD estimates that are approximately 97% of the true variation in white matter, and 86% in gray matter. Both SIMEX and bootstrap methods are flexible, estimate population characteristics based on single scans, and may be extended for bias and variance estimation on a variety of high angular resolution diffusion imaging metrics. © 2018 International Society for Magnetic Resonance in Medicine.
Social media for intelligence: research, concepts, and results
NASA Astrophysics Data System (ADS)
Franke, Ulrik; Rosell, Magnus
2016-05-01
When sampling part of the enormous amounts of social media data it is important to consider whether the sample is representative. Any method of studying the sampled data is also prone to bias. Sampling and bias aside the data may be generated with malicious intent, such as deception. Deception is a complicated (broad, situational, vague) concept. It seems improbable that an automated computer system would be able to find deception as such. Instead, we argue that the role of a system would be to aid the human analyst by detecting indicators, or clues, of (potential) deception. Indicators could take many forms and are typically neither necessary nor sufficient for there to be an actual deception. However, by using one or combining several of them a human may reach conclusions. Indicators are not necessarily dependent and will be added to or removed from the analysis depending on the circumstances. This modularity can help in counteracting/alleviating attacks on the system by an adversary. If we become aware that an indicator is compromised we can remove it from the analysis and/or replace it with a more sophisticated method that give us a similar indication.
Fetterly, Kenneth A; Favazza, Christopher P
2016-08-07
Channelized Hotelling model observer (CHO) methods were developed to assess performance of an x-ray angiography system. The analytical methods included correction for known bias error due to finite sampling. Detectability indices ([Formula: see text]) corresponding to disk-shaped objects with diameters in the range 0.5-4 mm were calculated. Application of the CHO for variable detector target dose (DTD) in the range 6-240 nGy frame(-1) resulted in [Formula: see text] estimates which were as much as 2.9× greater than expected of a quantum limited system. Over-estimation of [Formula: see text] was presumed to be a result of bias error due to temporally variable non-stationary noise. Statistical theory which allows for independent contributions of 'signal' from a test object (o) and temporally variable non-stationary noise (ns) was developed. The theory demonstrates that the biased [Formula: see text] is the sum of the detectability indices associated with the test object [Formula: see text] and non-stationary noise ([Formula: see text]). Given the nature of the imaging system and the experimental methods, [Formula: see text] cannot be directly determined independent of [Formula: see text]. However, methods to estimate [Formula: see text] independent of [Formula: see text] were developed. In accordance with the theory, [Formula: see text] was subtracted from experimental estimates of [Formula: see text], providing an unbiased estimate of [Formula: see text]. Estimates of [Formula: see text] exhibited trends consistent with expectations of an angiography system that is quantum limited for high DTD and compromised by detector electronic readout noise for low DTD conditions. Results suggest that these methods provide [Formula: see text] estimates which are accurate and precise for [Formula: see text]. Further, results demonstrated that the source of bias was detector electronic readout noise. In summary, this work presents theory and methods to test for the presence of bias in Hotelling model observers due to temporally variable non-stationary noise and correct this bias when the temporally variable non-stationary noise is independent and additive with respect to the test object signal.
Sequential biases in accumulating evidence
Huggins, Richard; Dogo, Samson Henry
2015-01-01
Whilst it is common in clinical trials to use the results of tests at one phase to decide whether to continue to the next phase and to subsequently design the next phase, we show that this can lead to biased results in evidence synthesis. Two new kinds of bias associated with accumulating evidence, termed ‘sequential decision bias’ and ‘sequential design bias’, are identified. Both kinds of bias are the result of making decisions on the usefulness of a new study, or its design, based on the previous studies. Sequential decision bias is determined by the correlation between the value of the current estimated effect and the probability of conducting an additional study. Sequential design bias arises from using the estimated value instead of the clinically relevant value of an effect in sample size calculations. We considered both the fixed‐effect and the random‐effects models of meta‐analysis and demonstrated analytically and by simulations that in both settings the problems due to sequential biases are apparent. According to our simulations, the sequential biases increase with increased heterogeneity. Minimisation of sequential biases arises as a new and important research area necessary for successful evidence‐based approaches to the development of science. © 2015 The Authors. Research Synthesis Methods Published by John Wiley & Sons Ltd. PMID:26626562
Härkänen, Tommi; Kaikkonen, Risto; Virtala, Esa; Koskinen, Seppo
2014-11-06
To assess the nonresponse rates in a questionnaire survey with respect to administrative register data, and to correct the bias statistically. The Finnish Regional Health and Well-being Study (ATH) in 2010 was based on a national sample and several regional samples. Missing data analysis was based on socio-demographic register data covering the whole sample. Inverse probability weighting (IPW) and doubly robust (DR) methods were estimated using the logistic regression model, which was selected using the Bayesian information criteria. The crude, weighted and true self-reported turnout in the 2008 municipal election and prevalences of entitlements to specially reimbursed medication, and the crude and weighted body mass index (BMI) means were compared. The IPW method appeared to remove a relatively large proportion of the bias compared to the crude prevalence estimates of the turnout and the entitlements to specially reimbursed medication. Several demographic factors were shown to be associated with missing data, but few interactions were found. Our results suggest that the IPW method can improve the accuracy of results of a population survey, and the model selection provides insight into the structure of missing data. However, health-related missing data mechanisms are beyond the scope of statistical methods, which mainly rely on socio-demographic information to correct the results.
How bandwidth selection algorithms impact exploratory data analysis using kernel density estimation.
Harpole, Jared K; Woods, Carol M; Rodebaugh, Thomas L; Levinson, Cheri A; Lenze, Eric J
2014-09-01
Exploratory data analysis (EDA) can reveal important features of underlying distributions, and these features often have an impact on inferences and conclusions drawn from data. Graphical analysis is central to EDA, and graphical representations of distributions often benefit from smoothing. A viable method of estimating and graphing the underlying density in EDA is kernel density estimation (KDE). This article provides an introduction to KDE and examines alternative methods for specifying the smoothing bandwidth in terms of their ability to recover the true density. We also illustrate the comparison and use of KDE methods with 2 empirical examples. Simulations were carried out in which we compared 8 bandwidth selection methods (Sheather-Jones plug-in [SJDP], normal rule of thumb, Silverman's rule of thumb, least squares cross-validation, biased cross-validation, and 3 adaptive kernel estimators) using 5 true density shapes (standard normal, positively skewed, bimodal, skewed bimodal, and standard lognormal) and 9 sample sizes (15, 25, 50, 75, 100, 250, 500, 1,000, 2,000). Results indicate that, overall, SJDP outperformed all methods. However, for smaller sample sizes (25 to 100) either biased cross-validation or Silverman's rule of thumb was recommended, and for larger sample sizes the adaptive kernel estimator with SJDP was recommended. Information is provided about implementing the recommendations in the R computing language. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A general reconstruction of the recent expansion history of the universe
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vitenti, S.D.P.; Penna-Lima, M., E-mail: dias@iap.fr, E-mail: pennal@apc.in2p3.fr
Distance measurements are currently the most powerful tool to study the expansion history of the universe without specifying its matter content nor any theory of gravitation. Assuming only an isotropic, homogeneous and flat universe, in this work we introduce a model-independent method to reconstruct directly the deceleration function via a piecewise function. Including a penalty factor, we are able to vary continuously the complexity of the deceleration function from a linear case to an arbitrary (n+1)-knots spline interpolation. We carry out a Monte Carlo (MC) analysis to determine the best penalty factor, evaluating the bias-variance trade-off, given the uncertainties ofmore » the SDSS-II and SNLS supernova combined sample (JLA), compilations of baryon acoustic oscillation (BAO) and H(z) data. The bias-variance analysis is done for three fiducial models with different features in the deceleration curve. We perform the MC analysis generating mock catalogs and computing their best-fit. For each fiducial model, we test different reconstructions using, in each case, more than 10{sup 4} catalogs in a total of about 5× 10{sup 5}. This investigation proved to be essential in determining the best reconstruction to study these data. We show that, evaluating a single fiducial model, the conclusions about the bias-variance ratio are misleading. We determine the reconstruction method in which the bias represents at most 10% of the total uncertainty. In all statistical analyses, we fit the coefficients of the deceleration function along with four nuisance parameters of the supernova astrophysical model. For the full sample, we also fit H{sub 0} and the sound horizon r{sub s}(z{sub d}) at the drag redshift. The bias-variance trade-off analysis shows that, apart from the deceleration function, all other estimators are unbiased. Finally, we apply the Ensemble Sampler Markov Chain Monte Carlo (ESMCMC) method to explore the posterior of the deceleration function up to redshift 1.3 (using only JLA) and 2.3 (JLA+BAO+H(z)). We obtain that the standard cosmological model agrees within 3σ level with the reconstructed results in the whole studied redshift intervals. Since our method is calibrated to minimize the bias, the error bars of the reconstructed functions are a good approximation for the total uncertainty.« less
Precluding nonlinear ISI in direct detection long-haul fiber optic systems
NASA Technical Reports Server (NTRS)
Swenson, Norman L.; Shoop, Barry L.; Cioffi, John M.
1991-01-01
Long-distance, high-rate fiber optic systems employing directly modulated 1.55-micron single-mode lasers and conventional single-mode fiber suffer severe intersymbol interference (ISI) with a large nonlinear component. A method of reducing the nonlinearity of the ISI, thereby making linear equalization more viable, is investigated. It is shown that the degree of nonlinearity is highly dependent on the choice of laser bias current, and that in some cases the ISI nonlinearity can be significantly reduced by biasing the laser substantially above threshold. Simulation results predict that an increase in signal-to-nonlinear-distortion ratio as high as 25 dB can be achieved for synchronously spaced samples at an optimal sampling phase by increasing the bias current from 1.2 times threshold to 3.5 times threshold. The high SDR indicates that a linear tapped delay line equalizer could be used to mitigate ISI. Furthermore, the shape of the pulse response suggests that partial response precoding and digital feedback equalization would be particularly effective for this channel.
Neither fixed nor random: weighted least squares meta-analysis.
Stanley, T D; Doucouliagos, Hristos
2015-06-15
This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects. Copyright © 2015 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Moses, Tim; Holland, Paul
2007-01-01
The purpose of this study was to empirically evaluate the impact of loglinear presmoothing accuracy on equating bias and variability across chained and post-stratification equating methods, kernel and percentile-rank continuization methods, and sample sizes. The results of evaluating presmoothing on equating accuracy generally agreed with those of…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rhee, Minsoung; Light, Yooli K.; Meagher, Robert J.
Here, multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently,more » the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet), ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli) compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.« less
Rhee, Minsoung; Light, Yooli K.; Meagher, Robert J.; ...
2016-05-04
Here, multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently,more » the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet), ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli) compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.« less
Data harmonization of environmental variables: from simple to general solutions
NASA Astrophysics Data System (ADS)
Baume, O.
2009-04-01
European data platforms often contain measurements from different regional or national networks. As standards and protocols - e.g. type of measurement devices, sensors or measurement site classification, laboratory analysis and post-processing methods, vary between networks, discontinuities will appear when mapping the target variable at an international scale. Standardisation is generally a costly solution and does not allow classical statistical analysis of previously reported values. As an alternative, harmonization should be envisaged as an integrated step in mapping procedures across borders. In this paper, several harmonization solutions developed under the INTAMAP FP6 project are presented. The INTAMAP FP6 project is currently developing an interoperable framework for real-time automatic mapping of critical environmental variables by extending spatial statistical methods to web-based implementations. Harmonization is often considered as a pre-processing step in statistical data analysis workflow. If biases are assessed with little knowledge about the target variable - in particular when no explanatory covariate is integrated, a harmonization procedure along borders or between regionally overlapping networks may be adopted (Skøien et al., 2007). In this case, bias is estimated as the systematic difference between line or local predictions. On the other hand, when covariates can be included in spatial prediction, the harmonization step is integrated in the whole model estimation procedure, and, therefore, is no longer an independent pre-processing step of the automatic mapping process (Baume et al., 2007). In this case, bias factors become integrated parameters of the geostatistical model and are estimated alongside the other model parameters. The harmonization methods developed within the INTAMAP project were first applied within the field of radiation, where the European Radiological Data Exchange Platform (EURDEP) - http://eurdep.jrc.ec.europa.eu/ - has been active for all member states for more than a decade (de Cort and de Vries, 1997). This database contains biases because of the different networks processes used in data reporting (Bossew et al., 2007). In a comparison study, monthly averaged Gamma dose measurements from eight European countries were using the methods described above. Baume et al. (2008) showed that both methods yield similar results and can detect and remove bias from the EURDEP database. To broaden the potential of the methods developed within the INTAMAP project, another application example taken from soil science is presented in this paper. The Carbon/Nitrogen (C/N) ratio of forest soils is one of the best predictors for evaluating soil functions such as used in climate change issues. Although soil samples were analyzed according to a common European laboratory method, Carré et al. (2008) concluded that systematic errors are introduced in the measurements due to calibration issues and instability of the sample. The application of the harmonization procedures showed that bias could be adequately removed, although the procedures have difficulty to distinguish real differences from bias.
Measurements of the properties of solar wind plasma relevant to studies of its coronal sources
NASA Technical Reports Server (NTRS)
Neugebauer, M.
1982-01-01
Interplanetary measurements of the speeds, densities, abundances, and charge states of solar wind ions are diagnostic of conditions in the source region of the solar wind. The absolute values of the mass, momentum, and energy fluxes in the solar wind are not known to an accuracy of 20%. The principal limitations on the absolute accuracies of observations of solar wind protons and alpha particles arise from uncertain instrument calibrations, from the methods used to reduce the data, and from sampling biases. Sampling biases are very important in studies of alpha particles. Instrumental resolution and measurement ambiguities are additional major problems for the observation of ions heavier than helium. Progress in overcoming some of these measurement inadequacies is reviewed.
NASA Astrophysics Data System (ADS)
Yamaguchi, Atsuko; Ohashi, Takeyoshi; Kawasaki, Takahiro; Inoue, Osamu; Kawada, Hiroki
2013-04-01
A new method for calculating critical dimension (CDs) at the top and bottom of three-dimensional (3D) pattern profiles from a critical-dimension scanning electron microscope (CD-SEM) image, called as "T-sigma method", is proposed and evaluated. Without preparing a library of database in advance, T-sigma can estimate a feature of a pattern sidewall. Furthermore, it supplies the optimum edge-definition (i.e., threshold level for determining edge position from a CDSEM signal) to detect the top and bottom of the pattern. This method consists of three steps. First, two components of line-edge roughness (LER); noise-induced bias (i.e., LER bias) and unbiased component (i.e., bias-free LER) are calculated with set threshold level. Second, these components are calculated with various threshold values, and the threshold-dependence of these two components, "T-sigma graph", is obtained. Finally, the optimum threshold value for the top and the bottom edge detection are given by the analysis of T-sigma graph. T-sigma was applied to CD-SEM images of three kinds of resist-pattern samples. In addition, reference metrology was performed with atomic force microscope (AFM) and scanning transmission electron microscope (STEM). Sensitivity of CD measured by T-sigma to the reference CD was higher than or equal to that measured by the conventional edge definition. Regarding the absolute measurement accuracy, T-sigma showed better results than the conventional definition. Furthermore, T-sigma graphs were calculated from CD-SEM images of two kinds of resist samples and compared with corresponding STEM observation results. Both bias-free LER and LER bias increased as the detected edge point moved from the bottom to the top of the pattern in the case that the pattern had a straight sidewall and a round top. On the other hand, they were almost constant in the case that the pattern had a re-entrant profile. T-sigma will be able to reveal a re-entrant feature. From these results, it is found that T-sigma method can provide rough cross-sectional pattern features and achieve quick, easy and accurate measurements of top and bottom CD.
Equilibrium Molecular Thermodynamics from Kirkwood Sampling
2015-01-01
We present two methods for barrierless equilibrium sampling of molecular systems based on the recently proposed Kirkwood method (J. Chem. Phys.2009, 130, 134102). Kirkwood sampling employs low-order correlations among internal coordinates of a molecule for random (or non-Markovian) sampling of the high dimensional conformational space. This is a geometrical sampling method independent of the potential energy surface. The first method is a variant of biased Monte Carlo, where Kirkwood sampling is used for generating trial Monte Carlo moves. Using this method, equilibrium distributions corresponding to different temperatures and potential energy functions can be generated from a given set of low-order correlations. Since Kirkwood samples are generated independently, this method is ideally suited for massively parallel distributed computing. The second approach is a variant of reservoir replica exchange, where Kirkwood sampling is used to construct a reservoir of conformations, which exchanges conformations with the replicas performing equilibrium sampling corresponding to different thermodynamic states. Coupling with the Kirkwood reservoir enhances sampling by facilitating global jumps in the conformational space. The efficiency of both methods depends on the overlap of the Kirkwood distribution with the target equilibrium distribution. We present proof-of-concept results for a model nine-atom linear molecule and alanine dipeptide. PMID:25915525
Lincoln, Tricia A.; Horan-Ross, Debra A.; McHale, Michael R.; Lawrence, Gregory B.
2006-01-01
The laboratory for analysis of low-ionic-strength water at the U.S. Geological Survey (USGS) Water Science Center in Troy, N.Y., analyzes samples collected by USGS projects throughout the Northeast. The laboratory's quality-assurance program is based on internal and interlaboratory quality-assurance samples and quality-control procedures that were developed to ensure proper sample collection, processing, and analysis. The quality-assurance and quality-control data were stored in the laboratory's LabMaster data-management system, which provides efficient review, compilation, and plotting of data. This report presents and discusses results of quality-assurance and quality-control samples analyzed from July 1999 through June 2001. Results for the quality-control samples for 18 analytical procedures were evaluated for bias and precision. Control charts indicate that data for eight of the analytical procedures were occasionally biased for either high-concentration or low-concentration samples but were within control limits; these procedures were: acid-neutralizing capacity, total monomeric aluminum, total aluminum, calcium, chloride and nitrate (ion chromatography and colormetric method) and sulfate. The total aluminum and dissolved organic carbon procedures were biased throughout the analysis period for the high-concentration sample, but were within control limits. The calcium and specific conductance procedures were biased throughout the analysis period for the low-concentration sample, but were within control limits. The magnesium procedure was biased for the high-concentration and low concentration samples, but was within control limits. Results from the filter-blank and analytical-blank analyses indicate that the procedures for 14 of 15 analytes were within control limits, although the concentrations for blanks were occasionally outside the control limits. The data-quality objective was not met for dissolved organic carbon. Sampling and analysis precision are evaluated herein in terms of the coefficient of variation obtained for triplicate samples in the procedures for 17 of the 18 analytes. At least 90 percent of the samples met data-quality objectives for all analytes except ammonium (81 percent of samples met objectives), chloride (75 percent of samples met objectives), and sodium (86 percent of samples met objectives). Results of the USGS interlaboratory Standard Reference Sample (SRS) Project indicated good data quality over the time period, with most ratings for each sample in the good to excellent range. The P-sample (low-ionic-strength constituents) analysis had one satisfactory rating for the specific conductance procedure in one study. The T-sample (trace constituents) analysis had one satisfactory rating for the aluminum procedure in one study and one unsatisfactory rating for the sodium procedure in another. The remainder of the samples had good or excellent ratings for each study. Results of Environment Canada's National Water Research Institute (NWRI) program indicated that at least 89 percent of the samples met data-quality objectives for 10 of the 14 analytes; the exceptions were ammonium, total aluminum, dissolved organic carbon, and sodium. Results indicate a positive bias for the ammonium procedure in all studies. Data-quality objectives were not met in 50 percent of samples analyzed for total aluminum, 38 percent of samples analyzed for dissolved organic carbon, and 27 percent of samples analyzed for sodium. Results from blind reference-sample analyses indicated that data-quality objectives were met by at least 91 percent of the samples analyzed for calcium, chloride, fluoride, magnesium, pH, potassium, and sulfate. Data-quality objectives were met by 75 percent of the samples analyzed for sodium and 58 percent of the samples analyzed for specific conductance.
Luo, Yong; Wu, Dapeng; Zeng, Shaojiang; Gai, Hongwei; Long, Zhicheng; Shen, Zheng; Dai, Zhongpeng; Qin, Jianhua; Lin, Bingcheng
2006-09-01
A novel sample injection method for chip CE was presented. This injection method uses hydrostatic pressure, generated by emptying the sample waste reservoir, for sample loading and electrokinetic force for dispensing. The injection was performed on a double-cross microchip. One cross, created by the sample and separation channels, is used for formation of a sample plug. Another cross, formed by the sample and controlling channels, is used for plug control. By varying the electric field in the controlling channel, the sample plug volume can be linearly adjusted. Hydrostatic pressure takes advantage of its ease of generation on a microfluidic chip, without any electrode or external pressure pump, thus allowing a sample injection with a minimum number of electrodes. The potential of this injection method was demonstrated by a four-separation-channel chip CE system. In this system, parallel sample separation can be achieved with only two electrodes, which is otherwise impossible with conventional injection methods. Hydrostatic pressure maintains the sample composition during the sample loading, allowing the injection to be free of injection bias.
Can we estimate molluscan abundance and biomass on the continental shelf?
NASA Astrophysics Data System (ADS)
Powell, Eric N.; Mann, Roger; Ashton-Alcox, Kathryn A.; Kuykendall, Kelsey M.; Chase Long, M.
2017-11-01
Few empirical studies have focused on the effect of sample density on the estimate of abundance of the dominant carbonate-producing fauna of the continental shelf. Here, we present such a study and consider the implications of suboptimal sampling design on estimates of abundance and size-frequency distribution. We focus on a principal carbonate producer of the U.S. Atlantic continental shelf, the Atlantic surfclam, Spisula solidissima. To evaluate the degree to which the results are typical, we analyze a dataset for the principal carbonate producer of Mid-Atlantic estuaries, the Eastern oyster Crassostrea virginica, obtained from Delaware Bay. These two species occupy different habitats and display different lifestyles, yet demonstrate similar challenges to survey design and similar trends with sampling density. The median of a series of simulated survey mean abundances, the central tendency obtained over a large number of surveys of the same area, always underestimated true abundance at low sample densities. More dramatic were the trends in the probability of a biased outcome. As sample density declined, the probability of a survey availability event, defined as a survey yielding indices >125% or <75% of the true population abundance, increased and that increase was disproportionately biased towards underestimates. For these cases where a single sample accessed about 0.001-0.004% of the domain, 8-15 random samples were required to reduce the probability of a survey availability event below 40%. The problem of differential bias, in which the probabilities of a biased-high and a biased-low survey index were distinctly unequal, was resolved with fewer samples than the problem of overall bias. These trends suggest that the influence of sampling density on survey design comes with a series of incremental challenges. At woefully inadequate sampling density, the probability of a biased-low survey index will substantially exceed the probability of a biased-high index. The survey time series on the average will return an estimate of the stock that underestimates true stock abundance. If sampling intensity is increased, the frequency of biased indices balances between high and low values. Incrementing sample number from this point steadily reduces the likelihood of a biased survey; however, the number of samples necessary to drive the probability of survey availability events to a preferred level of infrequency may be daunting. Moreover, certain size classes will be disproportionately susceptible to such events and the impact on size frequency will be species specific, depending on the relative dispersion of the size classes.
Feature Grouping and Selection Over an Undirected Graph.
Yang, Sen; Yuan, Lei; Lai, Ying-Cheng; Shen, Xiaotong; Wonka, Peter; Ye, Jieping
2012-01-01
High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the features has been considered to be promising in promoting regression/classification performance. Graph-guided fused lasso (GFlasso) has recently been proposed to facilitate feature selection and graph structure exploitation, when features exhibit certain graph structures. However, the formulation in GFlasso relies on pairwise sample correlations to perform feature grouping, which could introduce additional estimation bias. In this paper, we propose three new feature grouping and selection methods to resolve this issue. The first method employs a convex function to penalize the pairwise l ∞ norm of connected regression/classification coefficients, achieving simultaneous feature grouping and selection. The second method improves the first one by utilizing a non-convex function to reduce the estimation bias. The third one is the extension of the second method using a truncated l 1 regularization to further reduce the estimation bias. The proposed methods combine feature grouping and feature selection to enhance estimation accuracy. We employ the alternating direction method of multipliers (ADMM) and difference of convex functions (DC) programming to solve the proposed formulations. Our experimental results on synthetic data and two real datasets demonstrate the effectiveness of the proposed methods.
Agreement between methods of measurement of mean aortic wall thickness by MRI.
Rosero, Eric B; Peshock, Ronald M; Khera, Amit; Clagett, G Patrick; Lo, Hao; Timaran, Carlos
2009-03-01
To assess the agreement between three methods of calculation of mean aortic wall thickness (MAWT) using magnetic resonance imaging (MRI). High-resolution MRI of the infrarenal abdominal aorta was performed on 70 subjects with a history of coronary artery disease who were part of a multi-ethnic population-based sample. MAWT was calculated as the mean distance between the adventitial and luminal aortic boundaries using three different methods: average distance at four standard positions (AWT-4P), average distance at 100 automated positions (AWT-100P), and using a mathematical computation derived from the total vessel and luminal areas (AWT-VA). Bland-Altman plots and Passing-Bablok regression analyses were used to assess agreement between methods. Bland-Altman analyses demonstrated a positive bias of 3.02+/-7.31% between the AWT-VA and the AWT-4P methods, and of 1.76+/-6.82% between the AWT-100P and the AWT-4P methods. Passing-Bablok regression analyses demonstrated constant bias between the AWT-4P method and the other two methods. Proportional bias was, however, not evident among the three methods. MRI methods of measurement of MAWT using a limited number of positions of the aortic wall systematically underestimate the MAWT value compared with the method that calculates MAWT from the vessel areas. Copyright (c) 2009 Wiley-Liss, Inc.
Calibrating genomic and allelic coverage bias in single-cell sequencing.
Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher
2015-04-16
Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing
Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher
2016-01-01
Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
Sefa, Eunice; Adimazoya, Edward Akolgo; Yartey, Emmanuel; Lenzi, Rachel; Tarpo, Cindy; Heward-Mills, Nii Lante; Lew, Katherine; Ampeh, Yvonne
2018-01-01
Introduction Generating a nationally representative sample in low and middle income countries typically requires resource-intensive household level sampling with door-to-door data collection. High mobile phone penetration rates in developing countries provide new opportunities for alternative sampling and data collection methods, but there is limited information about response rates and sample biases in coverage and nonresponse using these methods. We utilized data from an interactive voice response, random-digit dial, national mobile phone survey in Ghana to calculate standardized response rates and assess representativeness of the obtained sample. Materials and methods The survey methodology was piloted in two rounds of data collection. The final survey included 18 demographic, media exposure, and health behavior questions. Call outcomes and response rates were calculated according to the American Association of Public Opinion Research guidelines. Sample characteristics, productivity, and costs per interview were calculated. Representativeness was assessed by comparing data to the Ghana Demographic and Health Survey and the National Population and Housing Census. Results The survey was fielded during a 27-day period in February-March 2017. There were 9,469 completed interviews and 3,547 partial interviews. Response, cooperation, refusal, and contact rates were 31%, 81%, 7%, and 39% respectively. Twenty-three calls were dialed to produce an eligible contact: nonresponse was substantial due to the automated calling system and dialing of many unassigned or non-working numbers. Younger, urban, better educated, and male respondents were overrepresented in the sample. Conclusions The innovative mobile phone data collection methodology yielded a large sample in a relatively short period. Response rates were comparable to other surveys, although substantial coverage bias resulted from fewer women, rural, and older residents completing the mobile phone survey in comparison to household surveys. Random digit dialing of mobile phones offers promise for future data collection in Ghana and may be suitable for other developing countries. PMID:29351349
Telfeyan, Katherine Christina; Ware, Stuart Doug; Reimus, Paul William; ...
2018-01-31
Here, diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating effective matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of effective matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged frommore » 14 to 30%, and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than effective matrix diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields effective matrix diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Telfeyan, Katherine Christina; Ware, Stuart Doug; Reimus, Paul William
Here, diffusion cell and diffusion wafer experiments were conducted to compare methods for estimating effective matrix diffusion coefficients in rock core samples from Pahute Mesa at the Nevada Nuclear Security Site (NNSS). A diffusion wafer method, in which a solute diffuses out of a rock matrix that is pre-saturated with water containing the solute, is presented as a simpler alternative to the traditional through-diffusion (diffusion cell) method. Both methods yielded estimates of effective matrix diffusion coefficients that were within the range of values previously reported for NNSS volcanic rocks. The difference between the estimates of the two methods ranged frommore » 14 to 30%, and there was no systematic high or low bias of one method relative to the other. From a transport modeling perspective, these differences are relatively minor when one considers that other variables (e.g., fracture apertures, fracture spacings) influence matrix diffusion to a greater degree and tend to have greater uncertainty than effective matrix diffusion coefficients. For the same relative random errors in concentration measurements, the diffusion cell method yields effective matrix diffusion coefficient estimates that have less uncertainty than the wafer method. However, the wafer method is easier and less costly to implement and yields estimates more quickly, thus allowing a greater number of samples to be analyzed for the same cost and time. Given the relatively good agreement between the methods, and the lack of any apparent bias between the methods, the diffusion wafer method appears to offer advantages over the diffusion cell method if better statistical representation of a given set of rock samples is desired.« less
Kvach, Yuriy; Ondračková, Markéta; Janáč, Michal; Jurajda, Pavel
2016-08-31
In this study, we assessed the impact of sampling method on the results of fish ectoparasite studies. Common roach Rutilus rutilus were sampled from the same gravel pit in the River Dyje flood plain (Czech Republic) using 3 different sampling methods, i.e. electrofishing, beach seining and gill-netting, and were examined for ectoparasites. Not only did fish caught by electrofishing have more of the most abundant parasites (Trichodina spp., Gyrodactylus spp.) than those caught by beach seining or gill-netting, they also had relatively rich parasite infracommunities, resulting in a significantly different assemblage composition, presumably as parasites were lost through handling and 'manipulation' in the net. Based on this, we recommend electrofishing as the most suitable method to sample fish for parasite community studies, as data from fish caught with gill-nets and beach seines will provide a biased picture of the ectoparasite community, underestimating ectoparasite abundance and infracommunity species richness.
Modeling bias and variation in the stochastic processes of small RNA sequencing
Etheridge, Alton; Sakhanenko, Nikita; Galas, David
2017-01-01
Abstract The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. PMID:28369495
NASA Technical Reports Server (NTRS)
Hixson, M. M.; Bauer, M. E.; Davis, B. J.
1979-01-01
The effect of sampling on the accuracy (precision and bias) of crop area estimates made from classifications of LANDSAT MSS data was investigated. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plants. Four sampling schemes involving different numbers of samples and different size sampling units were evaluated. The precision of the wheat area estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling unit size.
2016-01-01
Reliably estimating wildlife abundance is fundamental to effective management. Aerial surveys are one of the only spatially robust tools for estimating large mammal populations, but statistical sampling methods are required to address detection biases that affect accuracy and precision of the estimates. Although various methods for correcting aerial survey bias are employed on large mammal species around the world, these have rarely been rigorously validated. Several populations of feral horses (Equus caballus) in the western United States have been intensively studied, resulting in identification of all unique individuals. This provided a rare opportunity to test aerial survey bias correction on populations of known abundance. We hypothesized that a hybrid method combining simultaneous double-observer and sightability bias correction techniques would accurately estimate abundance. We validated this integrated technique on populations of known size and also on a pair of surveys before and after a known number was removed. Our analysis identified several covariates across the surveys that explained and corrected biases in the estimates. All six tests on known populations produced estimates with deviations from the known value ranging from -8.5% to +13.7% and <0.7 standard errors. Precision varied widely, from 6.1% CV to 25.0% CV. In contrast, the pair of surveys conducted around a known management removal produced an estimated change in population between the surveys that was significantly larger than the known reduction. Although the deviation between was only 9.1%, the precision estimate (CV = 1.6%) may have been artificially low. It was apparent that use of a helicopter in those surveys perturbed the horses, introducing detection error and heterogeneity in a manner that could not be corrected by our statistical models. Our results validate the hybrid method, highlight its potentially broad applicability, identify some limitations, and provide insight and guidance for improving survey designs. PMID:27139732
Lubow, Bruce C; Ransom, Jason I
2016-01-01
Reliably estimating wildlife abundance is fundamental to effective management. Aerial surveys are one of the only spatially robust tools for estimating large mammal populations, but statistical sampling methods are required to address detection biases that affect accuracy and precision of the estimates. Although various methods for correcting aerial survey bias are employed on large mammal species around the world, these have rarely been rigorously validated. Several populations of feral horses (Equus caballus) in the western United States have been intensively studied, resulting in identification of all unique individuals. This provided a rare opportunity to test aerial survey bias correction on populations of known abundance. We hypothesized that a hybrid method combining simultaneous double-observer and sightability bias correction techniques would accurately estimate abundance. We validated this integrated technique on populations of known size and also on a pair of surveys before and after a known number was removed. Our analysis identified several covariates across the surveys that explained and corrected biases in the estimates. All six tests on known populations produced estimates with deviations from the known value ranging from -8.5% to +13.7% and <0.7 standard errors. Precision varied widely, from 6.1% CV to 25.0% CV. In contrast, the pair of surveys conducted around a known management removal produced an estimated change in population between the surveys that was significantly larger than the known reduction. Although the deviation between was only 9.1%, the precision estimate (CV = 1.6%) may have been artificially low. It was apparent that use of a helicopter in those surveys perturbed the horses, introducing detection error and heterogeneity in a manner that could not be corrected by our statistical models. Our results validate the hybrid method, highlight its potentially broad applicability, identify some limitations, and provide insight and guidance for improving survey designs.
Effective dimension reduction for sparse functional data
YAO, F.; LEI, E.; WU, Y.
2015-01-01
Summary We propose a method of effective dimension reduction for functional data, emphasizing the sparse design where one observes only a few noisy and irregular measurements for some or all of the subjects. The proposed method borrows strength across the entire sample and provides a way to characterize the effective dimension reduction space, via functional cumulative slicing. Our theoretical study reveals a bias-variance trade-off associated with the regularizing truncation and decaying structures of the predictor process and the effective dimension reduction space. A simulation study and an application illustrate the superior finite-sample performance of the method. PMID:26566293
Capiau, Sara; Wilk, Leah S; De Kesel, Pieter M M; Aalders, Maurice C G; Stove, Christophe P
2018-02-06
The hematocrit (Hct) effect is one of the most important hurdles currently preventing more widespread implementation of quantitative dried blood spot (DBS) analysis in a routine context. Indeed, the Hct may affect both the accuracy of DBS methods as well as the interpretation of DBS-based results. We previously developed a method to determine the Hct of a DBS based on its hemoglobin content using noncontact diffuse reflectance spectroscopy. Despite the ease with which the analysis can be performed (i.e., mere scanning of the DBS) and the good results that were obtained, the method did require a complicated algorithm to derive the total hemoglobin content from the DBS's reflectance spectrum. As the total hemoglobin was calculated as the sum of oxyhemoglobin, methemoglobin, and hemichrome, the three main hemoglobin derivatives formed in DBS upon aging, the reflectance spectrum needed to be unmixed to determine the quantity of each of these derivatives. We now simplified the method by only using the reflectance at a single wavelength, located at a quasi-isosbestic point in the reflectance curve. At this wavelength, assuming 1-to-1 stoichiometry of the aging reaction, the reflectance is insensitive to the hemoglobin degradation and only scales with the total amount of hemoglobin and, hence, the Hct. This simplified method was successfully validated. At each quality control level as well as at the limits of quantitation (i.e., 0.20 and 0.67) bias, intra- and interday imprecision were within 10%. Method reproducibility was excellent based on incurred sample reanalysis and surpassed the reproducibility of the original method. Furthermore, the influence of the volume spotted, the measurement location within the spot, as well as storage time and temperature were evaluated, showing no relevant impact of these parameters. Application to 233 patient samples revealed a good correlation between the Hct determined on whole blood and the predicted Hct determined on venous DBS. The bias obtained with Bland and Altman analysis was -0.015 and the limits of agreement were -0.061 and 0.031, indicating that the simplified, noncontact Hct prediction method even outperforms the original method. In addition, using caffeine as a model compound, it was demonstrated that this simplified Hct prediction method can effectively be used to implement a Hct-dependent correction factor to DBS-based results to alleviate the Hct bias.
Self-referent information processing in individuals with bipolar spectrum disorders
Molz Adams, Ashleigh; Shapero, Benjamin G.; Pendergast, Laura H.; Alloy, Lauren B.; Abramson, Lyn Y.
2014-01-01
Background Bipolar spectrum disorders (BSDs) are common and impairing, which has led to an examination of risk factors for their development and maintenance. Historically, research has examined cognitive vulnerabilities to BSDs derived largely from the unipolar depression literature. Specifically, theorists propose that dysfunctional information processing guided by negative self-schemata may be a risk factor for depression. However, few studies have examined whether BSD individuals also show self-referent processing biases. Methods This study examined self-referent information processing differences between 66 individuals with and 58 individuals without a BSD in a young adult sample (age M = 19.65, SD = 1.74; 62% female; 47% Caucasian). Repeated measures multivariate analysis of variance (MANOVA) was conducted to examine multivariate effects of BSD diagnosis on 4 self-referent processing variables (self-referent judgments, response latency, behavioral predictions, and recall) in response to depression-related and nondepression-related stimuli. Results Bipolar individuals endorsed and recalled more negative and fewer positive self-referent adjectives, as well as made more negative and fewer positive behavioral predictions. Many of these information-processing biases were partially, but not fully, mediated by depressive symptoms. Limitations Our sample was not a clinical or treatment-seeking sample, so we cannot generalize our results to clinical BSD samples. No participants had a bipolar I disorder at baseline. Conclusions This study provides further evidence that individuals with BSDs exhibit a negative self-referent information processing bias. This may mean that those with BSDs have selective attention and recall of negative information about themselves, highlighting the need for attention to cognitive biases in therapy. PMID:24074480
Data analysis strategies for reducing the influence of the bias in cross-cultural research.
Sindik, Josko
2012-03-01
In cross-cultural research, researchers have to adjust the constructs and associated measurement instruments that have been developed in one culture and then imported for use in another culture. Importing concepts from other cultures is often simply reduced to language adjustment of the content in the items of the measurement instruments that define a certain (psychological) construct. In the context of cross-cultural research, test bias can be defined as a generic term for all nuisance factors that threaten the validity of cross-cultural comparisons. Bias can be an indicator that instrument scores based on the same items measure different traits and characteristics across different cultural groups. To reduce construct, method and item bias,the researcher can consider these strategies: (1) simply comparing average results in certain measuring instruments; (2) comparing only the reliability of certain dimensions of the measurement instruments, applied to the "target" and "source" samples of participants, i.e. from different cultures; (3) comparing the "framed" factor structure (fixed number of factors) of the measurement instruments, applied to the samples from the "target" and "source" cultures, using explorative factor analysis strategy on separate samples; (4) comparing the complete constructs ("unframed" factor analysis, i.e. unlimited number of factors) in relation to their best psychometric properties and the possibility of interpreting (best suited to certain cultures, applying explorative strategy of factor analysis); or (5) checking the similarity of the constructs in the samples from different cultures (using structural equation modeling approach). Each approach has its advantages and disadvantages. The advantages and lacks of each approach are discussed.
Introducing etch kernels for efficient pattern sampling and etch bias prediction
NASA Astrophysics Data System (ADS)
Weisbuch, François; Lutich, Andrey; Schatz, Jirka
2018-01-01
Successful patterning requires good control of the photolithography and etch processes. While compact litho models, mainly based on rigorous physics, can predict very well the contours printed in photoresist, pure empirical etch models are less accurate and more unstable. Compact etch models are based on geometrical kernels to compute the litho-etch biases that measure the distance between litho and etch contours. The definition of the kernels, as well as the choice of calibration patterns, is critical to get a robust etch model. This work proposes to define a set of independent and anisotropic etch kernels-"internal, external, curvature, Gaussian, z_profile"-designed to represent the finest details of the resist geometry to characterize precisely the etch bias at any point along a resist contour. By evaluating the etch kernels on various structures, it is possible to map their etch signatures in a multidimensional space and analyze them to find an optimal sampling of structures. The etch kernels evaluated on these structures were combined with experimental etch bias derived from scanning electron microscope contours to train artificial neural networks to predict etch bias. The method applied to contact and line/space layers shows an improvement in etch model prediction accuracy over standard etch model. This work emphasizes the importance of the etch kernel definition to characterize and predict complex etch effects.
Alibay, Irfan; Burusco, Kepa K; Bruce, Neil J; Bryce, Richard A
2018-03-08
Determining the conformations accessible to carbohydrate ligands in aqueous solution is important for understanding their biological action. In this work, we evaluate the conformational free-energy surfaces of Lewis oligosaccharides in explicit aqueous solvent using a multidimensional variant of the swarm-enhanced sampling molecular dynamics (msesMD) method; we compare with multi-microsecond unbiased MD simulations, umbrella sampling, and accelerated MD approaches. For the sialyl Lewis A tetrasaccharide, msesMD simulations in aqueous solution predict conformer landscapes in general agreement with the other biased methods and with triplicate unbiased 10 μs trajectories; these simulations find a predominance of closed conformer and a range of low-occupancy open forms. The msesMD simulations also suggest closed-to-open transitions in the tetrasaccharide are facilitated by changes in ring puckering of its GlcNAc residue away from the 4 C 1 form, in line with previous work. For sialyl Lewis X tetrasaccharide, msesMD simulations predict a minor population of an open form in solution corresponding to a rare lectin-bound pose observed crystallographically. Overall, from comparison with biased MD calculations, we find that triplicate 10 μs unbiased MD simulations may not be enough to fully sample glycan conformations in aqueous solution. However, the computational efficiency and intuitive approach of the msesMD method suggest potential for its application in glycomics as a tool for analysis of oligosaccharide conformation.
Garbarino, John R.
2000-01-01
Analysis of in-bottle digestate by using the inductively coupled plasma?mass spectrometric (ICP?MS) method has been expanded to include arsenic, boron, and vanadium. Whole-water samples are digested by using either the hydrochloric acid in-bottle digestion procedure or the nitric acid in-bottle digestion procedure. When the hydrochloric acid in-bottle digestion procedure is used, chloride must be removed from the digestate by subboiling evaporation before arsenic and vanadium can be accurately determined. Method detection limits for these elements are now 10 to 100 times lower than U.S. Geological Survey (USGS) methods using hydride generation? atomic absorption spectrophotometry (HG? AAS) and inductively coupled plasma? atomic emission spectrometry (ICP?AES), thus providing lower variability at ambient concentrations. The bias and variability of the methods were determined by using results from spike recoveries, standard reference materials, and validation samples. Spike recoveries in reagent-water, surface-water, ground-water, and whole-water recoverable matrices averaged 90 percent for seven replicates; spike recoveries were biased from 25 to 35 percent low for the ground-water matrix because of the abnormally high iron concentration. Results for reference material were within one standard deviation of the most probable value. There was no significant difference between the results from ICP?MS and HG?AAS or ICP?AES methods for the natural whole-water samples that were analyzed.
Inference with viral quasispecies diversity indices: clonal and NGS approaches.
Gregori, Josep; Salicrú, Miquel; Domingo, Esteban; Sanchez, Alex; Esteban, Juan I; Rodríguez-Frías, Francisco; Quer, Josep
2014-04-15
Given the inherent dynamics of a viral quasispecies, we are often interested in the comparison of diversity indices of sequential samples of a patient, or in the comparison of diversity indices of virus in groups of patients in a treated versus control design. It is then important to make sure that the diversity measures from each sample may be compared with no bias and within a consistent statistical framework. In the present report, we review some indices often used as measures for viral quasispecies complexity and provide means for statistical inference, applying procedures taken from the ecology field. In particular, we examine the Shannon entropy and the mutation frequency, and we discuss the appropriateness of different normalization methods of the Shannon entropy found in the literature. By taking amplicons ultra-deep pyrosequencing (UDPS) raw data as a surrogate of a real hepatitis C virus viral population, we study through in-silico sampling the statistical properties of these indices under two methods of viral quasispecies sampling, classical cloning followed by Sanger sequencing (CCSS) and next-generation sequencing (NGS) such as UDPS. We propose solutions specific to each of the two sampling methods-CCSS and NGS-to guarantee statistically conforming conclusions as free of bias as possible. josep.gregori@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Barber, Jessica; Palmese, Laura; Reutenauer, Erin L.; Grilo, Carlos; Tek, Cenk
2011-01-01
Obesity has been associated with significant stigma and weight-related self-bias in community and clinical studies, but these issues have not been studied among individuals with schizophrenia. A consecutive series of 70 obese individuals with schizophrenia or schizoaffective disorder underwent assessment for perceptions of weight-based stigmatization, self-directed weight-bias, negative affect, medication compliance, and quality of life. Levels of weight-based stigmatization and self-bias were compared to levels reported for non-psychiatric overweight/obese samples. Weight measures were unrelated to stigma, self-bias, affect, and quality of life. Weight-based stigmatization was lower than published levels for non-psychiatric samples, whereas levels of weight-based self-bias did not differ. After controlling for negative affect, weight-based self-bias predicted an additional 11% of the variance in the quality of life measure. Individuals with schizophrenia and schizoaffective disorder reported weight-based self-bias to the same extent as non-psychiatric samples despite reporting less weight stigma. Weight-based self-bias was associated with poorer quality of life after controlling for negative affect. PMID:21716053
Starrfelt, Jostein; Liow, Lee Hsiang
2016-01-01
The fossil record is a rich source of information about biological diversity in the past. However, the fossil record is not only incomplete but has also inherent biases due to geological, physical, chemical and biological factors. Our knowledge of past life is also biased because of differences in academic and amateur interests and sampling efforts. As a result, not all individuals or species that lived in the past are equally likely to be discovered at any point in time or space. To reconstruct temporal dynamics of diversity using the fossil record, biased sampling must be explicitly taken into account. Here, we introduce an approach that uses the variation in the number of times each species is observed in the fossil record to estimate both sampling bias and true richness. We term our technique TRiPS (True Richness estimated using a Poisson Sampling model) and explore its robustness to violation of its assumptions via simulations. We then venture to estimate sampling bias and absolute species richness of dinosaurs in the geological stages of the Mesozoic. Using TRiPS, we estimate that 1936 (1543–2468) species of dinosaurs roamed the Earth during the Mesozoic. We also present improved estimates of species richness trajectories of the three major dinosaur clades: the sauropodomorphs, ornithischians and theropods, casting doubt on the Jurassic–Cretaceous extinction event and demonstrating that all dinosaur groups are subject to considerable sampling bias throughout the Mesozoic. PMID:26977060
Starrfelt, Jostein; Liow, Lee Hsiang
2016-04-05
The fossil record is a rich source of information about biological diversity in the past. However, the fossil record is not only incomplete but has also inherent biases due to geological, physical, chemical and biological factors. Our knowledge of past life is also biased because of differences in academic and amateur interests and sampling efforts. As a result, not all individuals or species that lived in the past are equally likely to be discovered at any point in time or space. To reconstruct temporal dynamics of diversity using the fossil record, biased sampling must be explicitly taken into account. Here, we introduce an approach that uses the variation in the number of times each species is observed in the fossil record to estimate both sampling bias and true richness. We term our technique TRiPS (True Richness estimated using a Poisson Sampling model) and explore its robustness to violation of its assumptions via simulations. We then venture to estimate sampling bias and absolute species richness of dinosaurs in the geological stages of the Mesozoic. Using TRiPS, we estimate that 1936 (1543-2468) species of dinosaurs roamed the Earth during the Mesozoic. We also present improved estimates of species richness trajectories of the three major dinosaur clades: the sauropodomorphs, ornithischians and theropods, casting doubt on the Jurassic-Cretaceous extinction event and demonstrating that all dinosaur groups are subject to considerable sampling bias throughout the Mesozoic. © 2016 The Authors.
Smooth quantile normalization.
Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N; Quackenbush, John; Irizarry, Rafael A; Bravo, Héctor Corrada
2018-04-01
Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.
Sources of method bias in social science research and recommendations on how to control it.
Podsakoff, Philip M; MacKenzie, Scott B; Podsakoff, Nathan P
2012-01-01
Despite the concern that has been expressed about potential method biases, and the pervasiveness of research settings with the potential to produce them, there is disagreement about whether they really are a problem for researchers in the behavioral sciences. Therefore, the purpose of this review is to explore the current state of knowledge about method biases. First, we explore the meaning of the terms "method" and "method bias" and then we examine whether method biases influence all measures equally. Next, we review the evidence of the effects that method biases have on individual measures and on the covariation between different constructs. Following this, we evaluate the procedural and statistical remedies that have been used to control method biases and provide recommendations for minimizing method bias.
Parameter estimation for groundwater models under uncertain irrigation data
Demissie, Yonas; Valocchi, Albert J.; Cai, Ximing; Brozovic, Nicholas; Senay, Gabriel; Gebremichael, Mekonnen
2015-01-01
The success of modeling groundwater is strongly influenced by the accuracy of the model parameters that are used to characterize the subsurface system. However, the presence of uncertainty and possibly bias in groundwater model source/sink terms may lead to biased estimates of model parameters and model predictions when the standard regression-based inverse modeling techniques are used. This study first quantifies the levels of bias in groundwater model parameters and predictions due to the presence of errors in irrigation data. Then, a new inverse modeling technique called input uncertainty weighted least-squares (IUWLS) is presented for unbiased estimation of the parameters when pumping and other source/sink data are uncertain. The approach uses the concept of generalized least-squares method with the weight of the objective function depending on the level of pumping uncertainty and iteratively adjusted during the parameter optimization process. We have conducted both analytical and numerical experiments, using irrigation pumping data from the Republican River Basin in Nebraska, to evaluate the performance of ordinary least-squares (OLS) and IUWLS calibration methods under different levels of uncertainty of irrigation data and calibration conditions. The result from the OLS method shows the presence of statistically significant (p < 0.05) bias in estimated parameters and model predictions that persist despite calibrating the models to different calibration data and sample sizes. However, by directly accounting for the irrigation pumping uncertainties during the calibration procedures, the proposed IUWLS is able to minimize the bias effectively without adding significant computational burden to the calibration processes.
Electric Field-aided Selective Activation for Indium-Gallium-Zinc-Oxide Thin Film Transistors
NASA Astrophysics Data System (ADS)
Lee, Heesoo; Chang, Ki Soo; Tak, Young Jun; Jung, Tae Soo; Park, Jeong Woo; Kim, Won-Gi; Chung, Jusung; Jeong, Chan Bae; Kim, Hyun Jae
2016-10-01
A new technique is proposed for the activation of low temperature amorphous InGaZnO thin film transistor (a-IGZO TFT) backplanes through application of a bias voltage and annealing at 130 °C simultaneously. In this ‘electrical activation’, the effects of annealing under bias are selectively focused in the channel region. Therefore, electrical activation can be an effective method for lower backplane processing temperatures from 280 °C to 130 °C. Devices fabricated with this method exhibit equivalent electrical properties to those of conventionally-fabricated samples. These results are analyzed electrically and thermodynamically using infrared microthermography. Various bias voltages are applied to the gate, source, and drain electrodes while samples are annealed at 130 °C for 1 hour. Without conventional high temperature annealing or electrical activation, current-voltage curves do not show transfer characteristics. However, electrically activated a-IGZO TFTs show superior electrical characteristics, comparable to the reference TFTs annealed at 280 °C for 1 hour. This effect is a result of the lower activation energy, and efficient transfer of electrical and thermal energy to a-IGZO TFTs. With this approach, superior low-temperature a-IGZO TFTs are fabricated successfully.
Electric Field-aided Selective Activation for Indium-Gallium-Zinc-Oxide Thin Film Transistors
Lee, Heesoo; Chang, Ki Soo; Tak, Young Jun; Jung, Tae Soo; Park, Jeong Woo; Kim, Won-Gi; Chung, Jusung; Jeong, Chan Bae; Kim, Hyun Jae
2016-01-01
A new technique is proposed for the activation of low temperature amorphous InGaZnO thin film transistor (a-IGZO TFT) backplanes through application of a bias voltage and annealing at 130 °C simultaneously. In this ‘electrical activation’, the effects of annealing under bias are selectively focused in the channel region. Therefore, electrical activation can be an effective method for lower backplane processing temperatures from 280 °C to 130 °C. Devices fabricated with this method exhibit equivalent electrical properties to those of conventionally-fabricated samples. These results are analyzed electrically and thermodynamically using infrared microthermography. Various bias voltages are applied to the gate, source, and drain electrodes while samples are annealed at 130 °C for 1 hour. Without conventional high temperature annealing or electrical activation, current-voltage curves do not show transfer characteristics. However, electrically activated a-IGZO TFTs show superior electrical characteristics, comparable to the reference TFTs annealed at 280 °C for 1 hour. This effect is a result of the lower activation energy, and efficient transfer of electrical and thermal energy to a-IGZO TFTs. With this approach, superior low-temperature a-IGZO TFTs are fabricated successfully. PMID:27725695
Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules
NASA Astrophysics Data System (ADS)
Hamelberg, Donald; Mongan, John; McCammon, J. Andrew
2004-06-01
Many interesting dynamic properties of biological molecules cannot be simulated directly using molecular dynamics because of nanosecond time scale limitations. These systems are trapped in potential energy minima with high free energy barriers for large numbers of computational steps. The dynamic evolution of many molecular systems occurs through a series of rare events as the system moves from one potential energy basin to another. Therefore, we have proposed a robust bias potential function that can be used in an efficient accelerated molecular dynamics approach to simulate the transition of high energy barriers without any advance knowledge of the location of either the potential energy wells or saddle points. In this method, the potential energy landscape is altered by adding a bias potential to the true potential such that the escape rates from potential wells are enhanced, which accelerates and extends the time scale in molecular dynamics simulations. Our definition of the bias potential echoes the underlying shape of the potential energy landscape on the modified surface, thus allowing for the potential energy minima to be well defined, and hence properly sampled during the simulation. We have shown that our approach, which can be extended to biomolecules, samples the conformational space more efficiently than normal molecular dynamics simulations, and converges to the correct canonical distribution.
Adaptive Landscape Flattening Accelerates Sampling of Alchemical Space in Multisite λ Dynamics.
Hayes, Ryan L; Armacost, Kira A; Vilseck, Jonah Z; Brooks, Charles L
2017-04-20
Multisite λ dynamics (MSλD) is a powerful emerging method in free energy calculation that allows prediction of relative free energies for a large set of compounds from very few simulations. Calculating free energy differences between substituents that constitute large volume or flexibility jumps in chemical space is difficult for free energy methods in general, and for MSλD in particular, due to large free energy barriers in alchemical space. This study demonstrates that a simple biasing potential can flatten these barriers and introduces an algorithm that determines system specific biasing potential coefficients. Two sources of error, deep traps at the end points and solvent disruption by hard-core potentials, are identified. Both scale with the size of the perturbed substituent and are removed by sharp biasing potentials and a new soft-core implementation, respectively. MSλD with landscape flattening is demonstrated on two sets of molecules: derivatives of the heat shock protein 90 inhibitor geldanamycin and derivatives of benzoquinone. In the benzoquinone system, landscape flattening leads to 2 orders of magnitude improvement in transition rates between substituents and robust solvation free energies. Landscape flattening opens up new applications for MSλD by enabling larger chemical perturbations to be sampled with improved precision and accuracy.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
A minimalist approach to bias estimation for passive sensor measurements with targets of opportunity
NASA Astrophysics Data System (ADS)
Belfadel, Djedjiga; Osborne, Richard W.; Bar-Shalom, Yaakov
2013-09-01
In order to carry out data fusion, registration error correction is crucial in multisensor systems. This requires estimation of the sensor measurement biases. It is important to correct for these bias errors so that the multiple sensor measurements and/or tracks can be referenced as accurately as possible to a common tracking coordinate system. This paper provides a solution for bias estimation for the minimum number of passive sensors (two), when only targets of opportunity are available. The sensor measurements are assumed time-coincident (synchronous) and perfectly associated. Since these sensors provide only line of sight (LOS) measurements, the formation of a single composite Cartesian measurement obtained from fusing the LOS measurements from different sensors is needed to avoid the need for nonlinear filtering. We evaluate the Cramer-Rao Lower Bound (CRLB) on the covariance of the bias estimate, i.e., the quantification of the available information about the biases. Statistical tests on the results of simulations show that this method is statistically efficient, even for small sample sizes (as few as two sensors and six points on the trajectory of a single target of opportunity). We also show that the RMS position error is significantly improved with bias estimation compared with the target position estimation using the original biased measurements.
Sampling Biases in MODIS and SeaWiFS Ocean Chlorophyll Data
NASA Technical Reports Server (NTRS)
Gregg, Watson W.; Casey, Nancy W.
2007-01-01
Although modem ocean color sensors, such as MODIS and SeaWiFS are often considered global missions, in reality it takes many days, even months, to sample the ocean surface enough to provide complete global coverage. The irregular temporal sampling of ocean color sensors can produce biases in monthly and annual mean chlorophyll estimates. We quantified the biases due to sampling using data assimilation to create a "truth field", which we then sub-sampled using the observational patterns of MODIS and SeaWiFS. Monthly and annual mean chlorophyll estimates from these sub-sampled, incomplete daily fields were constructed and compared to monthly and annual means from the complete daily fields of the assimilation model, at a spatial resolution of 1.25deg longitude by 0.67deg latitude. The results showed that global annual mean biases were positive, reaching nearly 8% (MODIS) and >5% (SeaWiFS). For perspective the maximum interannual variability in the SeaWiFS chlorophyll record was about 3%. Annual mean sampling biases were low (<3%) in the midlatitudes (between -40deg and 40deg). Low interannual variability in the global annual mean sampling biases suggested that global scale trend analyses were valid. High latitude biases were much higher than the global annual means, up to 20% as a basin annual mean, and over 80% in some months. This was the result of the high solar zenith angle exclusion in the processing algorithms. Only data where the solar angle is <75deg are permitted, in contrast to the assimilation which samples regularly over the entire area and month. High solar zenith angles do not facilitate phytoplankton photosynthesis and consequently low chlorophyll concentrations occurring here are missed by the data sets. Ocean color sensors selectively sample in locations and times of favorable phytoplankton growth, producing overestimates of chlorophyll. The biases derived from lack of sampling in the high latitudes varied monthly, leading to artifacts in the apparent seasonal cycle from ocean color sensors. A false secondary peak in chlorophyll occurred in May-August, which resulted from lack of sampling in the Antarctic.
Mannion, Philip D; Upchurch, Paul; Carrano, Matthew T; Barrett, Paul M
2011-02-01
The accurate reconstruction of palaeobiodiversity patterns is central to a detailed understanding of the macroevolutionary history of a group of organisms. However, there is increasing evidence that diversity patterns observed directly from the fossil record are strongly influenced by fluctuations in the quality of our sampling of the rock record; thus, any patterns we see may reflect sampling biases, rather than genuine biological signals. Previous dinosaur diversity studies have suggested that fluctuations in sauropodomorph palaeobiodiversity reflect genuine biological signals, in comparison to theropods and ornithischians whose diversity seems to be largely controlled by the rock record. Most previous diversity analyses that have attempted to take into account the effects of sampling biases have used only a single method or proxy: here we use a number of techniques in order to elucidate diversity. A global database of all known sauropodomorph body fossil occurrences (2024) was constructed. A taxic diversity curve for all valid sauropodomorph genera was extracted from this database and compared statistically with several sampling proxies (rock outcrop area and dinosaur-bearing formations and collections), each of which captures a different aspect of fossil record sampling. Phylogenetic diversity estimates, residuals and sample-based rarefaction (including the first attempt to capture 'cryptic' diversity in dinosaurs) were implemented to investigate further the effects of sampling. After 'removal' of biases, sauropodomorph diversity appears to be genuinely high in the Norian, Pliensbachian-Toarcian, Bathonian-Callovian and Kimmeridgian-Tithonian (with a small peak in the Aptian), whereas low diversity levels are recorded for the Oxfordian and Berriasian-Barremian, with the Jurassic/Cretaceous boundary seemingly representing a real diversity trough. Observed diversity in the remaining Triassic-Jurassic stages appears to be largely driven by sampling effort. Late Cretaceous diversity is difficult to elucidate and it is possible that this interval remains relatively under-sampled. Despite its distortion by sampling biases, much of sauropodomorph palaeobiodiversity can be interpreted as a reflection of genuine biological signals, and fluctuations in sea level may account for some of these diversity patterns. © 2010 The Authors. Biological Reviews © 2010 Cambridge Philosophical Society.
Maloney, T.J.; Ludtke, A.S.; Krizman, T.L.
1994-01-01
The US. Geological Survey operates a quality- assurance program based on the analyses of reference samples for the National Water Quality Laboratory in Arvada, Colorado, and the Quality of Water Service Unit in Ocala, Florida. Reference samples containing selected inorganic, nutrient, and low ionic-strength constituents are prepared and disguised as routine samples. The program goal is to determine precision and bias for as many analytical methods offered by the participating laboratories as possible. The samples typically are submitted at a rate of approximately 5 percent of the annual environmental sample load for each constituent. The samples are distributed to the laboratories throughout the year. Analytical data for these reference samples reflect the quality of environmental sample data produced by the laboratories because the samples are processed in the same manner for all steps from sample login through data release. The results are stored permanently in the National Water Data Storage and Retrieval System. During water year 1991, 86 analytical procedures were evaluated at the National Water Quality Laboratory and 37 analytical procedures were evaluated at the Quality of Water Service Unit. An overall evaluation of the inorganic (major ion and trace metal) constituent data for water year 1991 indicated analytical imprecision in the National Water Quality Laboratory for 5 of 67 analytical procedures: aluminum (whole-water recoverable, atomic emission spectrometric, direct-current plasma); calcium (atomic emission spectrometric, direct); fluoride (ion-exchange chromatographic); iron (whole-water recoverable, atomic absorption spectrometric, direct); and sulfate (ion-exchange chromatographic). The results for 11 of 67 analytical procedures had positive or negative bias during water year 1991. Analytical imprecision was indicated in the determination of two of the five National Water Quality Laboratory nutrient constituents: orthophosphate as phosphorus and phosphorus. A negative or positive bias condition was indicated in three of five nutrient constituents. There was acceptable precision and no indication of bias for the 14 low ionic-strength analytical procedures tested in the National Water Quality Laboratory program and for the 32 inorganic and 5 nutrient analytical procedures tested in the Quality of Water Service Unit during water year 1991.
Explanation of Two Anomalous Results in Statistical Mediation Analysis.
Fritz, Matthew S; Taylor, Aaron B; Mackinnon, David P
2012-01-01
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M , a , increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y , b , was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a . Implications of these findings are discussed.
Adaptable gene-specific dye bias correction for two-channel DNA microarrays.
Margaritis, Thanasis; Lijnzaad, Philip; van Leenen, Dik; Bouwmeester, Diane; Kemmeren, Patrick; van Hooff, Sander R; Holstege, Frank C P
2009-01-01
DNA microarray technology is a powerful tool for monitoring gene expression or for finding the location of DNA-bound proteins. DNA microarrays can suffer from gene-specific dye bias (GSDB), causing some probes to be affected more by the dye than by the sample. This results in large measurement errors, which vary considerably for different probes and also across different hybridizations. GSDB is not corrected by conventional normalization and has been difficult to address systematically because of its variance. We show that GSDB is influenced by label incorporation efficiency, explaining the variation of GSDB across different hybridizations. A correction method (Gene- And Slide-Specific Correction, GASSCO) is presented, whereby sequence-specific corrections are modulated by the overall bias of individual hybridizations. GASSCO outperforms earlier methods and works well on a variety of publically available datasets covering a range of platforms, organisms and applications, including ChIP on chip. A sequence-based model is also presented, which predicts which probes will suffer most from GSDB, useful for microarray probe design and correction of individual hybridizations. Software implementing the method is publicly available.
Adaptable gene-specific dye bias correction for two-channel DNA microarrays
Margaritis, Thanasis; Lijnzaad, Philip; van Leenen, Dik; Bouwmeester, Diane; Kemmeren, Patrick; van Hooff, Sander R; Holstege, Frank CP
2009-01-01
DNA microarray technology is a powerful tool for monitoring gene expression or for finding the location of DNA-bound proteins. DNA microarrays can suffer from gene-specific dye bias (GSDB), causing some probes to be affected more by the dye than by the sample. This results in large measurement errors, which vary considerably for different probes and also across different hybridizations. GSDB is not corrected by conventional normalization and has been difficult to address systematically because of its variance. We show that GSDB is influenced by label incorporation efficiency, explaining the variation of GSDB across different hybridizations. A correction method (Gene- And Slide-Specific Correction, GASSCO) is presented, whereby sequence-specific corrections are modulated by the overall bias of individual hybridizations. GASSCO outperforms earlier methods and works well on a variety of publically available datasets covering a range of platforms, organisms and applications, including ChIP on chip. A sequence-based model is also presented, which predicts which probes will suffer most from GSDB, useful for microarray probe design and correction of individual hybridizations. Software implementing the method is publicly available. PMID:19401678
Bias correction of satellite-based rainfall data
NASA Astrophysics Data System (ADS)
Bhattacharya, Biswa; Solomatine, Dimitri
2015-04-01
Limitation in hydro-meteorological data availability in many catchments limits the possibility of reliable hydrological analyses especially for near-real-time predictions. However, the variety of satellite based and meteorological model products for rainfall provides new opportunities. Often times the accuracy of these rainfall products, when compared to rain gauge measurements, is not impressive. The systematic differences of these rainfall products from gauge observations can be partially compensated by adopting a bias (error) correction. Many of such methods correct the satellite based rainfall data by comparing their mean value to the mean value of rain gauge data. Refined approaches may also first find out a suitable time scale at which different data products are better comparable and then employ a bias correction at that time scale. More elegant methods use quantile-to-quantile bias correction, which however, assumes that the available (often limited) sample size can be useful in comparing probabilities of different rainfall products. Analysis of rainfall data and understanding of the process of its generation reveals that the bias in different rainfall data varies in space and time. The time aspect is sometimes taken into account by considering the seasonality. In this research we have adopted a bias correction approach that takes into account the variation of rainfall in space and time. A clustering based approach is employed in which every new data point (e.g. of Tropical Rainfall Measuring Mission (TRMM)) is first assigned to a specific cluster of that data product and then, by identifying the corresponding cluster of gauge data, the bias correction specific to that cluster is adopted. The presented approach considers the space-time variation of rainfall and as a result the corrected data is more realistic. Keywords: bias correction, rainfall, TRMM, satellite rainfall
K.P. Poudel; H. Temesgen
2016-01-01
Estimating aboveground biomass and its components requires sound statistical formulation and evaluation. Using data collected from 55 destructively sampled trees in different parts of Oregon, we evaluated the performance of three groups of methods to estimate total aboveground biomass and (or) its components based on the bias and root mean squared error (RMSE) that...
An Analysis of Methods Used To Reduce Nonresponse Bias in Survey Research.
ERIC Educational Resources Information Center
Johnson, Victoria A.
The effectiveness of five methods used to estimate the population parameters of a variable of interest from a random sample in the presence of non-response to mail surveys was tested in conditions that vary the return rate and the relationship of the variable of interest to the likelihood of response. Data from 125,092 adult Alabama residents in…
ERIC Educational Resources Information Center
Longford, Nicholas T.
Large scale surveys usually employ a complex sampling design and as a consequence, no standard methods for estimation of the standard errors associated with the estimates of population means are available. Resampling methods, such as jackknife or bootstrap, are often used, with reference to their properties of robustness and reduction of bias. A…
Hierarchical spatial models of abundance and occurrence from imperfect survey data
Royle, J. Andrew; Kery, M.; Gautier, R.; Schmid, Hans
2007-01-01
Many estimation and inference problems arising from large-scale animal surveys are focused on developing an understanding of patterns in abundance or occurrence of a species based on spatially referenced count data. One fundamental challenge, then, is that it is generally not feasible to completely enumerate ('census') all individuals present in each sample unit. This observation bias may consist of several components, including spatial coverage bias (not all individuals in the Population are exposed to sampling) and detection bias (exposed individuals may go undetected). Thus, observations are biased for the state variable (abundance, occupancy) that is the object of inference. Moreover, data are often sparse for most observation locations, requiring consideration of methods for spatially aggregating or otherwise combining sparse data among sample units. The development of methods that unify spatial statistical models with models accommodating non-detection is necessary to resolve important spatial inference problems based on animal survey data. In this paper, we develop a novel hierarchical spatial model for estimation of abundance and occurrence from survey data wherein detection is imperfect. Our application is focused on spatial inference problems in the Swiss Survey of Common Breeding Birds. The observation model for the survey data is specified conditional on the unknown quadrat population size, N(s). We augment the observation model with a spatial process model for N(s), describing the spatial variation in abundance of the species. The model includes explicit sources of variation in habitat structure (forest, elevation) and latent variation in the form of a correlated spatial process. This provides a model-based framework for combining the spatially referenced samples while at the same time yielding a unified treatment of estimation problems involving both abundance and occurrence. We provide a Bayesian framework for analysis and prediction based on the integrated likelihood, and we use the model to obtain estimates of abundance and occurrence maps for the European Jay (Garrulus glandarius), a widespread, elusive, forest bird. The naive national abundance estimate ignoring imperfect detection and incomplete quadrat coverage was 77 766 territories. Accounting for imperfect detection added approximately 18 000 territories, and adjusting for coverage bias added another 131 000 territories to yield a fully corrected estimate of the national total of about 227 000 territories. This is approximately three times as high as previous estimates that assume every territory is detected in each quadrat.
Blinded and unblinded internal pilot study designs for clinical trials with count data.
Schneider, Simon; Schmidli, Heinz; Friede, Tim
2013-07-01
Internal pilot studies are a popular design feature to address uncertainties in the sample size calculations caused by vague information on nuisance parameters. Despite their popularity, only very recently blinded sample size reestimation procedures for trials with count data were proposed and their properties systematically investigated. Although blinded procedures are favored by regulatory authorities, practical application is somewhat limited by fears that blinded procedures are prone to bias if the treatment effect was misspecified in the planning. Here, we compare unblinded and blinded procedures with respect to bias, error rates, and sample size distribution. We find that both procedures maintain the desired power and that the unblinded procedure is slightly liberal whereas the actual significance level of the blinded procedure is close to the nominal level. Furthermore, we show that in situations where uncertainty about the assumed treatment effect exists, the blinded estimator of the control event rate is biased in contrast to the unblinded estimator, which results in differences in mean sample sizes in favor of the unblinded procedure. However, these differences are rather small compared to the deviations of the mean sample sizes from the sample size required to detect the true, but unknown effect. We demonstrate that the variation of the sample size resulting from the blinded procedure is in many practically relevant situations considerably smaller than the one of the unblinded procedures. The methods are extended to overdispersed counts using a quasi-likelihood approach and are illustrated by trials in relapsing multiple sclerosis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sampling bias in blending validation and a different approach to homogeneity assessment.
Kraemer, J; Svensson, J R; Melgaard, H
1999-02-01
Sampling of batches studied for validation is reported. A thief particularly suited for granules, rather than cohesive powders, was used in the study. It is shown, as has been demonstrated in the past, that traditional 1x to 3x thief sampling of a blend is biased, and that the bias decreases as the sample size increases. It is shown that taking 50 samples of tablets after blending and testing this subpopulation for normality is a discriminating manner of testing for homogeneity. As a criterion, it is better than sampling at mixer or drum stage would be even if an unbiased sampling device were available.
Inferred Eccentricity and Period Distributions of Kepler Eclipsing Binaries
NASA Astrophysics Data System (ADS)
Prsa, Andrej; Matijevic, G.
2014-01-01
Determining the underlying eccentricity and orbital period distributions from an observed sample of eclipsing binary stars is not a trivial task. Shen and Turner (2008) have shown that the commonly used maximum likelihood estimators are biased to larger eccentricities and they do not describe the underlying distribution correctly; orbital periods suffer from a similar bias. Hogg, Myers and Bovy (2010) proposed a hierarchical probabilistic method for inferring the true eccentricity distribution of exoplanet orbits that uses the likelihood functions for individual star eccentricities. The authors show that proper inference outperforms the simple histogramming of the best-fit eccentricity values. We apply this method to the complete sample of eclipsing binary stars observed by the Kepler mission (Prsa et al. 2011) to derive the unbiased underlying eccentricity and orbital period distributions. These distributions can be used for the studies of multiple star formation, dynamical evolution, and they can serve as a drop-in replacement to prior, ad-hoc distributions used in the exoplanet field for determining false positive occurrence rates.
Cheng, Dunlei; Branscum, Adam J; Stamey, James D
2010-07-01
To quantify the impact of ignoring misclassification of a response variable and measurement error in a covariate on statistical power, and to develop software for sample size and power analysis that accounts for these flaws in epidemiologic data. A Monte Carlo simulation-based procedure is developed to illustrate the differences in design requirements and inferences between analytic methods that properly account for misclassification and measurement error to those that do not in regression models for cross-sectional and cohort data. We found that failure to account for these flaws in epidemiologic data can lead to a substantial reduction in statistical power, over 25% in some cases. The proposed method substantially reduced bias by up to a ten-fold margin compared to naive estimates obtained by ignoring misclassification and mismeasurement. We recommend as routine practice that researchers account for errors in measurement of both response and covariate data when determining sample size, performing power calculations, or analyzing data from epidemiological studies. 2010 Elsevier Inc. All rights reserved.
Noninvasive methods for haemoglobin screening in prospective blood donors.
Belardinelli, A; Benni, M; Tazzari, P L; Pagliaro, P
2013-08-01
The haemoglobin level of prospective blood donors is usually performed on blood obtained by from the finger pulp by fingerstick with a lancet and filling a capillary tube with a sample. New noninvasive methods are now available for rapid, noninvasive predonation haemoglobin screening. Prospective blood donors at our blood centre were tested, in two different trials, as follows: by the NBM 200 (OrSense) test (n = 445 donors) and by the Pronto-7 (Masimo) test (n = 463 donors). The haemoglobin values of each trial and the haemoglobin of finger pulp blood obtained by fingerstick with a lancet (HemoCue) were compared with the haemoglobin values obtained from a venous sample on a Cell Counter (Beckman Coulter). Comparison of Beckman Coulter Cell Counter and OrSense and results showed a bias of 0.29 g/dl, the standard deviation of the differences (SDD) of 0.98 and 95% limits of agreement from -1.64 to 2.21, using Bland and Altman statistical methodology. Comparison of Masimo and Beckman Coulter Cell Counter results showed a bias of -0.53 g/dl, SDD of 1.04 and 95% limits of agreement from -2.57 to 1.51. Cumulative analysis of all 908 donors, as tested by the usual fingerstick test showed a bias of 0.83 g/dl, SDD of 0.70 and 95% limits of agreement from -0.54 to 2.20 compared with the Coulter Cell Counter. Compared with the Coulter Counter, the specificity of the methods was 99.5% for fingerstick, 97% for OrSense and 83% for Massimo, and the sensitivity was 99, 98 and 93%, respectively. Analysis of finger pulp blood by either direct sampling by fingerstick and Hemocue, or by noninvasive haemoglobin tests does not replicate the results of cell counter analysis of venous samples. Compared with fingerstick, noninvasive haemoglobin tests eliminate pain and reduce stress, but have a lower level of specificity and sensitivity. © 2013 International Society of Blood Transfusion.
Detection probability in aerial surveys of feral horses
Ransom, Jason I.
2011-01-01
Observation bias pervades data collected during aerial surveys of large animals, and although some sources can be mitigated with informed planning, others must be addressed using valid sampling techniques that carefully model detection probability. Nonetheless, aerial surveys are frequently employed to count large mammals without applying such methods to account for heterogeneity in visibility of animal groups on the landscape. This often leaves managers and interest groups at odds over decisions that are not adequately informed. I analyzed detection of feral horse (Equus caballus) groups by dual independent observers from 24 fixed-wing and 16 helicopter flights using mixed-effect logistic regression models to investigate potential sources of observation bias. I accounted for observer skill, population location, and aircraft type in the model structure and analyzed the effects of group size, sun effect (position related to observer), vegetation type, topography, cloud cover, percent snow cover, and observer fatigue on detection of horse groups. The most important model-averaged effects for both fixed-wing and helicopter surveys included group size (fixed-wing: odds ratio = 0.891, 95% CI = 0.850–0.935; helicopter: odds ratio = 0.640, 95% CI = 0.587–0.698) and sun effect (fixed-wing: odds ratio = 0.632, 95% CI = 0.350–1.141; helicopter: odds ratio = 0.194, 95% CI = 0.080–0.470). Observer fatigue was also an important effect in the best model for helicopter surveys, with detection probability declining after 3 hr of survey time (odds ratio = 0.278, 95% CI = 0.144–0.537). Biases arising from sun effect and observer fatigue can be mitigated by pre-flight survey design. Other sources of bias, such as those arising from group size, topography, and vegetation can only be addressed by employing valid sampling techniques such as double sampling, mark–resight (batch-marked animals), mark–recapture (uniquely marked and identifiable animals), sightability bias correction models, and line transect distance sampling; however, some of these techniques may still only partially correct for negative observation biases.
Dowle, Eddy J; Pochon, Xavier; C Banks, Jonathan; Shearer, Karen; Wood, Susanna A
2016-09-01
Recent studies have advocated biomonitoring using DNA techniques. In this study, two high-throughput sequencing (HTS)-based methods were evaluated: amplicon metabarcoding of the cytochrome C oxidase subunit I (COI) mitochondrial gene and gene enrichment using MYbaits (targeting nine different genes including COI). The gene-enrichment method does not require PCR amplification and thus avoids biases associated with universal primers. Macroinvertebrate samples were collected from 12 New Zealand rivers. Macroinvertebrates were morphologically identified and enumerated, and their biomass determined. DNA was extracted from all macroinvertebrate samples and HTS undertaken using the illumina miseq platform. Macroinvertebrate communities were characterized from sequence data using either six genes (three of the original nine were not used) or just the COI gene in isolation. The gene-enrichment method (all genes) detected the highest number of taxa and obtained the strongest Spearman rank correlations between the number of sequence reads, abundance and biomass in 67% of the samples. Median detection rates across rare (<1% of the total abundance or biomass), moderately abundant (1-5%) and highly abundant (>5%) taxa were highest using the gene-enrichment method (all genes). Our data indicated primer biases occurred during amplicon metabarcoding with greater than 80% of sequence reads originating from one taxon in several samples. The accuracy and sensitivity of both HTS methods would be improved with more comprehensive reference sequence databases. The data from this study illustrate the challenges of using PCR amplification-based methods for biomonitoring and highlight the potential benefits of using approaches, such as gene enrichment, which circumvent the need for an initial PCR step. © 2015 John Wiley & Sons Ltd.
Design and methods in a survey of living conditions in the Arctic - the SLiCA study.
Eliassen, Bent-Martin; Melhus, Marita; Kruse, Jack; Poppel, Birger; Broderstad, Ann Ragnhild
2012-03-19
The main objective of this study is to describe the methods and design of the survey of living conditions in the Arctic (SLiCA), relevant participation rates and the distribution of participants, as applicable to the survey data in Alaska, Greenland and Norway. This article briefly addresses possible selection bias in the data and also the ways to tackle it in future studies. Population-based cross-sectional survey. Indigenous individuals aged 16 years and older, living in Greenland, Alaska and in traditional settlement areas in Norway, were invited to participate. Random sampling methods were applied in Alaska and Greenland, while non-probability sampling methods were applied in Norway. Data were collected in 3 periods: in Alaska, from January 2002 to February 2003; in Greenland, from December 2003 to August 2006; and in Norway, in 2003 and from June 2006 to June 2008. The principal method in SLiCA was standardised face-to-face interviews using a questionnaire. A total of 663, 1,197 and 445 individuals were interviewed in Alaska, Greenland and Norway, respectively. Very high overall participation rates of 83% were obtained in Greenland and Alaska, while a more conventional rate of 57% was achieved in Norway. A predominance of female respondents was obtained in Alaska. Overall, the Sami cohort is older than the cohorts from Greenland and Alaska. Preliminary assessments suggest that selection bias in the Sami sample is plausible but not a major threat. Few or no threats to validity are detected in the data from Alaska and Greenland. Despite different sampling and recruitment methods, and sociocultural differences, a unique database has been generated, which shall be used to explore relationships between health and other living conditions variables.
Integrating sphere based reflectance measurements for small-area semiconductor samples
NASA Astrophysics Data System (ADS)
Saylan, S.; Howells, C. T.; Dahlem, M. S.
2018-05-01
This article describes a method that enables reflectance spectroscopy of small semiconductor samples using an integrating sphere, without the use of additional optical elements. We employed an inexpensive sample holder to measure the reflectance of different samples through 2-, 3-, and 4.5-mm-diameter apertures and applied a mathematical formulation to remove the bias from the measured spectra caused by illumination of the holder. Using the proposed method, the reflectance of samples fabricated using expensive or rare materials and/or low-throughput processes can be measured. It can also be incorporated to infer the internal quantum efficiency of small-area, research-level solar cells. Moreover, small samples that reflect light at large angles and develop scattering may also be measured reliably, by virtue of an integrating sphere insensitive to directionalities.
Parametric Methods for Dynamic 11C-Phenytoin PET Studies.
Mansor, Syahir; Yaqub, Maqsood; Boellaard, Ronald; Froklage, Femke E; de Vries, Anke; Bakker, Esther D M; Voskuyl, Rob A; Eriksson, Jonas; Schwarte, Lothar A; Verbeek, Joost; Windhorst, Albert D; Lammertsma, Adriaan A
2017-03-01
In this study, the performance of various methods for generating quantitative parametric images of dynamic 11 C-phenytoin PET studies was evaluated. Methods: Double-baseline 60-min dynamic 11 C-phenytoin PET studies, including online arterial sampling, were acquired for 6 healthy subjects. Parametric images were generated using Logan plot analysis, a basis function method, and spectral analysis. Parametric distribution volume (V T ) and influx rate ( K 1 ) were compared with those obtained from nonlinear regression analysis of time-activity curves. In addition, global and regional test-retest (TRT) variability was determined for parametric K 1 and V T values. Results: Biases in V T observed with all parametric methods were less than 5%. For K 1 , spectral analysis showed a negative bias of 16%. The mean TRT variabilities of V T and K 1 were less than 10% for all methods. Shortening the scan duration to 45 min provided similar V T and K 1 with comparable TRT performance compared with 60-min data. Conclusion: Among the various parametric methods tested, the basis function method provided parametric V T and K 1 values with the least bias compared with nonlinear regression data and showed TRT variabilities lower than 5%, also for smaller volume-of-interest sizes (i.e., higher noise levels) and shorter scan duration. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.
Wetherbee, Gregory A.; Latysh, Natalie E.; Greene, Shannon M.
2006-01-01
The U.S. Geological Survey (USGS) used five programs to provide external quality-assurance monitoring for the National Atmospheric Deposition Program/National Trends Network (NADP/NTN) and two programs to provide external quality-assurance monitoring for the NADP/Mercury Deposition Network (NADP/MDN) during 2004. An intersite-comparison program was used to estimate accuracy and precision of field-measured pH and specific-conductance. The variability and bias of NADP/NTN data attributed to field exposure, sample handling and shipping, and laboratory chemical analysis were estimated using the sample-handling evaluation (SHE), field-audit, and interlaboratory-comparison programs. Overall variability of NADP/NTN data was estimated using a collocated-sampler program. Variability and bias of NADP/MDN data attributed to field exposure, sample handling and shipping, and laboratory chemical analysis were estimated using a system-blank program and an interlaboratory-comparison program. In two intersite-comparison studies, approximately 89 percent of NADP/NTN site operators met the pH measurement accuracy goals, and 94.7 to 97.1 percent of NADP/NTN site operators met the accuracy goals for specific conductance. Field chemistry measurements were discontinued by NADP at the end of 2004. As a result, the USGS intersite-comparison program also was discontinued at the end of 2004. Variability and bias in NADP/NTN data due to sample handling and shipping were estimated from paired-sample concentration differences and specific conductance differences obtained for the SHE program. Median absolute errors (MAEs) equal to less than 3 percent were indicated for all measured analytes except potassium and hydrogen ion. Positive bias was indicated for most of the measured analytes except for calcium, hydrogen ion and specific conductance. Negative bias for hydrogen ion and specific conductance indicated loss of hydrogen ion and decreased specific conductance from contact of the sample with the collector bucket. Field-audit results for 2004 indicate dissolved analyte loss in more than one-half of NADP/NTN wet-deposition samples for all analytes except chloride. Concentrations of contaminants also were estimated from field-audit data. On the basis of 2004 field-audit results, at least 25 percent of the 2004 NADP/NTN concentrations for sodium, potassium, and chloride were lower than the maximum sodium, potassium, and chloride contamination likely to be found in 90 percent of the samples with 90-percent confidence. Variability and bias in NADP/NTN data attributed to chemical analysis by the NADP Central Analytical Laboratory (CAL) were comparable to the variability and bias estimated for other laboratories participating in the interlaboratory-comparison program for all analytes. Variability in NADP/NTN ammonium data evident in 2002-03 was reduced substantially during 2004. Sulfate, hydrogen-ion, and specific conductance data reported by CAL during 2004 were positively biased. A significant (a = 0.05) bias was identified for CAL sodium, potassium, ammonium, and nitrate data, but the absolute values of the median differences for these analytes were less than the method detection limits. No detections were reported for CAL analyses of deionized-water samples, indicating that contamination was not a problem for CAL. Control charts show that CAL data were within statistical control during at least 90 percent of 2004. Most 2004 CAL interlaboratory-comparison results for synthetic wet-deposition solutions were within ?10 percent of the most probable values (MPVs) for solution concentrations except for chloride, nitrate, sulfate, and specific conductance results from one sample in November and one specific conductance result in December. Overall variability of NADP/NTN wet-deposition measurements was estimated during water year 2004 by the median absolute errors for weekly wet-deposition sample concentrations and precipitation measurements for tw
Bias of apparent tracer ages in heterogeneous environments.
McCallum, James L; Cook, Peter G; Simmons, Craig T; Werner, Adrian D
2014-01-01
The interpretation of apparent ages often assumes that a water sample is composed of a single age. In heterogeneous aquifers, apparent ages estimated with environmental tracer methods do not reflect mean water ages because of the mixing of waters from many flow paths with different ages. This is due to nonlinear variations in atmospheric concentrations of the tracer with time resulting in biases of mixed concentrations used to determine apparent ages. The bias of these methods is rarely reported and has not been systematically evaluated in heterogeneous settings. We simulate residence time distributions (RTDs) and environmental tracers CFCs, SF6 , (85) Kr, and (39) Ar in synthetic heterogeneous confined aquifers and compare apparent ages to mean ages. Heterogeneity was simulated as both K-field variance (σ(2) ) and structure. We demonstrate that an increase in heterogeneity (increase in σ(2) or structure) results in an increase in the width of the RTD. In low heterogeneity cases, widths were generally on the order of 10 years and biases generally less than 10%. In high heterogeneity cases, widths can reach 100 s of years and biases can reach up to 100%. In cases where the temporal variations of atmospheric concentration of individual tracers vary, different patterns of bias are observed for the same mean age. We show that CFC-12 and CFC-113 ages may be used to correct for the mean age if analytical errors are small. © 2013, National Ground Water Association.
Ho, Robin ST; Wu, Xinyin; Yuan, Jinqiu; Liu, Siya; Lai, Xin; Wong, Samuel YS; Chung, Vincent CH
2015-01-01
Background: Meta-analysis (MA) of randomised trials is considered to be one of the best approaches for summarising high-quality evidence on the efficacy and safety of treatments. However, methodological flaws in MAs can reduce the validity of conclusions, subsequently impairing the quality of decision making. Aims: To assess the methodological quality of MAs on COPD treatments. Methods: A cross-sectional study on MAs of COPD trials. MAs published during 2000–2013 were sampled from the Cochrane Database of Systematic Reviews and Database of Abstracts of Reviews of Effect. Methodological quality was assessed using the validated AMSTAR (Assessing the Methodological Quality of Systematic Reviews) tool. Results: Seventy-nine MAs were sampled. Only 18% considered the scientific quality of primary studies when formulating conclusions and 49% used appropriate meta-analytic methods to combine findings. The problems were particularly acute among MAs on pharmacological treatments. In 48% of MAs the authors did not report conflict of interest. Fifty-eight percent reported harmful effects of treatment. Publication bias was not assessed in 65% of MAs, and only 10% had searched non-English databases. Conclusions: The methodological quality of the included MAs was disappointing. Consideration of scientific quality when formulating conclusions should be made explicit. Future MAs should improve on reporting conflict of interest and harm, assessment of publication bias, prevention of language bias and use of appropriate meta-analytic methods. PMID:25569783
Pulsed discharge ionization source for miniature ion mobility spectrometers
Xu, Jun; Ramsey, J. Michael; Whitten, William B.
2004-11-23
A method and apparatus is disclosed for flowing a sample gas and a reactant gas (38, 43) past a corona discharge electrode (26) situated at a first location in an ion drift chamber (24), applying a pulsed voltage waveform comprising a varying pulse component and a dc bias component to the corona discharge electrode (26) to cause a corona which in turn produces ions from the sample gas and the reactant gas, applying a dc bias to the ion drift chamber (24) to cause the ions to drift to a second location (25) in the ion drift chamber (24), detecting the ions at the second location (25) in the drift chamber (24), and timing the period for the ions to drift from the corona discharge electrode to the selected location in the drift chamber.
Code of Federal Regulations, 2014 CFR
2014-07-01
... ambient temperature and pressure and the sampling time. The mass concentrations of both PM10c and PM2.5 in... 25 hours), and the start times of the PM2.5 and PM10c samples are within 10 minutes and the stop times of the samples are also within 10 minutes (see section 10.4 of this appendix). 4.0Accuracy (bias...
Code of Federal Regulations, 2012 CFR
2012-07-01
... ambient temperature and pressure and the sampling time. The mass concentrations of both PM10c and PM2.5 in... 25 hours), and the start times of the PM2.5 and PM10c samples are within 10 minutes and the stop times of the samples are also within 10 minutes (see section 10.4 of this appendix). 4.0Accuracy (bias...
Code of Federal Regulations, 2013 CFR
2013-07-01
... ambient temperature and pressure and the sampling time. The mass concentrations of both PM10c and PM2.5 in... 25 hours), and the start times of the PM2.5 and PM10c samples are within 10 minutes and the stop times of the samples are also within 10 minutes (see section 10.4 of this appendix). 4.0Accuracy (bias...
Code of Federal Regulations, 2011 CFR
2011-07-01
... ambient temperature and pressure and the sampling time. The mass concentrations of both PM10c and PM2.5 in... 25 hours), and the start times of the PM2.5 and PM10c samples are within 10 minutes and the stop times of the samples are also within 10 minutes (see section 10.4 of this appendix). 4.0Accuracy (bias...
Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxies
NASA Astrophysics Data System (ADS)
Lawlor, David; Budavári, Tamás; Mahoney, Michael W.
2016-12-01
We present a novel approach to studying the diversity of galaxies. It is based on a novel spectral graph technique, that of locally-biased semi-supervised eigenvectors. Our method introduces new coordinates that summarize an entire spectrum, similar to but going well beyond the widely used Principal Component Analysis (PCA). Unlike PCA, however, this technique does not assume that the Euclidean distance between galaxy spectra is a good global measure of similarity. Instead, we relax that condition to only the most similar spectra, and we show that doing so yields more reliable results for many astronomical questions of interest. The global variant of our approach can identify very finely numerous astronomical phenomena of interest. The locally-biased variants of our basic approach enable us to explore subtle trends around a set of chosen objects. The power of the method is demonstrated in the Sloan Digital Sky Survey Main Galaxy Sample, by illustrating that the derived spectral coordinates carry an unprecedented amount of information.
Real-time image annotation by manifold-based biased Fisher discriminant analysis
NASA Astrophysics Data System (ADS)
Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming
2008-01-01
Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.
Podsakoff, Philip M; MacKenzie, Scott B; Lee, Jeong-Yeon; Podsakoff, Nathan P
2003-10-01
Interest in the problem of method biases has a long history in the behavioral sciences. Despite this, a comprehensive summary of the potential sources of method biases and how to control for them does not exist. Therefore, the purpose of this article is to examine the extent to which method biases influence behavioral research results, identify potential sources of method biases, discuss the cognitive processes through which method biases influence responses to measures, evaluate the many different procedural and statistical techniques that can be used to control method biases, and provide recommendations for how to select appropriate procedural and statistical remedies for different types of research settings.
Estimating Dungeness crab (Cancer magister) abundance: Crab pots and dive transects compared
Taggart, S. James; O'Clair, Charles E.; Shirley, Thomas C.; Mondragon, Jennifer
2004-01-01
Dungeness crabs (Cancer magister) were sampled with commercial pots and counted by scuba divers on benthic transects at eight sites near Glacier Bay, Alaska. Catch per unit of effort (CPUE) from pots was compared to the density estimates from dives to evaluate the bias and power of the two techniques. Yearly sampling was conducted in two seasons: April and September, from 1992 to 2000. Male CPUE estimates from pots were significantly lower in April than in the following September; a step-wise regression demonstrated that season accounted for more of the variation in male CPUE than did temperature. In both April and September, pot sampling was significantly biased against females. When females were categorized as ovigerous and nonovigerous, it was clear that ovigerous females accounted for the majority of the bias because pots were not biased against nonovigerous females. We compared the power of pots and dive transects in detecting trends in populations and found that pots had much higher power than dive transects. Despite their low power, the dive transects were very useful for detecting bias in our pot sampling and in identifying the optimal times of year to sample so that pot bias could be avoided.
Potential, velocity, and density fields from sparse and noisy redshift-distance samples - Method
NASA Technical Reports Server (NTRS)
Dekel, Avishai; Bertschinger, Edmund; Faber, Sandra M.
1990-01-01
A method for recovering the three-dimensional potential, velocity, and density fields from large-scale redshift-distance samples is described. Galaxies are taken as tracers of the velocity field, not of the mass. The density field and the initial conditions are calculated using an iterative procedure that applies the no-vorticity assumption at an initial time and uses the Zel'dovich approximation to relate initial and final positions of particles on a grid. The method is tested using a cosmological N-body simulation 'observed' at the positions of real galaxies in a redshift-distance sample, taking into account their distance measurement errors. Malmquist bias and other systematic and statistical errors are extensively explored using both analytical techniques and Monte Carlo simulations.
James W. Flewelling
2009-01-01
Remotely sensed data can be used to make digital maps showing individual tree crowns (ITC) for entire forests. Attributes of the ITCs may include area, shape, height, and color. The crown map is sampled in a way that provides an unbiased linkage between ITCs and identifiable trees measured on the ground. Methods of avoiding edge bias are given. In an example from a...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nelson, Kaylea; Nagai, Daisuke; Yu, Liang
2014-02-20
The use of galaxy clusters as cosmological probes hinges on our ability to measure their masses accurately and with high precision. Hydrostatic mass is one of the most common methods for estimating the masses of individual galaxy clusters, which suffer from biases due to departures from hydrostatic equilibrium. Using a large, mass-limited sample of massive galaxy clusters from a high-resolution hydrodynamical cosmological simulation, in this work we show that in addition to turbulent and bulk gas velocities, acceleration of gas introduces biases in the hydrostatic mass estimate of galaxy clusters. In unrelaxed clusters, the acceleration bias is comparable to themore » bias due to non-thermal pressure associated with merger-induced turbulent and bulk gas motions. In relaxed clusters, the mean mass bias due to acceleration is small (≲ 3%), but the scatter in the mass bias can be reduced by accounting for gas acceleration. Additionally, this acceleration bias is greater in the outskirts of higher redshift clusters where mergers are more frequent and clusters are accreting more rapidly. Since gas acceleration cannot be observed directly, it introduces an irreducible bias for hydrostatic mass estimates. This acceleration bias places limits on how well we can recover cluster masses from future X-ray and microwave observations. We discuss implications for cluster mass estimates based on X-ray, Sunyaev-Zel'dovich effect, and gravitational lensing observations and their impact on cluster cosmology.« less
NASA Astrophysics Data System (ADS)
Nelson, Kaylea; Lau, Erwin T.; Nagai, Daisuke; Rudd, Douglas H.; Yu, Liang
2014-02-01
The use of galaxy clusters as cosmological probes hinges on our ability to measure their masses accurately and with high precision. Hydrostatic mass is one of the most common methods for estimating the masses of individual galaxy clusters, which suffer from biases due to departures from hydrostatic equilibrium. Using a large, mass-limited sample of massive galaxy clusters from a high-resolution hydrodynamical cosmological simulation, in this work we show that in addition to turbulent and bulk gas velocities, acceleration of gas introduces biases in the hydrostatic mass estimate of galaxy clusters. In unrelaxed clusters, the acceleration bias is comparable to the bias due to non-thermal pressure associated with merger-induced turbulent and bulk gas motions. In relaxed clusters, the mean mass bias due to acceleration is small (lsim 3%), but the scatter in the mass bias can be reduced by accounting for gas acceleration. Additionally, this acceleration bias is greater in the outskirts of higher redshift clusters where mergers are more frequent and clusters are accreting more rapidly. Since gas acceleration cannot be observed directly, it introduces an irreducible bias for hydrostatic mass estimates. This acceleration bias places limits on how well we can recover cluster masses from future X-ray and microwave observations. We discuss implications for cluster mass estimates based on X-ray, Sunyaev-Zel'dovich effect, and gravitational lensing observations and their impact on cluster cosmology.
Different hunting strategies select for different weights in red deer.
Martínez, María; Rodríguez-Vigal, Carlos; Jones, Owen R; Coulson, Tim; San Miguel, Alfonso
2005-09-22
Much insight can be derived from records of shot animals. Most researchers using such data assume that their data represents a random sample of a particular demographic class. However, hunters typically select a non-random subset of the population and hunting is, therefore, not a random process. Here, with red deer (Cervus elaphus) hunting data from a ranch in Toledo, Spain, we demonstrate that data collection methods have a significant influence upon the apparent relationship between age and weight. We argue that a failure to correct for such methodological bias may have significant consequences for the interpretation of analyses involving weight or correlated traits such as breeding success, and urge researchers to explore methods to identify and correct for such bias in their data.
Dynamic Histogram Analysis To Determine Free Energies and Rates from Biased Simulations.
Stelzl, Lukas S; Kells, Adam; Rosta, Edina; Hummer, Gerhard
2017-12-12
We present an algorithm to calculate free energies and rates from molecular simulations on biased potential energy surfaces. As input, it uses the accumulated times spent in each state or bin of a histogram and counts of transitions between them. Optimal unbiased equilibrium free energies for each of the states/bins are then obtained by maximizing the likelihood of a master equation (i.e., first-order kinetic rate model). The resulting free energies also determine the optimal rate coefficients for transitions between the states or bins on the biased potentials. Unbiased rates can be estimated, e.g., by imposing a linear free energy condition in the likelihood maximization. The resulting "dynamic histogram analysis method extended to detailed balance" (DHAMed) builds on the DHAM method. It is also closely related to the transition-based reweighting analysis method (TRAM) and the discrete TRAM (dTRAM). However, in the continuous-time formulation of DHAMed, the detailed balance constraints are more easily accounted for, resulting in compact expressions amenable to efficient numerical treatment. DHAMed produces accurate free energies in cases where the common weighted-histogram analysis method (WHAM) for umbrella sampling fails because of slow dynamics within the windows. Even in the limit of completely uncorrelated data, where WHAM is optimal in the maximum-likelihood sense, DHAMed results are nearly indistinguishable. We illustrate DHAMed with applications to ion channel conduction, RNA duplex formation, α-helix folding, and rate calculations from accelerated molecular dynamics. DHAMed can also be used to construct Markov state models from biased or replica-exchange molecular dynamics simulations. By using binless WHAM formulated as a numerical minimization problem, the bias factors for the individual states can be determined efficiently in a preprocessing step and, if needed, optimized globally afterward.
Assessment of Sample Preparation Bias in Mass Spectrometry-Based Proteomics.
Klont, Frank; Bras, Linda; Wolters, Justina C; Ongay, Sara; Bischoff, Rainer; Halmos, Gyorgy B; Horvatovich, Péter
2018-04-17
For mass spectrometry-based proteomics, the selected sample preparation strategy is a key determinant for information that will be obtained. However, the corresponding selection is often not based on a fit-for-purpose evaluation. Here we report a comparison of in-gel (IGD), in-solution (ISD), on-filter (OFD), and on-pellet digestion (OPD) workflows on the basis of targeted (QconCAT-multiple reaction monitoring (MRM) method for mitochondrial proteins) and discovery proteomics (data-dependent acquisition, DDA) analyses using three different human head and neck tissues (i.e., nasal polyps, parotid gland, and palatine tonsils). Our study reveals differences between the sample preparation methods, for example, with respect to protein and peptide losses, quantification variability, protocol-induced methionine oxidation, and asparagine/glutamine deamidation as well as identification of cysteine-containing peptides. However, none of the methods performed best for all types of tissues, which argues against the existence of a universal sample preparation method for proteome analysis.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-13
... proposed sample size, the expected response rate, methods for assessing potential non-response bias, the... Activities: Proposed Collection; Comment Request; Generic Clearance for the Collection of Qualitative...): ``Generic Clearance for the Collection of Qualitative Feedback on Agency Service Delivery '' to OMB for...
TECHNICAL NOTE: PERFORMANCE OF A PERSONAL ELECTROSTATIC PRECIPITATOR PARTICLE SAMPLER
Filter-based methods used to measure aerosols with semi-volatile constituents are subject to biases from adsorption and volatilization that may occur during sampling (McDow et al., 1990, Turpin et al., 1994, Volckens et al., 1999; Tolocka et al. 2001). The development and eval...
Bayesian Normalization Model for Label-Free Quantitative Analysis by LC-MS
Nezami Ranjbar, Mohammad R.; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.
2016-01-01
We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement. PMID:26357332
A Bayesian approach to truncated data sets: An application to Malmquist bias in Supernova Cosmology
NASA Astrophysics Data System (ADS)
March, Marisa Cristina
2018-01-01
A problem commonly encountered in statistical analysis of data is that of truncated data sets. A truncated data set is one in which a number of data points are completely missing from a sample, this is in contrast to a censored sample in which partial information is missing from some data points. In astrophysics this problem is commonly seen in a magnitude limited survey such that the survey is incomplete at fainter magnitudes, that is, certain faint objects are simply not observed. The effect of this `missing data' is manifested as Malmquist bias and can result in biases in parameter inference if it is not accounted for. In Frequentist methodologies the Malmquist bias is often corrected for by analysing many simulations and computing the appropriate correction factors. One problem with this methodology is that the corrections are model dependent. In this poster we derive a Bayesian methodology for accounting for truncated data sets in problems of parameter inference and model selection. We first show the methodology for a simple Gaussian linear model and then go on to show the method for accounting for a truncated data set in the case for cosmological parameter inference with a magnitude limited supernova Ia survey.
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Essential slow degrees of freedom in protein-surface simulations: A metadynamics investigation.
Prakash, Arushi; Sprenger, K G; Pfaendtner, Jim
2018-03-29
Many proteins exhibit strong binding affinities to surfaces, with binding energies much greater than thermal fluctuations. When modelling these protein-surface systems with classical molecular dynamics (MD) simulations, the large forces that exist at the protein/surface interface generally confine the system to a single free energy minimum. Exploring the full conformational space of the protein, especially finding other stable structures, becomes prohibitively expensive. Coupling MD simulations with metadynamics (enhanced sampling) has fast become a common method for sampling the adsorption of such proteins. In this paper, we compare three different flavors of metadynamics, specifically well-tempered, parallel-bias, and parallel-tempering in the well-tempered ensemble, to exhaustively sample the conformational surface-binding landscape of model peptide GGKGG. We investigate the effect of mobile ions and ion charge, as well as the choice of collective variable (CV), on the binding free energy of the peptide. We make the case for explicitly biasing ions to sample the true binding free energy of biomolecules when the ion concentration is high and the binding free energies of the solute and ions are similar. We also make the case for choosing CVs that apply bias to all atoms of the solute to speed up calculations and obtain the maximum possible amount of information about the system. Copyright © 2017 Elsevier Inc. All rights reserved.
HICOSMO: cosmology with a complete sample of galaxy clusters - II. Cosmological results
NASA Astrophysics Data System (ADS)
Schellenberger, G.; Reiprich, T. H.
2017-10-01
The X-ray bright, hot gas in the potential well of a galaxy cluster enables systematic X-ray studies of samples of galaxy clusters to constrain cosmological parameters. HIFLUGCS consists of the 64 X-ray brightest galaxy clusters in the Universe, building up a local sample. Here, we utilize this sample to determine, for the first time, individual hydrostatic mass estimates for all the clusters of the sample and, by making use of the completeness of the sample, we quantify constraints on the two interesting cosmological parameters, Ωm and σ8. We apply our total hydrostatic and gas mass estimates from the X-ray analysis to a Bayesian cosmological likelihood analysis and leave several parameters free to be constrained. We find Ωm = 0.30 ± 0.01 and σ8 = 0.79 ± 0.03 (statistical uncertainties, 68 per cent credibility level) using our default analysis strategy combining both a mass function analysis and the gas mass fraction results. The main sources of biases that we correct here are (1) the influence of galaxy groups (incompleteness in parent samples and differing behaviour of the Lx-M relation), (2) the hydrostatic mass bias, (3) the extrapolation of the total mass (comparing various methods), (4) the theoretical halo mass function and (5) other physical effects (non-negligible neutrino mass). We find that galaxy groups introduce a strong bias, since their number density seems to be over predicted by the halo mass function. On the other hand, incorporating baryonic effects does not result in a significant change in the constraints. The total (uncorrected) systematic uncertainties (∼20 per cent) clearly dominate the statistical uncertainties on cosmological parameters for our sample.
Quantification of in-contact probe-sample electrostatic forces with dynamic atomic force microscopy.
Balke, Nina; Jesse, Stephen; Carmichael, Ben; Okatan, M Baris; Kravchenko, Ivan I; Kalinin, Sergei V; Tselev, Alexander
2017-01-04
Atomic force microscopy (AFM) methods utilizing resonant mechanical vibrations of cantilevers in contact with a sample surface have shown sensitivities as high as few picometers for detecting surface displacements. Such a high sensitivity is harnessed in several AFM imaging modes. Here, we demonstrate a cantilever-resonance-based method to quantify electrostatic forces on a probe in the probe-sample junction in the presence of a surface potential or when a bias voltage is applied to the AFM probe. We find that the electrostatic forces acting on the probe tip apex can produce signals equivalent to a few pm of surface displacement. In combination with modeling, the measurements of the force were used to access the strength of the electrical field at the probe tip apex in contact with a sample. We find an evidence that the electric field strength in the junction can reach ca. 1 V nm -1 at a bias voltage of a few volts and is limited by non-ideality of the tip-sample contact. This field is sufficiently strong to significantly influence material states and kinetic processes through charge injection, Maxwell stress, shifts of phase equilibria, and reduction of energy barriers for activated processes. Besides, the results provide a baseline for accounting for the effects of local electrostatic forces in electromechanical AFM measurements as well as offer additional means to probe ionic mobility and field-induced phenomena in solids.
Information Repetition in Evaluative Judgments: Easy to Monitor, Hard to Control
ERIC Educational Resources Information Center
Unkelbach, Christian; Fiedler, Klaus; Freytag, Peter
2007-01-01
The sampling approach [Fiedler, K. (2000a). "Beware of samples! A cognitive-ecological sampling approach to judgment biases." "Psychological Review, 107"(4), 659-676.] attributes judgment biases to the information given in a sample. Because people usually do not monitor the constraints of samples and do not control their judgments accordingly,…
Williams, Michael S; Ebel, Eric D
2017-03-20
The presence or absence of contaminants in food samples changes as a commodity moves along the farm-to-table continuum. Interest lies in the degree to which the prevalence (i.e., infected animals or contaminated sample units) at one location in the continuum, as measured by the proportion of test-positive samples, is correlated with the prevalence at a location later in the continuum. If prevalence of a contaminant at one location in the continuum is strongly correlated with the prevalence of the contaminant later in the continuum, then the effect of changes in contamination on overall food safety can be better understood. Pearson's correlation coefficient is one of the simplest metrics of association between two measurements of prevalence but it is biased when data consisting of presence/absence testing results are used to directly estimate the correlation. This study demonstrates the potential magnitude of this bias and explores the utility of three methods for unbiased estimation of the degree of correlation in prevalence. An example, based on testing broiler chicken carcasses for Salmonella at re-hang and post-chill, is used to demonstrate the methods. Published by Elsevier B.V.
Optimal updating magnitude in adaptive flat-distribution sampling
NASA Astrophysics Data System (ADS)
Zhang, Cheng; Drake, Justin A.; Ma, Jianpeng; Pettitt, B. Montgomery
2017-11-01
We present a study on the optimization of the updating magnitude for a class of free energy methods based on flat-distribution sampling, including the Wang-Landau (WL) algorithm and metadynamics. These methods rely on adaptive construction of a bias potential that offsets the potential of mean force by histogram-based updates. The convergence of the bias potential can be improved by decreasing the updating magnitude with an optimal schedule. We show that while the asymptotically optimal schedule for the single-bin updating scheme (commonly used in the WL algorithm) is given by the known inverse-time formula, that for the Gaussian updating scheme (commonly used in metadynamics) is often more complex. We further show that the single-bin updating scheme is optimal for very long simulations, and it can be generalized to a class of bandpass updating schemes that are similarly optimal. These bandpass updating schemes target only a few long-range distribution modes and their optimal schedule is also given by the inverse-time formula. Constructed from orthogonal polynomials, the bandpass updating schemes generalize the WL and Langfeld-Lucini-Rago algorithms as an automatic parameter tuning scheme for umbrella sampling.
Optimal updating magnitude in adaptive flat-distribution sampling.
Zhang, Cheng; Drake, Justin A; Ma, Jianpeng; Pettitt, B Montgomery
2017-11-07
We present a study on the optimization of the updating magnitude for a class of free energy methods based on flat-distribution sampling, including the Wang-Landau (WL) algorithm and metadynamics. These methods rely on adaptive construction of a bias potential that offsets the potential of mean force by histogram-based updates. The convergence of the bias potential can be improved by decreasing the updating magnitude with an optimal schedule. We show that while the asymptotically optimal schedule for the single-bin updating scheme (commonly used in the WL algorithm) is given by the known inverse-time formula, that for the Gaussian updating scheme (commonly used in metadynamics) is often more complex. We further show that the single-bin updating scheme is optimal for very long simulations, and it can be generalized to a class of bandpass updating schemes that are similarly optimal. These bandpass updating schemes target only a few long-range distribution modes and their optimal schedule is also given by the inverse-time formula. Constructed from orthogonal polynomials, the bandpass updating schemes generalize the WL and Langfeld-Lucini-Rago algorithms as an automatic parameter tuning scheme for umbrella sampling.
NASA Astrophysics Data System (ADS)
Fasnacht, Marc
We develop adaptive Monte Carlo methods for the calculation of the free energy as a function of a parameter of interest. The methods presented are particularly well-suited for systems with complex energy landscapes, where standard sampling techniques have difficulties. The Adaptive Histogram Method uses a biasing potential derived from histograms recorded during the simulation to achieve uniform sampling in the parameter of interest. The Adaptive Integration method directly calculates an estimate of the free energy from the average derivative of the Hamiltonian with respect to the parameter of interest and uses it as a biasing potential. We compare both methods to a state of the art method, and demonstrate that they compare favorably for the calculation of potentials of mean force of dense Lennard-Jones fluids. We use the Adaptive Integration Method to calculate accurate potentials of mean force for different types of simple particles in a Lennard-Jones fluid. Our approach allows us to separate the contributions of the solvent to the potential of mean force from the effect of the direct interaction between the particles. With contributions of the solvent determined, we can find the potential of mean force directly for any other direct interaction without additional simulations. We also test the accuracy of the Adaptive Integration Method on a thermodynamic cycle, which allows us to perform a consistency check between potentials of mean force and chemical potentials calculated using the Adaptive Integration Method. The results demonstrate a high degree of consistency of the method.
Bakbergenuly, Ilyas; Kulinskaya, Elena; Morgenthaler, Stephan
2016-07-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability p̂, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence. © 2016 The Authors. Biometrical Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Attention bias and anxiety in young children exposed to family violence
Briggs-Gowan, Margaret J.; Pollak, Seth D.; Grasso, Damion; Voss, Joel; Mian, Nicholas D.; Zobel, Elvira; McCarthy, Kimberly J.; Wakschlag, Lauren S.; Pine, Daniel S.
2015-01-01
Background Attention bias towards threat is associated with anxiety in older youth and adults and has been linked with violence exposure. Attention bias may moderate the relationship between violence exposure and anxiety in young children. Capitalizing on measurement advances, the current study examines these relationships at a younger age than previously possible. Methods Young children (mean age 4.7, ±0.8) from a cross-sectional sample oversampled for violence exposure (N = 218) completed the dot-probe task to assess their attention biases. Observed fear/anxiety was characterized with a novel observational paradigm, the Anxiety Diagnostic Observation Schedule. Mother-reported symptoms were assessed with the Preschool-Age Psychiatric Assessment and Trauma Symptom Checklist for Young Children. Violence exposure was characterized with dimensional scores reflecting probability of membership in two classes derived via latent class analysis from the Conflict Tactics Scales: Abuse and Harsh Parenting. Results Family violence predicted greater child anxiety and trauma symptoms. Attention bias moderated the relationship between violence and anxiety. Conclusions Attention bias towards threat may strengthen the effects of family violence on the development of anxiety, with potentially cascading effects across childhood. Such associations may be most readily detected when using observational measures of childhood anxiety. PMID:26716142
De Kesel, Pieter M M; Capiau, Sara; Stove, Veronique V; Lambert, Willy E; Stove, Christophe P
2014-10-01
Although dried blood spot (DBS) sampling is increasingly receiving interest as a potential alternative to traditional blood sampling, the impact of hematocrit (Hct) on DBS results is limiting its final breakthrough in routine bioanalysis. To predict the Hct of a given DBS, potassium (K(+)) proved to be a reliable marker. The aim of this study was to evaluate whether application of an algorithm, based upon predicted Hct or K(+) concentrations as such, allowed correction for the Hct bias. Using validated LC-MS/MS methods, caffeine, chosen as a model compound, was determined in whole blood and corresponding DBS samples with a broad Hct range (0.18-0.47). A reference subset (n = 50) was used to generate an algorithm based on K(+) concentrations in DBS. Application of the developed algorithm on an independent test set (n = 50) alleviated the assay bias, especially at lower Hct values. Before correction, differences between DBS and whole blood concentrations ranged from -29.1 to 21.1%. The mean difference, as obtained by Bland-Altman comparison, was -6.6% (95% confidence interval (CI), -9.7 to -3.4%). After application of the algorithm, differences between corrected and whole blood concentrations lay between -19.9 and 13.9% with a mean difference of -2.1% (95% CI, -4.5 to 0.3%). The same algorithm was applied to a separate compound, paraxanthine, which was determined in 103 samples (Hct range, 0.17-0.47), yielding similar results. In conclusion, a K(+)-based algorithm allows correction for the Hct bias in the quantitative analysis of caffeine and its metabolite paraxanthine.
NASA Astrophysics Data System (ADS)
Knowles, Justin; Skutnik, Steven; Glasgow, David; Kapsimalis, Roger
2016-10-01
Rapid nondestructive assay methods for trace fissile material analysis are needed in both nuclear forensics and safeguards communities. To address these needs, research at the Oak Ridge National Laboratory High Flux Isotope Reactor Neutron Activation Analysis facility has developed a generalized nondestructive assay method to characterize materials containing fissile isotopes. This method relies on gamma-ray emissions from short-lived fission products and makes use of differences in fission product yields to identify fissile compositions of trace material samples. Although prior work has explored the use of short-lived fission product gamma-ray measurements, the proposed method is the first to provide a complete characterization of isotopic identification, mass ratios, and absolute mass determination. Successful single fissile isotope mass recoveries of less than 6% recovery bias have been conducted on standards of 235U and 239Pu as low as 12 ng in less than 10 minutes. Additionally, mixtures of fissile isotope standards containing 235U and 239Pu have been characterized as low as 198 ng of fissile mass with less than 7% recovery bias. The generalizability of this method is illustrated by evaluating different fissile isotopes, mixtures of fissile isotopes, and two different irradiation positions in the reactor. It is anticipated that this method will be expanded to characterize additional fissile nuclides, utilize various irradiation facilities, and account for increasingly complex sample matrices.
Simpson, Andrew T
2003-11-01
The measurement of oil mist derived from metalworking fluids formulated with light mineral oils can be highly inaccurate when using traditional filter sampling. This is due to evaporation of oil from the filter. In this work the practicability of an alternative approach measuring total oil mist and vapor was investigated. Combinations of inhalable particle samplers with backup sorbent vapor traps and standard vapor sampling on pumped and diffusive sorbent tubes were evaluated with gravimetric, infrared spectroscopic, and gas chromatographic analytical methods against the performance requirements of European Standard EN 482. An artificial aerosol was used to compare the methods against a reference method of filter sampler in series with three impingers. Multi-orifice samplers were used with standard 8-mm diameter charcoal tubes at 2 L/min without any signs of channelling or significant breakthrough, as were conical inhalable samplers with XAD-2 tubes at 1 L/min. Most combinations of samplers had a bias of less than 3 percent, but solitary pumped charcoal tubes underestimated total oil by 13 percent. Diffusive sampling was affected by impaction of mist particles and condensation of oil vapor. Gravimetric analysis of filters revealed significant potential sample loss during storage, with 4 percent being lost after one day when stored at room temperature and 2 percent when refrigerated. Samples left overnight in the balance room to equilibrate lost 24 percent. Infrared spectroscopy gave more precise results for vapor than gas chromatography (p = 0.002). Gas chromatography was less susceptible to bias from contaminating solvent vapors than infrared spectroscopy, but was still vulnerable to petroleum distillates. Under the specific test conditions (one oil type and mist particle size), all combinations of methods examined complied with the requirements of European Standard EN 484. Total airborne oil can be measured accurately; however, care must be taken to avoid contamination by hydrocarbon solvent vapors during sampling.
Bootstrap Estimation of Sample Statistic Bias in Structural Equation Modeling.
ERIC Educational Resources Information Center
Thompson, Bruce; Fan, Xitao
This study empirically investigated bootstrap bias estimation in the area of structural equation modeling (SEM). Three correctly specified SEM models were used under four different sample size conditions. Monte Carlo experiments were carried out to generate the criteria against which bootstrap bias estimation should be judged. For SEM fit indices,…
Network Structure and Biased Variance Estimation in Respondent Driven Sampling
Verdery, Ashton M.; Mouw, Ted; Bauldry, Shawn; Mucha, Peter J.
2015-01-01
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network. PMID:26679927
Method for construction of a biased potential for hyperdynamic simulation of atomic systems
NASA Astrophysics Data System (ADS)
Duda, E. V.; Kornich, G. V.
2017-10-01
An approach to constructing a biased potential for hyperdynamic simulation of atomic systems is considered. Using this approach, the diffusion of an atom adsorbed on the surface of a two-dimensional crystal and a vacancy in the bulk of the crystal are simulated. The influence of the variation in the potential barriers due to thermal vibrations of atoms on the results of calculations is discussed. It is shown that the bias of the potential in the hyperdynamic simulation makes it possible to obtain statistical samples of transitions of atomic systems between states, similar to those given by classical molecular dynamics. However, hyperdynamics significantly accelerates computations in comparison with molecular dynamics in the case of temperature-activated transitions and the associated processes in atomic systems.
NASA Astrophysics Data System (ADS)
Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.
2015-10-01
All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. Here, we are applying a consistent approach based on auto- and cross-covariance functions to quantify the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining data sets from several analysers and using simulations, we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time lag eliminates these effects (provided the time lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.
NASA Astrophysics Data System (ADS)
Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.
2015-03-01
All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. We are here applying a consistent approach based on auto- and cross-covariance functions to quantifying the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time-lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time-lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining datasets from several analysers and using simulations we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time-lag eliminates these effects (provided the time-lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time-lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.
Lincoln, Tricia A.; Horan-Ross, Debra A.; McHale, Michael R.; Lawrence, Gregory B.
2001-01-01
A laboratory for analysis of low-ionic strength water has been developed at the U.S. Geological Survey (USGS) office in Troy, N.Y., to analyze samples collected by USGS projects in the Northeast. The laboratory's quality-assurance program is based on internal and interlaboratory quality-assurance samples and quality-control procedures developed to ensure proper sample collection, processing, and analysis. The quality-assurance/quality-control data are stored in the laboratory's SAS data-management system, which provides efficient review, compilation, and plotting of quality-assurance/quality-control data. This report presents and discusses samples analyzed from July 1993 through June 1995. Quality-control results for 18 analytical procedures were evaluated for bias and precision. Control charts show that data from seven of the analytical procedures were biased throughout the analysis period for either high-concentration or low-concentration samples but were within control limits; these procedures were: acid-neutralizing capacity, dissolved inorganic carbon, dissolved organic carbon (soil expulsions), chloride, magnesium, nitrate (colorimetric method), and pH. Three of the analytical procedures were occasionally biased but were within control limits; they were: calcium (high for high-concentration samples for May 1995), dissolved organic carbon (high for highconcentration samples from January through September 1994), and fluoride (high in samples for April and June 1994). No quality-control sample has been developed for the organic monomeric aluminum procedure. Results from the filter-blank and analytical-blank analyses indicate that all analytical procedures in which blanks were run were within control limits, although values for a few blanks were outside the control limits. Blanks were not analyzed for acid-neutralizing capacity, dissolved inorganic carbon, fluoride, nitrate (colorimetric method), or pH. Sampling and analysis precision are evaluated herein in terms of the coefficient of variation obtained for triplicate samples in 14 of the 18 procedures. Data-quality objectives were met by more than 90 percent of the samples analyzed in all procedures except total monomeric aluminum (85 percent of samples met objectives), total aluminum (70 percent of samples met objectives), and dissolved organic carbon (85 percent of samples met objectives). Triplicate samples were not analyzed for ammonium, fluoride, dissolved inorganic carbon, or nitrate (colorimetric method). Results of the USGS interlaboratory Standard Reference Sample Program indicated high data quality with a median result of 3.6 of a possible 4.0. Environment Canada's LRTAP interlaboratory study results indicated that more than 85 percent of the samples met data-quality objectives in 6 of the 12 analyses; exceptions were calcium, dissolved organic carbon, chloride, pH, potassium, and sodium. Data-quality objectives were not met for calcium samples in one LRTAP study, but 94 percent of samples analyzed were within control limits for the remaining studies. Data-quality objectives were not met by 35 percent of samples analyzed for dissolved organic carbon, but 94 percent of sample values were within 20 percent of the most probable value. Data-quality objectives were not met for 30 percent of samples analyzed for chloride, but 90 percent of sample values were within 20 percent of the most probable value. Measurements of samples with a pH above 6.0 were biased high in 54 percent of the samples, although 85 percent of the samples met data-quality objectives for pH measurements below 6.0. Data-quality objectives for potassium and sodium were not met in one study (only 33 percent of the samples analyzed met the objectives), although 85 percent of the sample values were within control limits for the other studies. Measured sodium values were above the upper control limit in all studies. Results from blind reference-sample analyses indicated that data
Testing the Large-scale Environments of Cool-core and Non-cool-core Clusters with Clustering Bias
NASA Astrophysics Data System (ADS)
Medezinski, Elinor; Battaglia, Nicholas; Coupon, Jean; Cen, Renyue; Gaspari, Massimo; Strauss, Michael A.; Spergel, David N.
2017-02-01
There are well-observed differences between cool-core (CC) and non-cool-core (NCC) clusters, but the origin of this distinction is still largely unknown. Competing theories can be divided into internal (inside-out), in which internal physical processes transform or maintain the NCC phase, and external (outside-in), in which the cluster type is determined by its initial conditions, which in turn leads to different formation histories (I.e., assembly bias). We propose a new method that uses the relative assembly bias of CC to NCC clusters, as determined via the two-point cluster-galaxy cross-correlation function (CCF), to test whether formation history plays a role in determining their nature. We apply our method to 48 ACCEPT clusters, which have well resolved central entropies, and cross-correlate with the SDSS-III/BOSS LOWZ galaxy catalog. We find that the relative bias of NCC over CC clusters is b = 1.42 ± 0.35 (1.6σ different from unity). Our measurement is limited by the small number of clusters with core entropy information within the BOSS footprint, 14 CC and 34 NCC clusters. Future compilations of X-ray cluster samples, combined with deep all-sky redshift surveys, will be able to better constrain the relative assembly bias of CC and NCC clusters and determine the origin of the bimodality.
GC-Content Normalization for RNA-Seq Data
2011-01-01
Background Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq. Conclusions Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes. PMID:22177264
Testing the Large-scale Environments of Cool-core and Non-cool-core Clusters with Clustering Bias
DOE Office of Scientific and Technical Information (OSTI.GOV)
Medezinski, Elinor; Battaglia, Nicholas; Cen, Renyue
2017-02-10
There are well-observed differences between cool-core (CC) and non-cool-core (NCC) clusters, but the origin of this distinction is still largely unknown. Competing theories can be divided into internal (inside-out), in which internal physical processes transform or maintain the NCC phase, and external (outside-in), in which the cluster type is determined by its initial conditions, which in turn leads to different formation histories (i.e., assembly bias). We propose a new method that uses the relative assembly bias of CC to NCC clusters, as determined via the two-point cluster-galaxy cross-correlation function (CCF), to test whether formation history plays a role in determiningmore » their nature. We apply our method to 48 ACCEPT clusters, which have well resolved central entropies, and cross-correlate with the SDSS-III/BOSS LOWZ galaxy catalog. We find that the relative bias of NCC over CC clusters is b = 1.42 ± 0.35 (1.6 σ different from unity). Our measurement is limited by the small number of clusters with core entropy information within the BOSS footprint, 14 CC and 34 NCC clusters. Future compilations of X-ray cluster samples, combined with deep all-sky redshift surveys, will be able to better constrain the relative assembly bias of CC and NCC clusters and determine the origin of the bimodality.« less
NASA Astrophysics Data System (ADS)
Davis, C.; Rozo, E.; Roodman, A.; Alarcon, A.; Cawthon, R.; Gatti, M.; Lin, H.; Miquel, R.; Rykoff, E. S.; Troxel, M. A.; Vielzeuf, P.; Abbott, T. M. C.; Abdalla, F. B.; Allam, S.; Annis, J.; Bechtol, K.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; Cunha, C. E.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Doel, P.; Drlica-Wagner, A.; Fausti Neto, A.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gaztanaga, E.; Gerdes, D. W.; Giannantonio, T.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jeltema, T.; Krause, E.; Kuehn, K.; Kuhlmann, S.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Melchior, P.; Ogando, R. L. C.; Plazas, A. A.; Romer, A. K.; Sanchez, E.; Scarpine, V.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, M.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Wechsler, R. H.
2018-06-01
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogues with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty of Δz ˜ ±0.01. We forecast that our proposal can, in principle, control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Our results provide strong motivation to launch a programme to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.
Ward, Mary H.; Bell, Erin M.; Whitehead, Todd P.; Gunier, Robert B.; Friesen, Melissa C.; Nuckols, John R.
2013-01-01
Background: Residential pesticide exposure has been linked to adverse health outcomes in adults and children. High-quality exposure estimates are critical for confirming these associations. Past epidemiologic studies have used one measurement of pesticide concentrations in carpet dust to characterize an individual’s average long-term exposure. If concentrations vary over time, this approach could substantially misclassify exposure and attenuate risk estimates. Objectives: We assessed the repeatability of pesticide concentrations in carpet dust samples and the potential attenuation bias in epidemiologic studies relying on one sample. Methods: We collected repeated carpet dust samples (median = 3; range, 1–7) from 21 homes in Fresno County, California, during 2003–2005. Dust was analyzed for 13 pesticides using gas chromatography–mass spectrometry. We used mixed-effects models to estimate between- and within-home variance. For each pesticide, we computed intraclass correlation coefficients (ICCs) and the estimated attenuation of regression coefficients in a hypothetical case–control study collecting a single dust sample. Results: The median ICC was 0.73 (range, 0.37–0.95), demonstrating higher between-home than within-home variability for most pesticides. The expected magnitude of attenuation bias associated with using a single dust sample was estimated to be ≤ 30% for 7 of the 13 compounds evaluated. Conclusions: For several pesticides studied, use of one dust sample to represent an exposure period of approximately 2 years would not be expected to substantially attenuate odds ratios. Further study is needed to determine if our findings hold for longer exposure periods and for other pesticides. PMID:23462689
Davis, C.; Rozo, E.; Roodman, A.; ...
2018-03-26
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, C.; Rozo, E.; Roodman, A.
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less
Inadequacy of internal covariance estimation for super-sample covariance
NASA Astrophysics Data System (ADS)
Lacasa, Fabien; Kunz, Martin
2017-08-01
We give an analytical interpretation of how subsample-based internal covariance estimators lead to biased estimates of the covariance, due to underestimating the super-sample covariance (SSC). This includes the jackknife and bootstrap methods as estimators for the full survey area, and subsampling as an estimator of the covariance of subsamples. The limitations of the jackknife covariance have been previously presented in the literature because it is effectively a rescaling of the covariance of the subsample area. However we point out that subsampling is also biased, but for a different reason: the subsamples are not independent, and the corresponding lack of power results in SSC underprediction. We develop the formalism in the case of cluster counts that allows the bias of each covariance estimator to be exactly predicted. We find significant effects for a small-scale area or when a low number of subsamples is used, with auto-redshift biases ranging from 0.4% to 15% for subsampling and from 5% to 75% for jackknife covariance estimates. The cross-redshift covariance is even more affected; biases range from 8% to 25% for subsampling and from 50% to 90% for jackknife. Owing to the redshift evolution of the probe, the covariances cannot be debiased by a simple rescaling factor, and an exact debiasing has the same requirements as the full SSC prediction. These results thus disfavour the use of internal covariance estimators on data itself or a single simulation, leaving analytical prediction and simulations suites as possible SSC predictors.
Implicit Motivational Processes Underlying Smoking in American and Dutch Adolescents
Larsen, Helle; Kong, Grace; Becker, Daniela; Cousijn, Janna; Boendermaker, Wouter; Cavallo, Dana; Krishnan-Sarin, Suchitra; Wiers, Reinout
2014-01-01
Introduction: Research demonstrates that cognitive biases toward drug-related stimuli are correlated with substance use. This study aimed to investigate differences in cognitive biases (i.e., approach bias, attentional bias, and memory associations) between smoking and non-smoking adolescents in the US and the Netherlands. Within the group of smokers, we examined the relative predictive value of the cognitive biases and impulsivity related constructs (including inhibition skills, working memory, and risk taking) on daily smoking and nicotine dependence. Method: A total of 125 American and Dutch adolescent smokers (n = 67) and non-smokers (n = 58) between 13 and 18 years old participated. Participants completed the smoking approach–avoidance task, the classical and emotional Stroop task, brief implicit associations task, balloon analog risk task, the self-ordering pointing task, and a questionnaire assessing level of nicotine dependence and smoking behavior. Results: The analytical sample consisted of 56 Dutch adolescents (27 smokers and 29 non-smokers) and 37 American adolescents (19 smokers and 18 non-smokers). No differences in cognitive biases between smokers and non-smokers were found. Generally, Dutch adolescents demonstrated an avoidance bias toward both smoking and neutral stimuli whereas the American adolescents did not demonstrate a bias. Within the group of smokers, regression analyses showed that stronger attentional bias and weaker inhibition skills predicted greater nicotine dependence while weak working memory predicted more daily cigarette use. Conclusion: Attentional bias, inhibition skills, and working memory might be important factors explaining smoking in adolescence. Cultural differences in approach–avoidance bias should be considered in future research. PMID:24904435
Outcome-Dependent Sampling Design and Inference for Cox's Proportional Hazards Model.
Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P; Zhou, Haibo
2016-11-01
We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study.
Purifying Nucleic Acids from Samples of Extremely Low Biomass
NASA Technical Reports Server (NTRS)
La Duc, Myron; Osman, Shariff; Venkateswaran, Kasthuri
2008-01-01
A new method is able to circumvent the bias to which one commercial DNA extraction method falls prey with regard to the lysing of certain types of microbial cells, resulting in a truncated spectrum of microbial diversity. By prefacing the protocol with glass-bead-beating agitation (mechanically lysing a much more encompassing array of cell types and spores), the resulting microbial diversity detection is greatly enhanced. In preliminary studies, a commercially available automated DNA extraction method is effective at delivering total DNA yield, but only the non-hardy members of the bacterial bisque were represented in clone libraries, suggesting that this method was ineffective at lysing the hardier cell types. To circumvent such a bias in cells, yet another extraction method was devised. In this technique, samples are first subjected to a stringent bead-beating step, and then are processed via standard protocols. Prior to being loaded into extraction vials, samples are placed in micro-centrifuge bead tubes containing 50 micro-L of commercially produced lysis solution. After inverting several times, tubes are agitated at maximum speed for two minutes. Following agitation, tubes are centrifuged at 10,000 x g for one minute. At this time, the aqueous volumes are removed from the bead tubes and are loaded into extraction vials to be further processed via extraction regime. The new method couples two independent methodologies in such as way as to yield the highest concentration of PCR-amplifiable DNA with consistent and reproducible results and with the most accurate and encompassing report of species richness.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Presa, S., E-mail: silvino.presa@tyndall.ie; School of Engineering, University College Cork, Cork; Maaskant, P. P.
We present a comprehensive study of the emission spectra and electrical characteristics of InGaN/GaN multi-quantum well light-emitting diode (LED) structures under resonant optical pumping and varying electrical bias. A 5 quantum well LED with a thin well (1.5 nm) and a relatively thick barrier (6.6 nm) shows strong bias-dependent properties in the emission spectra, poor photovoltaic carrier escape under forward bias and an increase in effective resistance when compared with a 10 quantum well LED with a thin (4 nm) barrier. These properties are due to a strong piezoelectric field in the well and associated reduced field in the thickermore » barrier. We compare the voltage ideality factors for the LEDs under electrical injection, light emission with current, photovoltaic mode (PV) and photoluminescence (PL) emission. The PV and PL methods provide similar values for the ideality which are lower than for the resistance-limited electrical method. Under optical pumping the presence of an n-type InGaN underlayer in a commercial LED sample is shown to act as a second photovoltaic source reducing the photovoltage and the extracted ideality factor to less than 1. The use of photovoltaic measurements together with bias-dependent spectrally resolved luminescence is a powerful method to provide valuable insights into the dynamics of GaN LEDs.« less
NASA Astrophysics Data System (ADS)
Gao, Jing; Burt, James E.
2017-12-01
This study investigates the usefulness of a per-pixel bias-variance error decomposition (BVD) for understanding and improving spatially-explicit data-driven models of continuous variables in environmental remote sensing (ERS). BVD is a model evaluation method originated from machine learning and have not been examined for ERS applications. Demonstrated with a showcase regression tree model mapping land imperviousness (0-100%) using Landsat images, our results showed that BVD can reveal sources of estimation errors, map how these sources vary across space, reveal the effects of various model characteristics on estimation accuracy, and enable in-depth comparison of different error metrics. Specifically, BVD bias maps can help analysts identify and delineate model spatial non-stationarity; BVD variance maps can indicate potential effects of ensemble methods (e.g. bagging), and inform efficient training sample allocation - training samples should capture the full complexity of the modeled process, and more samples should be allocated to regions with more complex underlying processes rather than regions covering larger areas. Through examining the relationships between model characteristics and their effects on estimation accuracy revealed by BVD for both absolute and squared errors (i.e. error is the absolute or the squared value of the difference between observation and estimate), we found that the two error metrics embody different diagnostic emphases, can lead to different conclusions about the same model, and may suggest different solutions for performance improvement. We emphasize BVD's strength in revealing the connection between model characteristics and estimation accuracy, as understanding this relationship empowers analysts to effectively steer performance through model adjustments.