Extreme Mean and Its Applications
NASA Technical Reports Server (NTRS)
Swaroop, R.; Brownlow, J. D.
1979-01-01
Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2012 CFR
2012-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2013 CFR
2013-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2014 CFR
2014-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2011 CFR
2011-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
Unbiased multi-fidelity estimate of failure probability of a free plane jet
NASA Astrophysics Data System (ADS)
Marques, Alexandre; Kramer, Boris; Willcox, Karen; Peherstorfer, Benjamin
2017-11-01
Estimating failure probability related to fluid flows is a challenge because it requires a large number of evaluations of expensive models. We address this challenge by leveraging multiple low fidelity models of the flow dynamics to create an optimal unbiased estimator. In particular, we investigate the effects of uncertain inlet conditions in the width of a free plane jet. We classify a condition as failure when the corresponding jet width is below a small threshold, such that failure is a rare event (failure probability is smaller than 0.001). We estimate failure probability by combining the frameworks of multi-fidelity importance sampling and optimal fusion of estimators. Multi-fidelity importance sampling uses a low fidelity model to explore the parameter space and create a biasing distribution. An unbiased estimate is then computed with a relatively small number of evaluations of the high fidelity model. In the presence of multiple low fidelity models, this framework offers multiple competing estimators. Optimal fusion combines all competing estimators into a single estimator with minimal variance. We show that this combined framework can significantly reduce the cost of estimating failure probabilities, and thus can have a large impact in fluid flow applications. This work was funded by DARPA.
Harris, Alexandre M.; DeGiorgio, Michael
2016-01-01
Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781
Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris
Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey
2005-01-01
Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...
Correlations of the IR Luminosity and Eddington Ratio with a Hard X-ray Selected Sample of AGN
NASA Technical Reports Server (NTRS)
Mushotzy, Richard F.; Winter, Lisa M.; McIntosh, Daniel H.; Tueller, Jack
2008-01-01
We use the SWIFT Burst Alert Telescope (BAT) sample of hard x-ray selected active galactic nuclei (AGN) with a median redshift of 0.03 and the 2MASS J and K band photometry to examine the correlation of hard x-ray emission to Eddington ratio as well as the relationship of the J and K band nuclear luminosity to the hard x-ray luminosity. The BAT sample is almost unbiased by the effects of obscuration and thus offers the first large unbiased sample for the examination of correlations between different wavelength bands. We find that the near-IR nuclear J and K band luminosity is related to the BAT (14 - 195 keV) luminosity over a factor of 10(exp 3) in luminosity (L(sub IR) approx.equals L(sub BAT)(sup 1.25) and thus is unlikely to be due to dust. We also find that the Eddington ratio is proportional to the x-ray luminosity. This new result should be a strong constraint on models of the formation of the broad band continuum.
Decision rules for unbiased inventory estimates
NASA Technical Reports Server (NTRS)
Argentiero, P. D.; Koch, D.
1979-01-01
An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.
Xu, Yan; Liu, Biao; Ding, Fengan; Zhou, Xiaodie; Tu, Pin; Yu, Bo; He, Yan; Huang, Peilin
2017-06-01
Circulating tumor cells (CTCs), isolated as a 'liquid biopsy', may provide important diagnostic and prognostic information. Therefore, rapid, reliable and unbiased detection of CTCs are required for routine clinical analyses. It was demonstrated that negative enrichment, an epithelial marker-independent technique for isolating CTCs, exhibits a better efficiency in the detection of CTCs compared with positive enrichment techniques that only use specific anti-epithelial cell adhesion molecules. However, negative enrichment techniques incur significant cell loss during the isolation procedure, and as it is a method that uses only one type of antibody, it is inherently biased. The detection procedure and identification of cell types also relies on skilled and experienced technicians. In the present study, the detection sensitivity of using negative enrichment and a previously described unbiased detection method was compared. The results revealed that unbiased detection methods may efficiently detect >90% of cancer cells in blood samples containing CTCs. By contrast, only 40-60% of CTCs were detected by negative enrichment. Additionally, CTCs were identified in >65% of patients with stage I/II lung cancer. This simple yet efficient approach may achieve a high level of sensitivity. It demonstrates a potential for the large-scale clinical implementation of CTC-based diagnostic and prognostic strategies.
XMM Observations of 'New' Swift BAT Sources
NASA Technical Reports Server (NTRS)
Mushotzky, Richard F.
2008-01-01
Because the E> 15 keV band is unaffected by absorption this band offers the best hope of obtaining an unbiased sample of AGN. The Swift BAT survey has produced the first large sample of hard x-ray bright AGN in the local universe providing the data necessary to determine the true characteristics of the AGN population. However to use this data one needs to obtain the x-ray spectral properties of these objects.We will present the complete sample of x-ray spectra of the BAT objects and the implications of these data.
Overlap between treatment and control distributions as an effect size measure in experiments.
Hedges, Larry V; Olkin, Ingram
2016-03-01
The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).
Entropic uncertainty relations and locking: Tight bounds for mutually unbiased bases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ballester, Manuel A.; Wehner, Stephanie
We prove tight entropic uncertainty relations for a large number of mutually unbiased measurements. In particular, we show that a bound derived from the result by Maassen and Uffink [Phys. Rev. Lett. 60, 1103 (1988)] for two such measurements can in fact be tight for up to {radical}(d) measurements in mutually unbiased bases. We then show that using more mutually unbiased bases does not always lead to a better locking effect. We prove that the optimal bound for the accessible information using up to {radical}(d) specific mutually unbiased bases is log d/2, which is the same as can be achievedmore » by using only two bases. Our result indicates that merely using mutually unbiased bases is not sufficient to achieve a strong locking effect and we need to look for additional properties.« less
Unbiased, scalable sampling of protein loop conformations from probabilistic priors.
Zhang, Yajia; Hauser, Kris
2013-01-01
Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion.
Unbiased, scalable sampling of protein loop conformations from probabilistic priors
2013-01-01
Background Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Results Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Conclusion Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion. PMID:24565175
Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine
2008-01-01
Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...
An unbiased X-ray sampling of stars within 25 parsecs of the Sun
NASA Technical Reports Server (NTRS)
Johnson, H. M.
1985-01-01
A search of all of the Einstein Observatory IPC and HRI fields for untargeted stars in the Woolley, et al., Catalogue of the nearby stars is reported. Optical data and IPC coordinates, flux density F sub x, and luminosity L sub x, or upper limits, are tabulated for 126 single or blended systems, and HRI results for a few of them. IPC luminosity functions are derived for the systems, for 193 individual stars in the systems (with L sub x shared equally among blended components), and for 63 individual M dwarfs. These stars have relatively large X-ray flux densities that are free of interstellar extinction, because they are nearby, but they are otherwise unbiased with respect to the X-ray properties that are found in a defined small space around the Sun.
Double sampling to estimate density and population trends in birds
Bart, Jonathan; Earnst, Susan L.
2002-01-01
We present a method for estimating density of nesting birds based on double sampling. The approach involves surveying a large sample of plots using a rapid method such as uncorrected point counts, variable circular plot counts, or the recently suggested double-observer method. A subsample of those plots is also surveyed using intensive methods to determine actual density. The ratio of the mean count on those plots (using the rapid method) to the mean actual density (as determined by the intensive searches) is used to adjust results from the rapid method. The approach works well when results from the rapid method are highly correlated with actual density. We illustrate the method with three years of shorebird surveys from the tundra in northern Alaska. In the rapid method, surveyors covered ~10 ha h-1 and surveyed each plot a single time. The intensive surveys involved three thorough searches, required ~3 h ha-1, and took 20% of the study effort. Surveyors using the rapid method detected an average of 79% of birds present. That detection ratio was used to convert the index obtained in the rapid method into an essentially unbiased estimate of density. Trends estimated from several years of data would also be essentially unbiased. Other advantages of double sampling are that (1) the rapid method can be changed as new methods become available, (2) domains can be compared even if detection rates differ, (3) total population size can be estimated, and (4) valuable ancillary information (e.g. nest success) can be obtained on intensive plots with little additional effort. We suggest that double sampling be used to test the assumption that rapid methods, such as variable circular plot and double-observer methods, yield density estimates that are essentially unbiased. The feasibility of implementing double sampling in a range of habitats needs to be evaluated.
Vining, Kevin C.; Lundgren, Robert F.
2008-01-01
Sixty-five sampling sites, selected by a statistical design to represent lengths of perennial streams in North Dakota, were chosen to be sampled for fish and aquatic insects (macroinvertebrates) to establish unbiased baseline data. Channel catfish and common carp were the most abundant game and large fish species in the Cultivated Plains and Rangeland Plains, respectively. Blackflies were present in more than 50 percent of stream lengths sampled in the State; mayflies and caddisflies were present in more than 80 percent. Dragonflies were present in a greater percentage of stream lengths in the Rangeland Plains than in the Cultivated Plains.
Estimation of the simple correlation coefficient.
Shieh, Gwowen
2010-11-01
This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.
Nature vs. Nurture: The influence of OB star environments on proto-planetary disk evolution
NASA Astrophysics Data System (ADS)
Bouwman, Jeroen
2006-09-01
We propose a combined IRAC/IRS study of a large, well-defined and unbiased X-ray selected sample of pre-main-sequence stars in three OB associations: Pismis 24 in NGC 6357, NGC 2244 in the Rosette Nebula, and IC 1795 in the W3 complex. The samples are based on recent Chandra X-ray Observatory studies which reliably identify hundreds of cluster members and were carefully chosen to avoid high infrared nebular background. A new Chandra exposure of IC 1795 is requested, and an optical followup to characterise the host stars is planned.
NASA Astrophysics Data System (ADS)
Kwon, Ki-Won; Cho, Yongsoo
This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.
2014-01-01
Background Descendants from the extinct aurochs (Bos primigenius), taurine (Bos taurus) and zebu cattle (Bos indicus) were domesticated 10,000 years ago in Southwestern and Southern Asia, respectively, and colonized the world undergoing complex events of admixture and selection. Molecular data, in particular genome-wide single nucleotide polymorphism (SNP) markers, can complement historic and archaeological records to elucidate these past events. However, SNP ascertainment in cattle has been optimized for taurine breeds, imposing limitations to the study of diversity in zebu cattle. As amplified fragment length polymorphism (AFLP) markers are discovered and genotyped as the samples are assayed, this type of marker is free of ascertainment bias. In order to obtain unbiased assessments of genetic differentiation and structure in taurine and zebu cattle, we analyzed a dataset of 135 AFLP markers in 1,593 samples from 13 zebu and 58 taurine breeds, representing nine continental areas. Results We found a geographical pattern of expected heterozygosity in European taurine breeds decreasing with the distance from the domestication centre, arguing against a large-scale introgression from European or African aurochs. Zebu cattle were found to be at least as diverse as taurine cattle. Western African zebu cattle were found to have diverged more from Indian zebu than South American zebu. Model-based clustering and ancestry informative markers analyses suggested that this is due to taurine introgression. Although a large part of South American zebu cattle also descend from taurine cows, we did not detect significant levels of taurine ancestry in these breeds, probably because of systematic backcrossing with zebu bulls. Furthermore, limited zebu introgression was found in Podolian taurine breeds in Italy. Conclusions The assessment of cattle diversity reported here contributes an unbiased global view to genetic differentiation and structure of taurine and zebu cattle populations, which is essential for an effective conservation of the bovine genetic resources. PMID:24739206
Unbiased Estimates of Variance Components with Bootstrap Procedures
ERIC Educational Resources Information Center
Brennan, Robert L.
2007-01-01
This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…
Large deviations in the presence of cooperativity and slow dynamics
NASA Astrophysics Data System (ADS)
Whitelam, Stephen
2018-06-01
We study simple models of intermittency, involving switching between two states, within the dynamical large-deviation formalism. Singularities appear in the formalism when switching is cooperative or when its basic time scale diverges. In the first case the unbiased trajectory distribution undergoes a symmetry breaking, leading to a change in shape of the large-deviation rate function for a particular dynamical observable. In the second case the symmetry of the unbiased trajectory distribution remains unbroken. Comparison of these models suggests that singularities of the dynamical large-deviation formalism can signal the dynamical equivalent of an equilibrium phase transition but do not necessarily do so.
Geldsetzer, Pascal; Fink, Günther; Vaikath, Maria; Bärnighausen, Till
2018-02-01
(1) To evaluate the operational efficiency of various sampling methods for patient exit interviews; (2) to discuss under what circumstances each method yields an unbiased sample; and (3) to propose a new, operationally efficient, and unbiased sampling method. Literature review, mathematical derivation, and Monte Carlo simulations. Our simulations show that in patient exit interviews it is most operationally efficient if the interviewer, after completing an interview, selects the next patient exiting the clinical consultation. We demonstrate mathematically that this method yields a biased sample: patients who spend a longer time with the clinician are overrepresented. This bias can be removed by selecting the next patient who enters, rather than exits, the consultation room. We show that this sampling method is operationally more efficient than alternative methods (systematic and simple random sampling) in most primary health care settings. Under the assumption that the order in which patients enter the consultation room is unrelated to the length of time spent with the clinician and the interviewer, selecting the next patient entering the consultation room tends to be the operationally most efficient unbiased sampling method for patient exit interviews. © 2016 The Authors. Health Services Research published by Wiley Periodicals, Inc. on behalf of Health Research and Educational Trust.
Wakasaki, Rumie; Eiwaz, Mahaba; McClellan, Nicholas; Matsushita, Katsuyuki; Golgotiu, Kirsti; Hutchens, Michael P
2018-06-14
A technical challenge in translational models of kidney injury is determination of the extent of cell death. Histologic sections are commonly analyzed by area morphometry or unbiased stereology, but stereology requires specialized equipment. Therefore, a challenge to rigorous quantification would be addressed by an unbiased stereology tool with reduced equipment dependence. We hypothesized that it would be feasible to build a novel software component which would facilitate unbiased stereologic quantification on scanned slides, and that unbiased stereology would demonstrate greater precision and decreased bias compared with 2D morphometry. We developed a macro for the widely used image analysis program, Image J, and performed cardiac arrest with cardiopulmonary resuscitation (CA/CPR, a model of acute cardiorenal syndrome) in mice. Fluorojade-B stained kidney sections were analyzed using three methods to quantify cell death: gold standard stereology using a controlled stage and commercially-available software, unbiased stereology using the novel ImageJ macro, and quantitative 2D morphometry also using the novel macro. There was strong agreement between both methods of unbiased stereology (bias -0.004±0.006 with 95% limits of agreement -0.015 to 0.007). 2D morphometry demonstrated poor agreement and significant bias compared to either method of unbiased stereology. Unbiased stereology is facilitated by a novel macro for ImageJ and results agree with those obtained using gold-standard methods. Automated 2D morphometry overestimated tubular epithelial cell death and correlated modestly with values obtained from unbiased stereology. These results support widespread use of unbiased stereology for analysis of histologic outcomes of injury models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pascual, J.N.
1967-06-26
Evaluation of sample bias introduced by the mechanical sieving of Small Boy fallout samples for 10 minutes revealed the following: Up to 20% of the mass and 30% of the gamma-ray activity can be lost from the large-particle (greater than 1400 microns) fraction. The pan fraction (less than 44 microns) can gain in weight by as much as 79%, and in activity by as much as 44%. The gamma-ray spectra of the fractions were not noticeably altered by the process. Examination of unbiased pan fractions (before mechanical sieving) indicated bimodality of the mass-size distribution in a sample collected 9,200 feetmore » from ground zero, but not in a sample collected at 13,300 feet.« less
Skjaerven, Lars; Grant, Barry; Muga, Arturo; Teigen, Knut; McCammon, J. Andrew; Reuter, Nathalie; Martinez, Aurora
2011-01-01
GroEL is an ATP dependent molecular chaperone that promotes the folding of a large number of substrate proteins in E. coli. Large-scale conformational transitions occurring during the reaction cycle have been characterized from extensive crystallographic studies. However, the link between the observed conformations and the mechanisms involved in the allosteric response to ATP and the nucleotide-driven reaction cycle are not completely established. Here we describe extensive (in total long) unbiased molecular dynamics (MD) simulations that probe the response of GroEL subunits to ATP binding. We observe nucleotide dependent conformational transitions, and show with multiple 100 ns long simulations that the ligand-induced shift in the conformational populations are intrinsically coded in the structure-dynamics relationship of the protein subunit. Thus, these simulations reveal a stabilization of the equatorial domain upon nucleotide binding and a concomitant “opening” of the subunit, which reaches a conformation close to that observed in the crystal structure of the subunits within the ADP-bound oligomer. Moreover, we identify changes in a set of unique intrasubunit interactions potentially important for the conformational transition. PMID:21423709
Convergence of Free Energy Profile of Coumarin in Lipid Bilayer
2012-01-01
Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from “pulling” coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives. PMID:22545027
Convergence of Free Energy Profile of Coumarin in Lipid Bilayer.
Paloncýová, Markéta; Berka, Karel; Otyepka, Michal
2012-04-10
Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from "pulling" coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives.
Software engineering the mixed model for genome-wide association studies on large samples.
Zhang, Zhiwu; Buckler, Edward S; Casstevens, Terry M; Bradbury, Peter J
2009-11-01
Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample size and number of markers used for GWAS is increasing dramatically, resulting in greater statistical power to detect those associations. The use of mixed models with increasingly large data sets depends on the availability of software for analyzing those models. While multiple software packages implement the mixed model method, no single package provides the best combination of fast computation, ability to handle large samples, flexible modeling and ease of use. Key elements of association analysis with mixed models are reviewed, including modeling phenotype-genotype associations using mixed models, population stratification, kinship and its estimation, variance component estimation, use of best linear unbiased predictors or residuals in place of raw phenotype, improving efficiency and software-user interaction. The available software packages are evaluated, and suggestions made for future software development.
Kuipers, Jeroen; Kalicharan, Ruby D; Wolters, Anouk H G; van Ham, Tjakko J; Giepmans, Ben N G
2016-05-25
Large-scale 2D electron microscopy (EM), or nanotomy, is the tissue-wide application of nanoscale resolution electron microscopy. Others and we previously applied large scale EM to human skin pancreatic islets, tissue culture and whole zebrafish larvae(1-7). Here we describe a universally applicable method for tissue-scale scanning EM for unbiased detection of sub-cellular and molecular features. Nanotomy was applied to investigate the healthy and a neurodegenerative zebrafish brain. Our method is based on standardized EM sample preparation protocols: Fixation with glutaraldehyde and osmium, followed by epoxy-resin embedding, ultrathin sectioning and mounting of ultrathin-sections on one-hole grids, followed by post staining with uranyl and lead. Large-scale 2D EM mosaic images are acquired using a scanning EM connected to an external large area scan generator using scanning transmission EM (STEM). Large scale EM images are typically ~ 5 - 50 G pixels in size, and best viewed using zoomable HTML files, which can be opened in any web browser, similar to online geographical HTML maps. This method can be applied to (human) tissue, cross sections of whole animals as well as tissue culture(1-5). Here, zebrafish brains were analyzed in a non-invasive neuronal ablation model. We visualize within a single dataset tissue, cellular and subcellular changes which can be quantified in various cell types including neurons and microglia, the brain's macrophages. In addition, nanotomy facilitates the correlation of EM with light microscopy (CLEM)(8) on the same tissue, as large surface areas previously imaged using fluorescent microscopy, can subsequently be subjected to large area EM, resulting in the nano-anatomy (nanotomy) of tissues. In all, nanotomy allows unbiased detection of features at EM level in a tissue-wide quantifiable manner.
Kuipers, Jeroen; Kalicharan, Ruby D.; Wolters, Anouk H. G.
2016-01-01
Large-scale 2D electron microscopy (EM), or nanotomy, is the tissue-wide application of nanoscale resolution electron microscopy. Others and we previously applied large scale EM to human skin pancreatic islets, tissue culture and whole zebrafish larvae1-7. Here we describe a universally applicable method for tissue-scale scanning EM for unbiased detection of sub-cellular and molecular features. Nanotomy was applied to investigate the healthy and a neurodegenerative zebrafish brain. Our method is based on standardized EM sample preparation protocols: Fixation with glutaraldehyde and osmium, followed by epoxy-resin embedding, ultrathin sectioning and mounting of ultrathin-sections on one-hole grids, followed by post staining with uranyl and lead. Large-scale 2D EM mosaic images are acquired using a scanning EM connected to an external large area scan generator using scanning transmission EM (STEM). Large scale EM images are typically ~ 5 - 50 G pixels in size, and best viewed using zoomable HTML files, which can be opened in any web browser, similar to online geographical HTML maps. This method can be applied to (human) tissue, cross sections of whole animals as well as tissue culture1-5. Here, zebrafish brains were analyzed in a non-invasive neuronal ablation model. We visualize within a single dataset tissue, cellular and subcellular changes which can be quantified in various cell types including neurons and microglia, the brain's macrophages. In addition, nanotomy facilitates the correlation of EM with light microscopy (CLEM)8 on the same tissue, as large surface areas previously imaged using fluorescent microscopy, can subsequently be subjected to large area EM, resulting in the nano-anatomy (nanotomy) of tissues. In all, nanotomy allows unbiased detection of features at EM level in a tissue-wide quantifiable manner. PMID:27285162
Identifying Li-rich giants from low-resolution spectroscopic survey
NASA Astrophysics Data System (ADS)
Kumar, Yerra Bharat; Reddy, Bacham Eswar; Zhao, Gang
2018-04-01
In this paper we discuss our choice of a large unbiased sample used for the survey of red giant branch stars for finding Li-rich K giants, and the method used for identifying Li-rich candidates using low-resolution spectra. The sample has 2000 giants within a mass range of 0.8 to 3.0it{M}_{⊙}. Sample stars were selected from the Hipparcos catalogue with colour (B-V) and luminosity (it{L}/it{L}_{⊙}) in such way that the sample covers RGB evolution from its base towards RGB tip passing through first dredge-up and luminosity bump. Low-resolution (R ≈ 2000, 3500, 5000) spectra were obtained for all sample stars. Using core strength ratios of lines at Li I 6707 Å and its adjacent line Ca I 6717 Å we successfully identified 15 K giants with A(Li) > 1.5 dex, which are defined as Li-rich K giants. The results demonstrate the usefulness of low-resolution spectra to measure Li abundance and identify Li-rich giants from a large sample of stars in relatively shorter time periods.
Power Generation from a Radiative Thermal Source Using a Large-Area Infrared Rectenna
NASA Astrophysics Data System (ADS)
Shank, Joshua; Kadlec, Emil A.; Jarecki, Robert L.; Starbuck, Andrew; Howell, Stephen; Peters, David W.; Davids, Paul S.
2018-05-01
Electrical power generation from a moderate-temperature thermal source by means of direct conversion of infrared radiation is important and highly desirable for energy harvesting from waste heat and micropower applications. Here, we demonstrate direct rectified power generation from an unbiased large-area nanoantenna-coupled tunnel diode rectifier called a rectenna. Using a vacuum radiometric measurement technique with irradiation from a temperature-stabilized thermal source, a generated power density of 8 nW /cm2 is observed at a source temperature of 450 °C for the unbiased rectenna across an optimized load resistance. The optimized load resistance for the peak power generation for each temperature coincides with the tunnel diode resistance at zero bias and corresponds to the impedance matching condition for a rectifying antenna. Current-voltage measurements of a thermally illuminated large-area rectenna show current zero crossing shifts into the second quadrant indicating rectification. Photon-assisted tunneling in the unbiased rectenna is modeled as the mechanism for the large short-circuit photocurrents observed where the photon energy serves as an effective bias across the tunnel junction. The measured current and voltage across the load resistor as a function of the thermal source temperature represents direct current electrical power generation.
NASA Technical Reports Server (NTRS)
Horvath, R. (Principal Investigator); Cicone, R.; Crist, E.; Kauth, R. J.; Lambeck, P.; Malila, W. A.; Richardson, W.
1979-01-01
The author has identified the following significant results. An outgrowth of research and development activities in support of LACIE was a multicrop area estimation procedure, Procedure M. This procedure was a flexible, modular system that could be operated within the LACIE framework. Its distinctive features were refined preprocessing (including spatially varying correction for atmospheric haze), definition of field like spatial features for labeling, spectral stratification, and unbiased selection of samples to label and crop area estimation without conventional maximum likelihood classification.
Vicini, P; Fields, O; Lai, E; Litwack, E D; Martin, A-M; Morgan, T M; Pacanowski, M A; Papaluca, M; Perez, O D; Ringel, M S; Robson, M; Sakul, H; Vockley, J; Zaks, T; Dolsten, M; Søgaard, M
2016-02-01
High throughput molecular and functional profiling of patients is a key driver of precision medicine. DNA and RNA characterization has been enabled at unprecedented cost and scale through rapid, disruptive progress in sequencing technology, but challenges persist in data management and interpretation. We analyze the state-of-the-art of large-scale unbiased sequencing in drug discovery and development, including technology, application, ethical, regulatory, policy and commercial considerations, and discuss issues of LUS implementation in clinical and regulatory practice. © 2015 American Society for Clinical Pharmacology and Therapeutics.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
Quasi interpolation with Voronoi splines.
Mirzargar, Mahsa; Entezari, Alireza
2011-12-01
We present a quasi interpolation framework that attains the optimal approximation-order of Voronoi splines for reconstruction of volumetric data sampled on general lattices. The quasi interpolation framework of Voronoi splines provides an unbiased reconstruction method across various lattices. Therefore this framework allows us to analyze and contrast the sampling-theoretic performance of general lattices, using signal reconstruction, in an unbiased manner. Our quasi interpolation methodology is implemented as an efficient FIR filter that can be applied online or as a preprocessing step. We present visual and numerical experiments that demonstrate the improved accuracy of reconstruction across lattices, using the quasi interpolation framework. © 2011 IEEE
Flexible sampling large-scale social networks by self-adjustable random walk
NASA Astrophysics Data System (ADS)
Xu, Xiao-Ke; Zhu, Jonathan J. H.
2016-12-01
Online social networks (OSNs) have become an increasingly attractive gold mine for academic and commercial researchers. However, research on OSNs faces a number of difficult challenges. One bottleneck lies in the massive quantity and often unavailability of OSN population data. Sampling perhaps becomes the only feasible solution to the problems. How to draw samples that can represent the underlying OSNs has remained a formidable task because of a number of conceptual and methodological reasons. Especially, most of the empirically-driven studies on network sampling are confined to simulated data or sub-graph data, which are fundamentally different from real and complete-graph OSNs. In the current study, we propose a flexible sampling method, called Self-Adjustable Random Walk (SARW), and test it against with the population data of a real large-scale OSN. We evaluate the strengths of the sampling method in comparison with four prevailing methods, including uniform, breadth-first search (BFS), random walk (RW), and revised RW (i.e., MHRW) sampling. We try to mix both induced-edge and external-edge information of sampled nodes together in the same sampling process. Our results show that the SARW sampling method has been able to generate unbiased samples of OSNs with maximal precision and minimal cost. The study is helpful for the practice of OSN research by providing a highly needed sampling tools, for the methodological development of large-scale network sampling by comparative evaluations of existing sampling methods, and for the theoretical understanding of human networks by highlighting discrepancies and contradictions between existing knowledge/assumptions of large-scale real OSN data.
NASA Astrophysics Data System (ADS)
Spinoglio, L.; Alonso-Herrero, A.; Armus, L.; Baes, M.; Bernard-Salas, J.; Bianchi, S.; Bocchio, M.; Bolatto, A.; Bradford, C.; Braine, J.; Carrera, F. J.; Ciesla, L.; Clements, D. L.; Dannerbauer, H.; Doi, Y.; Efstathiou, A.; Egami, E.; Fernández-Ontiveros, J. A.; Ferrara, A.; Fischer, J.; Franceschini, A.; Gallerani, S.; Giard, M.; González-Alfonso, E.; Gruppioni, C.; Guillard, P.; Hatziminaoglou, E.; Imanishi, M.; Ishihara, D.; Isobe, N.; Kaneda, H.; Kawada, M.; Kohno, K.; Kwon, J.; Madden, S.; Malkan, M. A.; Marassi, S.; Matsuhara, H.; Matsuura, M.; Miniutti, G.; Nagamine, K.; Nagao, T.; Najarro, F.; Nakagawa, T.; Onaka, T.; Oyabu, S.; Pallottini, A.; Piro, L.; Pozzi, F.; Rodighiero, G.; Roelfsema, P.; Sakon, I.; Santini, P.; Schaerer, D.; Schneider, R.; Scott, D.; Serjeant, S.; Shibai, H.; Smith, J.-D. T.; Sobacchi, E.; Sturm, E.; Suzuki, T.; Vallini, L.; van der Tak, F.; Vignali, C.; Yamada, T.; Wada, T.; Wang, L.
2017-11-01
IR spectroscopy in the range 12-230 μm with the SPace IR telescope for Cosmology and Astrophysics (SPICA) will reveal the physical processes governing the formation and evolution of galaxies and black holes through cosmic time, bridging the gap between the James Webb Space Telescope and the upcoming Extremely Large Telescopes at shorter wavelengths and the Atacama Large Millimeter Array at longer wavelengths. The SPICA, with its 2.5-m telescope actively cooled to below 8 K, will obtain the first spectroscopic determination, in the mid-IR rest-frame, of both the star-formation rate and black hole accretion rate histories of galaxies, reaching lookback times of 12 Gyr, for large statistically significant samples. Densities, temperatures, radiation fields, and gas-phase metallicities will be measured in dust-obscured galaxies and active galactic nuclei, sampling a large range in mass and luminosity, from faint local dwarf galaxies to luminous quasars in the distant Universe. Active galactic nuclei and starburst feedback and feeding mechanisms in distant galaxies will be uncovered through detailed measurements of molecular and atomic line profiles. The SPICA's large-area deep spectrophotometric surveys will provide mid-IR spectra and continuum fluxes for unbiased samples of tens of thousands of galaxies, out to redshifts of z 6.
Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.
Youssef, Noha H; Elshahed, Mostafa S
2008-09-01
Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.
Barrier screens: a method to sample blood-fed and host-seeking exophilic mosquitoes
2013-01-01
Background Determining the proportion of blood meals on humans by outdoor-feeding and resting mosquitoes is challenging. This is largely due to the difficulty of finding an adequate and unbiased sample of resting, engorged mosquitoes to enable the identification of host blood meal sources. This is particularly difficult in the south-west Pacific countries of Indonesia, the Solomon Islands and Papua New Guinea where thick vegetation constitutes the primary resting sites for the exophilic mosquitoes that are the primary malaria and filariasis vectors. Methods Barrier screens of shade-cloth netting attached to bamboo poles were constructed between villages and likely areas where mosquitoes might seek blood meals or rest. Flying mosquitoes, obstructed by the barrier screens, would temporarily stop and could then be captured by aspiration at hourly intervals throughout the night. Results In the three countries where this method was evaluated, blood-fed females of Anopheles farauti, Anopheles bancroftii, Anopheles longirostris, Anopheles sundaicus, Anopheles vagus, Anopheles kochi, Anopheles annularis, Anopheles tessellatus, Culex vishnui, Culex quinquefasciatus and Mansonia spp were collected while resting on the barrier screens. In addition, female Anopheles punctulatus and Armigeres spp as well as male An. farauti, Cx. vishnui, Cx. quinquefasciatus and Aedes species were similarly captured. Conclusions Building barrier screens as temporary resting sites in areas where mosquitoes were likely to fly was an extremely time-effective method for collecting an unbiased representative sample of engorged mosquitoes for determining the human blood index. PMID:23379959
Russell, Joseph A; Campos, Brittany; Stone, Jennifer; Blosser, Erik M; Burkett-Cadena, Nathan; Jacobs, Jonathan L
2018-04-03
The future of infectious disease surveillance and outbreak response is trending towards smaller hand-held solutions for point-of-need pathogen detection. Here, samples of Culex cedecei mosquitoes collected in Southern Florida, USA were tested for Venezuelan Equine Encephalitis Virus (VEEV), a previously-weaponized arthropod-borne RNA-virus capable of causing acute and fatal encephalitis in animal and human hosts. A single 20-mosquito pool tested positive for VEEV by quantitative reverse transcription polymerase chain reaction (RT-qPCR) on the Biomeme two3. The virus-positive sample was subjected to unbiased metatranscriptome sequencing on the Oxford Nanopore MinION and shown to contain Everglades Virus (EVEV), an alphavirus in the VEEV serocomplex. Our results demonstrate, for the first time, the use of unbiased sequence-based detection and subtyping of a high-consequence biothreat pathogen directly from an environmental sample using field-forward protocols. The development and validation of methods designed for field-based diagnostic metagenomics and pathogen discovery, such as those suitable for use in mobile "pocket laboratories", will address a growing demand for public health teams to carry out their mission where it is most urgent: at the point-of-need.
NASA Astrophysics Data System (ADS)
Springob, Chris M.; Colless, M.; Jones, D. H.; Magoulas, C.; Mould, J. R.; Campbell, L.; Lah, P.; Lucey, J.; Merson, A.; Proctor, R.
2010-01-01
The 6dF Galaxy Survey (6dFGS) is an all southern sky galaxy survey, including 125,000 redshifts and more than 10,000 peculiar velocities, making it the largest peculiar velocity sample to date. In combination with 2MASS surface brightnesses and effective radii, 6dFGS yields the near-infrared Fundamental Plane (FP) for a large and uniform sample. We have fit the FP relation for the galaxies in the peculiar velocity sample using a maximum likelihood method which allows us to precisely account for selection effects and observational errors. We investigate the effects of varying stellar populations and environments on the FP. Finally, we discuss the implications of these results both for our understanding of the origin of the FP for early-type galaxies and bulges and for deriving unbiased distances and peculiar velocities in the local universe.
AGN Clustering in the BAT Sample
NASA Astrophysics Data System (ADS)
Powell, Meredith; Cappelluti, Nico; Urry, Meg; Koss, Michael; BASS Team
2018-01-01
We characterize the environments of local growing supermassive black holes by measuring the clustering of AGN in the Swift-BAT Spectroscopic Survey (BASS). With 548 AGN in the redshift range 0.01
Rosenberger, Amanda E.; Dunham, Jason B.
2005-01-01
Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.
ClustENM: ENM-Based Sampling of Essential Conformational Space at Full Atomic Resolution
Kurkcuoglu, Zeynep; Bahar, Ivet; Doruker, Pemra
2016-01-01
Accurate sampling of conformational space and, in particular, the transitions between functional substates has been a challenge in molecular dynamic (MD) simulations of large biomolecular systems. We developed an Elastic Network Model (ENM)-based computational method, ClustENM, for sampling large conformational changes of biomolecules with various sizes and oligomerization states. ClustENM is an iterative method that combines ENM with energy minimization and clustering steps. It is an unbiased technique, which requires only an initial structure as input, and no information about the target conformation. To test the performance of ClustENM, we applied it to six biomolecular systems: adenylate kinase (AK), calmodulin, p38 MAP kinase, HIV-1 reverse transcriptase (RT), triosephosphate isomerase (TIM), and the 70S ribosomal complex. The generated ensembles of conformers determined at atomic resolution show good agreement with experimental data (979 structures resolved by X-ray and/or NMR) and encompass the subspaces covered in independent MD simulations for TIM, p38, and RT. ClustENM emerges as a computationally efficient tool for characterizing the conformational space of large systems at atomic detail, in addition to generating a representative ensemble of conformers that can be advantageously used in simulating substrate/ligand-binding events. PMID:27494296
Zhou, Xiang
2017-12-01
Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z -scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.
Alibay, Irfan; Burusco, Kepa K; Bruce, Neil J; Bryce, Richard A
2018-03-08
Determining the conformations accessible to carbohydrate ligands in aqueous solution is important for understanding their biological action. In this work, we evaluate the conformational free-energy surfaces of Lewis oligosaccharides in explicit aqueous solvent using a multidimensional variant of the swarm-enhanced sampling molecular dynamics (msesMD) method; we compare with multi-microsecond unbiased MD simulations, umbrella sampling, and accelerated MD approaches. For the sialyl Lewis A tetrasaccharide, msesMD simulations in aqueous solution predict conformer landscapes in general agreement with the other biased methods and with triplicate unbiased 10 μs trajectories; these simulations find a predominance of closed conformer and a range of low-occupancy open forms. The msesMD simulations also suggest closed-to-open transitions in the tetrasaccharide are facilitated by changes in ring puckering of its GlcNAc residue away from the 4 C 1 form, in line with previous work. For sialyl Lewis X tetrasaccharide, msesMD simulations predict a minor population of an open form in solution corresponding to a rare lectin-bound pose observed crystallographically. Overall, from comparison with biased MD calculations, we find that triplicate 10 μs unbiased MD simulations may not be enough to fully sample glycan conformations in aqueous solution. However, the computational efficiency and intuitive approach of the msesMD method suggest potential for its application in glycomics as a tool for analysis of oligosaccharide conformation.
Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling
2006-01-01
Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083
Unbiased Sampling of Globular Lattice Proteins in Three Dimensions
NASA Astrophysics Data System (ADS)
Jacobsen, Jesper Lykke
2008-03-01
We present a Monte Carlo method that allows efficient and unbiased sampling of Hamiltonian walks on a cubic lattice. Such walks are self-avoiding and visit each lattice site exactly once. They are often used as simple models of globular proteins, upon adding suitable local interactions. Our algorithm can easily be equipped with such interactions, but we study here mainly the flexible homopolymer case where each conformation is generated with uniform probability. We argue that the algorithm is ergodic and has dynamical exponent z=0. We then use it to study polymers of size up to 643=262144 monomers. Results are presented for the effective interaction between end points, and the interaction with the boundaries of the system.
An evaluation of flow-stratified sampling for estimating suspended sediment loads
Robert B. Thomas; Jack Lewis
1995-01-01
Abstract - Flow-stratified sampling is a new method for sampling water quality constituents such as suspended sediment to estimate loads. As with selection-at-list-time (SALT) and time-stratified sampling, flow-stratified sampling is a statistical method requiring random sampling, and yielding unbiased estimates of load and variance. It can be used to estimate event...
Nearest neighbor density ratio estimation for large-scale applications in astronomy
NASA Astrophysics Data System (ADS)
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Detection of seizures from small samples using nonlinear dynamic system theory.
Yaylali, I; Koçak, H; Jayakar, P
1996-07-01
The electroencephalogram (EEG), like many other biological phenomena, is quite likely governed by nonlinear dynamics. Certain characteristics of the underlying dynamics have recently been quantified by computing the correlation dimensions (D2) of EEG time series data. In this paper, D2 of the unbiased autocovariance function of the scalp EEG data was used to detect electrographic seizure activity. Digital EEG data were acquired at a sampling rate of 200 Hz per channel and organized in continuous frames (duration 2.56 s, 512 data points). To increase the reliability of D2 computations with short duration data, raw EEG data were initially simplified using unbiased autocovariance analysis to highlight the periodic activity that is present during seizures. The D2 computation was then performed from the unbiased autocovariance function of each channel using the Grassberger-Procaccia method with Theiler's box-assisted correlation algorithm. Even with short duration data, this preprocessing proved to be computationally robust and displayed no significant sensitivity to implementation details such as the choices of embedding dimension and box size. The system successfully identified various types of seizures in clinical studies.
Constructing statistically unbiased cortical surface templates using feature-space covariance
NASA Astrophysics Data System (ADS)
Parvathaneni, Prasanna; Lyu, Ilwoo; Huo, Yuankai; Blaber, Justin; Hainline, Allison E.; Kang, Hakmook; Woodward, Neil D.; Landman, Bennett A.
2018-03-01
The choice of surface template plays an important role in cross-sectional subject analyses involving cortical brain surfaces because there is a tendency toward registration bias given variations in inter-individual and inter-group sulcal and gyral patterns. In order to account for the bias and spatial smoothing, we propose a feature-based unbiased average template surface. In contrast to prior approaches, we factor in the sample population covariance and assign weights based on feature information to minimize the influence of covariance in the sampled population. The mean surface is computed by applying the weights obtained from an inverse covariance matrix, which guarantees that multiple representations from similar groups (e.g., involving imaging, demographic, diagnosis information) are down-weighted to yield an unbiased mean in feature space. Results are validated by applying this approach in two different applications. For evaluation, the proposed unbiased weighted surface mean is compared with un-weighted means both qualitatively and quantitatively (mean squared error and absolute relative distance of both the means with baseline). In first application, we validated the stability of the proposed optimal mean on a scan-rescan reproducibility dataset by incrementally adding duplicate subjects. In the second application, we used clinical research data to evaluate the difference between the weighted and unweighted mean when different number of subjects were included in control versus schizophrenia groups. In both cases, the proposed method achieved greater stability that indicated reduced impacts of sampling bias. The weighted mean is built based on covariance information in feature space as opposed to spatial location, thus making this a generic approach to be applicable to any feature of interest.
Modeling Particle Exposure in US Trucking Terminals
Davis, ME; Smith, TJ; Laden, F; Hart, JE; Ryan, LM; Garshick, E
2007-01-01
Multi-tiered sampling approaches are common in environmental and occupational exposure assessment, where exposures for a given individual are often modeled based on simultaneous measurements taken at multiple indoor and outdoor sites. The monitoring data from such studies is hierarchical by design, imposing a complex covariance structure that must be accounted for in order to obtain unbiased estimates of exposure. Statistical methods such as structural equation modeling (SEM) represent a useful alternative to simple linear regression in these cases, providing simultaneous and unbiased predictions of each level of exposure based on a set of covariates specific to the exposure setting. We test the SEM approach using data from a large exposure assessment of diesel and combustion particles in the US trucking industry. The exposure assessment includes data from 36 different trucking terminals across the United States sampled between 2001 and 2005, measuring PM2.5 and its elemental carbon (EC), organic carbon (OC) components, by personal monitoring, and sampling at two indoor work locations and an outdoor “background” location. Using the SEM method, we predict: 1) personal exposures as a function of work related exposure and smoking status; 2) work related exposure as a function of terminal characteristics, indoor ventilation, job location, and background exposure conditions; and 3) background exposure conditions as a function of weather, nearby source pollution, and other regional differences across terminal sites. The primary advantage of SEMs in this setting is the ability to simultaneously predict exposures at each of the sampling locations, while accounting for the complex covariance structure among the measurements and descriptive variables. The statistically significant results and high R2 values observed from the trucking industry application supports the broader use of this approach in exposure assessment modeling. PMID:16856739
Flat-Sky Pseudo-Cls Analysis for Weak Gravitational Lensing
NASA Astrophysics Data System (ADS)
Asgari, Marika; Taylor, Andy; Joachimi, Benjamin; Kitching, Thomas D.
2018-05-01
We investigate the use of estimators of weak lensing power spectra based on a flat-sky implementation of the 'Pseudo-CI' (PCl) technique, where the masked shear field is transformed without regard for masked regions of sky. This masking mixes power, and 'E'-convergence and 'B'-modes. To study the accuracy of forward-modelling and full-sky power spectrum recovery we consider both large-area survey geometries, and small-scale masking due to stars and a checkerboard model for field-of-view gaps. The power spectrum for the large-area survey geometry is sparsely-sampled and highly oscillatory, which makes modelling problematic. Instead, we derive an overall calibration for large-area mask bias using simulated fields. The effects of small-area star masks can be accurately corrected for, while the checkerboard mask has oscillatory and spiky behaviour which leads to percent biases. Apodisation of the masked fields leads to increased biases and a loss of information. We find that we can construct an unbiased forward-model of the raw PCls, and recover the full-sky convergence power to within a few percent accuracy for both Gaussian and lognormal-distributed shear fields. Propagating this through to cosmological parameters using a Fisher-Matrix formalism, we find we can make unbiased estimates of parameters for surveys up to 1,200 deg2 with 30 galaxies per arcmin2, beyond which the percent biases become larger than the statistical accuracy. This implies a flat-sky PCl analysis is accurate for current surveys but a Euclid-like survey will require higher accuracy.
On estimation in k-tree sampling
Christoph Kleinn; Frantisek Vilcko
2007-01-01
The plot design known as k-tree sampling involves taking the k nearest trees from a selected sample point as sample trees. While this plot design is very practical and easily applied in the field for moderate values of k, unbiased estimation remains a problem. In this article, we give a brief introduction to the...
Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield
Robert B. Thomas
1986-01-01
Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...
Host Galaxy Properties Of The Swift Bat Hard X-ray Survey Of Agn
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.
2010-03-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 90 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, stellar mass, star formation, and AGN luminosity for a sample of 145 AGN Hard X-ray Selected AGN.
Computational tools for exact conditional logistic regression.
Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P
Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.
Estimating total suspended sediment yield with probability sampling
Robert B. Thomas
1985-01-01
The ""Selection At List Time"" (SALT) scheme controls sampling of concentration for estimating total suspended sediment yield. The probability of taking a sample is proportional to its estimated contribution to total suspended sediment discharge. This procedure gives unbiased estimates of total suspended sediment yield and the variance of the...
Design unbiased estimation in line intersect sampling using segmented transects
David L.R. Affleck; Timothy G. Gregoire; Harry T. Valentine; Harry T. Valentine
2005-01-01
In many applications of line intersect sampling. transects consist of multiple, connected segments in a prescribed configuration. The relationship between the transect configuration and the selection probability of a population element is illustrated and a consistent sampling protocol, applicable to populations composed of arbitrarily shaped elements, is proposed. It...
A modified weighted function method for parameter estimation of Pearson type three distribution
NASA Astrophysics Data System (ADS)
Liang, Zhongmin; Hu, Yiming; Li, Binquan; Yu, Zhongbo
2014-04-01
In this paper, an unconventional method called Modified Weighted Function (MWF) is presented for the conventional moment estimation of a probability distribution function. The aim of MWF is to estimate the coefficient of variation (CV) and coefficient of skewness (CS) from the original higher moment computations to the first-order moment calculations. The estimators for CV and CS of Pearson type three distribution function (PE3) were derived by weighting the moments of the distribution with two weight functions, which were constructed by combining two negative exponential-type functions. The selection of these weight functions was based on two considerations: (1) to relate weight functions to sample size in order to reflect the relationship between the quantity of sample information and the role of weight function and (2) to allocate more weights to data close to medium-tail positions in a sample series ranked in an ascending order. A Monte-Carlo experiment was conducted to simulate a large number of samples upon which statistical properties of MWF were investigated. For the PE3 parent distribution, results of MWF were compared to those of the original Weighted Function (WF) and Linear Moments (L-M). The results indicate that MWF was superior to WF and slightly better than L-M, in terms of statistical unbiasness and effectiveness. In addition, the robustness of MWF, WF, and L-M were compared by designing the Monte-Carlo experiment that samples are obtained from Log-Pearson type three distribution (LPE3), three parameter Log-Normal distribution (LN3), and Generalized Extreme Value distribution (GEV), respectively, but all used as samples from the PE3 distribution. The results show that in terms of statistical unbiasness, no one method possesses the absolutely overwhelming advantage among MWF, WF, and L-M, while in terms of statistical effectiveness, the MWF is superior to WF and L-M.
Xiao, Mengli; Zhang, Yongbo; Fu, Huimin; Wang, Zhihua
2018-05-01
High-precision navigation algorithm is essential for the future Mars pinpoint landing mission. The unknown inputs caused by large uncertainties of atmospheric density and aerodynamic coefficients as well as unknown measurement biases may cause large estimation errors of conventional Kalman filters. This paper proposes a derivative-free version of nonlinear unbiased minimum variance filter for Mars entry navigation. This filter has been designed to solve this problem by estimating the state and unknown measurement biases simultaneously with derivative-free character, leading to a high-precision algorithm for the Mars entry navigation. IMU/radio beacons integrated navigation is introduced in the simulation, and the result shows that with or without radio blackout, our proposed filter could achieve an accurate state estimation, much better than the conventional unscented Kalman filter, showing the ability of high-precision Mars entry navigation algorithm. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Piecewise SALT sampling for estimating suspended sediment yields
Robert B. Thomas
1989-01-01
A probability sampling method called SALT (Selection At List Time) has been developed for collecting and summarizing data on delivery of suspended sediment in rivers. It is based on sampling and estimating yield using a suspended-sediment rating curve for high discharges and simple random sampling for low flows. The method gives unbiased estimates of total yield and...
Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire
2009-01-01
Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...
Mohammad-Zamani, Mohammad Javad; Neshat, Mohammad; Moravvej-Farshi, Mohammad Kazem
2016-01-15
A new generation unbiased antennaless CW terahertz (THz) photomixer emitters array made of asymmetric metal-semiconductor-metal (MSM) gratings with a subwavelength pitch, operating in the optical near-field regime, is proposed. We take advantage of size effects in near-field optics and electrostatics to demonstrate the possibility of enhancing the THz power by 4 orders of magnitude, compared to a similar unbiased antennaless array of the same size that operates in the far-field regime. We show that, with the appropriate choice of grating parameters in such THz sources, the first plasmonic resonant cavity mode in the nanoslit between two adjacent MSMs can enhance the optical near-field absorption and, hence, the generation of photocarriers under the slit in the active medium. These photocarriers, on the other hand, are accelerated by the large built-in electric field sustained under the nanoslits by two dissimilar Schottky barriers to create the desired large THz power that is mainly radiated downward. The proposed structure can be tuned in a broadband frequency range of 0.1-3 THz, with output power increasing with frequency.
Minimum variance geographic sampling
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
Resource inventories require samples with geographical scatter, sometimes not as widely spaced as would be hoped. A simple model of correlation over distances is used to create a minimum variance unbiased estimate population means. The fitting procedure is illustrated from data used to estimate Missouri corn acreage.
Complex sample survey estimation in static state-space
Raymond L. Czaplewski
2010-01-01
Increased use of remotely sensed data is a key strategy adopted by the Forest Inventory and Analysis Program. However, multiple sensor technologies require complex sampling units and sampling designs. The Recursive Restriction Estimator (RRE) accommodates this complexity. It is a design-consistent Empirical Best Linear Unbiased Prediction for the state-vector, which...
Regression sampling: some results for resource managers and researchers
William G. O' Regan; Robert W. Boyd
1974-01-01
Regression sampling is widely used in natural resources management and research to estimate quantities of resources per unit area. This note brings together results found in the statistical literature in the application of this sampling technique. Conditional and unconditional estimators are listed and for each estimator, exact variances and unbiased estimators for the...
The New Peabody Picture Vocabulary Test-III: An Illusion of Unbiased Assessment?
Stockman, Ida J
2000-10-01
This article examines whether changes in the ethnic minority composition of the standardization sample for the latest edition of the Peabody Picture Vocabulary Test (PPVT-III, Dunn & Dunn, 1997) can be used as the sole explanation for children's better test scores when compared to an earlier edition, the Peabody Picture Vocabulary Test-Revised (PPVT-R, Dunn & Dunn, 1981). Results from a comparative analysis of these two test editions suggest that other factors may explain improved performances. Among these factors are the number of words and age levels sampled, the types of words and pictures used, and characteristics of the standardization sample other than its ethnic minority composition. This analysis also raises questions regarding the usefulness of converting scores from one edition to the other and the type of criteria that could be used to evaluate whether the PPVT-III is an unbiased test of vocabulary for children from diverse cultural and linguistic backgrounds.
Comparison of estimators of standard deviation for hydrologic time series
Tasker, Gary D.; Gilroy, Edward J.
1982-01-01
Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.
NASA Astrophysics Data System (ADS)
Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan
2015-07-01
While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.
Likelihood inference of non-constant diversification rates with incomplete taxon sampling.
Höhna, Sebastian
2014-01-01
Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species) and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family) and thus to maximize diversity (diversified sampling). So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent) diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa). The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa). Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model). Finally, I applied six different diversification rate models--ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models--on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species). All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear to be violated.
Diallel analysis for sex-linked and maternal effects.
Zhu, J; Weir, B S
1996-01-01
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.
Random sampling of elementary flux modes in large-scale metabolic networks.
Machado, Daniel; Soons, Zita; Patil, Kiran Raosaheb; Ferreira, Eugénio C; Rocha, Isabel
2012-09-15
The description of a metabolic network in terms of elementary (flux) modes (EMs) provides an important framework for metabolic pathway analysis. However, their application to large networks has been hampered by the combinatorial explosion in the number of modes. In this work, we develop a method for generating random samples of EMs without computing the whole set. Our algorithm is an adaptation of the canonical basis approach, where we add an additional filtering step which, at each iteration, selects a random subset of the new combinations of modes. In order to obtain an unbiased sample, all candidates are assigned the same probability of getting selected. This approach avoids the exponential growth of the number of modes during computation, thus generating a random sample of the complete set of EMs within reasonable time. We generated samples of different sizes for a metabolic network of Escherichia coli, and observed that they preserve several properties of the full EM set. It is also shown that EM sampling can be used for rational strain design. A well distributed sample, that is representative of the complete set of EMs, should be suitable to most EM-based methods for analysis and optimization of metabolic networks. Source code for a cross-platform implementation in Python is freely available at http://code.google.com/p/emsampler. dmachado@deb.uminho.pt Supplementary data are available at Bioinformatics online.
Host Galaxy Properties of SWIFT Hard X-ray Selected AGN
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.
2010-01-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.
Computer program uses Monte Carlo techniques for statistical system performance analysis
NASA Technical Reports Server (NTRS)
Wohl, D. P.
1967-01-01
Computer program with Monte Carlo sampling techniques determines the effect of a component part of a unit upon the overall system performance. It utilizes the full statistics of the disturbances and misalignments of each component to provide unbiased results through simulated random sampling.
Verifying mixing in dilution tunnels How to ensure cookstove emissions samples are unbiased
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, Daniel L.; Rapp, Vi H.; Caubel, Julien J.
A well-mixed diluted sample is essential for unbiased measurement of cookstove emissions. Most cookstove testing labs employ a dilution tunnel, also referred to as a “duct,” to mix clean dilution air with cookstove emissions before sampling. It is important that the emissions be well-mixed and unbiased at the sampling port so that instruments can take representative samples of the emission plume. Some groups have employed mixing baffles to ensure the gaseous and aerosol emissions from cookstoves are well-mixed before reaching the sampling location [2, 4]. The goal of these baffles is to to dilute and mix the emissions stream withmore » the room air entering the fume hood by creating a local zone of high turbulence. However, potential drawbacks of mixing baffles include increased flow resistance (larger blowers needed for the same exhaust flow), nuisance cleaning of baffles as soot collects, and, importantly, the potential for loss of PM2.5 particles on the baffles themselves, thus biasing results. A cookstove emission monitoring system with baffles will collect particles faster than the duct’s walls alone. This is mostly driven by the available surface area for deposition by processes of Brownian diffusion (through the boundary layer) and turbophoresis (i.e. impaction). The greater the surface area available for diffusive and advection-driven deposition to occur, the greater the particle loss will be at the sampling port. As a layer of larger particle “fuzz” builds on the mixing baffles, even greater PM2.5 loss could occur. The micro structure of the deposited aerosol will lead to increased rates of particle loss by interception and a tendency for smaller particles to deposit due to impaction on small features of the micro structure. If the flow stream could be well-mixed without the need for baffles, these drawbacks could be avoided and the cookstove emissions sampling system would be more robust.« less
Chiu, Charles Y
2015-01-01
Viral pathogen discovery is of critical importance to clinical microbiology, infectious diseases, and public health. Genomic approaches for pathogen discovery, including consensus polymerase chain reaction (PCR), microarrays, and unbiased next-generation sequencing (NGS), have the capacity to comprehensively identify novel microbes present in clinical samples. Although numerous challenges remain to be addressed, including the bioinformatics analysis and interpretation of large datasets, these technologies have been successful in rapidly identifying emerging outbreak threats, screening vaccines and other biological products for microbial contamination, and discovering novel viruses associated with both acute and chronic illnesses. Downstream studies such as genome assembly, epidemiologic screening, and a culture system or animal model of infection are necessary to establish an association of a candidate pathogen with disease. PMID:23725672
Kraschnewski, Jennifer L; Keyserling, Thomas C; Bangdiwala, Shrikant I; Gizlice, Ziya; Garcia, Beverly A; Johnston, Larry F; Gustafson, Alison; Petrovic, Lindsay; Glasgow, Russell E; Samuel-Hodge, Carmen D
2010-01-01
Studies of type 2 translation, the adaption of evidence-based interventions to real-world settings, should include representative study sites and staff to improve external validity. Sites for such studies are, however, often selected by convenience sampling, which limits generalizability. We used an optimized probability sampling protocol to select an unbiased, representative sample of study sites to prepare for a randomized trial of a weight loss intervention. We invited North Carolina health departments within 200 miles of the research center to participate (N = 81). Of the 43 health departments that were eligible, 30 were interested in participating. To select a representative and feasible sample of 6 health departments that met inclusion criteria, we generated all combinations of 6 from the 30 health departments that were eligible and interested. From the subset of combinations that met inclusion criteria, we selected 1 at random. Of 593,775 possible combinations of 6 counties, 15,177 (3%) met inclusion criteria. Sites in the selected subset were similar to all eligible sites in terms of health department characteristics and county demographics. Optimized probability sampling improved generalizability by ensuring an unbiased and representative sample of study sites.
Topography of acute stroke in a sample of 439 right brain damaged patients.
Sperber, Christoph; Karnath, Hans-Otto
2016-01-01
Knowledge of the typical lesion topography and volumetry is important for clinical stroke diagnosis as well as for anatomo-behavioral lesion mapping analyses. Here we used modern lesion analysis techniques to examine the naturally occurring lesion patterns caused by ischemic and by hemorrhagic infarcts in a large, representative acute stroke patient sample. Acute MR and CT imaging of 439 consecutively admitted right-hemispheric stroke patients from a well-defined catchment area suffering from ischemia (n = 367) or hemorrhage (n = 72) were normalized and mapped in reference to stereotaxic anatomical atlases. For ischemic infarcts, highest frequencies of stroke were observed in the insula, putamen, operculum and superior temporal cortex, as well as the inferior and superior occipito-frontal fascicles, superior longitudinal fascicle, uncinate fascicle, and the acoustic radiation. The maximum overlay of hemorrhages was located more posteriorly and more medially, involving posterior areas of the insula, Heschl's gyrus, and putamen. Lesion size was largest in frontal and anterior areas and lowest in subcortical and posterior areas. The large and unbiased sample of stroke patients used in the present study accumulated the different sub-patterns to identify the global topographic and volumetric pattern of right hemisphere stroke in humans.
Two-stage sequential sampling: A neighborhood-free adaptive sampling procedure
Salehi, M.; Smith, D.R.
2005-01-01
Designing an efficient sampling scheme for a rare and clustered population is a challenging area of research. Adaptive cluster sampling, which has been shown to be viable for such a population, is based on sampling a neighborhood of units around a unit that meets a specified condition. However, the edge units produced by sampling neighborhoods have proven to limit the efficiency and applicability of adaptive cluster sampling. We propose a sampling design that is adaptive in the sense that the final sample depends on observed values, but it avoids the use of neighborhoods and the sampling of edge units. Unbiased estimators of population total and its variance are derived using Murthy's estimator. The modified two-stage sampling design is easy to implement and can be applied to a wider range of populations than adaptive cluster sampling. We evaluate the proposed sampling design by simulating sampling of two real biological populations and an artificial population for which the variable of interest took the value either 0 or 1 (e.g., indicating presence and absence of a rare event). We show that the proposed sampling design is more efficient than conventional sampling in nearly all cases. The approach used to derive estimators (Murthy's estimator) opens the door for unbiased estimators to be found for similar sequential sampling designs. ?? 2005 American Statistical Association and the International Biometric Society.
Migration monitoring with automated technology
Rhonda L. Millikin
2005-01-01
Automated technology can supplement ground-based methods of migration monitoring by providing: (1) unbiased and automated sampling; (2) independent validation of current methods; (3) a larger sample area for landscape-level analysis of habitat selection for stopover, and (4) an opportunity to study flight behavior. In particular, radar-acoustic sensor fusion can...
Suitability of the line intersect method for sampling hardwood logging residues
A. Jeff Martin
1976-01-01
The line intersect method of sampling logging residues was tested in Appalachian hardwoods and was found to provide unbiased estimates of the volume of residue in cubic feet per acre. Thirty-two chains of sample line were established on each of sixteen 1-acre plots on cutover areas in a variety of conditions. Estimates from these samples were then compared to actual...
da Costa, Nuno Maçarico; Hepp, Klaus; Martin, Kevan A C
2009-05-30
Synapses can only be morphologically identified by electron microscopy and this is often a very labor-intensive and time-consuming task. When quantitative estimates are required for pathways that contribute a small proportion of synapses to the neuropil, the problems of accurate sampling are particularly severe and the total time required may become prohibitive. Here we present a sampling method devised to count the percentage of rarely occurring synapses in the neuropil using a large sample (approximately 1000 sampling sites), with the strong constraint of doing it in reasonable time. The strategy, which uses the unbiased physical disector technique, resembles that used in particle physics to detect rare events. We validated our method in the primary visual cortex of the cat, where we used biotinylated dextran amine to label thalamic afferents and measured the density of their synapses using the physical disector method. Our results show that we could obtain accurate counts of the labeled synapses, even when they represented only 0.2% of all the synapses in the neuropil.
Rogers, Paul; Stoner, Julie
2016-01-01
Regression models for correlated binary outcomes are commonly fit using a Generalized Estimating Equations (GEE) methodology. GEE uses the Liang and Zeger sandwich estimator to produce unbiased standard error estimators for regression coefficients in large sample settings even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large, and there are few repeated measurements. The sandwich estimator is not without drawbacks; its asymptotic properties do not hold in small sample settings. In these situations, the sandwich estimator is biased downwards, underestimating the variances. In this project, a modified form for the sandwich estimator is proposed to correct this deficiency. The performance of this new sandwich estimator is compared to the traditional Liang and Zeger estimator as well as alternative forms proposed by Morel, Pan and Mancl and DeRouen. The performance of each estimator was assessed with 95% coverage probabilities for the regression coefficient estimators using simulated data under various combinations of sample sizes and outcome prevalence values with an Independence (IND), Autoregressive (AR) and Compound Symmetry (CS) correlation structure. This research is motivated by investigations involving rare-event outcomes in aviation data. PMID:26998504
Within-subject template estimation for unbiased longitudinal image analysis.
Reuter, Martin; Schmansky, Nicholas J; Rosas, H Diana; Fischl, Bruce
2012-07-16
Longitudinal image analysis has become increasingly important in clinical studies of normal aging and neurodegenerative disorders. Furthermore, there is a growing appreciation of the potential utility of longitudinally acquired structural images and reliable image processing to evaluate disease modifying therapies. Challenges have been related to the variability that is inherent in the available cross-sectional processing tools, to the introduction of bias in longitudinal processing and to potential over-regularization. In this paper we introduce a novel longitudinal image processing framework, based on unbiased, robust, within-subject template creation, for automatic surface reconstruction and segmentation of brain MRI of arbitrarily many time points. We demonstrate that it is essential to treat all input images exactly the same as removing only interpolation asymmetries is not sufficient to remove processing bias. We successfully reduce variability and avoid over-regularization by initializing the processing in each time point with common information from the subject template. The presented results show a significant increase in precision and discrimination power while preserving the ability to detect large anatomical deviations; as such they hold great potential in clinical applications, e.g. allowing for smaller sample sizes or shorter trials to establish disease specific biomarkers or to quantify drug effects. Copyright © 2012 Elsevier Inc. All rights reserved.
Robert B. Thomas; Jack Lewis
1993-01-01
Time-stratified sampling of sediment for estimating suspended load is introduced and compared to selection at list time (SALT) sampling. Both methods provide unbiased estimates of load and variance. The magnitude of the variance of the two methods is compared using five storm populations of suspended sediment flux derived from turbidity data. Under like conditions,...
Unbiased Estimation of Refractive State of Aberrated Eyes
Martin, Jesson; Vasudevan, Balamurali; Himebaugh, Nikole; Bradley, Arthur; Thibos, Larry
2011-01-01
To identify unbiased methods for estimating the target vergence required to maximize visual acuity based on wavefront aberration measurements. Experiments were designed to minimize the impact of confounding factors that have hampered previous research. Objective wavefront refractions and subjective acuity refractions were obtained for the same monochromatic wavelength. Accommodation and pupil fluctuations were eliminated by cycloplegia. Unbiased subjective refractions that maximize visual acuity for high contrast letters were performed with a computer controlled forced choice staircase procedure, using 0.125 diopter steps of defocus. All experiments were performed for two pupil diameters (3mm and 6mm). As reported in the literature, subjective refractive error does not change appreciably when the pupil dilates. For 3 mm pupils most metrics yielded objective refractions that were about 0.1D more hyperopic than subjective acuity refractions. When pupil diameter increased to 6 mm, this bias changed in the myopic direction and the variability between metrics also increased. These inaccuracies were small compared to the precision of the measurements, which implies that most metrics provided unbiased estimates of refractive state for medium and large pupils. A variety of image quality metrics may be used to determine ocular refractive state for monochromatic (635nm) light, thereby achieving accurate results without the need for empirical correction factors. PMID:21777601
NASA Astrophysics Data System (ADS)
Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.
2017-12-01
Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.
Liu, Dajiang J; Leal, Suzanne M
2012-10-05
Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Conformational free energies of methyl-α-L-iduronic and methyl-β-D-glucuronic acids in water
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Sagui, Celeste
2010-03-01
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of α-L-iduronic acid and its C5 epimer β-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Conformational free energies of methyl-alpha-L-iduronic and methyl-beta-D-glucuronic acids in water.
Babin, Volodymyr; Sagui, Celeste
2010-03-14
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of alpha-L-iduronic acid and its C5 epimer beta-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Using object-based image analysis to guide the selection of field sample locations
USDA-ARS?s Scientific Manuscript database
One of the most challenging tasks for resource management and research is designing field sampling schemes to achieve unbiased estimates of ecosystem parameters as efficiently as possible. This study focused on the potential of fine-scale image objects from object-based image analysis (OBIA) to be u...
Poisson sampling - The adjusted and unadjusted estimator revisited
Michael S. Williams; Hans T. Schreuder; Gerardo H. Terrazas
1998-01-01
The prevailing assumption, that for Poisson sampling the adjusted estimator "Y-hat a" is always substantially more efficient than the unadjusted estimator "Y-hat u" , is shown to be incorrect. Some well known theoretical results are applicable since "Y-hat a" is a ratio-of-means estimator and "Y-hat u" a simple unbiased estimator...
Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses...
Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation
NASA Astrophysics Data System (ADS)
Räisänen, Petri; Barker, W. Howard
2004-07-01
The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.
Mixed model approaches for diallel analysis based on a bio-model.
Zhu, J; Weir, B S
1996-12-01
A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.
Gao, Hongying; Deng, Shibing; Obach, R Scott
2015-12-01
An unbiased scanning methodology using ultra high-performance liquid chromatography coupled with high-resolution mass spectrometry was used to bank data and plasma samples for comparing the data generated at different dates. This method was applied to bank the data generated earlier in animal samples and then to compare the exposure to metabolites in animal versus human for safety assessment. With neither authentic standards nor prior knowledge of the identities and structures of metabolites, full scans for precursor ions and all ion fragments (AIF) were employed with a generic gradient LC method to analyze plasma samples at positive and negative polarity, respectively. In a total of 22 tested drugs and metabolites, 21 analytes were detected using this unbiased scanning method except that naproxen was not detected due to low sensitivity at negative polarity and interference at positive polarity; and 4'- or 5-hydroxy diclofenac was not separated by a generic UPLC method. Statistical analysis of the peak area ratios of the analytes versus the internal standard in five repetitive analyses over approximately 1 year demonstrated that the analysis variation was significantly different from sample instability. The confidence limits for comparing the exposure using peak area ratio of metabolites in animal plasma versus human plasma measured over approximately 1 year apart were comparable to the analysis undertaken side by side on the same days. These statistical analysis results showed it was feasible to compare data generated at different dates with neither authentic standards nor prior knowledge of the analytes.
Harry V., Jr. Wiant; Michael L. Spangler; John E. Baumgras
2002-01-01
Various taper systems and the centroid method were compared to unbiased volume estimates made by importance sampling for 720 hardwood trees selected throughout the state of West Virginia. Only the centroid method consistently gave volumes estimates that did not differ significantly from those made by importance sampling, although some taper equations did well for most...
Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods
NASA Astrophysics Data System (ADS)
Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.
2011-12-01
Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.
Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries
McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.
2013-01-01
Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.
On the degrees of freedom of reduced-rank estimators in multivariate regression
Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.
2015-01-01
Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155
Host Galaxy Morphologies Of Hard X-ray Selected AGN From The Swift BAT Survey
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.
2009-01-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. For these host galaxies, only a fraction, 29%, have high quality optical images, predominately from the SDSS. In addition, about 33% show peculiar morphologies and interaction. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.
On fixed-area plot sampling for downed coarse woody debris
Jeffrey H. Gove; Paul C. Van Deusen
2011-01-01
The use of fixed-area plots for sampling down coarse woody debris is reviewed. A set of clearly defined protocols for two previously described methods is established and a new method, which we call the 'sausage' method, is developed. All methods (protocols) are shown to be unbiased for volume estimation, but not necessarily for estimation of population...
High-Dimensional Multivariate Repeated Measures Analysis with Unequal Covariance Matrices.
Harrar, Solomon W; Kong, Xiaoli
2015-03-01
In this paper, test statistics for repeated measures design are introduced when the dimension is large. By large dimension is meant the number of repeated measures and the total sample size grow together but either one could be larger than the other. Asymptotic distribution of the statistics are derived for the equal as well as unequal covariance cases in the balanced as well as unbalanced cases. The asymptotic framework considered requires proportional growth of the sample sizes and the dimension of the repeated measures in the unequal covariance case. In the equal covariance case, one can grow at much faster rate than the other. The derivations of the asymptotic distributions mimic that of Central Limit Theorem with some important peculiarities addressed with sufficient rigor. Consistent and unbiased estimators of the asymptotic variances, which make efficient use of all the observations, are also derived. Simulation study provides favorable evidence for the accuracy of the asymptotic approximation under the null hypothesis. Power simulations have shown that the new methods have comparable power with a popular method known to work well in low-dimensional situation but the new methods have shown enormous advantage when the dimension is large. Data from Electroencephalograph (EEG) experiment is analyzed to illustrate the application of the results.
Large-scale serum protein biomarker discovery in Duchenne muscular dystrophy.
Hathout, Yetrib; Brody, Edward; Clemens, Paula R; Cripe, Linda; DeLisle, Robert Kirk; Furlong, Pat; Gordish-Dressman, Heather; Hache, Lauren; Henricson, Erik; Hoffman, Eric P; Kobayashi, Yvonne Monique; Lorts, Angela; Mah, Jean K; McDonald, Craig; Mehler, Bob; Nelson, Sally; Nikrad, Malti; Singer, Britta; Steele, Fintan; Sterling, David; Sweeney, H Lee; Williams, Steve; Gold, Larry
2015-06-09
Serum biomarkers in Duchenne muscular dystrophy (DMD) may provide deeper insights into disease pathogenesis, suggest new therapeutic approaches, serve as acute read-outs of drug effects, and be useful as surrogate outcome measures to predict later clinical benefit. In this study a large-scale biomarker discovery was performed on serum samples from patients with DMD and age-matched healthy volunteers using a modified aptamer-based proteomics technology. Levels of 1,125 proteins were quantified in serum samples from two independent DMD cohorts: cohort 1 (The Parent Project Muscular Dystrophy-Cincinnati Children's Hospital Medical Center), 42 patients with DMD and 28 age-matched normal volunteers; and cohort 2 (The Cooperative International Neuromuscular Research Group, Duchenne Natural History Study), 51 patients with DMD and 17 age-matched normal volunteers. Forty-four proteins showed significant differences that were consistent in both cohorts when comparing DMD patients and healthy volunteers at a 1% false-discovery rate, a large number of significant protein changes for such a small study. These biomarkers can be classified by known cellular processes and by age-dependent changes in protein concentration. Our findings demonstrate both the utility of this unbiased biomarker discovery approach and suggest potential new diagnostic and therapeutic avenues for ameliorating the burden of DMD and, we hope, other rare and devastating diseases.
High-Dimensional Multivariate Repeated Measures Analysis with Unequal Covariance Matrices
Harrar, Solomon W.; Kong, Xiaoli
2015-01-01
In this paper, test statistics for repeated measures design are introduced when the dimension is large. By large dimension is meant the number of repeated measures and the total sample size grow together but either one could be larger than the other. Asymptotic distribution of the statistics are derived for the equal as well as unequal covariance cases in the balanced as well as unbalanced cases. The asymptotic framework considered requires proportional growth of the sample sizes and the dimension of the repeated measures in the unequal covariance case. In the equal covariance case, one can grow at much faster rate than the other. The derivations of the asymptotic distributions mimic that of Central Limit Theorem with some important peculiarities addressed with sufficient rigor. Consistent and unbiased estimators of the asymptotic variances, which make efficient use of all the observations, are also derived. Simulation study provides favorable evidence for the accuracy of the asymptotic approximation under the null hypothesis. Power simulations have shown that the new methods have comparable power with a popular method known to work well in low-dimensional situation but the new methods have shown enormous advantage when the dimension is large. Data from Electroencephalograph (EEG) experiment is analyzed to illustrate the application of the results. PMID:26778861
Mutually unbiased product bases for multiple qudits
DOE Office of Scientific and Technical Information (OSTI.GOV)
McNulty, Daniel; Pammer, Bogdan; Weigert, Stefan
We investigate the interplay between mutual unbiasedness and product bases for multiple qudits of possibly different dimensions. A product state of such a system is shown to be mutually unbiased to a product basis only if each of its factors is mutually unbiased to all the states which occur in the corresponding factors of the product basis. This result implies both a tight limit on the number of mutually unbiased product bases which the system can support and a complete classification of mutually unbiased product bases for multiple qubits or qutrits. In addition, only maximally entangled states can be mutuallymore » unbiased to a maximal set of mutually unbiased product bases.« less
Identifying currents in the gene pool for bacterial populations using an integrative approach.
Tang, Jing; Hanage, William P; Fraser, Christophe; Corander, Jukka
2009-08-01
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
Lü, Xiaoshu; Takala, Esa-Pekka; Toppila, Esko; Marjanen, Ykä; Kaila-Kangas, Leena; Lu, Tao
2017-08-01
Exposure to whole-body vibration (WBV) presents an occupational health risk and several safety standards obligate to measure WBV. The high cost of direct measurements in large epidemiological studies raises the question of the optimal sampling for estimating WBV exposures given by a large variation in exposure levels in real worksites. This paper presents a new approach to addressing this problem. A daily exposure to WBV was recorded for 9-24 days among 48 all-terrain vehicle drivers. Four data-sets based on root mean squared recordings were obtained from the measurement. The data were modelled using semi-variogram with spectrum analysis and the optimal sampling scheme was derived. The optimum sampling period was 140 min apart. The result was verified and validated in terms of its accuracy and statistical power. Recordings of two to three hours are probably needed to get a sufficiently unbiased daily WBV exposure estimate in real worksites. The developed model is general enough that is applicable to other cumulative exposures or biosignals. Practitioner Summary: Exposure to whole-body vibration (WBV) presents an occupational health risk and safety standards obligate to measure WBV. However, direct measurements can be expensive. This paper presents a new approach to addressing this problem. The developed model is general enough that is applicable to other cumulative exposures or biosignals.
Kolmogorov-Smirnov test for spatially correlated data
Olea, R.A.; Pawlowsky-Glahn, V.
2009-01-01
The Kolmogorov-Smirnov test is a convenient method for investigating whether two underlying univariate probability distributions can be regarded as undistinguishable from each other or whether an underlying probability distribution differs from a hypothesized distribution. Application of the test requires that the sample be unbiased and the outcomes be independent and identically distributed, conditions that are violated in several degrees by spatially continuous attributes, such as topographical elevation. A generalized form of the bootstrap method is used here for the purpose of modeling the distribution of the statistic D of the Kolmogorov-Smirnov test. The innovation is in the resampling, which in the traditional formulation of bootstrap is done by drawing from the empirical sample with replacement presuming independence. The generalization consists of preparing resamplings with the same spatial correlation as the empirical sample. This is accomplished by reading the value of unconditional stochastic realizations at the sampling locations, realizations that are generated by simulated annealing. The new approach was tested by two empirical samples taken from an exhaustive sample closely following a lognormal distribution. One sample was a regular, unbiased sample while the other one was a clustered, preferential sample that had to be preprocessed. Our results show that the p-value for the spatially correlated case is always larger that the p-value of the statistic in the absence of spatial correlation, which is in agreement with the fact that the information content of an uncorrelated sample is larger than the one for a spatially correlated sample of the same size. ?? Springer-Verlag 2008.
NASA Astrophysics Data System (ADS)
Trenti, Michele
2017-08-01
Hubble's WFC3 has been a game changer for the study of early galaxy formation in the first 700 Myr after the Big Bang. Reliable samples of sources to redshift z 11, which can be discovered only from space, are now constraining the evolution of the galaxy luminosity function into the epoch of reionization. Unexpectedly but excitingly, the recent spectroscopic confirmations of L>L* galaxies at z>8.5 demonstrate that objects brighter than our own Galaxy are already present 500 Myr after the Big Bang, creating a challenge to current theoretical/numerical models that struggle to explain how galaxies can grow so luminous so quickly. Yet, the existing HST observations do not cover sufficient area, nor sample a large enough diversity of environments to provide an unbiased sample of sources, especially at z 9-11 where only a handful of bright candidates are known. To double this currently insufficient sample size, to constrain effectively the bright-end of the galaxy luminosity function at z 9-10, and to provide targets for follow-up imaging and spectroscopy with JWST, we propose a large-area pure-parallel survey that will discover the Brightest of Reionizing Galaxies (BoRG[4JWST]). We will observe 580 arcmin^2 over 125 sightlines in five WFC3 bands (0.35 to 1.7 micron) using high-quality pure-parallel opportunities available in the cycle (3 orbits or longer). These public observations will identify more than 80 intrinsically bright galaxies at z 8-11, investigate the connection between halo mass, star formation and feedback in progenitors of groups and clusters, and build HST lasting legacy of large-area, near-IR imaging.
High levels of absorption in orientation-unbiased, radio-selected 3CR Active Galaxies
NASA Astrophysics Data System (ADS)
Wilkes, Belinda J.; Haas, Martin; Barthel, Peter; Leipski, Christian; Kuraszkiewicz, Joanna; Worrall, Diana; Birkinshaw, Mark; Willner, Steven P.
2014-08-01
A critical problem in understanding active galaxies (AGN) is the separation of intrinsic physical differences from observed differences that are due to orientation. Obscuration of the active nucleus is anisotropic and strongly frequency dependent leading to complex selection effects for observations in most wavebands. These can only be quantified using a sample that is sufficiently unbiased to test orientation effects. Low-frequency radio emission is one way to select a close-to orientation-unbiased sample, albeit limited to the minority of AGN with strong radio emission.Recent Chandra, Spitzer and Herschel observations combined with multi-wavelength data for a complete sample of high-redshift (1
Jean-Yves Courbois; Stephen L. Katz; Daniel J. Isaak; E. Ashley Steel; Russell F. Thurow; A. Michelle Wargo Rub; Tony Olsen; Chris E. Jordan
2008-01-01
Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses of Chinook salmon (Oncorhynchus tshawytscha) redds from central Idaho,...
Cluster Masses Derived from X-ray and Sunyaev-Zeldovich Effect Measurements
NASA Technical Reports Server (NTRS)
Laroque, S.; Joy, Marshall; Bonamente, M.; Carlstrom, J.; Dawson, K.
2003-01-01
We infer the gas mass and total gravitational mass of 11 clusters using two different methods; analysis of X-ray data from the Chandra X-ray Observatory and analysis of centimeter-wave Sunyaev-Zel'dovich Effect (SZE) data from the BIMA and OVRO interferometers. This flux-limited sample of clusters from the BCS cluster catalogue was chosen so as to be well above the surface brightness limit of the ROSAT All Sky Survey; this is therefore an orientation unbiased sample. The gas mass fraction, f_g, is calculated for each cluster using both X-ray and SZE data, and the results are compared at a fiducial radius of r_500. Comparison of the X-ray and SZE results for this orientation unbiased sample allows us to constrain cluster systematics, such as clumping of the intracluster medium. We derive an upper limit on Omega_M assuming that the mass composition of clusters within r_500 reflects the universal mass composition Omega_M h_100 is greater than Omega _B / f-g. We also demonstrate how the mean f_g derived from the sample can be used to estimate the masses of clusters discovered by upcoming deep SZE surveys.
NASA Technical Reports Server (NTRS)
Chapman, G. M. (Principal Investigator); Carnes, J. G.
1981-01-01
Several techniques which use clusters generated by a new clustering algorithm, CLASSY, are proposed as alternatives to random sampling to obtain greater precision in crop proportion estimation: (1) Proportional Allocation/relative count estimator (PA/RCE) uses proportional allocation of dots to clusters on the basis of cluster size and a relative count cluster level estimate; (2) Proportional Allocation/Bayes Estimator (PA/BE) uses proportional allocation of dots to clusters and a Bayesian cluster-level estimate; and (3) Bayes Sequential Allocation/Bayesian Estimator (BSA/BE) uses sequential allocation of dots to clusters and a Bayesian cluster level estimate. Clustering in an effective method in making proportion estimates. It is estimated that, to obtain the same precision with random sampling as obtained by the proportional sampling of 50 dots with an unbiased estimator, samples of 85 or 166 would need to be taken if dot sets with AI labels (integrated procedure) or ground truth labels, respectively were input. Dot reallocation provides dot sets that are unbiased. It is recommended that these proportion estimation techniques are maintained, particularly the PA/BE because it provides the greatest precision.
YSO Jets in the Galactic Plane from UWISH2. IV. Jets and Outflows in Cygnus-X
NASA Astrophysics Data System (ADS)
Makin, S. V.; Froebrich, D.
2018-01-01
We have performed an unbiased search for outflows from young stars in Cygnus-X using 42 deg2 of data from the UKIRT Widefield Infrared Survey for H2 (UWISH2 Survey), to identify shock-excited near-IR H2 emission in the 1–0 S(1) 2.122 μm line. We uncovered 572 outflows, of which 465 are new discoveries, increasing the number of known objects by more than 430%. This large and unbiased sample allows us to statistically determine the typical properties of outflows from young stars. We found 261 bipolar outflows, and 16% of these are parsec scale. The typical bipolar outflow is 0.45 pc in length and has gaps of 0.025–0.1 pc between large knots. The median luminosity in the 1–0 S(1) line is 10‑3 {L}ȯ . The bipolar flows are typically asymmetrical, with the two lobes misaligned by 5°, one lobe 30% shorter than the other, and one lobe twice as bright as the other. Of the remaining outflows, 152 are single-sided and 159 are groups of extended, shock-excited H2 emission without identifiable driving sources. Half of all driving sources have sufficient WISE data to determine their evolutionary status as either protostars (80%) or classical T Tauri stars (20%). One-fifth of the driving sources are variable by more than 0.5 mag in the K-band continuum over several years. Several of the newly identified outflows provide excellent targets for follow-up studies. We particularly encourage the study of the outflows and young stars identified in a bright-rimmed cloud near IRAS 20294+4255, which seems to represent a textbook example of triggered star formation.
New Approach for Investigating Reaction Dynamics and Rates with Ab Initio Calculations.
Fleming, Kelly L; Tiwary, Pratyush; Pfaendtner, Jim
2016-01-21
Herein, we demonstrate a convenient approach to systematically investigate chemical reaction dynamics using the metadynamics (MetaD) family of enhanced sampling methods. Using a symmetric SN2 reaction as a model system, we applied infrequent metadynamics, a theoretical framework based on acceleration factors, to quantitatively estimate the rate of reaction from biased and unbiased simulations. A systematic study of the algorithm and its application to chemical reactions was performed by sampling over 5000 independent reaction events. Additionally, we quantitatively reweighed exhaustive free-energy calculations to obtain the reaction potential-energy surface and showed that infrequent metadynamics works to effectively determine Arrhenius-like activation energies. Exact agreement with unbiased high-temperature kinetics is also shown. The feasibility of using the approach on actual ab initio molecular dynamics calculations is then presented by using Car-Parrinello MD+MetaD to sample the same reaction using only 10-20 calculations of the rare event. Owing to the ease of use and comparatively low-cost of computation, the approach has extensive potential applications for catalysis, combustion, pyrolysis, and enzymology.
Simulation of design-unbiased point-to-particle sampling compared to alternatives on plantation rows
Thomas B. Lynch; David Hamlin; Mark J. Ducey
2016-01-01
Total quantities of tree attributes can be estimated in plantations by sampling on plantation rows using several methods. At random sample points on a row, either fixed row lengths or variable row lengths with a fixed number of sample trees can be assessed. Ratio of means or mean of ratios estimators can be developed for the fixed number of trees option but are not...
NASA Astrophysics Data System (ADS)
Peter, Emanuel K.
2017-12-01
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Peter, Emanuel K
2017-12-07
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Effects of sampling close relatives on some elementary population genetics analyses.
Wang, Jinliang
2018-01-01
Many molecular ecology analyses assume the genotyped individuals are sampled at random from a population and thus are representative of the population. Realistically, however, a sample may contain excessive close relatives (ECR) because, for example, localized juveniles are drawn from fecund species. Our knowledge is limited about how ECR affect the routinely conducted elementary genetics analyses, and how ECR are best dealt with to yield unbiased and accurate parameter estimates. This study quantifies the effects of ECR on some popular population genetics analyses of marker data, including the estimation of allele frequencies, F-statistics, expected heterozygosity (H e ), effective and observed numbers of alleles, and the tests of Hardy-Weinberg equilibrium (HWE) and linkage equilibrium (LE). It also investigates several strategies for handling ECR to mitigate their impact and to yield accurate parameter estimates. My analytical work, assisted by simulations, shows that ECR have large and global effects on all of the above marker analyses. The naïve approach of simply ignoring ECR could yield low-precision and often biased parameter estimates, and could cause too many false rejections of HWE and LE. The bold approach, which simply identifies and removes ECR, and the cautious approach, which estimates target parameters (e.g., H e ) by accounting for ECR and using naïve allele frequency estimates, eliminate the bias and the false HWE and LE rejections, but could reduce estimation precision substantially. The likelihood approach, which accounts for ECR in estimating allele frequencies and thus target parameters relying on allele frequencies, usually yields unbiased and the most accurate parameter estimates. Which of the four approaches is the most effective and efficient may depend on the particular marker analysis to be conducted. The results are discussed in the context of using marker data for understanding population properties and marker properties. © 2017 John Wiley & Sons Ltd.
Nurse Practitioners, Certified Nurse Midwives, and Physician Assistants in Physician Offices
... on Vital and Health Statistics Annual Reports Health Survey Research Methods Conference Reports from the National Medical Care ... each sample visit that takes all stages of design into account. The survey data are inflated or weighted to produce unbiased ...
Density estimation in wildlife surveys
Bart, Jonathan; Droege, Sam; Geissler, Paul E.; Peterjohn, Bruce G.; Ralph, C. John
2004-01-01
Several authors have recently discussed the problems with using index methods to estimate trends in population size. Some have expressed the view that index methods should virtually never be used. Others have responded by defending index methods and questioning whether better alternatives exist. We suggest that index methods are often a cost-effective component of valid wildlife monitoring but that double-sampling or another procedure that corrects for bias or establishes bounds on bias is essential. The common assertion that index methods require constant detection rates for trend estimation is mathematically incorrect; the requirement is no long-term trend in detection "ratios" (index result/parameter of interest), a requirement that is probably approximately met by many well-designed index surveys. We urge that more attention be given to defining bird density rigorously and in ways useful to managers. Once this is done, 4 sources of bias in density estimates may be distinguished: coverage, closure, surplus birds, and detection rates. Distance, double-observer, and removal methods do not reduce bias due to coverage, closure, or surplus birds. These methods may yield unbiased estimates of the number of birds present at the time of the survey, but only if their required assumptions are met, which we doubt occurs very often in practice. Double-sampling, in contrast, produces unbiased density estimates if the plots are randomly selected and estimates on the intensive surveys are unbiased. More work is needed, however, to determine the feasibility of double-sampling in different populations and habitats. We believe the tension that has developed over appropriate survey methods can best be resolved through increased appreciation of the mathematical aspects of indices, especially the effects of bias, and through studies in which candidate methods are evaluated against known numbers determined through intensive surveys.
Demand for health care in Denmark: results of a national sample survey using contingent valuation.
Gyldmark, M; Morrison, G C
2001-10-01
In this paper we use willingness to pay (WTP) to elicit values for private insurance covering treatment for four different health problems. By way of obtaining these values, we test the viability of the contingent valuation method (CVM) and econometric techniques, respectively, as means of eliciting and analysing values from the general public. WTP responses from a Danish national sample survey, which was designed in accordance with existing guidelines, are analysed in terms of consistency and validity checks. Large numbers of zero responses are common in WTP studies, and are found here; therefore, the Heckman selectivity model and log-transformed OLS are employed. The selectivity model is rejected, but test results indicate that the lognormal model yields efficient and unbiased estimates. The results give confidence in the WTP estimates obtained and, more generally, in CVM as a means of valuing publicly provided goods and in econometrics as a tool for analysing WTP results containing many zero responses.
A HST Search to Constrain the Binary Fraction of Stripped-Envelope Supernovae
NASA Astrophysics Data System (ADS)
Fox, Ori
2018-01-01
Stripped-envelope supernovae (e.g., SNe IIb, Ib, and Ic) refer to a subset of core-collapse explosions with progenitors that have lost some fraction of their outer envelopes in pre-SN mass loss. Mounting evidence over the past decade suggests that the mass loss in a large fraction of these systems occurs due to binary interaction. An unbiased, statistically significant sample of companion-star characteristics (including deep upper limits) can constrain the binary fraction, having direct implications on the theoretical physics of both single star and binary evolution. To date, however, only two detections have been made: SNe 1993J and 2011dh. Over the past year, we have improved this sample with an HST WFC3/NUV survey for binary companions of three additional nearby stripped-envelope SNe: 2002ap, 2001ig, and 2010br. I will present a review of previous companion searches and results from our current HST survey, which include one detection and two meaningful upper limits.
Extending large-scale forest inventories to assess urban forests.
Corona, Piermaria; Agrimi, Mariagrazia; Baffetta, Federica; Barbati, Anna; Chiriacò, Maria Vincenza; Fattorini, Lorenzo; Pompei, Enrico; Valentini, Riccardo; Mattioli, Walter
2012-03-01
Urban areas are continuously expanding today, extending their influence on an increasingly large proportion of woods and trees located in or nearby urban and urbanizing areas, the so-called urban forests. Although these forests have the potential for significantly improving the quality the urban environment and the well-being of the urban population, data to quantify the extent and characteristics of urban forests are still lacking or fragmentary on a large scale. In this regard, an expansion of the domain of multipurpose forest inventories like National Forest Inventories (NFIs) towards urban forests would be required. To this end, it would be convenient to exploit the same sampling scheme applied in NFIs to assess the basic features of urban forests. This paper considers approximately unbiased estimators of abundance and coverage of urban forests, together with estimators of the corresponding variances, which can be achieved from the first phase of most large-scale forest inventories. A simulation study is carried out in order to check the performance of the considered estimators under various situations involving the spatial distribution of the urban forests over the study area. An application is worked out on the data from the Italian NFI.
A sampling strategy to estimate the area and perimeter of irregularly shaped planar regions
Timothy G. Gregoire; Harry T. Valentine
1995-01-01
The length of a randomly oriented ray emanating from an interior point of a planar region can be used to unbiasedly estimate the region's area and perimeter. Estimators and corresponding variance estimators under various selection strategies are presented.
Overy, Catherine; Booth, George H; Blunt, N S; Shepherd, James J; Cleland, Deidre; Alavi, Ali
2014-12-28
Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamic itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.
Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.
2016-01-01
The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334
Efficient Stochastic Rendering of Static and Animated Volumes Using Visibility Sweeps.
von Radziewsky, Philipp; Kroes, Thomas; Eisemann, Martin; Eisemann, Elmar
2017-09-01
Stochastically solving the rendering integral (particularly visibility) is the de-facto standard for physically-based light transport but it is computationally expensive, especially when displaying heterogeneous volumetric data. In this work, we present efficient techniques to speed-up the rendering process via a novel visibility-estimation method in concert with an unbiased importance sampling (involving environmental lighting and visibility inside the volume), filtering, and update techniques for both static and animated scenes. Our major contributions include a progressive estimate of partial occlusions based on a fast sweeping-plane algorithm. These occlusions are stored in an octahedral representation, which can be conveniently transformed into a quadtree-based hierarchy suited for a joint importance sampling. Further, we propose sweep-space filtering, which suppresses the occurrence of fireflies and investigate different update schemes for animated scenes. Our technique is unbiased, requires little precomputation, is highly parallelizable, and is applicable to a various volume data sets, dynamic transfer functions, animated volumes and changing environmental lighting.
Meng, Yilin; Roux, Benoît
2015-08-11
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
2015-01-01
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
DOE Office of Scientific and Technical Information (OSTI.GOV)
Overy, Catherine; Blunt, N. S.; Shepherd, James J.
2014-12-28
Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamicmore » itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.« less
Faint warm debris disks around nearby bright stars explored by AKARI and IRSF
NASA Astrophysics Data System (ADS)
Ishihara, Daisuke; Takeuchi, Nami; Kobayashi, Hiroshi; Nagayama, Takahiro; Kaneda, Hidehiro; Inutsuka, Shu-ichiro; Fujiwara, Hideaki; Onaka, Takashi
2017-05-01
Context. Debris disks are important observational clues for understanding planetary-system formation process. In particular, faint warm debris disks may be related to late planet formation near 1 au. A systematic search of faint warm debris disks is necessary to reveal terrestrial planet formation. Aims: Faint warm debris disks show excess emission that peaks at mid-IR wavelengths. Thus we explore debris disks using the AKARI mid-IR all-sky point source catalog (PSC), a product of the second generation unbiased IR all-sky survey. Methods: We investigate IR excess emission for 678 isolated main-sequence stars for which there are 18 μm detections in the AKARI mid-IR all-sky catalog by comparing their fluxes with the predicted fluxes of the photospheres based on optical to near-IR fluxes and model spectra. The near-IR fluxes are first taken from the 2MASS PSC. However, 286 stars with Ks < 4.5 in our sample have large flux errors in the 2MASS photometry due to saturation. Thus we have measured accurate J, H, and Ks band fluxes, applying neutral density (ND) filters for Simultaneous InfraRed Imager for Unbiased Survey (SIRIUS) on IRSF, the φ1.4 m near-IR telescope in South Africa, and improved the flux accuracy from 14% to 1.8% on average. Results: We identified 53 debris-disk candidates including eight new detections from our sample of 678 main-sequence stars. The detection rate of debris disks for this work is 8%, which is comparable with those in previous works by Spitzer and Herschel. Conclusions: The importance of this study is the detection of faint warm debris disks around nearby field stars. At least nine objects have a large amount of dust for their ages, which cannot be explained by the conventional steady-state collisional cascade model. The full version of Table 2 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/601/A72
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
Duarte, Adam; Adams, Michael J.; Peterson, James T.
2018-01-01
Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision making. Therefore, we also discuss alternative approaches to yield unbiased estimates of population state variables using similar data types, and we stress that there is no substitute for an effective sample design that is grounded upon well-defined management objectives.
Olives, Casey; Valadez, Joseph J; Pagano, Marcello
2014-03-01
To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.
Prioritizing causal disease genes using unbiased genomic features.
Deo, Rahul C; Musso, Gabriel; Tasan, Murat; Tang, Paul; Poon, Annie; Yuan, Christiana; Felix, Janine F; Vasan, Ramachandran S; Beroukhim, Rameen; De Marco, Teresa; Kwok, Pui-Yan; MacRae, Calum A; Roth, Frederick P
2014-12-03
Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
James W. Flewelling
2009-01-01
Remotely sensed data can be used to make digital maps showing individual tree crowns (ITC) for entire forests. Attributes of the ITCs may include area, shape, height, and color. The crown map is sampled in a way that provides an unbiased linkage between ITCs and identifiable trees measured on the ground. Methods of avoiding edge bias are given. In an example from a...
Cox, Nick L J; Cami, Jan; Farhang, Amin; Smoker, Jonathan; Monreal-Ibero, Ana; Lallement, Rosine; Sarre, Peter J; Marshall, Charlotte C M; Smith, Keith T; Evans, Christopher J; Royer, Pierre; Linnartz, Harold; Cordiner, Martin A; Joblin, Christine; van Loon, Jacco Th; Foing, Bernard H; Bhatt, Neil H; Bron, Emeric; Elyajouri, Meriem; de Koter, Alex; Ehrenfreund, Pascale; Javadi, Atefeh; Kaper, Lex; Khosroshadi, Habib G; Laverick, Mike; Le Petit, Franck; Mulas, Giacomo; Roueff, Evelyne; Salama, Farid; Spaans, Marco
2017-10-01
The carriers of the diffuse interstellar bands (DIBs) are largely unidentified molecules ubiquitously present in the interstellar medium (ISM). After decades of study, two strong and possibly three weak near-infrared DIBs have recently been attributed to the [Formula: see text] fullerene based on observational and laboratory measurements. There is great promise for the identification of the over 400 other known DIBs, as this result could provide chemical hints towards other possible carriers. In an effort to systematically study the properties of the DIB carriers, we have initiated a new large-scale observational survey: the ESO Diffuse Interstellar Bands Large Exploration Survey (EDIBLES). The main objective is to build on and extend existing DIB surveys to make a major step forward in characterising the physical and chemical conditions for a statistically significant sample of interstellar lines-of-sight, with the goal to reverse-engineer key molecular properties of the DIB carriers. EDIBLES is a filler Large Programme using the Ultraviolet and Visual Echelle Spectrograph at the Very Large Telescope at Paranal, Chile. It is designed to provide an observationally unbiased view of the presence and behaviour of the DIBs towards early-spectral type stars whose lines-of-sight probe the diffuse-to-translucent ISM. Such a complete dataset will provide a deep census of the atomic and molecular content, physical conditions, chemical abundances and elemental depletion levels for each sightline. Achieving these goals requires a homogeneous set of high-quality data in terms of resolution ( R ~ 70 000 - 100 000), sensitivity (S/N up to 1000 per resolution element), and spectral coverage (305-1042 nm), as well as a large sample size (100+ sightlines). In this first paper the goals, objectives and methodology of the EDIBLES programme are described and an initial assessment of the data is provided.
NASA Astrophysics Data System (ADS)
Cox, Nick L. J.; Cami, Jan; Farhang, Amin; Smoker, Jonathan; Monreal-Ibero, Ana; Lallement, Rosine; Sarre, Peter J.; Marshall, Charlotte C. M.; Smith, Keith T.; Evans, Christopher J.; Royer, Pierre; Linnartz, Harold; Cordiner, Martin A.; Joblin, Christine; van Loon, Jacco Th.; Foing, Bernard H.; Bhatt, Neil H.; Bron, Emeric; Elyajouri, Meriem; de Koter, Alex; Ehrenfreund, Pascale; Javadi, Atefeh; Kaper, Lex; Khosroshadi, Habib G.; Laverick, Mike; Le Petit, Franck; Mulas, Giacomo; Roueff, Evelyne; Salama, Farid; Spaans, Marco
2017-10-01
The carriers of the diffuse interstellar bands (DIBs) are largely unidentified molecules ubiquitously present in the interstellar medium (ISM). After decades of study, two strong and possibly three weak near-infrared DIBs have recently been attributed to the C60^+ fullerene based on observational and laboratory measurements. There is great promise for the identification of the over 400 other known DIBs, as this result could provide chemical hints towards other possible carriers. In an effort tosystematically study the properties of the DIB carriers, we have initiated a new large-scale observational survey: the ESO Diffuse Interstellar Bands Large Exploration Survey (EDIBLES). The main objective is to build on and extend existing DIB surveys to make a major step forward in characterising the physical and chemical conditions for a statistically significant sample of interstellar lines-of-sight, with the goal to reverse-engineer key molecular properties of the DIB carriers. EDIBLES is a filler Large Programme using the Ultraviolet and Visual Echelle Spectrograph at the Very Large Telescope at Paranal, Chile. It is designed to provide an observationally unbiased view of the presence and behaviour of the DIBs towards early-spectral-type stars whose lines-of-sight probe the diffuse-to-translucent ISM. Such a complete dataset will provide a deep census of the atomic and molecular content, physical conditions, chemical abundances and elemental depletion levels for each sightline. Achieving these goals requires a homogeneous set of high-quality data in terms of resolution (R 70 000-100 000), sensitivity (S/N up to 1000 per resolution element), and spectral coverage (305-1042 nm), as well as a large sample size (100+ sightlines). In this first paper the goals, objectives and methodology of the EDIBLES programme are described and an initial assessment of the data is provided.
Cox, Nick L. J.; Cami, Jan; Farhang, Amin; Smoker, Jonathan; Monreal-Ibero, Ana; Lallement, Rosine; Sarre, Peter J.; Marshall, Charlotte C. M.; Smith, Keith T.; Evans, Christopher J.; Royer, Pierre; Linnartz, Harold; Cordiner, Martin A.; Joblin, Christine; van Loon, Jacco Th.; Foing, Bernard H.; Bhatt, Neil H.; Bron, Emeric; Elyajouri, Meriem; de Koter, Alex; Ehrenfreund, Pascale; Javadi, Atefeh; Kaper, Lex; Khosroshadi, Habib G.; Laverick, Mike; Le Petit, Franck; Mulas, Giacomo; Roueff, Evelyne; Salama, Farid; Spaans, Marco
2017-01-01
The carriers of the diffuse interstellar bands (DIBs) are largely unidentified molecules ubiquitously present in the interstellar medium (ISM). After decades of study, two strong and possibly three weak near-infrared DIBs have recently been attributed to the C60+ fullerene based on observational and laboratory measurements. There is great promise for the identification of the over 400 other known DIBs, as this result could provide chemical hints towards other possible carriers. In an effort to systematically study the properties of the DIB carriers, we have initiated a new large-scale observational survey: the ESO Diffuse Interstellar Bands Large Exploration Survey (EDIBLES). The main objective is to build on and extend existing DIB surveys to make a major step forward in characterising the physical and chemical conditions for a statistically significant sample of interstellar lines-of-sight, with the goal to reverse-engineer key molecular properties of the DIB carriers. EDIBLES is a filler Large Programme using the Ultraviolet and Visual Echelle Spectrograph at the Very Large Telescope at Paranal, Chile. It is designed to provide an observationally unbiased view of the presence and behaviour of the DIBs towards early-spectral type stars whose lines-of-sight probe the diffuse-to-translucent ISM. Such a complete dataset will provide a deep census of the atomic and molecular content, physical conditions, chemical abundances and elemental depletion levels for each sightline. Achieving these goals requires a homogeneous set of high-quality data in terms of resolution (R ~ 70 000 – 100 000), sensitivity (S/N up to 1000 per resolution element), and spectral coverage (305–1042 nm), as well as a large sample size (100+ sightlines). In this first paper the goals, objectives and methodology of the EDIBLES programme are described and an initial assessment of the data is provided. PMID:29151608
Mutually unbiased projectors and duality between lines and bases in finite quantum systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shalaby, M.; Vourdas, A., E-mail: a.vourdas@bradford.ac.uk
2013-10-15
Quantum systems with variables in the ring Z(d) are considered, and the concepts of weak mutually unbiased bases and mutually unbiased projectors are discussed. The lines through the origin in the Z(d)×Z(d) phase space, are classified into maximal lines (sets of d points), and sublines (sets of d{sub i} points where d{sub i}|d). The sublines are intersections of maximal lines. It is shown that there exists a duality between the properties of lines (resp., sublines), and the properties of weak mutually unbiased bases (resp., mutually unbiased projectors). -- Highlights: •Lines in discrete phase space. •Bases in finite quantum systems. •Dualitymore » between bases and lines. •Weak mutually unbiased bases.« less
2016-01-01
We report a theoretical description and numerical tests of the extended-system adaptive biasing force method (eABF), together with an unbiased estimator of the free energy surface from eABF dynamics. Whereas the original ABF approach uses its running estimate of the free energy gradient as the adaptive biasing force, eABF is built on the idea that the exact free energy gradient is not necessary for efficient exploration, and that it is still possible to recover the exact free energy separately with an appropriate estimator. eABF does not directly bias the collective coordinates of interest, but rather fictitious variables that are harmonically coupled to them; therefore is does not require second derivative estimates, making it easily applicable to a wider range of problems than ABF. Furthermore, the extended variables present a smoother, coarse-grain-like sampling problem on a mollified free energy surface, leading to faster exploration and convergence. We also introduce CZAR, a simple, unbiased free energy estimator from eABF trajectories. eABF/CZAR converges to the physical free energy surface faster than standard ABF for a wide range of parameters. PMID:27959559
Postmortem structural studies of the thalamus in schizophrenia
Dorph-Petersen, Karl-Anton; Lewis, David A.
2017-01-01
In this review, we seek to answer the following question: Do findings in the current literature support the idea that thalamo-cortical dysfunction in schizophrenia is due to structural abnormalities in the thalamus? We base our review on the existing literature of design-unbiased stereological studies of the postmortem thalamus from subjects with schizophrenia. Thus, all reported results are based upon the use of unbiased principles of sampling to determine volume and/or total cell numbers of thalamus or its constituent nuclei. We found 28 such papers covering 26 studies. In a series of tables we list all positive and negative findings from the total thalamus, the mediodorsal, pulvinar and anterior nuclei, as well as less frequently studied thalamic regions. Only four studies examined the entire thalamus and the results were inconsistent. We found largely consistent evidence for structural changes (reduced volume and cell numbers) in the pulvinar located in the posterior thalamus. In contrast, findings in the mediodorsal thalamic nucleus are inconsistent, with the largest and most recent studies generally failing to support earlier reports of a lower number of neurons in schizophrenia. Thus, the current findings of stereological studies of the thalamus in schizophrenia support the idea that thalamo-cortical dysfunction in schizophrenia might be attributable, at least in part, to structural alterations in the pulvinar that could impair thalamic inputs to higher order cortical association areas in the frontal and parietal lobes. However, more studies are needed before robust conclusions can be drawn. PMID:27567291
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2013-09-04
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Kärkkäinen, Hanni P.; Sillanpää, Mikko J.
2013-01-01
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
A molecular simulation protocol to avoid sampling redundancy and discover new states.
Bacci, Marco; Vitalis, Andreas; Caflisch, Amedeo
2015-05-01
For biomacromolecules or their assemblies, experimental knowledge is often restricted to specific states. Ambiguity pervades simulations of these complex systems because there is no prior knowledge of relevant phase space domains, and sampling recurrence is difficult to achieve. In molecular dynamics methods, ruggedness of the free energy surface exacerbates this problem by slowing down the unbiased exploration of phase space. Sampling is inefficient if dwell times in metastable states are large. We suggest a heuristic algorithm to terminate and reseed trajectories run in multiple copies in parallel. It uses a recent method to order snapshots, which provides notions of "interesting" and "unique" for individual simulations. We define criteria to guide the reseeding of runs from more "interesting" points if they sample overlapping regions of phase space. Using a pedagogical example and an α-helical peptide, the approach is demonstrated to amplify the rate of exploration of phase space and to discover metastable states not found by conventional sampling schemes. Evidence is provided that accurate kinetics and pathways can be extracted from the simulations. The method, termed PIGS for Progress Index Guided Sampling, proceeds in unsupervised fashion, is scalable, and benefits synergistically from larger numbers of replicas. Results confirm that the underlying ideas are appropriate and sufficient to enhance sampling. In molecular simulations, errors caused by not exploring relevant domains in phase space are always unquantifiable and can be arbitrarily large. Our protocol adds to the toolkit available to researchers in reducing these types of errors. This article is part of a Special Issue entitled "Recent developments of molecular dynamics". Copyright © 2014 Elsevier B.V. All rights reserved.
APPLICATION OF A MULTIPURPOSE UNEQUAL-PROBABILITY STREAM SURVEY IN THE MID-ATLANTIC COASTAL PLAIN
A stratified random sample with unequal-probability selection was used to design a multipurpose survey of headwater streams in the Mid-Atlantic Coastal Plain. Objectives for data from the survey include unbiased estimates of regional stream conditions, and adequate coverage of un...
Forest inventory and stratified estimation: a cautionary note
John Coulston
2008-01-01
The Forest Inventory and Analysis (FIA) Program uses stratified estimation techniques to produce estimates of forest attributes. Stratification must be unbiased and stratification procedures should be examined to identify any potential bias. This note explains simple techniques for identifying potential bias, discriminating between sample bias and stratification bias,...
VizieR Online Data Catalog: Formamide detection with ASAI-IRAM (Lopez-Sepulcre+, 2015)
NASA Astrophysics Data System (ADS)
Lopez-Sepulcre, A.; Jaber, A. A.; Mendoza, E.; Lefloch, B.; Ceccarelli, C.; Vastel, C.; Bachiller, R.; Cernicharo, J.; Codella, C.; Kahane, C.; Kama, M.; Tafalla, M.
2017-11-01
Our source sample consists of 10 well-known pre-stellar and protostellar objects representing different masses and evolutionary states, thus providing a complete view of the various types of objects encountered along the first phases of star formation. The data presented in this work were acquired with the IRAM 30-m telescope near Pico Veleta (Spain) and consist of unbiased spectral surveys at millimetre wavelengths. These are part of the Large Programme ASAI, whose observations and data reduction procedures will be presented in detail in an article by Lefloch & Bachiller (in preparation). Briefly, we gathered the spectral data in several observing runs between 2011 and 2014 using the EMIR receivers at 3 mm (80-116 GHz), 2 mm (129-173 GHz), and 1.3 mm (200-276 GHz). (13 data files).
Božičević, Alen; Dobrzyński, Maciej; De Bie, Hans; Gafner, Frank; Garo, Eliane; Hamburger, Matthias
2017-12-05
The technological development of LC-MS instrumentation has led to significant improvements of performance and sensitivity, enabling high-throughput analysis of complex samples, such as plant extracts. Most software suites allow preprocessing of LC-MS chromatograms to obtain comprehensive information on single constituents. However, more advanced processing needs, such as the systematic and unbiased comparative metabolite profiling of large numbers of complex LC-MS chromatograms remains a challenge. Currently, users have to rely on different tools to perform such data analyses. We developed a two-step protocol comprising a comparative metabolite profiling tool integrated in ACD/MS Workbook Suite, and a web platform developed in R language designed for clustering and visualization of chromatographic data. Initially, all relevant chromatographic and spectroscopic data (retention time, molecular ions with the respective ion abundance, and sample names) are automatically extracted and assembled in an Excel spreadsheet. The file is then loaded into an online web application that includes various statistical algorithms and provides the user with tools to compare and visualize the results in intuitive 2D heatmaps. We applied this workflow to LC-ESIMS profiles obtained from 69 honey samples. Within few hours of calculation with a standard PC, honey samples were preprocessed and organized in clusters based on their metabolite profile similarities, thereby highlighting the common metabolite patterns and distributions among samples. Implementation in the ACD/Laboratories software package enables ulterior integration of other analytical data, and in silico prediction tools for modern drug discovery.
Sampling bias in blending validation and a different approach to homogeneity assessment.
Kraemer, J; Svensson, J R; Melgaard, H
1999-02-01
Sampling of batches studied for validation is reported. A thief particularly suited for granules, rather than cohesive powders, was used in the study. It is shown, as has been demonstrated in the past, that traditional 1x to 3x thief sampling of a blend is biased, and that the bias decreases as the sample size increases. It is shown that taking 50 samples of tablets after blending and testing this subpopulation for normality is a discriminating manner of testing for homogeneity. As a criterion, it is better than sampling at mixer or drum stage would be even if an unbiased sampling device were available.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Travaglini, Davide; Fattorini, Lorenzo; Barbati, Anna; Bottalico, Francesca; Corona, Piermaria; Ferretti, Marco; Chirici, Gherardo
2013-04-01
A correct characterization of the status and trend of forest condition is essential to support reporting processes at national and international level. An international forest condition monitoring has been implemented in Europe since 1987 under the auspices of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests). The monitoring is based on harmonized methodologies, with individual countries being responsible for its implementation. Due to inconsistencies and problems in sampling design, however, the ICP Forests network is not able to produce reliable quantitative estimates of forest condition at European and sometimes at country level. This paper proposes (1) a set of requirements for status and change assessment and (2) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and of their changes at both country and European level. Under the assumption that a common definition of forest holds among European countries, monitoring objectives, parameters of concern and accuracy indexes are stated. On the basis of fixed-area plot sampling performed independently in each country, an unbiased and consistent estimator of forest defoliation indexes is obtained at both country and European level, together with conservative estimators of their sampling variance and power in the detection of changes. The strategy adopts a probabilistic sampling scheme based on fixed-area plots selected by means of systematic or stratified schemes. Operative guidelines for its application are provided.
Brandon M. Collins; Richard G. Everett; Scott L. Stephens
2011-01-01
We re-sampled areas included in an unbiased 1911 timber inventory conducted by the U.S. Forest Service over a 4000 ha study area. Over half of the re-sampled area burned in relatively recent management- and lightning-ignited fires. This allowed for comparisons of both areas that have experienced recent fire and areas with no recent fire, to the same areas historically...
A four-alternative forced choice (4AFC) software for observer performance evaluation in radiology
NASA Astrophysics Data System (ADS)
Zhang, Guozhi; Cockmartin, Lesley; Bosmans, Hilde
2016-03-01
Four-alternative forced choice (4AFC) test is a psychophysical method that can be adopted for observer performance evaluation in radiological studies. While the concept of this method is well established, difficulties to handle large image data, perform unbiased sampling, and keep track of the choice made by the observer have restricted its application in practice. In this work, we propose an easy-to-use software that can help perform 4AFC tests with DICOM images. The software suits for any experimental design that follows the 4AFC approach. It has a powerful image viewing system that favorably simulates the clinical reading environment. The graphical interface allows the observer to adjust various viewing parameters and perform the selection with very simple operations. The sampling process involved in 4AFC as well as the speed and accuracy of the choice made by the observer is precisely monitored in the background and can be easily exported for test analysis. The software has also a defensive mechanism for data management and operation control that minimizes the possibility of mistakes from user during the test. This software can largely facilitate the use of 4AFC approach in radiological observer studies and is expected to have widespread applicability.
A Test-Length Correction to the Estimation of Extreme Proficiency Levels
ERIC Educational Resources Information Center
Magis, David; Beland, Sebastien; Raiche, Gilles
2011-01-01
In this study, the estimation of extremely large or extremely small proficiency levels, given the item parameters of a logistic item response model, is investigated. On one hand, the estimation of proficiency levels by maximum likelihood (ML), despite being asymptotically unbiased, may yield infinite estimates. On the other hand, with an…
Multi-Armed RCTs: A Design-Based Framework. NCEE 2017-4027
ERIC Educational Resources Information Center
Schochet, Peter Z.
2017-01-01
Design-based methods have recently been developed as a way to analyze data from impact evaluations of interventions, programs, and policies (Imbens and Rubin, 2015; Schochet, 2015, 2016). The estimators are derived using the building blocks of experimental designs with minimal assumptions, and are unbiased and normally distributed in large samples…
Design Difficulties in Stand Density Studies
Frank A. Bennett
1969-01-01
Designing unbiased stand density studies is difficult. An acceptable sample requires stratification of the plots of age, site, and density. When basal area, percent stocking, or Reineke's stand density index is used as the density measure, this stratification forces a high negative correlation between site and number of trees per acre. Mortality in trees per acre...
Problems and Limitations in Studies on Screening for Language Delay
ERIC Educational Resources Information Center
Eriksson, Marten; Westerlund, Monica; Miniscalco, Carmela
2010-01-01
This study discusses six common methodological limitations in screening for language delay (LD) as illustrated in 11 recent studies. The limitations are (1) whether the studies define a target population, (2) whether the recruitment procedure is unbiased, (3) attrition, (4) verification bias, (5) small sample size and (6) inconsistencies in choice…
The Importance of Contamination Knowledge in Curation - Insights into Mars Sample Return
NASA Technical Reports Server (NTRS)
Harrington, A. D.; Calaway, M. J.; Regberg, A. B.; Mitchell, J. L.; Fries, M. D.; Zeigler, R. A.; McCubbin, F. M.
2018-01-01
The Astromaterials Acquisition and Curation Office at NASA Johnson Space Center (JSC), in Houston, TX (henceforth Curation Office) manages the curation of extraterrestrial samples returned by NASA missions and shared collections from international partners, preserving their integrity for future scientific study while providing the samples to the international community in a fair and unbiased way. The Curation Office also curates flight and non-flight reference materials and other materials from spacecraft assembly (e.g., lubricants, paints and gases) of sample return missions that would have the potential to cross-contaminate a present or future NASA astromaterials collection.
Filaments from the galaxy distribution and from the velocity field in the local universe
NASA Astrophysics Data System (ADS)
Libeskind, Noam I.; Tempel, Elmo; Hoffman, Yehuda; Tully, R. Brent; Courtois, Hélène
2015-10-01
The cosmic web that characterizes the large-scale structure of the Universe can be quantified by a variety of methods. For example, large redshift surveys can be used in combination with point process algorithms to extract long curvilinear filaments in the galaxy distribution. Alternatively, given a full 3D reconstruction of the velocity field, kinematic techniques can be used to decompose the web into voids, sheets, filaments and knots. In this Letter, we look at how two such algorithms - the Bisous model and the velocity shear web - compare with each other in the local Universe (within 100 Mpc), finding good agreement. This is both remarkable and comforting, given that the two methods are radically different in ideology and applied to completely independent and different data sets. Unsurprisingly, the methods are in better agreement when applied to unbiased and complete data sets, like cosmological simulations, than when applied to observational samples. We conclude that more observational data is needed to improve on these methods, but that both methods are most likely properly tracing the underlying distribution of matter in the Universe.
Towards the automatic detection and analysis of sunspot rotation
NASA Astrophysics Data System (ADS)
Brown, Daniel S.; Walker, Andrew P.
2016-10-01
Torsional rotation of sunspots have been noted by many authors over the past century. Sunspots have been observed to rotate up to the order of 200 degrees over 8-10 days, and these have often been linked with eruptive behaviour such as solar flares and coronal mass ejections. However, most studies in the literature are case studies or small-number studies which suffer from selection bias. In order to better understand sunspot rotation and its impact on the corona, unbiased large-sample statistical studies are required (including both rotating and non-rotating sunspots). While this can be done manually, a better approach is to automate the detection and analysis of rotating sunspots using robust methods with well characterised uncertainties. The SDO/HMI instrument provide long-duration, high-resolution and high-cadence continuum observations suitable for extracting a large number of examples of rotating sunspots. This presentation will outline the analysis of SDI/HMI data to determine the rotation (and non-rotation) profiles of sunspots for the complete duration of their transit across the solar disk, along with how this can be extended to automatically identify sunspots and initiate their analysis.
Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi
2015-09-01
Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Estimating population trends with a linear model
Bart, Jonathan; Collins, Brian D.; Morrison, R.I.G.
2003-01-01
We describe a simple and robust method for estimating trends in population size. The method may be used with Breeding Bird Survey data, aerial surveys, point counts, or any other program of repeated surveys at permanent locations. Surveys need not be made at each location during each survey period. The method differs from most existing methods in being design based, rather than model based. The only assumptions are that the nominal sampling plan is followed and that sample size is large enough for use of the t-distribution. Simulations based on two bird data sets from natural populations showed that the point estimate produced by the linear model was essentially unbiased even when counts varied substantially and 25% of the complete data set was missing. The estimating-equation approach, often used to analyze Breeding Bird Survey data, performed similarly on one data set but had substantial bias on the second data set, in which counts were highly variable. The advantages of the linear model are its simplicity, flexibility, and that it is self-weighting. A user-friendly computer program to carry out the calculations is available from the senior author.
MODELING LEFT-TRUNCATED AND RIGHT-CENSORED SURVIVAL DATA WITH LONGITUDINAL COVARIATES
Su, Yu-Ru; Wang, Jane-Ling
2018-01-01
There is a surge in medical follow-up studies that include longitudinal covariates in the modeling of survival data. So far, the focus has been largely on right censored survival data. We consider survival data that are subject to both left truncation and right censoring. Left truncation is well known to produce biased sample. The sampling bias issue has been resolved in the literature for the case which involves baseline or time-varying covariates that are observable. The problem remains open however for the important case where longitudinal covariates are present in survival models. A joint likelihood approach has been shown in the literature to provide an effective way to overcome those difficulties for right censored data, but this approach faces substantial additional challenges in the presence of left truncation. Here we thus propose an alternative likelihood to overcome these difficulties and show that the regression coefficient in the survival component can be estimated unbiasedly and efficiently. Issues about the bias for the longitudinal component are discussed. The new approach is illustrated numerically through simulations and data from a multi-center AIDS cohort study. PMID:29479122
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomas, Robert E.; Overy, Catherine; Opalka, Daniel
Unbiased stochastic sampling of the one- and two-body reduced density matrices is achieved in full configuration interaction quantum Monte Carlo with the introduction of a second, “replica” ensemble of walkers, whose population evolves in imaginary time independently from the first and which entails only modest additional computational overheads. The matrices obtained from this approach are shown to be representative of full configuration-interaction quality and hence provide a realistic opportunity to achieve high-quality results for a range of properties whose operators do not necessarily commute with the Hamiltonian. A density-matrix formulated quasi-variational energy estimator having been already proposed and investigated, themore » present work extends the scope of the theory to take in studies of analytic nuclear forces, molecular dipole moments, and polarisabilities, with extensive comparison to exact results where possible. These new results confirm the suitability of the sampling technique and, where sufficiently large basis sets are available, achieve close agreement with experimental values, expanding the scope of the method to new areas of investigation.« less
NIH Peer Review: Scored Review Criteria and Overall Impact
ERIC Educational Resources Information Center
Lindner, Mark D.; Vancea, Adrian; Chen, Mei-Ching; Chacko, George
2016-01-01
The National Institutes of Health (NIH) is the largest source of funding for biomedical research in the world. Funding decisions are made largely based on the outcome of a peer review process that is intended to provide a fair, equitable, timely, and unbiased review of the quality, scientific merit, and potential impact of the research. There have…
Kistner, Emily O; Muller, Keith E
2004-09-01
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact results allow calculating the exact distribution function and other properties of intraclass correlation and Cronbach's alpha, for Gaussian data with any covariance pattern, not just compound symmetry. Probabilities are computed in terms of the distribution function of a weighted sum of independent chi-square random variables. New F approximations for the distribution functions of intraclass correlation and Cronbach's alpha are much simpler and faster to compute than the exact forms. Assuming the covariance matrix is known, the approximations typically provide sufficient accuracy, even with as few as ten observations. Either the exact or approximate distributions may be used to create confidence intervals around an estimate of reliability. Monte Carlo simulations led to a number of conclusions. Correctly assuming that the covariance matrix is compound symmetric leads to accurate confidence intervals, as was expected from previously known results. However, assuming and estimating a general covariance matrix produces somewhat optimistically narrow confidence intervals with 10 observations. Increasing sample size to 100 gives essentially unbiased coverage. Incorrectly assuming compound symmetry leads to pessimistically large confidence intervals, with pessimism increasing with sample size. In contrast, incorrectly assuming general covariance introduces only a modest optimistic bias in small samples. Hence the new methods seem preferable for creating confidence intervals, except when compound symmetry definitely holds.
NASA Astrophysics Data System (ADS)
Nüske, Feliks; Wu, Hao; Prinz, Jan-Hendrik; Wehmeyer, Christoph; Clementi, Cecilia; Noé, Frank
2017-03-01
Many state-of-the-art methods for the thermodynamic and kinetic characterization of large and complex biomolecular systems by simulation rely on ensemble approaches, where data from large numbers of relatively short trajectories are integrated. In this context, Markov state models (MSMs) are extremely popular because they can be used to compute stationary quantities and long-time kinetics from ensembles of short simulations, provided that these short simulations are in "local equilibrium" within the MSM states. However, over the last 15 years since the inception of MSMs, it has been controversially discussed and not yet been answered how deviations from local equilibrium can be detected, whether these deviations induce a practical bias in MSM estimation, and how to correct for them. In this paper, we address these issues: We systematically analyze the estimation of MSMs from short non-equilibrium simulations, and we provide an expression for the error between unbiased transition probabilities and the expected estimate from many short simulations. We show that the unbiased MSM estimate can be obtained even from relatively short non-equilibrium simulations in the limit of long lag times and good discretization. Further, we exploit observable operator model (OOM) theory to derive an unbiased estimator for the MSM transition matrix that corrects for the effect of starting out of equilibrium, even when short lag times are used. Finally, we show how the OOM framework can be used to estimate the exact eigenvalues or relaxation time scales of the system without estimating an MSM transition matrix, which allows us to practically assess the discretization quality of the MSM. Applications to model systems and molecular dynamics simulation data of alanine dipeptide are included for illustration. The improved MSM estimator is implemented in PyEMMA of version 2.3.
Z-rich solar particle event characteristics 1972-1976
NASA Technical Reports Server (NTRS)
Zwickl, R. D.; Roelof, E. C.; Gold, R. E.; Krimigis, S. M.; Armstrong, T. P.
1978-01-01
It is found in the reported investigation that Z-rich solar particle events usually have large and prolonged anisotropies in addition to an extremely variable charge composition that varies not only from event to event but also throughout the event. These observations suggest that one can no longer regard the event-averaged composition of solar particle events at low energies as providing an unbiased global sample of the solar atmospheric composition. The variability from event to event and among classes of events is just too great. However, the tendency for the Z-rich events to be associated with both the low-speed solar wind at or just before the onset of solar wind streams and with active regions located in the western hemisphere, indicates that charge composition studies of solar particle events can yield a better knowledge of the flare acceleration process as well as the inhomogeneous nature of magnetic field structure and particle composition in the solar atmosphere.
Rendall, Michael S.; Ghosh-Dastidar, Bonnie; Weden, Margaret M.; Baker, Elizabeth H.; Nazarov, Zafar
2013-01-01
Within-survey multiple imputation (MI) methods are adapted to pooled-survey regression estimation where one survey has more regressors, but typically fewer observations, than the other. This adaptation is achieved through: (1) larger numbers of imputations to compensate for the higher fraction of missing values; (2) model-fit statistics to check the assumption that the two surveys sample from a common universe; and (3) specificying the analysis model completely from variables present in the survey with the larger set of regressors, thereby excluding variables never jointly observed. In contrast to the typical within-survey MI context, cross-survey missingness is monotonic and easily satisfies the Missing At Random (MAR) assumption needed for unbiased MI. Large efficiency gains and substantial reduction in omitted variable bias are demonstrated in an application to sociodemographic differences in the risk of child obesity estimated from two nationally-representative cohort surveys. PMID:24223447
VizieR Online Data Catalog: MYStIX candidate protostars (Romine+, 2016)
NASA Astrophysics Data System (ADS)
Romine, G.; Feigelson, E. D.; Getman, K. V.; Kuhn, M. A.; Povich, M. S.
2017-04-01
The present study seeks protostars from the Massive Young Star-forming complex in Infrared and X-ray (MYStIX) survey catalogs. We combine objects with protostellar infrared SEDs and 4.5um excesses with X-ray sources exhibiting ultrahard spectra denoting very heavy obscuration. These criteria filter away nearly all of the older Class II-III stars and contaminant populations, but give very incomplete samples. The result is a list of 1109 protostellar candidates in 14 star-forming regions. See sections 1 and 2 for further explanations. The reliability of the catalog is strengthened because a large majority (86%) are found to be associated with dense cores seen in Herschel 500um maps that trace cold dust emission. However, the candidate list requires more detailed study for confirmation and cannot be viewed as an unbiased view of star formation in the clouds. (3 data files).
Whole-Brain Microscopy Meets In Vivo Neuroimaging: Techniques, Benefits, and Limitations.
Aswendt, Markus; Schwarz, Martin; Abdelmoula, Walid M; Dijkstra, Jouke; Dedeurwaerdere, Stefanie
2017-02-01
Magnetic resonance imaging, positron emission tomography, and optical imaging have emerged as key tools to understand brain function and neurological disorders in preclinical mouse models. They offer the unique advantage of monitoring individual structural and functional changes over time. What remained unsolved until recently was to generate whole-brain microscopy data which can be correlated to the 3D in vivo neuroimaging data. Conventional histological sections are inappropriate especially for neuronal tracing or the unbiased screening for molecular targets through the whole brain. As part of the European Society for Molecular Imaging (ESMI) meeting 2016 in Utrecht, the Netherlands, we addressed this issue in the Molecular Neuroimaging study group meeting. Presentations covered new brain clearing methods, light sheet microscopes for large samples, and automatic registration of microscopy to in vivo imaging data. In this article, we summarize the discussion; give an overview of the novel techniques; and discuss the practical needs, benefits, and limitations.
From metadynamics to dynamics.
Tiwary, Pratyush; Parrinello, Michele
2013-12-06
Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.
NASA Astrophysics Data System (ADS)
Tiwary, Pratyush; Parrinello, Michele
2013-12-01
Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.
Seifert, Erin L; Fiehn, Oliver; Bezaire, Véronic; Bickel, David R; Wohlgemuth, Gert; Adams, Sean H; Harper, Mary-Ellen
2010-03-24
Incomplete or limited long-chain fatty acid (LCFA) combustion in skeletal muscle has been associated with insulin resistance. Signals that are responsive to shifts in LCFA beta-oxidation rate or degree of intramitochondrial catabolism are hypothesized to regulate second messenger systems downstream of the insulin receptor. Recent evidence supports a causal link between mitochondrial LCFA combustion in skeletal muscle and insulin resistance. We have used unbiased metabolite profiling of mouse muscle mitochondria with the aim of identifying candidate metabolites within or effluxed from mitochondria and that are shifted with LCFA combustion rate. Large-scale unbiased metabolomics analysis was performed using GC/TOF-MS on buffer and mitochondrial matrix fractions obtained prior to and after 20 min of palmitate catabolism (n = 7 mice/condition). Three palmitate concentrations (2, 9 and 19 microM; corresponding to low, intermediate and high oxidation rates) and 9 microM palmitate plus tricarboxylic acid (TCA) cycle and electron transport chain inhibitors were each tested and compared to zero palmitate control incubations. Paired comparisons of the 0 and 20 min samples were made by Student's t-test. False discovery rate were estimated and Type I error rates assigned. Major metabolite groups were organic acids, amines and amino acids, free fatty acids and sugar phosphates. Palmitate oxidation was associated with unique profiles of metabolites, a subset of which correlated to palmitate oxidation rate. In particular, palmitate oxidation rate was associated with distinct changes in the levels of TCA cycle intermediates within and effluxed from mitochondria. This proof-of-principle study establishes that large-scale metabolomics methods can be applied to organelle-level models to discover metabolite patterns reflective of LCFA combustion, which may lead to identification of molecules linking muscle fat metabolism and insulin signaling. Our results suggest that future studies should focus on the fate of effluxed TCA cycle intermediates and on mechanisms ensuring their replenishment during LCFA metabolism in skeletal muscle.
Seifert, Erin L.; Fiehn, Oliver; Bezaire, Véronic; Bickel, David R.; Wohlgemuth, Gert; Adams, Sean H.; Harper, Mary-Ellen
2010-01-01
Background/Aim Incomplete or limited long-chain fatty acid (LCFA) combustion in skeletal muscle has been associated with insulin resistance. Signals that are responsive to shifts in LCFA β-oxidation rate or degree of intramitochondrial catabolism are hypothesized to regulate second messenger systems downstream of the insulin receptor. Recent evidence supports a causal link between mitochondrial LCFA combustion in skeletal muscle and insulin resistance. We have used unbiased metabolite profiling of mouse muscle mitochondria with the aim of identifying candidate metabolites within or effluxed from mitochondria and that are shifted with LCFA combustion rate. Methodology/Principal Findings Large-scale unbiased metabolomics analysis was performed using GC/TOF-MS on buffer and mitochondrial matrix fractions obtained prior to and after 20 min of palmitate catabolism (n = 7 mice/condition). Three palmitate concentrations (2, 9 and 19 µM; corresponding to low, intermediate and high oxidation rates) and 9 µM palmitate plus tricarboxylic acid (TCA) cycle and electron transport chain inhibitors were each tested and compared to zero palmitate control incubations. Paired comparisons of the 0 and 20 min samples were made by Student's t-test. False discovery rate were estimated and Type I error rates assigned. Major metabolite groups were organic acids, amines and amino acids, free fatty acids and sugar phosphates. Palmitate oxidation was associated with unique profiles of metabolites, a subset of which correlated to palmitate oxidation rate. In particular, palmitate oxidation rate was associated with distinct changes in the levels of TCA cycle intermediates within and effluxed from mitochondria. Conclusions/Significance This proof-of-principle study establishes that large-scale metabolomics methods can be applied to organelle-level models to discover metabolite patterns reflective of LCFA combustion, which may lead to identification of molecules linking muscle fat metabolism and insulin signaling. Our results suggest that future studies should focus on the fate of effluxed TCA cycle intermediates and on mechanisms ensuring their replenishment during LCFA metabolism in skeletal muscle. PMID:20352092
Unbiased feature selection in learning random forests for high-dimensional data.
Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi
2015-01-01
Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.
Gupta, Manan; Joshi, Amitabh; Vidya, T N C
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species.
Joshi, Amitabh; Vidya, T. N. C.
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species. PMID:28306735
Aigner, Annette; Grittner, Ulrike; Becher, Heiko
2018-01-01
Low response rates in epidemiologic research potentially lead to the recruitment of a non-representative sample of controls in case-control studies. Problems in the unbiased estimation of odds ratios arise when characteristics causing the probability of participation are associated with exposure and outcome. This is a specific setting of selection bias and a realistic hazard in many case-control studies. This paper formally describes the problem and shows its potential extent, reviews existing approaches for bias adjustment applicable under certain conditions, compares and applies them. We focus on two scenarios: a characteristic C causing differential participation of controls is linked to the outcome through its association with risk factor E (scenario I), and C is additionally a genuine risk factor itself (scenario II). We further assume external data sources are available which provide an unbiased estimate of C in the underlying population. Given these scenarios, we (i) review available approaches and their performance in the setting of bias due to differential participation; (ii) describe two existing approaches to correct for the bias in both scenarios in more detail; (iii) present the magnitude of the resulting bias by simulation if the selection of a non-representative sample is ignored; and (iv) demonstrate the approaches' application via data from a case-control study on stroke. The bias of the effect measure for variable E in scenario I and C in scenario II can be large and should therefore be adjusted for in any analysis. It is positively associated with the difference in response rates between groups of the characteristic causing differential participation, and inversely associated with the total response rate in the controls. Adjustment in a standard logistic regression framework is possible in both scenarios if the population distribution of the characteristic causing differential participation is known or can be approximated well.
Farmer, Jocelyn R; Ong, Mei-Sing; Barmettler, Sara; Yonker, Lael M; Fuleihan, Ramsay; Sullivan, Kathleen E; Cunningham-Rundles, Charlotte; Walter, Jolan E
2017-01-01
Common variable immunodeficiency (CVID) is increasingly recognized for its association with autoimmune and inflammatory complications. Despite recent advances in immunophenotypic and genetic discovery, clinical care of CVID remains limited by our inability to accurately model risk for non-infectious disease development. Herein, we demonstrate the utility of unbiased network clustering as a novel method to analyze inter-relationships between non-infectious disease outcomes in CVID using databases at the United States Immunodeficiency Network (USIDNET), the centralized immunodeficiency registry of the United States, and Partners, a tertiary care network in Boston, MA, USA, with a shared electronic medical record amenable to natural language processing. Immunophenotypes were comparable in terms of native antibody deficiencies, low titer response to pneumococcus, and B cell maturation arrest. However, recorded non-infectious disease outcomes were more substantial in the Partners cohort across the spectrum of lymphoproliferation, cytopenias, autoimmunity, atopy, and malignancy. Using unbiased network clustering to analyze 34 non-infectious disease outcomes in the Partners cohort, we further identified unique patterns of lymphoproliferative (two clusters), autoimmune (two clusters), and atopic (one cluster) disease that were defined as CVID non-infectious endotypes according to discrete and non-overlapping immunophenotypes. Markers were both previously described {high serum IgE in the atopic cluster [odds ratio (OR) 6.5] and low class-switched memory B cells in the total lymphoproliferative cluster (OR 9.2)} and novel [low serum C3 in the total lymphoproliferative cluster (OR 5.1)]. Mortality risk in the Partners cohort was significantly associated with individual non-infectious disease outcomes as well as lymphoproliferative cluster 2, specifically (OR 5.9). In contrast, unbiased network clustering failed to associate known comorbidities in the adult USIDNET cohort. Together, these data suggest that unbiased network clustering can be used in CVID to redefine non-infectious disease inter-relationships; however, applicability may be limited to datasets well annotated through mechanisms such as natural language processing. The lymphoproliferative, autoimmune, and atopic Partners CVID endotypes herein described can be used moving forward to streamline genetic and biomarker discovery and to facilitate early screening and intervention in CVID patients at highest risk for autoimmune and inflammatory progression.
Gormley, Andrew M.; Forsyth, David M.; Wright, Elaine F.; Lyall, John; Elliott, Mike; Martini, Mark; Kappers, Benno; Perry, Mike; McKay, Meredith
2015-01-01
There is interest in large-scale and unbiased monitoring of biodiversity status and trend, but there are few published examples of such monitoring being implemented. The New Zealand Department of Conservation is implementing a monitoring program that involves sampling selected biota at the vertices of an 8-km grid superimposed over the 8.6 million hectares of public conservation land that it manages. The introduced brushtail possum (Trichosurus Vulpecula) is a major threat to some biota and is one taxon that they wish to monitor and report on. A pilot study revealed that the traditional method of monitoring possums using leg-hold traps set for two nights, termed the Trap Catch Index, was a constraint on the cost and logistical feasibility of the monitoring program. A phased implementation of the monitoring program was therefore conducted to collect data for evaluating the trade-off between possum occupancy–abundance estimates and the costs of sampling for one night rather than two nights. Reducing trapping effort from two nights to one night along four trap-lines reduced the estimated costs of monitoring by 5.8% due to savings in labour, food and allowances; it had a negligible effect on estimated national possum occupancy but resulted in slightly higher and less precise estimates of relative possum abundance. Monitoring possums for one night rather than two nights would provide an annual saving of NZ$72,400, with 271 fewer field days required for sampling. Possums occupied 60% (95% credible interval; 53–68) of sampling locations on New Zealand’s public conservation land, with a mean relative abundance (Trap Catch Index) of 2.7% (2.0–3.5). Possum occupancy and abundance were higher in forest than in non-forest habitats. Our case study illustrates the need to evaluate relationships between sampling design, cost, and occupancy–abundance estimates when designing and implementing large-scale occupancy–abundance monitoring programs. PMID:26029890
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Ethnic Group Bias in Intelligence Test Items.
ERIC Educational Resources Information Center
Scheuneman, Janice
In previous studies of ethnic group bias in intelligence test items, the question of bias has been confounded with ability differences between the ethnic group samples compared. The present study is based on a conditional probability model in which an unbiased item is defined as one where the probability of a correct response to an item is the…
Michael D. Bell; James O. Sickman; Andrzej Bytnerowicz; Pamela E. Padgett; Edith B. Allen
2014-01-01
The sources and oxidation pathways of atmospheric nitric acid (HNO3) can be evaluated using the isotopic signatures of oxygen (O) and nitrogen (N). This study evaluated the ability of Nylasorb nylon filters to passively collect unbiased isotopologues of atmospheric HNO3 under controlled and field conditions. Filters...
NASA Technical Reports Server (NTRS)
Joseph, Robert D.; Hora, Joseph; Stockton, Alan; Hu, Esther; Sanders, David
1997-01-01
This report concerns one of the major observational studies in the ISO Central Programme, the ISO Normal Galaxy Survey. This is a survey of an unbiased sample of spiral and lenticular galaxies selected from the Revised Shapley-Ames Catalog. It is therefore optically-selected, with a brightness limit of blue magnitude = 12, and otherwise randomly chosen. The original sample included 150 galaxies, but this was reduced to 74 when the allocated observing time was expended because the ISO overheads encountered in flight were much larger than predicted.
Validation of a Monte Carlo Simulation of Binary Time Series.
1981-09-18
the probability distribution corresponding to the population from which the n sample vectors are generated. Simple unbiased estimators were chosen for...Cowcept A s*us Agew Bethesd, Marylnd H. L. Wauom Am D. RoQuE SymMS Reserch Brach , p" Ssms Delsbian September 18, 1981 DTIC EL E C T E SEP 24 =I98ST...is generated from the sample of such vectors produced by several independent replications of the Monte Carlo simulation. Then the validity of the
Enhanced Conformational Sampling of N-Glycans in Solution with Replica State Exchange Metadynamics.
Galvelis, Raimondas; Re, Suyong; Sugita, Yuji
2017-05-09
Molecular dynamics (MD) simulation of a N-glycan in solution is challenging because of high-energy barriers of the glycosidic linkages, functional group rotational barriers, and numerous intra- and intermolecular hydrogen bonds. In this study, we apply different enhanced conformational sampling approaches, namely, metadynamics (MTD), the replica-exchange MD (REMD), and the recently proposed replica state exchange MTD (RSE-MTD), to a N-glycan in solution and compare the conformational sampling efficiencies of the approaches. MTD helps to cross the high-energy barrier along the ω angle by utilizing a bias potential, but it cannot enhance sampling of the other degrees of freedom. REMD ensures moderate-energy barrier crossings by exchanging temperatures between replicas, while it hardly crosses the barriers along ω. In contrast, RSE-MTD succeeds to cross the high-energy barrier along ω as well as to enhance sampling of the other degrees of freedom. We tested two RSE-MTD schemes: in one scheme, 64 replicas were simulated with the bias potential along ω at different temperatures, while simulations of four replicas were performed with the bias potentials for different CVs at 300 K. In both schemes, one unbiased replica at 300 K was included to compute conformational properties of the glycan. The conformational sampling of the former is better than the other enhanced sampling methods, while the latter shows reasonable performance without spending large computational resources. The latter scheme is likely to be useful when a N-glycan-attached protein is simulated.
Sampling scales define occupancy and underlying occupancy-abundance relationships in animals.
Steenweg, Robin; Hebblewhite, Mark; Whittington, Jesse; Lukacs, Paul; McKelvey, Kevin
2018-01-01
Occupancy-abundance (OA) relationships are a foundational ecological phenomenon and field of study, and occupancy models are increasingly used to track population trends and understand ecological interactions. However, these two fields of ecological inquiry remain largely isolated, despite growing appreciation of the importance of integration. For example, using occupancy models to infer trends in abundance is predicated on positive OA relationships. Many occupancy studies collect data that violate geographical closure assumptions due to the choice of sampling scales and application to mobile organisms, which may change how occupancy and abundance are related. Little research, however, has explored how different occupancy sampling designs affect OA relationships. We develop a conceptual framework for understanding how sampling scales affect the definition of occupancy for mobile organisms, which drives OA relationships. We explore how spatial and temporal sampling scales, and the choice of sampling unit (areal vs. point sampling), affect OA relationships. We develop predictions using simulations, and test them using empirical occupancy data from remote cameras on 11 medium-large mammals. Surprisingly, our simulations demonstrate that when using point sampling, OA relationships are unaffected by spatial sampling grain (i.e., cell size). In contrast, when using areal sampling (e.g., species atlas data), OA relationships are affected by spatial grain. Furthermore, OA relationships are also affected by temporal sampling scales, where the curvature of the OA relationship increases with temporal sampling duration. Our empirical results support these predictions, showing that at any given abundance, the spatial grain of point sampling does not affect occupancy estimates, but longer surveys do increase occupancy estimates. For rare species (low occupancy), estimates of occupancy will quickly increase with longer surveys, even while abundance remains constant. Our results also clearly demonstrate that occupancy for mobile species without geographical closure is not true occupancy. The independence of occupancy estimates from spatial sampling grain depends on the sampling unit. Point-sampling surveys can, however, provide unbiased estimates of occupancy for multiple species simultaneously, irrespective of home-range size. The use of occupancy for trend monitoring needs to explicitly articulate how the chosen sampling scales define occupancy and affect the occupancy-abundance relationship. © 2017 by the Ecological Society of America.
ERIC Educational Resources Information Center
Chetty, Raj; Friedman, John N.; Rockoff, Jonah E.
2011-01-01
Are teachers' impacts on students' test scores ("value-added") a good measure of their quality? This question has sparked debate largely because of disagreement about (1) whether value-added (VA) provides unbiased estimates of teachers' impacts on student achievement and (2) whether high-VA teachers improve students' long-term outcomes.…
A statistical test of unbiased evolution of body size in birds.
Bokma, Folmer
2002-12-01
Of the approximately 9500 bird species, the vast majority is small-bodied. That is a general feature of evolutionary lineages, also observed for instance in mammals and plants. The avian interspecific body size distribution is right-skewed even on a logarithmic scale. That has previously been interpreted as evidence that body size evolution has been biased. However, a procedure to test for unbiased evolution from the shape of body size distributions was lacking. In the present paper unbiased body size evolution is defined precisely, and a statistical test is developed based on Monte Carlo simulation of unbiased evolution. Application of the test to birds suggests that it is highly unlikely that avian body size evolution has been unbiased as defined. Several possible explanations for this result are discussed. A plausible explanation is that the general model of unbiased evolution assumes that population size and generation time do not affect the evolutionary variability of body size; that is, that micro- and macroevolution are decoupled, which theory suggests is not likely to be the case.
Unbiased contaminant removal for 3D galaxy power spectrum measurements
NASA Astrophysics Data System (ADS)
Kalus, B.; Percival, W. J.; Bacon, D. J.; Samushia, L.
2016-11-01
We assess and develop techniques to remove contaminants when calculating the 3D galaxy power spectrum. We separate the process into three separate stages: (I) removing the contaminant signal, (II) estimating the uncontaminated cosmological power spectrum and (III) debiasing the resulting estimates. For (I), we show that removing the best-fitting contaminant (mode subtraction) and setting the contaminated components of the covariance to be infinite (mode deprojection) are mathematically equivalent. For (II), performing a quadratic maximum likelihood (QML) estimate after mode deprojection gives an optimal unbiased solution, although it requires the manipulation of large N_mode^2 matrices (Nmode being the total number of modes), which is unfeasible for recent 3D galaxy surveys. Measuring a binned average of the modes for (II) as proposed by Feldman, Kaiser & Peacock (FKP) is faster and simpler, but is sub-optimal and gives rise to a biased solution. We present a method to debias the resulting FKP measurements that does not require any large matrix calculations. We argue that the sub-optimality of the FKP estimator compared with the QML estimator, caused by contaminants, is less severe than that commonly ignored due to the survey window.
Conformational free energy modeling of druglike molecules by metadynamics in the WHIM space.
Spiwok, Vojtěch; Hlat-Glembová, Katarína; Tvaroška, Igor; Králová, Blanka
2012-03-26
Protein-ligand affinities can be significantly influenced not only by the interaction itself but also by conformational equilibrium of both binding partners, free ligand and free protein. Identification of important conformational families of a ligand and prediction of their thermodynamics is important for efficient ligand design. Here we report conformational free energy modeling of nine small-molecule drugs in explicitly modeled water by metadynamics with a bias potential applied in the space of weighted holistic invariant molecular (WHIM) descriptors. Application of metadynamics enhances conformational sampling compared to unbiased molecular dynamics simulation and allows to predict relative free energies of key conformations. Selected free energy minima and one example of transition state were tested by a series of unbiased molecular dynamics simulation. Comparison of free energy surfaces of free and target-bound Imatinib provides an estimate of free energy penalty of conformational change induced by its binding to the target. © 2012 American Chemical Society
Mutually unbiased bases and semi-definite programming
NASA Astrophysics Data System (ADS)
Brierley, Stephen; Weigert, Stefan
2010-11-01
A complex Hilbert space of dimension six supports at least three but not more than seven mutually unbiased bases. Two computer-aided analytical methods to tighten these bounds are reviewed, based on a discretization of parameter space and on Gröbner bases. A third algorithmic approach is presented: the non-existence of more than three mutually unbiased bases in composite dimensions can be decided by a global optimization method known as semidefinite programming. The method is used to confirm that the spectral matrix cannot be part of a complete set of seven mutually unbiased bases in dimension six.
Estimating mortality rates of adult fish from entrainment through the propellers of river towboats
Gutreuter, S.; Dettmers, J.M.; Wahl, David H.
2003-01-01
We developed a method to estimate mortality rates of adult fish caused by entrainment through the propellers of commercial towboats operating in river channels. The method combines trawling while following towboats (to recover a fraction of the kills) and application of a hydrodynamic model of diffusion (to estimate the fraction of the total kills collected in the trawls). The sampling problem is unusual and required quantifying relatively rare events. We first examined key statistical properties of the entrainment mortality rate estimators using Monte Carlo simulation, which demonstrated that a design-based estimator and a new ad hoc estimator are both unbiased and converge to the true value as the sample size becomes large. Next, we estimated the entrainment mortality rates of adult fishes in Pool 26 of the Mississippi River and the Alton Pool of the Illinois River, where we observed kills that we attributed to entrainment. Our estimates of entrainment mortality rates were 2.52 fish/km of towboat travel (80% confidence interval, 1.00-6.09 fish/km) for gizzard shad Dorosoma cepedianum, 0.13 fish/km (0.00-0.41) for skipjack herring Alosa chrysochloris, and 0.53 fish/km (0.00-1.33) for both shovelnose sturgeon Scaphirhynchus platorynchus and smallmouth buffalo Ictiobus bubalus. Our approach applies more broadly to commercial vessels operating in confined channels, including other large rivers and intracoastal waterways.
Large biases in regression-based constituent flux estimates: causes and diagnostic tools
Hirsch, Robert M.
2014-01-01
It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.
NASA Technical Reports Server (NTRS)
Jolliff, Bradley L.; Rockow, Kaylynn M.; Korotev, Randy L.; Haskin, Larry A.
1996-01-01
Through analysis by instrumental neutron activation (INAA) of 789 individual lithic fragments from the 2 mm-4 mm grain-size fractions of five Apollo 17 soil samples (72443, 72503, 73243, 76283, and 76503) and petrographic examination of a subset, we have determined the diversity and proportions of rock types recorded within soils from the highland massifs. The distribution of rock types at the site, as recorded by lithic fragments in the soils, is an alternative to the distribution inferred from the limited number of large rock samples. The compositions and proportions of 2 mm-4 mm fragments provide a bridge between compositions of less than 1 mm fines and types and proportions of rocks observed in large collected breccias and their clasts. The 2 mm-4 mm fraction of soil from South Massif, represented by an unbiased set of lithic fragments from station-2 samples 72443 and 72503, consists of 71% noritic impact-melt breccia, 7% Incompatible-Trace-Element-(ITE)-poor highland rock types (mainly granulitic breccias), 19% agglutinates and regolith breccias, 1% high-Ti mare basalt, and 2% others (very-low-Ti (VLT) basalt, monzogabbro breccia, and metal). In contrast, the 2 mm - 4 mm fraction of a soil from the North Massif, represented by an unbiased set of lithic fragments from station-6 sample 76503, has a greater proportion of ITE-poor highland rock types and mare-basalt fragments: it consists of 29% ITE-poor highland rock types (mainly granulitic breccias and troctolitic anorthosite), 25% impact-melt breccia, 13% high-Ti mare basalt, 31 % agglutinates and regolith breccias, 1% orange glass and related breccia, and 1% others. Based on a comparison of mass- weighted mean compositions of the lithic fragments with compositions of soil fines from all Apollo 17 highland stations, differences between the station-2 and station-6 samples are representative of differences between available samples from the two massifs. From the distribution of different rock types and their compositions, we conclude the following: (1) North-Massif and South-Massif soil samples differ significantly in types and proportions of ITE-poor highland components and ITE-rich impact-melt-breccia components. These differences reflect crudely layered massifs and known local geology. The greater percentage of impact-melt breccia in the South- Massif light-mantle soil stems from derivation of the light mantle from the top of the massif, which apparently is richer in noritic impact-melt breccia than are lower parts of the massifs. (2) At station 2, the 2 mm-4 mm grain-size fraction is enriched in impact-melt breccias compared to the less than 1 mm fraction, suggesting that the <1 mm fraction within the light mantle has a greater proportion of lithologies such as granulitic breccias which are more prevalent lower in the massifs and which we infer to be older (pre-basin) highland components. (3) Soil from station 6, North Massif, contains magnesian troctolitic anorthosite, which is a component that is rare in station-2 South-Massif,contains magnesian troctolitic in impact-melt breccia interpreted by most investigators to be ejecta from the Serenitatis basin.
Thomas B. Lynch; Rodney E. Will; Rider Reynolds
2013-01-01
Preliminary results are given for development of an eastern redcedar (Juniperus virginiana) cubic-volume equation based on measurements of redcedar sample tree stem volume using dendrometry with Monte Carlo integration. Monte Carlo integration techniques can be used to provide unbiased estimates of stem cubic-foot volume based on upper stem diameter...
F. Mauro; Vicente Monleon; H. Temesgen
2015-01-01
Small area estimation (SAE) techniques have been successfully applied in forest inventories to provide reliable estimates for domains where the sample size is small (i.e. small areas). Previous studies have explored the use of either Area Level or Unit Level Empirical Best Linear Unbiased Predictors (EBLUPs) in a univariate framework, modeling each variable of interest...
El-Kassaby, Yousry A; Funda, Tomas; Lai, Ben S K
2010-01-01
The impact of female reproductive success on the mating system, gene flow, and genetic diversity of the filial generation was studied using a random sample of 801 bulk seed from a 49-clone Pseudotsuga menziesii seed orchard. We used microsatellite DNA fingerprinting and pedigree reconstruction to assign each seed's maternal and paternal parents and directly estimated clonal reproductive success, selfing rate, and the proportion of seed sired by outside pollen sources. Unlike most family array mating system and gene flow studies conducted on natural and experimental populations, which used an equal number of seeds per maternal genotype and thus generating unbiased inferences only on male reproductive success, the random sample we used was a representative of the entire seed crop; therefore, provided a unique opportunity to draw unbiased inferences on both female and male reproductive success variation. Selfing rate and the number of seed sired by outside pollen sources were found to be a function of female fertility variation. This variation also substantially and negatively affected female effective population size. Additionally, the results provided convincing evidence that the use of clone size as a proxy to fertility is questionable and requires further consideration.
Toward a Principled Sampling Theory for Quasi-Orders
Ünlü, Ali; Schrepp, Martin
2016-01-01
Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets. PMID:27965601
Toward a Principled Sampling Theory for Quasi-Orders.
Ünlü, Ali; Schrepp, Martin
2016-01-01
Quasi-orders, that is, reflexive and transitive binary relations, have numerous applications. In educational theories, the dependencies of mastery among the problems of a test can be modeled by quasi-orders. Methods such as item tree or Boolean analysis that mine for quasi-orders in empirical data are sensitive to the underlying quasi-order structure. These data mining techniques have to be compared based on extensive simulation studies, with unbiased samples of randomly generated quasi-orders at their basis. In this paper, we develop techniques that can provide the required quasi-order samples. We introduce a discrete doubly inductive procedure for incrementally constructing the set of all quasi-orders on a finite item set. A randomization of this deterministic procedure allows us to generate representative samples of random quasi-orders. With an outer level inductive algorithm, we consider the uniform random extensions of the trace quasi-orders to higher dimension. This is combined with an inner level inductive algorithm to correct the extensions that violate the transitivity property. The inner level correction step entails sampling biases. We propose three algorithms for bias correction and investigate them in simulation. It is evident that, on even up to 50 items, the new algorithms create close to representative quasi-order samples within acceptable computing time. Hence, the principled approach is a significant improvement to existing methods that are used to draw quasi-orders uniformly at random but cannot cope with reasonably large item sets.
Integrated fiducial sample mount and software for correlated microscopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timothy R McJunkin; Jill R. Scott; Tammy L. Trowbridge
2014-02-01
A novel design sample mount with integrated fiducials and software for assisting operators in easily and efficiently locating points of interest established in previous analytical sessions is described. The sample holder and software were evaluated with experiments to demonstrate the utility and ease of finding the same points of interest in two different microscopy instruments. Also, numerical analysis of expected errors in determining the same position with errors unbiased by a human operator was performed. Based on the results, issues related to acquiring reproducibility and best practices for using the sample mount and software were identified. Overall, the sample mountmore » methodology allows data to be efficiently and easily collected on different instruments for the same sample location.« less
Xu, Stanley; Clarke, Christina L; Newcomer, Sophia R; Daley, Matthew F; Glanz, Jason M
2018-05-16
Vaccine safety studies are often electronic health record (EHR)-based observational studies. These studies often face significant methodological challenges, including confounding and misclassification of adverse event. Vaccine safety researchers use self-controlled case series (SCCS) study design to handle confounding effect and employ medical chart review to ascertain cases that are identified using EHR data. However, for common adverse events, limited resources often make it impossible to adjudicate all adverse events observed in electronic data. In this paper, we considered four approaches for analyzing SCCS data with confirmation rates estimated from an internal validation sample: (1) observed cases, (2) confirmed cases only, (3) known confirmation rate, and (4) multiple imputation (MI). We conducted a simulation study to evaluate these four approaches using type I error rates, percent bias, and empirical power. Our simulation results suggest that when misclassification of adverse events is present, approaches such as observed cases, confirmed case only, and known confirmation rate may inflate the type I error, yield biased point estimates, and affect statistical power. The multiple imputation approach considers the uncertainty of estimated confirmation rates from an internal validation sample, yields a proper type I error rate, largely unbiased point estimate, proper variance estimate, and statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
SHARDS: Survey for High-z Absorption Red & Dead Sources
NASA Astrophysics Data System (ADS)
Pérez-González, P. G.; Cava, A.
2013-05-01
SHARDS, an ESO/GTC Large Program, is an ultra-deep (26.5 mag) spectro-photometric survey with GTC/OSIRIS designed to select and study massive passively evolving galaxies at z=1.0-2.3 in the GOODS-N field using a set of 24 medium-band filters (FWHM~17 nm) covering the 500-950 nm spectral range. Our observing strategy has been planned to detect, for z>1 sources, the prominent Mg absorption feature (at rest-frame ~280 nm), a distinctive, necessary, and sufficient feature of evolved stellar populations (older than 0.5 Gyr). These observations are being used to: (1) derive for the first time an unbiased sample of high-z quiescent galaxies, which extends to fainter magnitudes the samples selected with color techniques and spectroscopic surveys; (2) derive accurate ages and stellar masses based on robust measurements of spectral features such as the Mg_UV or D(4000) indices; (3) measure their redshift with an accuracy Δz/(1+z)<0.02; and (4) study emission-line galaxies (starbursts and AGN) up to very high redshifts. The well-sampled optical SEDs provided by SHARDS for all sources in the GOODS-N field are a valuable complement for current and future surveys carried out with other telescopes (e.g., Spitzer, HST, and Herschel).
Galaxy groups in the low-redshift Universe
NASA Astrophysics Data System (ADS)
Lim, S. H.; Mo, H. J.; Lu, Yi; Wang, Huiyuan; Yang, Xiaohu
2017-09-01
We apply a halo-based group finder to four large redshift surveys, the 2MRS (Two Micron All-Sky Redshift Survey), 6dFGS (Six-degree Field Galaxy Survey), SDSS (Sloan Digital Sky Survey) and 2dFGRS (Two-degree Field Galaxy Redshift Survey), to construct group catalogues in the low-redshift Universe. The group finder is based on that of Yang et al. but with an improved halo mass assignment so that it can be applied uniformly to various redshift surveys of galaxies. Halo masses are assigned to groups according to proxies based on the stellar mass/luminosity of member galaxies. The performances of the group finder in grouping galaxies according to common haloes and in halo mass assignments are tested using realistic mock samples constructed from hydrodynamical simulations and empirical models of galaxy occupation in dark matter haloes. Our group finder finds ∼94 per cent of the correct true member galaxies for 90-95 per cent of the groups in the mock samples; the halo masses assigned by the group finder are un-biased with respect to the true halo masses, and have a typical uncertainty of ∼0.2 dex. The properties of group catalogues constructed from the observational samples are described and compared with other similar catalogues in the literature.
Link, W.A.; Armitage, Peter; Colton, Theodore
1998-01-01
Unbiasedness is probably the best known criterion for evaluating the performance of estimators. This note describes unbiasedness, demonstrating various failings of the criterion. It is shown that unbiased estimators might not exist, or might not be unique; an example of a unique but clearly unacceptable unbiased estimator is given. It is shown that unbiased estimators are not translation invariant. Various alternative criteria are described, and are illustrated through examples.
FAST TRACK COMMUNICATION: Affine constellations without mutually unbiased counterparts
NASA Astrophysics Data System (ADS)
Weigert, Stefan; Durt, Thomas
2010-10-01
It has been conjectured that a complete set of mutually unbiased bases in a space of dimension d exists if and only if there is an affine plane of order d. We introduce affine constellations and compare their existence properties with those of mutually unbiased constellations. The observed discrepancies make a deeper relation between the two existence problems unlikely.
Sex differences and gender-invariance of mother-reported childhood problem behavior.
van der Sluis, Sophie; Polderman, Tinca J C; Neale, Michael C; Verhulst, Frank C; Posthuma, Danielle; Dieleman, Gwen C
2017-09-01
Prevalence and severity of childhood behavioral problems differ between boys and girls, and in psychiatry, testing for gender differences is common practice. Population-based studies show that many psychopathology scales are (partially) Measurement Invariance (MI) with respect to gender, i.e. are unbiased. It is, however, unclear whether these studies generalize towards clinical samples. In a psychiatric outpatient sample, we tested whether the Child Behavior Checklist 6-18 (CBCL) is unbiased with respect to gender. We compared mean scores across gender of all syndrome scales of the CBCL in 3271 patients (63.3% boys) aged 6-18. Second, we tested for MI on both the syndrome scale and the item-level using a stepwise modeling procedure. Six of the eight CBCL syndrome scales included one or more gender-biased items (12.6% of all items), resulting in slight over- or under-estimation of the absolute gender difference in mean scores. Two scales, Somatic Complaints and Rule-breaking Behavior, contained no biased items. The CBCL is a valid instrument to measure gender differences in problem behavior in children and adolescents from a clinical sample; while various gender-biased items were identified, the resulting bias was generally clinically irrelevant, and sufficient items per subscale remained after exclusion of biased items. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Naccache, Samia N.; Federman, Scot; Veeraraghavan, Narayanan; Zaharia, Matei; Lee, Deanna; Samayoa, Erik; Bouquet, Jerome; Greninger, Alexander L.; Luk, Ka-Cheung; Enge, Barryett; Wadford, Debra A.; Messenger, Sharon L.; Genrich, Gillian L.; Pellegrino, Kristen; Grard, Gilda; Leroy, Eric; Schneider, Bradley S.; Fair, Joseph N.; Martínez, Miguel A.; Isa, Pavel; Crump, John A.; DeRisi, Joseph L.; Sittler, Taylor; Hackett, John; Miller, Steve; Chiu, Charles Y.
2014-01-01
Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI (“sequence-based ultrarapid pathogen identification”), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7–500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times. PMID:24899342
Polynomial complexity despite the fermionic sign
NASA Astrophysics Data System (ADS)
Rossi, R.; Prokof'ev, N.; Svistunov, B.; Van Houcke, K.; Werner, F.
2017-04-01
It is commonly believed that in unbiased quantum Monte Carlo approaches to fermionic many-body problems, the infamous sign problem generically implies prohibitively large computational times for obtaining thermodynamic-limit quantities. We point out that for convergent Feynman diagrammatic series evaluated with a recently introduced Monte Carlo algorithm (see Rossi R., arXiv:1612.05184), the computational time increases only polynomially with the inverse error on thermodynamic-limit quantities.
Progress on Ultra-Dense Quantum Communication Using Integrated Photonic Architecture
2012-05-09
REPORT Progress on Ultra-Dense Quantum Communication Using Integrated Photonic Architecture 14. ABSTRACT 16. SECURITY CLASSIFICATION OF: The goal of...including the development of a large-alphabet quantum key distribution protocol that uses measurements in mutually unbiased bases. 1. REPORT DATE (DD-MM... quantum information, integrated optics, photonic integrated chip Dirk Englund, Karl Berggren, Jeffrey Shapiro, Chee Wei Wong, Franco Wong, and Gregory
Magic Angle Spinning NMR Metabolomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhi Hu, Jian
Nuclear Magnetic Resonance (NMR) spectroscopy is a non-destructive, quantitative, reproducible, untargeted and unbiased method that requires no or minimal sample preparation, and is one of the leading analytical tools for metabonomics research [1-3]. The easy quantification and the no need of prior knowledge about compounds present in a sample associated with NMR are advantageous over other techniques [1,4]. 1H NMR is especially attractive because protons are present in virtually all metabolites and its NMR sensitivity is high, enabling the simultaneous identification and monitoring of a wide range of low molecular weight metabolites.
Some comments on Anderson and Pospahala's correction of bias in line transect sampling
Anderson, D.R.; Burnham, K.P.; Chain, B.R.
1980-01-01
ANDERSON and POSPAHALA (1970) investigated the estimation of wildlife population size using the belt or line transect sampling method and devised a correction for bias, thus leading to an estimator with interesting characteristics. This work was given a uniform mathematical framework in BURNHAM and ANDERSON (1976). In this paper we show that the ANDERSON-POSPAHALA estimator is optimal in the sense of being the (unique) best linear unbiased estimator within the class of estimators which are linear combinations of cell frequencies, provided certain assumptions are met.
Demographics of Starbursts in Nearby Seyfert Galaxies
NASA Astrophysics Data System (ADS)
Schinnerer, E.; Colbert, E.; Armus, L.; Scoville, N. Z.; Heckman, T.
2002-12-01
We investigate the frequency of circumnuclear starbursts in Seyfert galaxies using medium-resolution H and K band spectroscopy. An unbiased sample of ~20 nearby Seyfert galaxies was observed at the KeckII telescope with an average seeing of ~0.7''. Preliminary analysis shows strong stellar absorption lines for most galaxies in our sample. Comparison of stellar equivalent widths in the H and K band will allow us to determine the average age of the dominating stellar population. Evidence for an age trend with Seyfert type would provide a strong hint toward a starburst/AGN connection.
Multiple sclerosis and birth order.
James, W H
1984-01-01
Studies on the birth order of patients with multiple sclerosis have yielded contradictory conclusions. Most of the sets of data, however, have been tested by biased tests. Data that have been submitted to unbiased tests seem to suggest that cases are more likely to occur in early birth ranks. This should be tested on further samples and some comments are offered on how this should be done. PMID:6707558
Kinetics of Huperzine A Dissociation from Acetylcholinesterase via Multiple Unbinding Pathways.
Rydzewski, J; Jakubowski, R; Nowak, W; Grubmüller, H
2018-06-12
The dissociation of huperzine A (hupA) from Torpedo californica acetylcholinesterase ( TcAChE) was investigated by 4 μs unbiased and biased all-atom molecular dynamics (MD) simulations in explicit solvent. We performed our study using memetic sampling (MS) for the determination of reaction pathways (RPs), metadynamics to calculate free energy, and maximum-likelihood estimation (MLE) to recover kinetic rates from unbiased MD simulations. Our simulations suggest that the dissociation of hupA occurs mainly via two RPs: a front door along the axis of the active-site gorge (pwf) and through a new transient side door (pws), i.e., formed by the Ω-loop (residues 67-94 of TcAChE). An analysis of the inhibitor unbinding along the RPs suggests that pws is opened transiently after hupA and the Ω-loop reach a low free-energy transition state characterized by the orientation of the pyridone group of the inhibitor directed toward the Ω-loop plane. Unlike pws, pwf does not require large structural changes in TcAChE to be accessible. The estimated free energies and rates agree well with available experimental data. The dissociation rates along the unbinding pathways are similar, suggesting that the dissociation of hupA along pws is likely to be relevant. This indicates that perturbations to hupA- TcAChE interactions could potentially induce pathway hopping. In summary, our results characterize the slow-onset inhibition of TcAChE by hupA, which may provide the structural and energetic bases for the rational design of the next-generation slow-onset inhibitors with optimized pharmacokinetic properties for the treatment of Alzheimer's disease.
Highly sensitive and unbiased approach for elucidating antibody repertoires
Lin, Sherry G.; Ba, Zhaoqing; Du, Zhou; Zhang, Yu; Hu, Jiazhi; Alt, Frederick W.
2016-01-01
Developing B lymphocytes undergo V(D)J recombination to assemble germ-line V, D, and J gene segments into exons that encode the antigen-binding variable region of Ig heavy (H) and light (L) chains. IgH and IgL chains associate to form the B-cell receptor (BCR), which, upon antigen binding, activates B cells to secrete BCR as an antibody. Each of the huge number of clonally independent B cells expresses a unique set of IgH and IgL variable regions. The ability of V(D)J recombination to generate vast primary B-cell repertoires results from a combinatorial assortment of large numbers of different V, D, and J segments, coupled with diversification of the junctions between them to generate the complementary determining region 3 (CDR3) for antigen contact. Approaches to evaluate in depth the content of primary antibody repertoires and, ultimately, to study how they are further molded by secondary mutation and affinity maturation processes are of great importance to the B-cell development, vaccine, and antibody fields. We now describe an unbiased, sensitive, and readily accessible assay, referred to as high-throughput genome-wide translocation sequencing-adapted repertoire sequencing (HTGTS-Rep-seq), to quantify antibody repertoires. HTGTS-Rep-seq quantitatively identifies the vast majority of IgH and IgL V(D)J exons, including their unique CDR3 sequences, from progenitor and mature mouse B lineage cells via the use of specific J primers. HTGTS-Rep-seq also accurately quantifies DJH intermediates and V(D)J exons in either productive or nonproductive configurations. HTGTS-Rep-seq should be useful for studies of human samples, including clonal B-cell expansions, and also for following antibody affinity maturation processes. PMID:27354528
Mishra, Mark V; Bennett, Michele; Vincent, Armon; Lee, Olivia T; Lallas, Costas D; Trabulsi, Edouard J; Gomella, Leonard G; Dicker, Adam P; Showalter, Timothy N
2013-01-01
Qualitative research aimed at identifying patient acceptance of active surveillance (AS) has been identified as a public health research priority. The primary objective of this study was to determine if analysis of a large-sample of anonymous internet conversations (ICs) could be utilized to identify unmet public needs regarding AS. English-language ICs regarding prostate cancer (PC) treatment with AS from 2002-12 were identified using a novel internet search methodology. Web spiders were developed to mine, aggregate, and analyze content from the world-wide-web for ICs centered on AS. Collection of ICs was not restricted to any specific geographic region of origin. NLP was used to evaluate content and perform a sentiment analysis. Conversations were scored as positive, negative, or neutral. A sentiment index (SI) was subsequently calculated according to the following formula to compare temporal trends in public sentiment towards AS: [(# Positive IC/#Total IC)-(#Negative IC/#Total IC) x 100]. A total of 464 ICs were identified. Sentiment increased from -13 to +2 over the study period. The increase sentiment has been driven by increased patient emphasis on quality-of-life factors and endorsement of AS by national medical organizations. Unmet needs identified in these ICs include: a gap between quantitative data regarding long-term outcomes with AS vs. conventional treatments, desire for treatment information from an unbiased specialist, and absence of public role models managed with AS. This study demonstrates the potential utility of online patient communications to provide insight into patient preferences and decision-making. Based on our findings, we recommend that multidisciplinary clinics consider including an unbiased specialist to present treatment options and that future decision tools for AS include quantitative data regarding outcomes after AS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perley, D. A.; Perley, R. A.; Hjorth, J.
2015-03-10
Luminous infrared galaxies and submillimeter galaxies contribute significantly to stellar mass assembly and provide an important test of the connection between the gamma-ray burst (GRB) rate and that of overall cosmic star formation. We present sensitive 3 GHz radio observations using the Karl G. Jansky Very Large Array of 32 uniformly selected GRB host galaxies spanning a redshift range from 0 < z < 2.5, providing the first fully dust- and sample-unbiased measurement of the fraction of GRBs originating from the universe's most bolometrically luminous galaxies. Four galaxies are detected, with inferred radio star formation rates (SFRs) ranging between 50 and 300 Mmore » {sub ☉} yr{sup –1}. Three of the four detections correspond to events consistent with being optically obscured 'dark' bursts. Our overall detection fraction implies that between 9% and 23% of GRBs between 0.5 < z < 2.5 occur in galaxies with S {sub 3GHz} > 10 μJy, corresponding to SFR > 50 M {sub ☉} yr{sup –1} at z ∼ 1 or >250 M {sub ☉} yr{sup –1} at z ∼ 2. Similar galaxies contribute approximately 10%-30% of all cosmic star formation, so our results are consistent with a GRB rate that is not strongly biased with respect to the total SFR of a galaxy. However, all four radio-detected hosts have stellar masses significantly lower than IR/submillimeter-selected field galaxies of similar luminosities. We suggest that the GRB rate may be suppressed in metal-rich environments but independently enhanced in intense starbursts, producing a strong efficiency dependence on mass but little net dependence on bulk galaxy SFR.« less
Comet composition and density analyzer
NASA Technical Reports Server (NTRS)
Clark, B. C.
1982-01-01
Distinctions between cometary material and other extraterrestrial materials (meteorite suites and stratospherically-captured cosmic dust) are addressed. The technique of X-ray fluorescence (XRF) for analysis of elemental composition is involved. Concomitant with these investigations, the problem of collecting representative samples of comet dust (for rendezvous missions) was solved, and several related techniques such as mineralogic analysis (X-ray diffraction), direct analysis of the nucleus without docking (electron macroprobe), dust flux rate measurement, and test sample preparation were evaluated. An explicit experiment concept based upon X-ray fluorescence analysis of biased and unbiased sample collections was scoped and proposed for a future rendezvous mission with a short-period comet.
Optimal reconstruction of the states in qutrit systems
NASA Astrophysics Data System (ADS)
Yan, Fei; Yang, Ming; Cao, Zhuo-Liang
2010-10-01
Based on mutually unbiased measurements, an optimal tomographic scheme for the multiqutrit states is presented explicitly. Because the reconstruction process of states based on mutually unbiased states is free of information waste, we refer to our scheme as the optimal scheme. By optimal we mean that the number of the required conditional operations reaches the minimum in this tomographic scheme for the states of qutrit systems. Special attention will be paid to how those different mutually unbiased measurements are realized; that is, how to decompose each transformation that connects each mutually unbiased basis with the standard computational basis. It is found that all those transformations can be decomposed into several basic implementable single- and two-qutrit unitary operations. For the three-qutrit system, there exist five different mutually unbiased-bases structures with different entanglement properties, so we introduce the concept of physical complexity to minimize the number of nonlocal operations needed over the five different structures. This scheme is helpful for experimental scientists to realize the most economical reconstruction of quantum states in qutrit systems.
Host Galaxy Properties of the Swift BAT Ultra Hard X-Ray Selected AGN
NASA Technical Reports Server (NTRS)
Koss, Michael; Mushotzky, Richard; Veilleux, Sylvain; Winter, Lisa M.; Baumgartner, Wayne; Tueller, Jack; Gehrels, Neil; Valencic, Lynne
2011-01-01
We have assembled the largest sample of ultra hard X-ray selected (14-195 keV) AGN with host galaxy optical data to date, with 185 nearby (z<0.05), moderate luminosity AGN from the Swift Burst Alert Telescope (BAT) sample. The BAT AGN host galaxies have intermediate optical colors (u -- r and g -- r) that are bluer than a comparison sample of inactive galaxies and optically selected AGN from the Sloan Digital Sky Survey (SDSS) which are chosen to have the same stellar mass. Based on morphological classifications from the RC3 and the Galaxy Zoo, the bluer colors of BAT AGN are mainly due to a higher fraction of mergers and massive spirals than in the comparison samples. BAT AGN in massive galaxies (log Stellar Mass >10.5) have a 5 to 10 times higher rate of spiral morphologies than in SDSS AGN or inactive galaxies. We also see enhanced far-IR emission in BAT AGN suggestive of higher levels of star formation compared to the comparison samples. BAT AGN are preferentially found in the most massive host galaxies with high concentration indexes indicative of large bulge-to-disk ratios and large supermassive black holes. The narrow-line (NL) BAT AGN have similar intrinsic luminosities as the SDSS NL Seyferts based on measurements of [O III] Lambda 5007. There is also a correlation between the stellar mass and X-ray emission. The BAT AGN in mergers have bluer colors and greater ultra hard X-ray emission compared to the BAT sample as whole. In agreement with the Unified Model of AGN, and the relatively unbiased nature of the BAT sources, the host galaxy colors and morphologies are independent of measures of obscuration such as X-ray column density or Seyfert type. The high fraction of massive spiral galaxies and galaxy mergers in BAT AGN suggest that host galaxy morphology is related to the activation and fueling of local AGN.
HOST GALAXY PROPERTIES OF THE SWIFT BAT ULTRA HARD X-RAY SELECTED ACTIVE GALACTIC NUCLEUS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koss, Michael; Mushotzky, Richard; Veilleux, Sylvain
We have assembled the largest sample of ultra hard X-ray selected (14-195 keV) active galactic nucleus (AGN) with host galaxy optical data to date, with 185 nearby (z < 0.05), moderate luminosity AGNs from the Swift BAT sample. The BAT AGN host galaxies have intermediate optical colors (u - r and g - r) that are bluer than a comparison sample of inactive galaxies and optically selected AGNs from the Sloan Digital Sky Survey (SDSS) which are chosen to have the same stellar mass. Based on morphological classifications from the RC3 and the Galaxy Zoo, the bluer colors of BATmore » AGNs are mainly due to a higher fraction of mergers and massive spirals than in the comparison samples. BAT AGNs in massive galaxies (log M{sub *} >10.5) have a 5-10 times higher rate of spiral morphologies than in SDSS AGNs or inactive galaxies. We also see enhanced far-infrared emission in BAT AGN suggestive of higher levels of star formation compared to the comparison samples. BAT AGNs are preferentially found in the most massive host galaxies with high concentration indexes indicative of large bulge-to-disk ratios and large supermassive black holes. The narrow-line (NL) BAT AGNs have similar intrinsic luminosities as the SDSS NL Seyferts based on measurements of [O III] {lambda}5007. There is also a correlation between the stellar mass and X-ray emission. The BAT AGNs in mergers have bluer colors and greater ultra hard X-ray emission compared to the BAT sample as a whole. In agreement with the unified model of AGNs, and the relatively unbiased nature of the BAT sources, the host galaxy colors and morphologies are independent of measures of obscuration such as X-ray column density or Seyfert type. The high fraction of massive spiral galaxies and galaxy mergers in BAT AGNs suggest that host galaxy morphology is related to the activation and fueling of local AGN.« less
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A
2015-04-01
Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence-absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence-absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model's performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence-absence data for a given species is scarceIf we have only presence-only data and no presence-absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species' geographic range.
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A.
2016-01-01
Summary Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence–absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence–absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model’s performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence–absence data for a given species is scarceIf we have only presence-only data and no presence–absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species’ geographic range. PMID:27840673
NASA Astrophysics Data System (ADS)
Yamagishi, M.; Nishimura, A.; Fujita, S.; Takekoshi, T.; Matsuo, M.; Minamidani, T.; Taniguchi, K.; Tokuda, K.; Shimajiri, Y.
2018-03-01
We present an unbiased large-scale (9 deg2) CN (N = 1–0) and C18O (J = 1–0) survey of Cygnus-X conducted with the Nobeyama 45 m Cygnus-X CO survey. CN and C18O are detected in various objects toward the Cygnus-X North and South (e.g., DR17, DR18, DR21, DR22, DR23, and W75N). We find that CN/C18O integrated intensity ratios are systematically different from region to region, and are especially enhanced in DR17 and DR18, which are irradiated by the nearby OB stars. This result suggests that CN/C18O ratios are enhanced via photodissociation reactions. We investigate the relation between the CN/C18O ratio and strength of the UV radiation field. As a result, we find that CN/C18O ratios correlate with the far-UV intensities, G 0. We also find that CN/C18O ratios decrease inside molecular clouds, where the interstellar UV radiation is reduced due to the interstellar dust extinction. We conclude that the CN/C18O ratio is controlled by the UV radiation, and is a good probe of photon-dominated regions.
Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A
2014-10-01
Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.
A new approach to importance sampling for the simulation of false alarms. [in radar systems
NASA Technical Reports Server (NTRS)
Lu, D.; Yao, K.
1987-01-01
In this paper a modified importance sampling technique for improving the convergence of Importance Sampling is given. By using this approach to estimate low false alarm rates in radar simulations, the number of Monte Carlo runs can be reduced significantly. For one-dimensional exponential, Weibull, and Rayleigh distributions, a uniformly minimum variance unbiased estimator is obtained. For Gaussian distribution the estimator in this approach is uniformly better than that of previously known Importance Sampling approach. For a cell averaging system, by combining this technique and group sampling, the reduction of Monte Carlo runs for a reference cell of 20 and false alarm rate of lE-6 is on the order of 170 as compared to the previously known Importance Sampling approach.
Waldispühl, Jérôme; Ponty, Yann
2011-11-01
The analysis of the relationship between sequences and structures (i.e., how mutations affect structures and reciprocally how structures influence mutations) is essential to decipher the principles driving molecular evolution, to infer the origins of genetic diseases, and to develop bioengineering applications such as the design of artificial molecules. Because their structures can be predicted from the sequence data only, RNA molecules provide a good framework to study this sequence-structure relationship. We recently introduced a suite of algorithms called RNAmutants which allows a complete exploration of RNA sequence-structure maps in polynomial time and space. Formally, RNAmutants takes an input sequence (or seed) to compute the Boltzmann-weighted ensembles of mutants with exactly k mutations, and sample mutations from these ensembles. However, this approach suffers from major limitations. Indeed, since the Boltzmann probabilities of the mutations depend of the free energy of the structures, RNAmutants has difficulties to sample mutant sequences with low G+C-contents. In this article, we introduce an unbiased adaptive sampling algorithm that enables RNAmutants to sample regions of the mutational landscape poorly covered by classical algorithms. We applied these methods to sample mutations with low G+C-contents. These adaptive sampling techniques can be easily adapted to explore other regions of the sequence and structural landscapes which are difficult to sample. Importantly, these algorithms come at a minimal computational cost. We demonstrate the insights offered by these techniques on studies of complete RNA sequence structures maps of sizes up to 40 nucleotides. Our results indicate that the G+C-content has a strong influence on the size and shape of the evolutionary accessible sequence and structural spaces. In particular, we show that low G+C-contents favor the apparition of internal loops and thus possibly the synthesis of tertiary structure motifs. On the other hand, high G+C-contents significantly reduce the size of the evolutionary accessible mutational landscapes.
Szabolcsi, Zoltán; Farkas, Zsuzsa; Borbély, Andrea; Bárány, Gusztáv; Varga, Dániel; Heinrich, Attila; Völgyi, Antónia; Pamjav, Horolma
2015-11-01
When the DNA profile from a crime-scene matches that of a suspect, the weight of DNA evidence depends on the unbiased estimation of the match probability of the profiles. For this reason, it is required to establish and expand the databases that reflect the actual allele frequencies in the population applied. 21,473 complete DNA profiles from Databank samples were used to establish the allele frequency database to represent the population of Hungarian suspects. We used fifteen STR loci (PowerPlex ESI16) including five, new ESS loci. The aim was to calculate the statistical, forensic efficiency parameters for the Databank samples and compare the newly detected data to the earlier report. The population substructure caused by relatedness may influence the frequency of profiles estimated. As our Databank profiles were considered non-random samples, possible relationships between the suspects can be assumed. Therefore, population inbreeding effect was estimated using the FIS calculation. The overall inbreeding parameter was found to be 0.0106. Furthermore, we tested the impact of the two allele frequency datasets on 101 randomly chosen STR profiles, including full and partial profiles. The 95% confidence interval estimates for the profile frequencies (pM) resulted in a tighter range when we used the new dataset compared to the previously published ones. We found that the FIS had less effect on frequency values in the 21,473 samples than the application of minimum allele frequency. No genetic substructure was detected by STRUCTURE analysis. Due to the low level of inbreeding effect and the high number of samples, the new dataset provides unbiased and precise estimates of LR for statistical interpretation of forensic casework and allows us to use lower allele frequencies. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Nicodemus, Kristin K; Malley, James D; Strobl, Carolin; Ziegler, Andreas
2010-02-27
Random forests (RF) have been increasingly used in applications such as genome-wide association and microarray studies where predictor correlation is frequently observed. Recent works on permutation-based variable importance measures (VIMs) used in RF have come to apparently contradictory conclusions. We present an extended simulation study to synthesize results. In the case when both predictor correlation was present and predictors were associated with the outcome (HA), the unconditional RF VIM attributed a higher share of importance to correlated predictors, while under the null hypothesis that no predictors are associated with the outcome (H0) the unconditional RF VIM was unbiased. Conditional VIMs showed a decrease in VIM values for correlated predictors versus the unconditional VIMs under HA and was unbiased under H0. Scaled VIMs were clearly biased under HA and H0. Unconditional unscaled VIMs are a computationally tractable choice for large datasets and are unbiased under the null hypothesis. Whether the observed increased VIMs for correlated predictors may be considered a "bias" - because they do not directly reflect the coefficients in the generating model - or if it is a beneficial attribute of these VIMs is dependent on the application. For example, in genetic association studies, where correlation between markers may help to localize the functionally relevant variant, the increased importance of correlated predictors may be an advantage. On the other hand, we show examples where this increased importance may result in spurious signals.
The joint fit of the BHMF and ERDF for the BAT AGN Sample
NASA Astrophysics Data System (ADS)
Weigel, Anna K.; Koss, Michael; Ricci, Claudio; Trakhtenbrot, Benny; Oh, Kyuseok; Schawinski, Kevin; Lamperti, Isabella
2018-01-01
A natural product of an AGN survey is the AGN luminosity function. This statistical measure describes the distribution of directly measurable AGN luminosities. Intrinsically, the shape of the luminosity function depends on the distribution of black hole masses and Eddington ratios. To constrain these fundamental AGN properties, the luminosity function thus has to be disentangled into the black hole mass and Eddington ratio distribution function. The BASS survey is unique as it allows such a joint fit for a large number of local AGN, is unbiased in terms of obscuration in the X-rays and provides black hole masses for type-1 and type-2 AGN. The black hole mass function at z ~ 0 represents an essential baseline for simulations and black hole growth models. The normalization of the Eddington ratio distribution function directly constrains the AGN fraction. Together, the BASS AGN luminosity, black hole mass and Eddington ratio distribution functions thus provide a complete picture of the local black hole population.
Wilmot, Michael P; Kostal, Jack W; Stillwell, David; Kosinski, Michal
2017-07-01
For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acquisitive and protective self-monitoring using original Self-Monitoring Scale (SMS) items, and data from two large, nonstudent samples ( Ns = 13,563 and 709). Results indicate that the new acquisitive (six-item) and protective (seven-item) self-monitoring scales are reliable, unbiased in terms of gender and age, and demonstrate theoretically consistent relations to measures of personality traits and cognitive ability. Additionally, by virtue of using original SMS items, previously collected responses can be reanalyzed in accordance with the alternative bivariate model. Recommendations for the reanalysis of archival SMS data, as well as directions for future research, are provided.
Mechanisms of passive ion permeation through lipid bilayers
Tepper, Harald L.; Voth, Gregory A.
2008-01-01
Multi-State Empirical Valence Bond and classical Molecular Dynamics simulations were used to explore mechanisms for passive ion leakage through a dimyristoyl phosphatidylcholine (DMPC) lipid bilayer. In accordance with a previous study on proton leakage, it was found that the permeation mechanism must be a highly concerted one, in which ion, solvent and membrane coordinates are coupled. The presence of the ion itself significantly alters the response of those coordinates, suggesting that simulations of transmembrane water structures without explicit inclusion of the ionic solute are insufficient for elucidating transition mechanisms. The properties of H+, Na+, OH-, and bare water molecules in the membrane interior were compared, both by biased sampling techniques and by constructing complete and unbiased transition paths. It was found that the anomalous difference in leakage rates between protons and other cations can be largely explained by charge delocalization effects, rather than the usual kinetic picture (Grotthuss hopping of the proton). Permeability differences between anions and cations through PC bilayers are correlated with suppression of favorable membrane breathing modes by cations. PMID:17048962
Potential-based dynamical reweighting for Markov state models of protein dynamics.
Weber, Jeffrey K; Pande, Vijay S
2015-06-09
As simulators attempt to replicate the dynamics of large cellular components in silico, problems related to sampling slow, glassy degrees of freedom in molecular systems will be amplified manyfold. It is tempting to augment simulation techniques with external biases to overcome such barriers with ease; biased simulations, however, offer little utility unless equilibrium properties of interest (both kinetic and thermodynamic) can be recovered from the data generated. In this Article, we present a general scheme that harnesses the power of Markov state models (MSMs) to extract equilibrium kinetic properties from molecular dynamics trajectories collected on biased potential energy surfaces. We first validate our reweighting protocol on a simple two-well potential, and we proceed to test our method on potential-biased simulations of the Trp-cage miniprotein. In both cases, we find that equilibrium populations, time scales, and dynamical processes are reliably reproduced as compared to gold standard, unbiased data sets. We go on to discuss the limitations of our dynamical reweighting approach, and we suggest auspicious target systems for further application.
Aspects of mutually unbiased bases in odd-prime-power dimensions
NASA Astrophysics Data System (ADS)
Chaturvedi, S.
2002-04-01
We rephrase the Wootters-Fields construction [W. K. Wootters and B. C. Fields, Ann. Phys. 191, 363 (1989)] of a full set of mutually unbiased bases in a complex vector space of dimensions N=pr, where p is an odd prime, in terms of the character vectors of the cyclic group G of order p. This form may be useful in explicitly writing down mutually unbiased bases for N=pr.
[Application of ordinary Kriging method in entomologic ecology].
Zhang, Runjie; Zhou, Qiang; Chen, Cuixian; Wang, Shousong
2003-01-01
Geostatistics is a statistic method based on regional variables and using the tool of variogram to analyze the spatial structure and the patterns of organism. In simulating the variogram within a great range, though optimal simulation cannot be obtained, the simulation method of a dialogue between human and computer can be used to optimize the parameters of the spherical models. In this paper, the method mentioned above and the weighted polynomial regression were utilized to simulate the one-step spherical model, the two-step spherical model and linear function model, and the available nearby samples were used to draw on the ordinary Kriging procedure, which provided a best linear unbiased estimate of the constraint of the unbiased estimation. The sum of square deviation between the estimating and measuring values of varying theory models were figured out, and the relative graphs were shown. It was showed that the simulation based on the two-step spherical model was the best simulation, and the one-step spherical model was better than the linear function model.
AWE-WQ: fast-forwarding molecular dynamics using the accelerated weighted ensemble.
Abdul-Wahid, Badi'; Feng, Haoyun; Rajan, Dinesh; Costaouec, Ronan; Darve, Eric; Thain, Douglas; Izaguirre, Jesús A
2014-10-27
A limitation of traditional molecular dynamics (MD) is that reaction rates are difficult to compute. This is due to the rarity of observing transitions between metastable states since high energy barriers trap the system in these states. Recently the weighted ensemble (WE) family of methods have emerged which can flexibly and efficiently sample conformational space without being trapped and allow calculation of unbiased rates. However, while WE can sample correctly and efficiently, a scalable implementation applicable to interesting biomolecular systems is not available. We provide here a GPLv2 implementation called AWE-WQ of a WE algorithm using the master/worker distributed computing WorkQueue (WQ) framework. AWE-WQ is scalable to thousands of nodes and supports dynamic allocation of computer resources, heterogeneous resource usage (such as central processing units (CPU) and graphical processing units (GPUs) concurrently), seamless heterogeneous cluster usage (i.e., campus grids and cloud providers), and support for arbitrary MD codes such as GROMACS, while ensuring that all statistics are unbiased. We applied AWE-WQ to a 34 residue protein which simulated 1.5 ms over 8 months with peak aggregate performance of 1000 ns/h. Comparison was done with a 200 μs simulation collected on a GPU over a similar timespan. The folding and unfolded rates were of comparable accuracy.
AWE-WQ: Fast-Forwarding Molecular Dynamics Using the Accelerated Weighted Ensemble
2015-01-01
A limitation of traditional molecular dynamics (MD) is that reaction rates are difficult to compute. This is due to the rarity of observing transitions between metastable states since high energy barriers trap the system in these states. Recently the weighted ensemble (WE) family of methods have emerged which can flexibly and efficiently sample conformational space without being trapped and allow calculation of unbiased rates. However, while WE can sample correctly and efficiently, a scalable implementation applicable to interesting biomolecular systems is not available. We provide here a GPLv2 implementation called AWE-WQ of a WE algorithm using the master/worker distributed computing WorkQueue (WQ) framework. AWE-WQ is scalable to thousands of nodes and supports dynamic allocation of computer resources, heterogeneous resource usage (such as central processing units (CPU) and graphical processing units (GPUs) concurrently), seamless heterogeneous cluster usage (i.e., campus grids and cloud providers), and support for arbitrary MD codes such as GROMACS, while ensuring that all statistics are unbiased. We applied AWE-WQ to a 34 residue protein which simulated 1.5 ms over 8 months with peak aggregate performance of 1000 ns/h. Comparison was done with a 200 μs simulation collected on a GPU over a similar timespan. The folding and unfolded rates were of comparable accuracy. PMID:25207854
Systematic versus random sampling in stereological studies.
West, Mark J
2012-12-01
The sampling that takes place at all levels of an experimental design must be random if the estimate is to be unbiased in a statistical sense. There are two fundamental ways by which one can make a random sample of the sections and positions to be probed on the sections. Using a card-sampling analogy, one can pick any card at all out of a deck of cards. This is referred to as independent random sampling because the sampling of any one card is made without reference to the position of the other cards. The other approach to obtaining a random sample would be to pick a card within a set number of cards and others at equal intervals within the deck. Systematic sampling along one axis of many biological structures is more efficient than random sampling, because most biological structures are not randomly organized. This article discusses the merits of systematic versus random sampling in stereological studies.
León, Ileana R.; Schwämmle, Veit; Jensen, Ole N.; Sprenger, Richard R.
2013-01-01
The majority of mass spectrometry-based protein quantification studies uses peptide-centric analytical methods and thus strongly relies on efficient and unbiased protein digestion protocols for sample preparation. We present a novel objective approach to assess protein digestion efficiency using a combination of qualitative and quantitative liquid chromatography-tandem MS methods and statistical data analysis. In contrast to previous studies we employed both standard qualitative as well as data-independent quantitative workflows to systematically assess trypsin digestion efficiency and bias using mitochondrial protein fractions. We evaluated nine trypsin-based digestion protocols, based on standard in-solution or on spin filter-aided digestion, including new optimized protocols. We investigated various reagents for protein solubilization and denaturation (dodecyl sulfate, deoxycholate, urea), several trypsin digestion conditions (buffer, RapiGest, deoxycholate, urea), and two methods for removal of detergents before analysis of peptides (acid precipitation or phase separation with ethyl acetate). Our data-independent quantitative liquid chromatography-tandem MS workflow quantified over 3700 distinct peptides with 96% completeness between all protocols and replicates, with an average 40% protein sequence coverage and an average of 11 peptides identified per protein. Systematic quantitative and statistical analysis of physicochemical parameters demonstrated that deoxycholate-assisted in-solution digestion combined with phase transfer allows for efficient, unbiased generation and recovery of peptides from all protein classes, including membrane proteins. This deoxycholate-assisted protocol was also optimal for spin filter-aided digestions as compared with existing methods. PMID:23792921
NASA Astrophysics Data System (ADS)
Bianchi, Luciana
2018-01-01
Rest-frame UV, uniquely sensitive to luminous, short-lived hot massive stars, trace and age-date star formation across galaxies, and is very sensitive to dust, whose properties and presence are closely connected to star formation.With wide f-o-v and deep sensitivity in two broad filters,FUV and NUV,GALEX delivered the first comprehensive UV view of large nearby galaxies, and of the universe to z~2 (e.g.,Bianchi 2014 ApSS 354,103), detected star formation at the lowest rates, in environments where it was not seen before and not expected (e.g. Bianchi 2011 ApSS 335,51; Thilker+2009 Nature 457,990;2007 ApJS 173,538), triggering a new era of investigations with HST and large ground-based telescopes. New instrument technology and modeling capabilities make it now possible and compelling to solve standing issues. The scant UV filters available (esp. FUV) and the wide gap in resolution and f-o-v between GALEX and HST leaves old and new questions open. A chief limitation is degeneracies between physical parameters of stellar populations (age/SFR) and hot stars, and dust (e.g. Bianchi+ 2014 JASR 53,928).We show sample model simulations for filter optimization to provide critical measurements for the science objectives. We also demonstrate how adequate FUV+NUV filters, and resolution, allow us to move from speculative interpretation of UV data to unbiased physical characterization of young stellar populations and dust, using new data from UVIT, which, though smaller than CETUS, has better resolution and filter coverage than GALEX.Also, our understanding of galaxy chemical enrichment is limited by critical gaps in stellar evolution; GALEX surveys enabled the first unbiased census of the Milky Way hot-WD population (Bianchi+2011 MNRAS, 411,2770), which we complement with SDSS, Pan-STARRS, and Gaia data to fill such gaps (Bianchi et al.2018, ApSS). Such objects in CETUS fields (deeper exposures, more filters, and the first UV MOS) will be much better characterized, enabling "Galactic archeology" investigations not possible otherwise.
Naccache, Samia N; Federman, Scot; Veeraraghavan, Narayanan; Zaharia, Matei; Lee, Deanna; Samayoa, Erik; Bouquet, Jerome; Greninger, Alexander L; Luk, Ka-Cheung; Enge, Barryett; Wadford, Debra A; Messenger, Sharon L; Genrich, Gillian L; Pellegrino, Kristen; Grard, Gilda; Leroy, Eric; Schneider, Bradley S; Fair, Joseph N; Martínez, Miguel A; Isa, Pavel; Crump, John A; DeRisi, Joseph L; Sittler, Taylor; Hackett, John; Miller, Steve; Chiu, Charles Y
2014-07-01
Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI ("sequence-based ultrarapid pathogen identification"), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7-500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times. © 2014 Naccache et al.; Published by Cold Spring Harbor Laboratory Press.
Comparative modeling and benchmarking data sets for human histone deacetylases and sirtuin families.
Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2015-02-23
Histone deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases, and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective histone deacetylase inhibitors (HDACIs). To facilitate the process, we constructed maximal unbiased benchmarking data sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs cover all four classes including Class III (Sirtuins family) and 14 HDAC isoforms, composed of 631 inhibitors and 24609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of "artificial enrichment" and "analogue bias". We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets and demonstrate that our MUBD-HDACs are unique in that they can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the "2D bias" and "LBVS favorable" effect within the benchmarking sets. In summary, MUBD-HDACs are the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that are available so far. MUBD-HDACs are freely available at http://www.xswlab.org/ .
Designing a national soil erosion monitoring network for England and Wales
NASA Astrophysics Data System (ADS)
Lark, Murray; Rawlins, Barry; Anderson, Karen; Evans, Martin; Farrow, Luke; Glendell, Miriam; James, Mike; Rickson, Jane; Quine, Timothy; Quinton, John; Brazier, Richard
2014-05-01
Although soil erosion is recognised as a significant threat to sustainable land use and may be a priority for action in any forthcoming EU Soil Framework Directive, those responsible for setting national policy with respect to erosion are constrained by a lack of robust, representative, data at large spatial scales. This reflects the process-orientated nature of much soil erosion research. Recognising this limitation, The UK Department for Environment, Food and Rural Affairs (Defra) established a project to pilot a cost-effective framework for monitoring of soil erosion in England and Wales (E&W). The pilot will compare different soil erosion monitoring methods at a site scale and provide statistical information for the final design of the full national monitoring network that will: provide unbiased estimates of the spatial mean of soil erosion rate across E&W (tonnes ha-1 yr-1) for each of three land-use classes - arable and horticultural grassland upland and semi-natural habitats quantify the uncertainty of these estimates with confidence intervals. Probability (design-based) sampling provides most efficient unbiased estimates of spatial means. In this study, a 16 hectare area (a square of 400 x 400 m) positioned at the centre of a 1-km grid cell, selected at random from mapped land use across E&W, provided the sampling support for measurement of erosion rates, with at least 94% of the support area corresponding to the target land use classes. Very small or zero erosion rates likely to be encountered at many sites reduce the sampling efficiency and make it difficult to compare different methods of soil erosion monitoring. Therefore, to increase the proportion of samples with larger erosion rates without biasing our estimates, we increased the inclusion probability density in areas where the erosion rate is likely to be large by using stratified random sampling. First, each sampling domain (land use class in E&W) was divided into strata; e.g. two sub-domains within which, respectively, small or no erosion rates, and moderate or larger erosion rates are expected. Each stratum was then sampled independently and at random. The sample density need not be equal in the two strata, but is known and is accounted for in the estimation of the mean and its standard error. To divide the domains into strata we used information on slope angle, previous interpretation of erosion susceptibility of the soil associations that correspond to the soil map of E&W at 1:250 000 (Soil Survey of England and Wales, 1983), and visual interpretation of evidence of erosion from aerial photography. While each domain could be stratified on the basis of the first two criteria, air photo interpretation across the whole country was not feasible. For this reason we used a two-phase random sampling for stratification (TPRS) design (de Gruijter et al., 2006). First, we formed an initial random sample of 1-km grid cells from the target domain. Second, each cell was then allocated to a stratum on the basis of the three criteria. A subset of the selected cells from each stratum were then selected for field survey at random, with a specified sampling density for each stratum so as to increase the proportion of cells where moderate or larger erosion rates were expected. Once measurements of erosion have been made, an estimate of the spatial mean of the erosion rate over the target domain, its standard error and associated uncertainty can be calculated by an expression which accounts for the estimated proportions of the two strata within the initial random sample. de Gruijter, J.J., Brus, D.J., Biekens, M.F.P. & Knotters, M. 2006. Sampling for Natural Resource Monitoring. Springer, Berlin. Soil Survey of England and Wales. 1983 National Soil Map NATMAP Vector 1:250,000. National Soil Research Institute, Cranfield University.
How to benchmark methods for structure-based virtual screening of large compound libraries.
Christofferson, Andrew J; Huang, Niu
2012-01-01
Structure-based virtual screening is a useful computational technique for ligand discovery. To systematically evaluate different docking approaches, it is important to have a consistent benchmarking protocol that is both relevant and unbiased. Here, we describe the designing of a benchmarking data set for docking screen assessment, a standard docking screening process, and the analysis and presentation of the enrichment of annotated ligands among a background decoy database.
Hierarchical thinking in network biology: the unbiased modularization of biochemical networks.
Papin, Jason A; Reed, Jennifer L; Palsson, Bernhard O
2004-12-01
As reconstructed biochemical reaction networks continue to grow in size and scope, there is a growing need to describe the functional modules within them. Such modules facilitate the study of biological processes by deconstructing complex biological networks into conceptually simple entities. The definition of network modules is often based on intuitive reasoning. As an alternative, methods are being developed for defining biochemical network modules in an unbiased fashion. These unbiased network modules are mathematically derived from the structure of the whole network under consideration.
Genes and abdominal aortic aneurysm.
Hinterseher, Irene; Tromp, Gerard; Kuivaniemi, Helena
2011-04-01
Abdominal aortic aneurysm (AAA) is a multifactorial disease with a strong genetic component. Since the first candidate gene studies were published 20 years ago, approximately 100 genetic association studies using single nucleotide polymorphisms (SNPs) in biologically relevant genes have been reported on AAA. These studies investigated SNPs in genes of the extracellular matrix, the cardiovascular system, the immune system, and signaling pathways. Very few studies were large enough to draw firm conclusions and very few results could be replicated in another sample set. The more recent unbiased approaches are family-based DNA linkage studies and genome-wide genetic association studies, which have the potential of identifying the genetic basis for AAA, only when appropriately powered and well-characterized large AAA cohorts are used. SNPs associated with AAA have already been identified in these large multicenter studies. One significant association was of a variant in a gene called contactin-3, which is located on chromosome 3p12.3. However, two follow-up studies could not replicate this association. Two other SNPs, which are located on chromosome 9p21 and 9q33, were replicated in other samples. The two genes with the strongest supporting evidence of contribution to the genetic risk for AAA are the CDKN2BAS gene, also known as ANRIL, which encodes an antisense ribonucleic acid that regulates expression of the cyclin-dependent kinase inhibitors CDKN2A and CDKN2B, and DAB2IP, which encodes an inhibitor of cell growth and survival. Functional studies are now needed to establish the mechanisms by which these genes contribute toward AAA pathogenesis. Copyright © 2011 Annals of Vascular Surgery Inc. Published by Elsevier Inc. All rights reserved.
Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I
2018-01-01
Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.
Rare event simulation in radiation transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kollman, Craig
1993-10-01
This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved,more » even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiple by the likelihood ratio between the true and simulated probabilities so as to keep the estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive ``learning`` algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give with probability one, a sequence of estimates converging exponentially fast to the true solution.« less
SHARDS: a spectro-photometric analysis of distant red and dead massive galaxies
NASA Astrophysics Data System (ADS)
Pérez-González, P. G.; Cava, A.; The Shards Team
2013-05-01
SHARDS, an ESO/GTC Large Program, is an ultra-deep (26.5 mag) spectro-photometric survey carried out with GTC/OSIRIS and designed to select and study massive passively evolving galaxies at z= 1.0--2.5 in the GOODS-N field. The survey uses a set of 24 medium band filters (FWHM ˜15 nm) covering the 500--950 nm spectral range. Our observing strategy has been planned to detect, for z>1 sources, the prominent Mg absorption feature (at rest-frame ˜280 nm), a distinctive, necessary, and sufficient feature of evolved stellar populations (older than 0.5 Gyr). These observations are being used to: (1) construct for the first time an unbiased sample of high-z quiescent galaxies, which extends to fainter magnitudes the samples selected with color techniques and spectroscopic surveys; (2) derive accurate ages and stellar masses based on robust measurements of spectral features such as the Mg(UV) or D(4000) indices; (3) measure their redshift with an accuracy Δ z/(1+z)<0.02; and (4) study emission-line galaxies (starbursts and AGN) up to very high redshifts. The well-sampled optical SEDs provided by SHARDS for all sources in the GOODS-N field are a valuable complement for current and future surveys carried out with other telescopes (e.g., Spitzer, HST, and Herschel).
NASA Astrophysics Data System (ADS)
Powell, M. C.; Cappelluti, N.; Urry, C. M.; Koss, M.; Finoguenov, A.; Ricci, C.; Trakhtenbrot, B.; Allevato, V.; Ajello, M.; Oh, K.; Schawinski, K.; Secrest, N.
2018-05-01
We characterize the environments of local accreting supermassive black holes by measuring the clustering of AGNs in the Swift/BAT Spectroscopic Survey (BASS). With 548 AGN in the redshift range 0.01 < z < 0.1 over the full sky from the DR1 catalog, BASS provides the largest, least biased sample of local AGNs to date due to its hard X-ray selection (14–195 keV) and rich multiwavelength/ancillary data. By measuring the projected cross-correlation function between the AGN and 2MASS galaxies, and interpreting it via halo occupation distribution and subhalo-based models, we constrain the occupation statistics of the full sample, as well as in bins of absorbing column density and black hole mass. We find that AGNs tend to reside in galaxy group environments, in agreement with previous studies of AGNs throughout a large range of luminosity and redshift, and that on average they occupy their dark matter halos similar to inactive galaxies of comparable stellar mass. We also find evidence that obscured AGNs tend to reside in denser environments than unobscured AGNs, even when samples were matched in luminosity, redshift, stellar mass, and Eddington ratio. We show that this can be explained either by significantly different halo occupation distributions or statistically different host halo assembly histories. Lastly, we see that massive black holes are slightly more likely to reside in central galaxies than black holes of smaller mass.
The Mass Distribution of Companions to Low-mass White Dwarfs
NASA Astrophysics Data System (ADS)
Andrews, Jeff J.; Price-Whelan, Adrian M.; Agüeros, Marcel A.
2014-12-01
Measuring the masses of companions to single-line spectroscopic binary stars is (in general) not possible because of the unknown orbital plane inclination. Even when the mass of the visible star can be measured, only a lower limit can be placed on the mass of the unseen companion. However, since these inclination angles should be isotropically distributed, for a large enough, unbiased sample, the companion mass distribution can be deconvolved from the distribution of observables. In this work, we construct a hierarchical probabilistic model to infer properties of unseen companion stars given observations of the orbital period and projected radial velocity of the primary star. We apply this model to three mock samples of low-mass white dwarfs (LMWDs; M <~ 0.45 M ⊙) and a sample of post-common-envelope binaries. We use a mixture of two Gaussians to model the WD and neutron star (NS) companion mass distributions. Our model successfully recovers the initial parameters of these test data sets. We then apply our model to 55 WDs in the extremely low-mass (ELM) WD Survey. Our maximum a posteriori model for the WD companion population has a mean mass μWD = 0.74 M ⊙, with a standard deviation σWD = 0.24 M ⊙. Our model constrains the NS companion fraction f NS to be <16% at 68% confidence. We make samples from the posterior distribution publicly available so that future observational efforts may compute the NS probability for newly discovered LMWDs.
Statistical Measures of Large-Scale Structure
NASA Astrophysics Data System (ADS)
Vogeley, Michael; Geller, Margaret; Huchra, John; Park, Changbom; Gott, J. Richard
1993-12-01
\\inv Mpc} To quantify clustering in the large-scale distribution of galaxies and to test theories for the formation of structure in the universe, we apply statistical measures to the CfA Redshift Survey. This survey is complete to m_{B(0)}=15.5 over two contiguous regions which cover one-quarter of the sky and include ~ 11,000 galaxies. The salient features of these data are voids with diameter 30-50\\hmpc and coherent dense structures with a scale ~ 100\\hmpc. Comparison with N-body simulations rules out the ``standard" CDM model (Omega =1, b=1.5, sigma_8 =1) at the 99% confidence level because this model has insufficient power on scales lambda >30\\hmpc. An unbiased open universe CDM model (Omega h =0.2) and a biased CDM model with non-zero cosmological constant (Omega h =0.24, lambda_0 =0.6) match the observed power spectrum. The amplitude of the power spectrum depends on the luminosity of galaxies in the sample; bright (L>L(*) ) galaxies are more strongly clustered than faint galaxies. The paucity of bright galaxies in low-density regions may explain this dependence. To measure the topology of large-scale structure, we compute the genus of isodensity surfaces of the smoothed density field. On scales in the ``non-linear" regime, <= 10\\hmpc, the high- and low-density regions are multiply-connected over a broad range of density threshold, as in a filamentary net. On smoothing scales >10\\hmpc, the topology is consistent with statistics of a Gaussian random field. Simulations of CDM models fail to produce the observed coherence of structure on non-linear scales (>95% confidence level). The underdensity probability (the frequency of regions with density contrast delta rho //lineρ=-0.8) depends strongly on the luminosity of galaxies; underdense regions are significantly more common (>2sigma ) in bright (L>L(*) ) galaxy samples than in samples which include fainter galaxies.
NASA Astrophysics Data System (ADS)
Jolliff, Bradley L.; Rockow, Kaylynn M.; Korotev, Randy L.; Haskin, Larry A.
1996-01-01
Through analysis by instrumental neutron activation (INAA) of 789 individual lithic fragments from the 2 mm-4 mm grain-size fractions of five Apollo 17 soil samples (72443, 72503, 73243, 76283, and 76503) and petrographic examination of a subset, we have determined the diversity and proportions of rock types recorded within soils from the highland massifs. The distribution of rock types at the site, as recorded by lithic fragments in the soils, is an alternative to the distribution inferred from the limited number of large rock samples. The compositions and proportions of 2 mm-4 mm fragments provide a bridge between compositions of <1 mm fines, and types and proportions of rocks observed in large collected breccias and their clasts. The 2 mm-4 mm fraction of soil from South Massif, represented by an unbiased set of lithic fragments from station-2 samples 72443 and 72503, consists of 71% noritic impact-melt breccia, 7% incompatible-trace-element-(ITE)-poor highland rock types (mainly granulitic breccias), 19% agglutinates and regolith breccias, 1% high-Ti mare basalt, and 2% others (very-low-Ti (VLT) basalt, monzogabbro breccia, and metal). In contrast, the 2 mm-4 mm fraction of a soil from the North Massif, represented by an unbiased set of lithic fragments from station-6 sample 76503, has a greater proportion of ITE-poor highland rock types and mare-basalt fragments: it consists of 29% ITE-poor highland rock types (mainly granulitic breccias and troctolitic anorthosite), 25% impact-melt breccia, 13% high-Ti mare basalt, 31% agglutinates and regolith breccias, 1% orange glass and related breccia, and 1% others. Based on a comparison of mass-weighted mean compositions of the lithic fragments with compositions of soil fines from all Apollo 17 highland stations, differences between the station-2 and station-6 samples are representative of differences between available samples from the two massifs. From the distribution of different rock types and their compositions, we conclude the following: (1) North-Massif and South-Massif soil samples differ significantly in types and proportions of ITE-poor highland components and ITE-rich impact-melt-breccia components. These differences reflect crudely layered massifs and known local geology. The greater percentage of impact-melt breccia in the South-Massif light-mantle soil stems from derivation of the light mantle from the top of the massif, which apparently is richer in noritic impact-melt breccia than are lower parts of the massifs. (2) At station 2, the 2 mm-4 mm grain-size fraction is enriched in impact-melt breccias compared to the <1 mm fraction, suggesting that the <1 mm fraction within the light mantle has a greater proportion of lithologies such as granulitic breccias which are more prevalent lower in the massifs and which we infer to be older (pre-basin) highland components. (3) Soil from station 6, North Massif, contains magnesian troctolitic anorthosite, which is a component that is rare in station-2 South-Massif soils. (4) Compositional differences between poikilitic impact-melt breccias from the two massifs suggest broad-scale heterogeneity in impact-melt breccia interpreted by most investigators to be ejecta from the Serenitatis basin. We have found rock types not previously recognized or uncommon at the Apollo 17 site. These include (1) ITE-rich impact-melt breccias that are compositionally distinct from previously recognized "aphanitic" and "poikilitic" groups at Apollo 17; (2) regolith breccias that are free of mare components and poor in impact melt of the types associated with the main melt-breccia groups, and that, if those groups derive from the Serenitatis impact, may represent the pre-Serenitatis surface; (3) several VLT basalts, including an unusual very-high-K basaltic breccia; (4) orange-glass regolith breccias; (5) aphanitic-matrix melt breccias at station 6; (6) fragments of alkali-rich composition, including alkali anorthosite, and monzogabbro; (7) one fragment of 72275-type KREEP basalt from station 3; (8) seven lithic fragments of ferroan-anorthositic-suite rocks; and (9) a fragment of metal, possibly from an L chondrite. Some of these lithologies have been found only as lithic fragments in the soils and not among the large rock samples. In contrast, we have not found among the 2 mm-4 mm lithic fragments individual samples of certain lithologies that have been recognized as clasts in breccias (e.g., dunite and spinel troctolite). The diversity of lithologic information contained in the lithic fragments of these soils nearly equals that found among large rock samples, and most information bearing on petrographic relationships is maintained, even in such small samples. Given a small number of large samples for "petrologic ground truth," small lithic fragments contained in soil "scoop" samples can provide the basis for interpreting the diversity of rock types and their proportions in remotely sensed geologic units. They should be considered essential targets for future automated sample-analysis and sample-return missions.
2014-06-01
Specifically, we combined the CRISPR genome editing system with a novel approach allowing efficient single cell cloning of Drosophila cells with the aim of...and culture these to produce cultures completely lacking wildtype sequence at the target locus. No robust methods existed to clone single Drosophila ...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . 65 samples that displayed synthetic lethality (15 genes) or synthetic
Equilibrium Free Energies from Nonequilibrium Metadynamics
NASA Astrophysics Data System (ADS)
Bussi, Giovanni; Laio, Alessandro; Parrinello, Michele
2006-03-01
In this Letter we propose a new formalism to map history-dependent metadynamics in a Markovian process. We apply this formalism to model Langevin dynamics and determine the equilibrium distribution of a collection of simulations. We demonstrate that the reconstructed free energy is an unbiased estimate of the underlying free energy and analytically derive an expression for the error. The present results can be applied to other history-dependent stochastic processes, such as Wang-Landau sampling.
Comparative Modeling and Benchmarking Data Sets for Human Histone Deacetylases and Sirtuin Families
Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2015-01-01
Histone Deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective Histone Deacetylases Inhibitors (HDACIs). To facilitate the process, we constructed the Maximal Unbiased Benchmarking Data Sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs covers all 4 Classes including Class III (Sirtuins family) and 14 HDACs isoforms, composed of 631 inhibitors and 24,609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of “artificial enrichment” and “analogue bias”. We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets, and demonstrate that our MUBD-HDACs is unique in that it can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the “2D bias” and “LBVS favorable” effect within the benchmarking sets. In summary, MUBD-HDACs is the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that is available so far. MUBD-HDACs is freely available at http://www.xswlab.org/. PMID:25633490
Systematic random sampling of the comet assay.
McArt, Darragh G; Wasson, Gillian R; McKerr, George; Saetzler, Kurt; Reed, Matt; Howard, C Vyvyan
2009-07-01
The comet assay is a technique used to quantify DNA damage and repair at a cellular level. In the assay, cells are embedded in agarose and the cellular content is stripped away leaving only the DNA trapped in an agarose cavity which can then be electrophoresed. The damaged DNA can enter the agarose and migrate while the undamaged DNA cannot and is retained. DNA damage is measured as the proportion of the migratory 'tail' DNA compared to the total DNA in the cell. The fundamental basis of these arbitrary values is obtained in the comet acquisition phase using fluorescence microscopy with a stoichiometric stain in tandem with image analysis software. Current methods deployed in such an acquisition are expected to be both objectively and randomly obtained. In this paper we examine the 'randomness' of the acquisition phase and suggest an alternative method that offers both objective and unbiased comet selection. In order to achieve this, we have adopted a survey sampling approach widely used in stereology, which offers a method of systematic random sampling (SRS). This is desirable as it offers an impartial and reproducible method of comet analysis that can be used both manually or automated. By making use of an unbiased sampling frame and using microscope verniers, we are able to increase the precision of estimates of DNA damage. Results obtained from a multiple-user pooled variation experiment showed that the SRS technique attained a lower variability than that of the traditional approach. The analysis of a single user with repetition experiment showed greater individual variances while not being detrimental to overall averages. This would suggest that the SRS method offers a better reflection of DNA damage for a given slide and also offers better user reproducibility.
Estimating unbiased economies of scale of HIV prevention projects: a case study of Avahan.
Lépine, Aurélia; Vassall, Anna; Chandrashekar, Sudha; Blanc, Elodie; Le Nestour, Alexis
2015-04-01
Governments and donors are investing considerable resources on HIV prevention in order to scale up these services rapidly. Given the current economic climate, providers of HIV prevention services increasingly need to demonstrate that these investments offer good 'value for money'. One of the primary routes to achieve efficiency is to take advantage of economies of scale (a reduction in the average cost of a health service as provision scales-up), yet empirical evidence on economies of scale is scarce. Methodologically, the estimation of economies of scale is hampered by several statistical issues preventing causal inference and thus making the estimation of economies of scale complex. In order to estimate unbiased economies of scale when scaling up HIV prevention services, we apply our analysis to one of the few HIV prevention programmes globally delivered at a large scale: the Indian Avahan initiative. We costed the project by collecting data from the 138 Avahan NGOs and the supporting partners in the first four years of its scale-up, between 2004 and 2007. We develop a parsimonious empirical model and apply a system Generalized Method of Moments (GMM) and fixed-effects Instrumental Variable (IV) estimators to estimate unbiased economies of scale. At the programme level, we find that, after controlling for the endogeneity of scale, the scale-up of Avahan has generated high economies of scale. Our findings suggest that average cost reductions per person reached are achievable when scaling-up HIV prevention in low and middle income countries. Copyright © 2015 Elsevier Ltd. All rights reserved.
Occupancy Modeling for Improved Accuracy and Understanding of Pathogen Prevalence and Dynamics
Colvin, Michael E.; Peterson, James T.; Kent, Michael L.; Schreck, Carl B.
2015-01-01
Most pathogen detection tests are imperfect, with a sensitivity < 100%, thereby resulting in the potential for a false negative, where a pathogen is present but not detected. False negatives in a sample inflate the number of non-detections, negatively biasing estimates of pathogen prevalence. Histological examination of tissues as a diagnostic test can be advantageous as multiple pathogens can be examined and providing important information on associated pathological changes to the host. However, it is usually less sensitive than molecular or microbiological tests for specific pathogens. Our study objectives were to 1) develop a hierarchical occupancy model to examine pathogen prevalence in spring Chinook salmon Oncorhynchus tshawytscha and their distribution among host tissues 2) use the model to estimate pathogen-specific test sensitivities and infection rates, and 3) illustrate the effect of using replicate within host sampling on sample sizes required to detect a pathogen. We examined histological sections of replicate tissue samples from spring Chinook salmon O. tshawytscha collected after spawning for common pathogens seen in this population: Apophallus/echinostome metacercariae, Parvicapsula minibicornis, Nanophyetus salmincola/ metacercariae, and Renibacterium salmoninarum. A hierarchical occupancy model was developed to estimate pathogen and tissue-specific test sensitivities and unbiased estimation of host- and organ-level infection rates. Model estimated sensitivities and host- and organ-level infections rates varied among pathogens and model estimated infection rate was higher than prevalence unadjusted for test sensitivity, confirming that prevalence unadjusted for test sensitivity was negatively biased. The modeling approach provided an analytical approach for using hierarchically structured pathogen detection data from lower sensitivity diagnostic tests, such as histology, to obtain unbiased pathogen prevalence estimates with associated uncertainties. Accounting for test sensitivity using within host replicate samples also required fewer individual fish to be sampled. This approach is useful for evaluating pathogen or microbe community dynamics when test sensitivity is <100%. PMID:25738709
Occupancy modeling for improved accuracy and understanding of pathogen prevalence and dynamics
Colvin, Michael E.; Peterson, James T.; Kent, Michael L.; Schreck, Carl B.
2015-01-01
Most pathogen detection tests are imperfect, with a sensitivity < 100%, thereby resulting in the potential for a false negative, where a pathogen is present but not detected. False negatives in a sample inflate the number of non-detections, negatively biasing estimates of pathogen prevalence. Histological examination of tissues as a diagnostic test can be advantageous as multiple pathogens can be examined and providing important information on associated pathological changes to the host. However, it is usually less sensitive than molecular or microbiological tests for specific pathogens. Our study objectives were to 1) develop a hierarchical occupancy model to examine pathogen prevalence in spring Chinook salmonOncorhynchus tshawytscha and their distribution among host tissues 2) use the model to estimate pathogen-specific test sensitivities and infection rates, and 3) illustrate the effect of using replicate within host sampling on sample sizes required to detect a pathogen. We examined histological sections of replicate tissue samples from spring Chinook salmon O. tshawytscha collected after spawning for common pathogens seen in this population:Apophallus/echinostome metacercariae, Parvicapsula minibicornis, Nanophyetus salmincola/metacercariae, and Renibacterium salmoninarum. A hierarchical occupancy model was developed to estimate pathogen and tissue-specific test sensitivities and unbiased estimation of host- and organ-level infection rates. Model estimated sensitivities and host- and organ-level infections rates varied among pathogens and model estimated infection rate was higher than prevalence unadjusted for test sensitivity, confirming that prevalence unadjusted for test sensitivity was negatively biased. The modeling approach provided an analytical approach for using hierarchically structured pathogen detection data from lower sensitivity diagnostic tests, such as histology, to obtain unbiased pathogen prevalence estimates with associated uncertainties. Accounting for test sensitivity using within host replicate samples also required fewer individual fish to be sampled. This approach is useful for evaluating pathogen or microbe community dynamics when test sensitivity is <100%.
Ways to improve your correlation functions
NASA Technical Reports Server (NTRS)
Hamilton, A. J. S.
1993-01-01
This paper describes a number of ways to improve on the standard method for measuring the two-point correlation function of large scale structure in the Universe. Issues addressed are: (1) the problem of the mean density, and how to solve it; (2) how to estimate the uncertainty in a measured correlation function; (3) minimum variance pair weighting; (4) unbiased estimation of the selection function when magnitudes are discrete; and (5) analytic computation of angular integrals in background pair counts.
Pei, Fen; Jin, Hongwei; Zhou, Xin; Xia, Jie; Sun, Lidan; Liu, Zhenming; Zhang, Liangren
2015-11-01
Toll-like receptor 8 agonists, which activate adaptive immune responses by inducing robust production of T-helper 1-polarizing cytokines, are promising candidates for vaccine adjuvants. As the binding site of toll-like receptor 8 is large and highly flexible, virtual screening by individual method has inevitable limitations; thus, a comprehensive comparison of different methods may provide insights into seeking effective strategy for the discovery of novel toll-like receptor 8 agonists. In this study, the performance of knowledge-based pharmacophore, shape-based 3D screening, and combined strategies was assessed against a maximum unbiased benchmarking data set containing 13 actives and 1302 decoys specialized for toll-like receptor 8 agonists. Prior structure-activity relationship knowledge was involved in knowledge-based pharmacophore generation, and a set of antagonists was innovatively used to verify the selectivity of the selected knowledge-based pharmacophore. The benchmarking data set was generated from our recently developed 'mubd-decoymaker' protocol. The enrichment assessment demonstrated a considerable performance through our selected three-layer virtual screening strategy: knowledge-based pharmacophore (Phar1) screening, shape-based 3D similarity search (Q4_combo), and then a Gold docking screening. This virtual screening strategy could be further employed to perform large-scale database screening and to discover novel toll-like receptor 8 agonists. © 2015 John Wiley & Sons A/S.
PyRETIS: A well-done, medium-sized python library for rare events.
Lervik, Anders; Riccardi, Enrico; van Erp, Titus S
2017-10-30
Transition path sampling techniques are becoming common approaches in the study of rare events at the molecular scale. More efficient methods, such as transition interface sampling (TIS) and replica exchange transition interface sampling (RETIS), allow the investigation of rare events, for example, chemical reactions and structural/morphological transitions, in a reasonable computational time. Here, we present PyRETIS, a Python library for performing TIS and RETIS simulations. PyRETIS directs molecular dynamics (MD) simulations in order to sample rare events with unbiased dynamics. PyRETIS is designed to be easily interfaced with any molecular simulation package and in the present release, it has been interfaced with GROMACS and CP2K, for classical and ab initio MD simulations, respectively. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Ehrenberg, A J; Nguy, A K; Theofilas, P; Dunlop, S; Suemoto, C K; Di Lorenzo Alho, A T; Leite, R P; Diehl Rodriguez, R; Mejia, M B; Rüb, U; Farfel, J M; de Lucena Ferretti-Rebustini, R E; Nascimento, C F; Nitrini, R; Pasquallucci, C A; Jacob-Filho, W; Miller, B; Seeley, W W; Heinsen, H; Grinberg, L T
2017-08-01
Hyperphosphorylated tau neuronal cytoplasmic inclusions (ht-NCI) are the best protein correlate of clinical decline in Alzheimer's disease (AD). Qualitative evidence identifies ht-NCI accumulating in the isodendritic core before the entorhinal cortex. Here, we used unbiased stereology to quantify ht-NCI burden in the locus coeruleus (LC) and dorsal raphe nucleus (DRN), aiming to characterize the impact of AD pathology in these nuclei with a focus on early stages. We utilized unbiased stereology in a sample of 48 well-characterized subjects enriched for controls and early AD stages. ht-NCI counts were estimated in 60-μm-thick sections immunostained for p-tau throughout LC and DRN. Data were integrated with unbiased estimates of LC and DRN neuronal population for a subset of cases. In Braak stage 0, 7.9% and 2.6% of neurons in LC and DRN, respectively, harbour ht-NCIs. Although the number of ht-NCI+ neurons significantly increased by about 1.9× between Braak stages 0 to I in LC (P = 0.02), we failed to detect any significant difference between Braak stage I and II. Also, the number of ht-NCI+ neurons remained stable in DRN between all stages 0 and II. Finally, the differential susceptibility to tau inclusions among nuclear subdivisions was more notable in LC than in DRN. LC and DRN neurons exhibited ht-NCI during AD precortical stages. The ht-NCI increases along AD progression on both nuclei, but quantitative changes in LC precede DRN changes. © 2017 British Neuropathological Society.
A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue
Nyengaard, Jens Randel; Lind, Martin; Spector, Myron
2015-01-01
Objective To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin–eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. Results We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm3 (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. Conclusion We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage. PMID:26069715
Griaud, François; Denefeld, Blandine; Lang, Manuel; Hensinger, Héloïse; Haberl, Peter; Berg, Matthias
2017-07-01
Characterization of charge-based variants by mass spectrometry (MS) is required for the analytical development of a new biologic entity and its marketing approval by health authorities. However, standard peak-based data analysis approaches are time-consuming and biased toward the detection, identification, and quantification of main variants only. The aim of this study was to characterize in-depth acidic and basic species of a stressed IgG1 monoclonal antibody using comprehensive and unbiased MS data evaluation tools. Fractions collected from cation ion exchange (CEX) chromatography were analyzed as intact, after reduction of disulfide bridges, and after proteolytic cleavage using Lys-C. Data of both intact and reduced samples were evaluated consistently using a time-resolved deconvolution algorithm. Peptide mapping data were processed simultaneously, quantified and compared in a systematic manner for all MS signals and fractions. Differences observed between the fractions were then further characterized and assigned. Time-resolved deconvolution enhanced pattern visualization and data interpretation of main and minor modifications in 3-dimensional maps across CEX fractions. Relative quantification of all MS signals across CEX fractions before peptide assignment enabled the detection of fraction-specific chemical modifications at abundances below 1%. Acidic fractions were shown to be heterogeneous, containing antibody fragments, glycated as well as deamidated forms of the heavy and light chains. In contrast, the basic fractions contained mainly modifications of the C-terminus and pyroglutamate formation at the N-terminus of the heavy chain. Systematic data evaluation was performed to investigate multiple data sets and comprehensively extract main and minor differences between each CEX fraction in an unbiased manner.
Gao, Wen; Yang, Hua; Qi, Lian-Wen; Liu, E-Hu; Ren, Mei-Ting; Yan, Yu-Ting; Chen, Jun; Li, Ping
2012-07-06
Plant-based medicines become increasingly popular over the world. Authentication of herbal raw materials is important to ensure their safety and efficacy. Some herbs belonging to closely related species but differing in medicinal properties are difficult to be identified because of similar morphological and microscopic characteristics. Chromatographic fingerprinting is an alternative method to distinguish them. Existing approaches do not allow a comprehensive analysis for herbal authentication. We have now developed a strategy consisting of (1) full metabolic profiling of herbal medicines by rapid resolution liquid chromatography (RRLC) combined with quadrupole time-of-flight mass spectrometry (QTOF MS), (2) global analysis of non-targeted compounds by molecular feature extraction algorithm, (3) multivariate statistical analysis for classification and prediction, and (4) marker compounds characterization. This approach has provided a fast and unbiased comparative multivariate analysis of the metabolite composition of 33-batch samples covering seven Lonicera species. Individual metabolic profiles are performed at the level of molecular fragments without prior structural assignment. In the entire set, the obtained classifier for seven Lonicera species flower buds showed good prediction performance and a total of 82 statistically different components were rapidly obtained by the strategy. The elemental compositions of discriminative metabolites were characterized by the accurate mass measurement of the pseudomolecular ions and their chemical types were assigned by the MS/MS spectra. The high-resolution, comprehensive and unbiased strategy for metabolite data analysis presented here is powerful and opens the new direction of authentication in herbal analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue.
Foldager, Casper Bindzus; Nyengaard, Jens Randel; Lind, Martin; Spector, Myron
2015-04-01
To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin-eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm(3) (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage.
Generating equilateral random polygons in confinement
NASA Astrophysics Data System (ADS)
Diao, Y.; Ernst, C.; Montemayor, A.; Ziegler, U.
2011-10-01
One challenging problem in biology is to understand the mechanism of DNA packing in a confined volume such as a cell. It is known that confined circular DNA is often knotted and hence the topology of the extracted (and relaxed) circular DNA can be used as a probe of the DNA packing mechanism. However, in order to properly estimate the topological properties of the confined circular DNA structures using mathematical models, it is necessary to generate large ensembles of simulated closed chains (i.e. polygons) of equal edge lengths that are confined in a volume such as a sphere of certain fixed radius. Finding efficient algorithms that properly sample the space of such confined equilateral random polygons is a difficult problem. In this paper, we propose a method that generates confined equilateral random polygons based on their probability distribution. This method requires the creation of a large database initially. However, once the database has been created, a confined equilateral random polygon of length n can be generated in linear time in terms of n. The errors introduced by the method can be controlled and reduced by the refinement of the database. Furthermore, our numerical simulations indicate that these errors are unbiased and tend to cancel each other in a long polygon.
Yu, Teresa; Korgaonkar, Mayuresh S; Grieve, Stuart M
2017-04-01
This study examined patterns of cerebellar volumetric gray matter (GM) loss across the adult lifespan in a large cross-sectional sample. Four hundred and seventy-nine healthy participants (age range: 7-86 years) were drawn from the Brain Resource International Database who provided T1-weighted MRI scans. The spatially unbiased infratentorial template (SUIT) toolbox in SPM8 was used for normalisation of the cerebellum structures. Global volumetric and voxel-based morphometry analyses were performed to evaluate age-associated trends and gender-specific age-patterns. Global cerebellar GM shows a cross-sectional reduction with advancing age of 2.5 % per decade-approximately half the rate seen in the whole brain. The male cerebellum is larger with a lower percentage of GM, however, after controlling for total brain volume, no gender difference was detected. Analysis of age-related changes in GM volume revealed large bilateral clusters involving the vermis and cerebellar crus where regional loss occurred at nearly twice the average cerebellar rate. No gender-specific patterns were detected. These data confirm that regionally specific GM loss occurs in the cerebellum with age, and form a solid base for further investigation to find functional correlates for this global and focal loss.
Spectroscopic observation of SN2017gkk by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Onori, F.; Benetti, S.; Cappellaro, E.; Losada, Illa R.; Gafton, E.; NUTS Collaboration
2017-09-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of supernova SN2017gkk (=MASTER OT J091344.71762842.5) in host galaxy NGC 2748.
Spectroscopic observation of ASASSN-17he by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Kostrzewa-Rutkowska, Z.; Benetti, S.; Dong, S.; Stritzinger, M.; Stanek, K.; Brimacombe, J.; Sagues, A.; Galindo, P.; Losada, I. Rivero
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17he. The candidate was discovered by by the All-Sky Automated Survey for Supernovae.
Lawson, Tyler D; Jones, Martin L; Komar, Oliver; Welch, Allison M
2011-07-01
Amphibian populations around the world have been declining at an alarming rate due to factors such as habitat destruction, pollution, and infectious diseases. Between May and July 2008, we investigated a fungal pathogen in the critically endangered Morelet's treefrog (Agalychnis moreletii) at sites in El Salvador. Larvae were screened with a hand lens for indications of infection with Batrachochytrium dendrobatidis (Bd), a fungus that can cause lethal chytridiomycosis in amphibians. Subsets of inspected tadpoles were preserved for analysis by polymerase chain reaction to determine the effectiveness of hand lens screening for presence of Bd and to estimate infection prevalence at various sites. Because individuals with signs of infection were preferentially included, we used a novel method to generate unbiased estimates of infection prevalence from these biased samples. External mouthpart deformities, identified with a hand lens, successfully predicted Bd infection across a large spatial scale. Two of 13 sites sampled had high (≥ 89%) estimated prevalence, whereas little or no Bd was detected at the remaining sites. Although it appears that A. moreletii populations in this region are not suffering rapid declines due to Bd, further monitoring is required to determine the extent to which these populations are stably coexisting with the pathogen.
Occupancy as a surrogate for abundance estimation
MacKenzie, D.I.; Nichols, J.D.
2004-01-01
In many monitoring programmes it may be prohibitively expensive to estimate the actual abundance of a bird species in a defined area, particularly at large spatial scales, or where birds occur at very low densities. Often it may be appropriate to consider the proportion of area occupied by the species as an alternative state variable. However, as with abundance estimation, issues of detectability must be taken into account in order to make accurate inferences: the non?detection of the species does not imply the species is genuinely absent. Here we review some recent modelling developments that permit unbiased estimation of the proportion of area occupied, colonization and local extinction probabilities. These methods allow for unequal sampling effort and enable covariate information on sampling locations to be incorporated. We also describe how these models could be extended to incorporate information from marked individuals, which would enable finer questions of population dynamics (such as turnover rate of nest sites by specific breeding pairs) to be addressed. We believe these models may be applicable to a wide range of bird species and may be useful for investigating various questions of ecological interest. For example, with respect to habitat quality, we might predict that a species is more likely to have higher local extinction probabilities, or higher turnover rates of specific breeding pairs, in poor quality habitats.
Gedik, Nilgün; Krüger, Marcus; Thielmann, Matthias; Kottenberg, Eva; Skyschally, Andreas; Frey, Ulrich H; Cario, Elke; Peters, Jürgen; Jakob, Heinz; Heusch, Gerd; Kleinbongard, Petra
2017-08-09
Remote ischemic preconditioning (RIPC) by repeated brief cycles of limb ischemia/reperfusion reduces myocardial ischemia/reperfusion injury. In left ventricular (LV) biopsies from patients undergoing coronary artery bypass grafting (CABG), only the activation of signal transducer and activator of transcription 5 was associated with RIPC's cardioprotection. We have now used an unbiased, non-hypothesis-driven proteomics and phosphoproteomics approach to analyze LV biopsies from patients undergoing CABG and from pigs undergoing coronary occlusion/reperfusion without (sham) and with RIPC. False discovery rate-based statistics identified a higher prostaglandin reductase 2 expression at early reperfusion with RIPC than with sham in patients. In pigs, the phosphorylation of 116 proteins was different between baseline and early reperfusion with RIPC and/or with sham. The identified proteins were not identical for patients and pigs, but in-silico pathway analysis of proteins with ≥2-fold higher expression/phosphorylation at early reperfusion with RIPC in comparison to sham revealed a relation to mitochondria and cytoskeleton in both species. Apart from limitations of the proteomics analysis per se, the small cohorts, the sampling/sample processing and the number of uncharacterized/unverifiable porcine proteins may have contributed to this largely unsatisfactory result.
Andersen, Ole Juul; Grouleff, Julie; Needham, Perri; Walker, Ross C; Jensen, Frank
2015-11-19
Current enhanced sampling molecular dynamics methods for studying large conformational changes in proteins suffer from certain limitations. These include, among others, the need for user defined collective variables, the prerequisite of both start and end point structures of the conformational change, and the need for a priori knowledge of the amount by which to boost specific parts of the potential. In this paper, a framework is proposed for a molecular dynamics method for studying ligand-induced conformational changes, in which the nonbonded interactions between the ligand and the protein are used to calculate a biasing force. The method requires only a single input structure, and does not entail the use of collective variables. We provide a proof-of-concept for accelerating conformational changes in three simple test molecules, as well as promising results for two proteins known to undergo domain closure upon ligand binding. For the ribose-binding protein, backbone root-mean-square deviations as low as 0.75 Å compared to the crystal structure of the closed conformation are obtained within 50 ns simulations, whereas no domain closures are observed in unbiased simulations. A skewed closed structure is obtained for the glutamine-binding protein at high bias values, indicating that specific protein-ligand interactions might suppress important protein-protein interactions.
Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection
NASA Astrophysics Data System (ADS)
Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd
2015-02-01
Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.
An unbiased Hessian representation for Monte Carlo PDFs.
Carrazza, Stefano; Forte, Stefano; Kassabov, Zahari; Latorre, José Ignacio; Rojo, Juan
We develop a methodology for the construction of a Hessian representation of Monte Carlo sets of parton distributions, based on the use of a subset of the Monte Carlo PDF replicas as an unbiased linear basis, and of a genetic algorithm for the determination of the optimal basis. We validate the methodology by first showing that it faithfully reproduces a native Monte Carlo PDF set (NNPDF3.0), and then, that if applied to Hessian PDF set (MMHT14) which was transformed into a Monte Carlo set, it gives back the starting PDFs with minimal information loss. We then show that, when applied to a large Monte Carlo PDF set obtained as combination of several underlying sets, the methodology leads to a Hessian representation in terms of a rather smaller set of parameters (MC-H PDFs), thereby providing an alternative implementation of the recently suggested Meta-PDF idea and a Hessian version of the recently suggested PDF compression algorithm (CMC-PDFs). The mc2hessian conversion code is made publicly available together with (through LHAPDF6) a Hessian representations of the NNPDF3.0 set, and the MC-H PDF set.
Etzioni, Ruth; Gulati, Roman
2013-04-01
In our article about limitations of basing screening policy on screening trials, we offered several examples of ways in which modeling, using data from large screening trials and population trends, provided insights that differed somewhat from those based only on empirical trial results. In this editorial, we take a step back and consider the general question of whether randomized screening trials provide the strongest evidence for clinical guidelines concerning population screening programs. We argue that randomized trials provide a process that is designed to protect against certain biases but that this process does not guarantee that inferences based on empirical results from screening trials will be unbiased. Appropriate quantitative methods are key to obtaining unbiased inferences from screening trials. We highlight several studies in the statistical literature demonstrating that conventional survival analyses of screening trials can be misleading and list a number of key questions concerning screening harms and benefits that cannot be answered without modeling. Although we acknowledge the centrality of screening trials in the policy process, we maintain that modeling constitutes a powerful tool for screening trial interpretation and screening policy development.
A unique chromatin complex occupies young α-satellite arrays of human centromeres
Henikoff, Jorja G.; Thakur, Jitendra; Kasinathan, Sivakanthan; Henikoff, Steven
2015-01-01
The intractability of homogeneous α-satellite arrays has impeded understanding of human centromeres. Artificial centromeres are produced from higher-order repeats (HORs) present at centromere edges, although the exact sequences and chromatin conformations of centromere cores remain unknown. We use high-resolution chromatin immunoprecipitation (ChIP) of centromere components followed by clustering of sequence data as an unbiased approach to identify functional centromere sequences. We find that specific dimeric α-satellite units shared by multiple individuals dominate functional human centromeres. We identify two recently homogenized α-satellite dimers that are occupied by precisely positioned CENP-A (cenH3) nucleosomes with two ~100–base pair (bp) DNA wraps in tandem separated by a CENP-B/CENP-C–containing linker, whereas pericentromeric HORs show diffuse positioning. Precise positioning is largely maintained, whereas abundance decreases exponentially with divergence, which suggests that young α-satellite dimers with paired ~100-bp particles mediate evolution of functional human centromeres. Our unbiased strategy for identifying functional centromeric sequences should be generally applicable to tandem repeat arrays that dominate the centromeres of most eukaryotes. PMID:25927077
2015-09-01
assessed the specificity of mutation in Drosophila S2R+ cells. We generated a quantitative mutation reporter vector in which an sgRNA target sequence ...phosphatases (563 genes) in the Drosophila genome (Figure 4). 65 samples that displayed synthetic lethality (15 genes) or synthetic increases in viability...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . . Identified three hits (mRNA-Cap, Pitslre and CycT) that scored as
NASA Astrophysics Data System (ADS)
Tugores, M. Pilar; Iglesias, Magdalena; Oñate, Dolores; Miquel, Joan
2016-02-01
In the Mediterranean Sea, the European anchovy (Engraulis encrasicolus) displays a key role in ecological and economical terms. Ensuring stock sustainability requires the provision of crucial information, such as species spatial distribution or unbiased abundance and precision estimates, so that management strategies can be defined (e.g. fishing quotas, temporal closure areas or marine protected areas MPA). Furthermore, the estimation of the precision of global abundance at different sampling intensities can be used for survey design optimisation. Geostatistics provide a priori unbiased estimations of the spatial structure, global abundance and precision for autocorrelated data. However, their application to non-Gaussian data introduces difficulties in the analysis in conjunction with low robustness or unbiasedness. The present study applied intrinsic geostatistics in two dimensions in order to (i) analyse the spatial distribution of anchovy in Spanish Western Mediterranean waters during the species' recruitment season, (ii) produce distribution maps, (iii) estimate global abundance and its precision, (iv) analyse the effect of changing the sampling intensity on the precision of global abundance estimates and, (v) evaluate the effects of several methodological options on the robustness of all the analysed parameters. The results suggested that while the spatial structure was usually non-robust to the tested methodological options when working with the original dataset, it became more robust for the transformed datasets (especially for the log-backtransformed dataset). The global abundance was always highly robust and the global precision was highly or moderately robust to most of the methodological options, except for data transformation.
CHRR: coordinate hit-and-run with rounding for uniform sampling of constraint-based models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haraldsdóttir, Hulda S.; Cousins, Ben; Thiele, Ines
In constraint-based metabolic modelling, physical and biochemical constraints define a polyhedral convex set of feasible flux vectors. Uniform sampling of this set provides an unbiased characterization of the metabolic capabilities of a biochemical network. However, reliable uniform sampling of genome-scale biochemical networks is challenging due to their high dimensionality and inherent anisotropy. Here, we present an implementation of a new sampling algorithm, coordinate hit-and-run with rounding (CHRR). This algorithm is based on the provably efficient hit-and-run random walk and crucially uses a preprocessing step to round the anisotropic flux set. CHRR provably converges to a uniform stationary sampling distribution. Wemore » apply it to metabolic networks of increasing dimensionality. We show that it converges several times faster than a popular artificial centering hit-and-run algorithm, enabling reliable and tractable sampling of genome-scale biochemical networks.« less
CHRR: coordinate hit-and-run with rounding for uniform sampling of constraint-based models
Haraldsdóttir, Hulda S.; Cousins, Ben; Thiele, Ines; ...
2017-01-31
In constraint-based metabolic modelling, physical and biochemical constraints define a polyhedral convex set of feasible flux vectors. Uniform sampling of this set provides an unbiased characterization of the metabolic capabilities of a biochemical network. However, reliable uniform sampling of genome-scale biochemical networks is challenging due to their high dimensionality and inherent anisotropy. Here, we present an implementation of a new sampling algorithm, coordinate hit-and-run with rounding (CHRR). This algorithm is based on the provably efficient hit-and-run random walk and crucially uses a preprocessing step to round the anisotropic flux set. CHRR provably converges to a uniform stationary sampling distribution. Wemore » apply it to metabolic networks of increasing dimensionality. We show that it converges several times faster than a popular artificial centering hit-and-run algorithm, enabling reliable and tractable sampling of genome-scale biochemical networks.« less
Spectroscopic classification of Gaia18adv by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Gall, C.; Benetti, S.; Wyrzykowski, L.; Stritzinger, M.; Holmbo, S.; Dong, S.; Siltala, Lauri; NUTS Collaboration
2018-01-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of Gaia18adv (SN2018hh) near the host galaxy SDSS J121341.37+282640.0.
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Interpolating scattered data points is a problem of wide ranging interest. A number of approaches for interpolation have been proposed both from theoretical domains such as computational geometry and in applications' fields such as geostatistics. Our motivation arises from geological and mining applications. In many instances data can be costly to compute and are available only at nonuniformly scattered positions. Because of the high cost of collecting measurements, high accuracy is required in the interpolants. One of the most popular interpolation methods in this field is called ordinary kriging. It is popular because it is a best linear unbiased estimator. The price for its statistical optimality is that the estimator is computationally very expensive. This is because the value of each interpolant is given by the solution of a large dense linear system. In practice, kriging problems have been solved approximately by restricting the domain to a small local neighborhood of points that lie near the query point. Determining the proper size for this neighborhood is a solved by ad hoc methods, and it has been shown that this approach leads to undesirable discontinuities in the interpolant. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. This process achieves its efficiency by replacing the large dense kriging system with a much sparser linear system. This technique has been applied to a restriction of our problem, called simple kriging, which is not unbiased for general data sets. In this paper we generalize these results by showing how to apply covariance tapering to the more general problem of ordinary kriging. Through experimentation we demonstrate the space and time efficiency and accuracy of approximating ordinary kriging through the use of covariance tapering combined with iterative methods for solving large sparse systems. We demonstrate our approach on large data sizes arising both from synthetic sources and from real applications.
Galaxy redshift surveys with sparse sampling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiang, Chi-Ting; Wullstein, Philipp; Komatsu, Eiichiro
2013-12-01
Survey observations of the three-dimensional locations of galaxies are a powerful approach to measure the distribution of matter in the universe, which can be used to learn about the nature of dark energy, physics of inflation, neutrino masses, etc. A competitive survey, however, requires a large volume (e.g., V{sub survey} ∼ 10Gpc{sup 3}) to be covered, and thus tends to be expensive. A ''sparse sampling'' method offers a more affordable solution to this problem: within a survey footprint covering a given survey volume, V{sub survey}, we observe only a fraction of the volume. The distribution of observed regions should bemore » chosen such that their separation is smaller than the length scale corresponding to the wavenumber of interest. Then one can recover the power spectrum of galaxies with precision expected for a survey covering a volume of V{sub survey} (rather than the volume of the sum of observed regions) with the number density of galaxies given by the total number of observed galaxies divided by V{sub survey} (rather than the number density of galaxies within an observed region). We find that regularly-spaced sampling yields an unbiased power spectrum with no window function effect, and deviations from regularly-spaced sampling, which are unavoidable in realistic surveys, introduce calculable window function effects and increase the uncertainties of the recovered power spectrum. On the other hand, we show that the two-point correlation function (pair counting) is not affected by sparse sampling. While we discuss the sparse sampling method within the context of the forthcoming Hobby-Eberly Telescope Dark Energy Experiment, the method is general and can be applied to other galaxy surveys.« less
Uncertainty importance analysis using parametric moment ratio functions.
Wei, Pengfei; Lu, Zhenzhou; Song, Jingwen
2014-02-01
This article presents a new importance analysis framework, called parametric moment ratio function, for measuring the reduction of model output uncertainty when the distribution parameters of inputs are changed, and the emphasis is put on the mean and variance ratio functions with respect to the variances of model inputs. The proposed concepts efficiently guide the analyst to achieve a targeted reduction on the model output mean and variance by operating on the variances of model inputs. The unbiased and progressive unbiased Monte Carlo estimators are also derived for the parametric mean and variance ratio functions, respectively. Only a set of samples is needed for implementing the proposed importance analysis by the proposed estimators, thus the computational cost is free of input dimensionality. An analytical test example with highly nonlinear behavior is introduced for illustrating the engineering significance of the proposed importance analysis technique and verifying the efficiency and convergence of the derived Monte Carlo estimators. Finally, the moment ratio function is applied to a planar 10-bar structure for achieving a targeted 50% reduction of the model output variance. © 2013 Society for Risk Analysis.
Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use
Arthur, Steve M.; Schwartz, Charles C.
1999-01-01
We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the effects of variability of those estimates. Use of GPS-equipped collars can facilitate obtaining larger samples of unbiased data and improve accuracy and precision of home range estimates.
Pierce, Brandon L; Ahsan, Habibul; Vanderweele, Tyler J
2011-06-01
Mendelian Randomization (MR) studies assess the causality of an exposure-disease association using genetic determinants [i.e. instrumental variables (IVs)] of the exposure. Power and IV strength requirements for MR studies using multiple genetic variants have not been explored. We simulated cohort data sets consisting of a normally distributed disease trait, a normally distributed exposure, which affects this trait and a biallelic genetic variant that affects the exposure. We estimated power to detect an effect of exposure on disease for varying allele frequencies, effect sizes and samples sizes (using two-stage least squares regression on 10,000 data sets-Stage 1 is a regression of exposure on the variant. Stage 2 is a regression of disease on the fitted exposure). Similar analyses were conducted using multiple genetic variants (5, 10, 20) as independent or combined IVs. We assessed IV strength using the first-stage F statistic. Simulations of realistic scenarios indicate that MR studies will require large (n > 1000), often very large (n > 10,000), sample sizes. In many cases, so-called 'weak IV' problems arise when using multiple variants as independent IVs (even with as few as five), resulting in biased effect estimates. Combining genetic factors into fewer IVs results in modest power decreases, but alleviates weak IV problems. Ideal methods for combining genetic factors depend upon knowledge of the genetic architecture underlying the exposure. The feasibility of well-powered, unbiased MR studies will depend upon the amount of variance in the exposure that can be explained by known genetic factors and the 'strength' of the IV set derived from these genetic factors.
Intrinsic scatter of caustic masses and hydrostatic bias: An observational study
NASA Astrophysics Data System (ADS)
Andreon, S.; Trinchieri, G.; Moretti, A.; Wang, J.
2017-10-01
All estimates of cluster mass have some intrinsic scatter and perhaps some bias with true mass even in the absence of measurement errors for example caused by cluster triaxiality and large scale structure. Knowledge of the bias and scatter values is fundamental for both cluster cosmology and astrophysics. In this paper we show that the intrinsic scatter of a mass proxy can be constrained by measurements of the gas fraction because masses with higher values of intrinsic scatter with true mass produce more scattered gas fractions. Moreover, the relative bias of two mass estimates can be constrained by comparing the mean gas fraction at the same (nominal) cluster mass. Our observational study addresses the scatter between caustic (I.e., dynamically estimated) and true masses, and the relative bias of caustic and hydrostatic masses. For these purposes, we used the X-ray Unbiased Cluster Sample, a cluster sample selected independently from the intracluster medium content with reliable masses: 34 galaxy clusters in the nearby (0.050 < z < 0.135) Universe, mostly with 14 < log M500/M⊙ ≲ 14.5, and with caustic masses. We found a 35% scatter between caustic and true masses. Furthermore, we found that the relative bias between caustic and hydrostatic masses is small, 0.06 ± 0.05 dex, improving upon past measurements. The small scatter found confirms our previous measurements of a highly variable amount of feedback from cluster to cluster, which is the cause of the observed large variety of core-excised X-ray luminosities and gas masses.
3D two-fluid simulations of turbulence in LAPD
NASA Astrophysics Data System (ADS)
Fisher, Dustin M.
The Large Plasma Device (LAPD) is modeled using a modified version of the 3D Global Braginskii Solver code (GBS) for a nominal Helium plasma. The unbiased low-flow regime is explored in simulations where there is an intrinsic E x B rotation of the plasma. In the simulations this rotation is caused primarily by sheath effects with the Reynolds stress and J x B torque due to a cross-field Pederson conductivity having little effect. Explicit biasing simulations are also explored for the first time where the intrinsic rotation of the plasma is modified through boundary conditions that mimic the biasable limiter used in LAPD. Comparisons to experimental measurements in the unbiased case show strong qualitative agreement with the data, particularly the radial dependence of the density fluctuations, cross-correlation lengths, radial flux dependence outside of the cathode edge, and camera imagery. Kelvin Helmholtz (KH) turbulence at relatively large scales is the dominant driver of cross-field transport in these simulations with smaller-scale drift waves and sheath modes playing a secondary role. Plasma holes and blobs arising from KH vortices are consistent with the scale sizes and overall appearance of those in LAPD camera images. The addition of ion-neutral collisions in the unbiased simulations at previously theorized values reduces the radial particle flux due to a modest stabilizing contribution of the collisions on the KH-modes driving the turbulent transport. In the biased runs the ion-neutral collisions have a much smaller effect due to the modification of the potential from sheath terms. In biasing the plasma to increase the intrinsic rotation, simulations show the emergence of a nonlinearly saturated coherent mode of order m = 6. In addition, the plasma inside of the cathode edge becomes quiescent due to the strong influence of the wall bias in setting up the equilibrium plasma potential. Biasing in the direction opposite to the intrinsic flow reduces the effective shear and leads to a stronger presence of drift modes that are seen to saturate when the KH drive has been suppressed. Both biasing cases show a moderate density confinement similarly seen in the experiment.
NASA Astrophysics Data System (ADS)
Cannizzaro, G.; Kuncarayakti, H.; Fraser, M.; Hamanowicz, A.; Jonker, P.; Kankare, E.; Kostrzewa-Rutkowska, Z.; Onori, F.; Wevers, T.; Wyrzykowski, L.; Galbany, L.
2018-03-01
The NOT Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of supernovae SN 2018aei and SN 2018aej, discovered by PanSTARSS Survey for Transients (ATel #11408).
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen
Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; ...
2018-03-12
Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
NASA Astrophysics Data System (ADS)
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; Gauthier, Daniel J.
2018-03-01
We propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator-coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data.
Caulfield, Thomas R; Devkota, Batsal; Rollins, Geoffrey C
2011-01-01
We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16.
Identifying genetic alterations that prime a cancer cell to respond to a particular therapeutic agent can facilitate the development of precision cancer medicines. Cancer cell-line (CCL) profiling of small-molecule sensitivity has emerged as an unbiased method to assess the relationships between genetic or cellular features of CCLs and small-molecule response. Here, we developed annotated cluster multidimensional enrichment analysis to explore the associations between groups of small molecules and groups of CCLs in a new, quantitative sensitivity dataset.
Controlled and Uncontrolled Disorder in Optical Lattice Emulators
2014-12-16
energy dispersion computed for the Hubbard interaction in a flat spin-‐orbit band. The...Right) Emergent single-‐ particle energy dispersion versus wavevector. The inset depicts the emergent...on approach to the Néel transition by means of large-‐scale unbiased Diagrammatic Determinant
Mapping the gas-to-dust ratio in the edge-on spiral galaxy IC2531
NASA Astrophysics Data System (ADS)
Baes, Maarten; Gentile, Gianfranco; Allaert, Flor; Kuno, Nario; Verstappen, Joris
2012-04-01
The gas-to-dust ratio is an important diagnostic of the chemical evolution of galaxies, but unfortunately, there are only a few unbiased studies of the gas-to-dust ratio within galaxies and among different galaxies. We want to take advantage of the revolutionary capabilities of the Herschel Space Observatory and the special geometry of edge-on spiral galaxies to derive accurate gas and dust mass profiles in the edge-on spiral galaxy IC2531, the only southern galaxy from a sample of large edge-on spirals observed with Herschel. We already have a wealth of ancillary data and detailed radiative transfer modelling at our disposal for this galaxy, and now request CO observations to map the molecular gas distribution. With our combined dataset, we will investigate the radial behaviour of the gas-to-dust ratio, compare it with the properties of the stellar population and the dark matter distribution, and test the possibility to use the far-infrared emission from dust to determine the total ISM mass in galaxies.
Young, John; Iwanowicz, Luke; Sperry, Adam; Blazer, Vicki
2014-09-01
Endocrine-disrupting compounds (EDCs) are becoming of increasing concern in waterways of the USA and worldwide. What remains poorly understood, however, is how prevalent these emerging contaminants are in the environment and what methods are best able to determine landscape sources of EDCs. We describe the development of a spatially structured sampling design and a reconnaissance survey of estrogenic activity along gradients of land use within sub-watersheds. We present this example as a useful approach for state and federal agencies with an interest in identifying locations potentially impacted by EDCs that warrant more intensive, focused research. Our study confirms the importance of agricultural activities on levels of a measured estrogenic equivalent (E2Eq) and also highlights the importance of other potential sources of E2Eq in areas where intensive agriculture is not the dominant land use. Through application of readily available geographic information system (GIS) data, coupled with spatial statistical analysis, we demonstrate the correlation of specific land use types to levels of estrogenic activity across a large area in a consistent and unbiased manner.
Chen, Minyong; Shi, Xiaofeng; Duke, Rebecca M.; Ruse, Cristian I.; Dai, Nan; Taron, Christopher H.; Samuelson, James C.
2017-01-01
A method for selective and comprehensive enrichment of N-linked glycopeptides was developed to facilitate detection of micro-heterogeneity of N-glycosylation. The method takes advantage of the inherent properties of Fbs1, which functions within the ubiquitin-mediated degradation system to recognize the common core pentasaccharide motif (Man3GlcNAc2) of N-linked glycoproteins. We show that Fbs1 is able to bind diverse types of N-linked glycomolecules; however, wild-type Fbs1 preferentially binds high-mannose-containing glycans. We identified Fbs1 variants through mutagenesis and plasmid display selection, which possess higher affinity and improved recovery of complex N-glycomolecules. In particular, we demonstrate that the Fbs1 GYR variant may be employed for substantially unbiased enrichment of N-linked glycopeptides from human serum. Most importantly, this highly efficient N-glycopeptide enrichment method enables the simultaneous determination of N-glycan composition and N-glycosites with a deeper coverage (compared to lectin enrichment) and improves large-scale N-glycoproteomics studies due to greatly reduced sample complexity. PMID:28534482
Mechanisms of passive ion permeation through lipid bilayers: insights from simulations.
Tepper, Harald L; Voth, Gregory A
2006-10-26
Multistate empirical valence bond and classical molecular dynamics simulations were used to explore mechanisms for passive ion leakage through a dimyristoyl phosphatidylcholine lipid bilayer. In accordance with a previous study on proton leakage (Biophys. J. 2005, 88, 3095), it was found that the permeation mechanism must be a highly concerted one, in which ion, solvent, and membrane coordinates are coupled. The presence of the ion itself significantly alters the response of those coordinates, suggesting that simulations of transmembrane water structures without explicit inclusion of the ionic solute are insufficient for elucidating transition mechanisms. The properties of H(+), Na(+), OH(-), and bare water molecules in the membrane interior were compared, both by biased sampling techniques and by constructing complete and unbiased transition paths. It was found that the anomalous difference in leakage rates between protons and other cations can be largely explained by charge delocalization effects rather than the usual kinetic picture (Grotthuss hopping of the proton). Permeability differences between anions and cations through phosphatidylcholine bilayers are correlated with suppression of favorable membrane breathing modes by cations.
Folding of polyglutamine chains
NASA Astrophysics Data System (ADS)
Chopra, Manan; Reddy, Allam S.; Abbott, N. L.; de Pablo, J. J.
2008-10-01
Long polyglutamine chains have been associated with a number of neurodegenerative diseases. These include Huntington's disease, where expanded polyglutamine (PolyQ) sequences longer than 36 residues are correlated with the onset of symptoms. In this paper we study the folding pathway of a 54-residue PolyQ chain into a β-helical structure. Transition path sampling Monte Carlo simulations are used to generate unbiased reactive pathways between unfolded configurations and the folded β-helical structure of the polyglutamine chain. The folding process is examined in both explicit water and an implicit solvent. Both models reveal that the formation of a few critical contacts is necessary and sufficient for the molecule to fold. Once the primary contacts are formed, the fate of the protein is sealed and it is largely committed to fold. We find that, consistent with emerging hypotheses about PolyQ aggregation, a stable β-helical structure could serve as the nucleus for subsequent polymerization of amyloid fibrils. Our results indicate that PolyQ sequences shorter than 36 residues cannot form that nucleus, and it is also shown that specific mutations inferred from an analysis of the simulated folding pathway exacerbate its stability.
Statistical and Machine Learning forecasting methods: Concerns and ways forward
Makridakis, Spyros; Assimakopoulos, Vassilios
2018-01-01
Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784
Young, John A.; Iwanowicz, Luke R.; Sperry, Adam J.; Blazer, Vicki
2014-01-01
Endocrine-disrupting compounds (EDCs) are becoming of increasing concern in waterways of the USA and worldwide. What remains poorly understood, however, is how prevalent these emerging contaminants are in the environment and what methods are best able to determine landscape sources of EDCs. We describe the development of a spatially structured sampling design and a reconnaissance survey of estrogenic activity along gradients of land use within sub-watersheds.We present this example as a useful approach for state and federal agencies with an interest in identifying locations potentially impacted by EDCs that warrant more intensive, focused research. Our study confirms the importance of agricultural activities on levels of a measured estrogenic equivalent (E2Eq) and also highlights the importance of other potential sources of E2Eq in areas where intensive agriculture is not the dominant land use. Through application of readily available geographic information system (GIS) data, coupled with spatial statistical analysis, we demonstrate the correlation of specific land use types to levels of estrogenic activity across a large area in a consistent and unbiased manner.
Black Holes and the Centers of Galaxies
NASA Astrophysics Data System (ADS)
Richstone, Douglas
1997-07-01
We propose to continue our survey of centers of nearby galaxies. The major goal for Cycle 7 is to survey an unbiased set of galaxies with a potentially wide range of black hole masses. The results will constrain the prevalence and formation of massive black holes and their relationship to AGN's. Over the last several years, we have used HST to characterize the scaling laws for galaxy centers, to identify an apparent dichotomy in galaxy types based on their central light profiles, and to identify new black hole candidates and confirm ground-based results on known candidates. In the STIS epoch, we wish to capitalize on the presence of a genuine slit spectrograph to study the central stellar dynamics of a large set of systematically selected elliptical and S0 galaxies. The sample for this cycle has been carefully chosen to optimize our leverage on the character of a proposed correlation of black hole mass with galaxy mass. In addition, high-S/N observations of line profiles should permit us to distinguish between BHs and anisotropic stellar orbits, a critical degeneracy that has long plagued this subject.
Unbiased Large Spectroscopic Surveys of Galaxies Selected by SPICA Using Dust Bands
NASA Astrophysics Data System (ADS)
Kaneda, H.; Ishihara, D.; Oyabu, S.; Yamagishi, M.; Wada, T.; Armus, L.; Baes, M.; Charmandaris, V.; Czerny, B.; Efstathiou, A.; Fernández-Ontiveros, J. A.; Ferrara, A.; González-Alfonso, E.; Griffin, M.; Gruppioni, C.; Hatziminaoglou, E.; Imanishi, M.; Kohno, K.; Kwon, J.; Nakagawa, T.; Onaka, T.; Pozzi, F.; Scott, D.; Smith, J.-D. T.; Spinoglio, L.; Suzuki, T.; van der Tak, F.; Vaccari, M.; Vignali, C.; Wang, L.
2017-11-01
The mid-infrared range contains many spectral features associated with large molecules and dust grains such as polycyclic aromatic hydrocarbons and silicates. These are usually very strong compared to fine-structure gas lines, and thus valuable in studying the spectral properties of faint distant galaxies. In this paper, we evaluate the capability of low-resolution mid-infrared spectroscopic surveys of galaxies that could be performed by SPICA. The surveys are designed to address the question how star formation and black hole accretion activities evolved over cosmic time through spectral diagnostics of the physical conditions of the interstellar/circumnuclear media in galaxies. On the basis of results obtained with Herschel far-infrared photometric surveys of distant galaxies and Spitzer and AKARI near- to mid-infrared spectroscopic observations of nearby galaxies, we estimate the numbers of the galaxies at redshift z > 0.5, which are expected to be detected in the polycyclic aromatic hydrocarbon features or dust continuum by a wide (10 deg2) or deep (1 deg2) blind survey, both for a given observation time of 600 h. As by-products of the wide blind survey, we also expect to detect debris disks, through the mid-infrared excess above the photospheric emission of nearby main-sequence stars, and we estimate their number. We demonstrate that the SPICA mid-infrared surveys will efficiently provide us with unprecedentedly large spectral samples, which can be studied further in the far-infrared with SPICA.
Biased Brownian dynamics for rate constant calculation.
Zou, G; Skeel, R D; Subramaniam, S
2000-08-01
An enhanced sampling method-biased Brownian dynamics-is developed for the calculation of diffusion-limited biomolecular association reaction rates with high energy or entropy barriers. Biased Brownian dynamics introduces a biasing force in addition to the electrostatic force between the reactants, and it associates a probability weight with each trajectory. A simulation loses weight when movement is along the biasing force and gains weight when movement is against the biasing force. The sampling of trajectories is then biased, but the sampling is unbiased when the trajectory outcomes are multiplied by their weights. With a suitable choice of the biasing force, more reacted trajectories are sampled. As a consequence, the variance of the estimate is reduced. In our test case, biased Brownian dynamics gives a sevenfold improvement in central processing unit (CPU) time with the choice of a simple centripetal biasing force.
Jaffe, Jacob D; Keshishian, Hasmik; Chang, Betty; Addona, Theresa A; Gillette, Michael A; Carr, Steven A
2008-10-01
Verification of candidate biomarker proteins in blood is typically done using multiple reaction monitoring (MRM) of peptides by LC-MS/MS on triple quadrupole MS systems. MRM assay development for each protein requires significant time and cost, much of which is likely to be of little value if the candidate biomarker is below the detection limit in blood or a false positive in the original discovery data. Here we present a new technology, accurate inclusion mass screening (AIMS), designed to provide a bridge from unbiased discovery to MS-based targeted assay development. Masses on the software inclusion list are monitored in each scan on the Orbitrap MS system, and MS/MS spectra for sequence confirmation are acquired only when a peptide from the list is detected with both the correct accurate mass and charge state. The AIMS experiment confirms that a given peptide (and thus the protein from which it is derived) is present in the plasma. Throughput of the method is sufficient to qualify up to a hundred proteins/week. The sensitivity of AIMS is similar to MRM on a triple quadrupole MS system using optimized sample preparation methods (low tens of ng/ml in plasma), and MS/MS data from the AIMS experiments on the Orbitrap can be directly used to configure MRM assays. The method was shown to be at least 4-fold more efficient at detecting peptides of interest than undirected LC-MS/MS experiments using the same instrumentation, and relative quantitation information can be obtained by AIMS in case versus control experiments. Detection by AIMS ensures that a quantitative MRM-based assay can be configured for that protein. The method has the potential to qualify large number of biomarker candidates based on their detection in plasma prior to committing to the time- and resource-intensive steps of establishing a quantitative assay.
Rapid Evolution of Ovarian-Biased Genes in the Yellow Fever Mosquito (Aedes aegypti).
Whittle, Carrie A; Extavour, Cassandra G
2017-08-01
Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system ( e.g. , sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition. Copyright © 2017 by the Genetics Society of America.
A Unique Four-Hub Protein Cluster Associates to Glioblastoma Progression
Simeone, Pasquale; Trerotola, Marco; Urbanella, Andrea; Lattanzio, Rossano; Ciavardelli, Domenico; Di Giuseppe, Fabrizio; Eleuterio, Enrica; Sulpizio, Marilisa; Eusebi, Vincenzo; Pession, Annalisa; Piantelli, Mauro; Alberti, Saverio
2014-01-01
Gliomas are the most frequent brain tumors. Among them, glioblastomas are malignant and largely resistant to available treatments. Histopathology is the gold standard for classification and grading of brain tumors. However, brain tumor heterogeneity is remarkable and histopathology procedures for glioma classification remain unsatisfactory for predicting disease course as well as response to treatment. Proteins that tightly associate with cancer differentiation and progression, can bear important prognostic information. Here, we describe the identification of protein clusters differentially expressed in high-grade versus low-grade gliomas. Tissue samples from 25 high-grade tumors, 10 low-grade tumors and 5 normal brain cortices were analyzed by 2D-PAGE and proteomic profiling by mass spectrometry. This led to identify 48 differentially expressed protein markers between tumors and normal samples. Protein clustering by multivariate analyses (PCA and PLS-DA) provided discrimination between pathological samples to an unprecedented extent, and revealed a unique network of deranged proteins. We discovered a novel glioblastoma control module centered on four major network hubs: Huntingtin, HNF4α, c-Myc and 14-3-3ζ. Immunohistochemistry, western blotting and unbiased proteome-wide meta-analysis revealed altered expression of this glioblastoma control module in human glioma samples as compared with normal controls. Moreover, the four-hub network was found to cross-talk with both p53 and EGFR pathways. In summary, the findings of this study indicate the existence of a unifying signaling module controlling glioblastoma pathogenesis and malignant progression, and suggest novel targets for development of diagnostic and therapeutic procedures. PMID:25050814
Spectroscopic observation of SN 2017jzp and SN 2018bf by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Kuncarayakti, H.; Mattila, S.; Kotak, R.; Harmanen, J.; Reynolds, T.; Wyrzykowski, L.; Stritzinger, M.; Onori, F.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.
2018-01-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of SNe 2017jzp and 2018bf in host galaxies KUG 1326+679 and SDSS J225746.53+253833.5, respectively.
NASA Astrophysics Data System (ADS)
Harmanen, J.; Mattila, S.; Kuncarayakti, H.; Reynolds, T.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.; Dong, S.; Pastorello, A.; Pursimo, T.; NUTS Collaboration
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17nb in MCG+06-17-007 and CSS170922:172546+342249 in an unknown host galaxy.
Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression
Garrett, Neil; Sharot, Tali; Faulkner, Paul; Korn, Christoph W.; Roiser, Jonathan P.; Dolan, Raymond J.
2014-01-01
Recent evidence suggests that a state of good mental health is associated with biased processing of information that supports a positively skewed view of the future. Depression, on the other hand, is associated with unbiased processing of such information. Here, we use brain imaging in conjunction with a belief update task administered to clinically depressed patients and healthy controls to characterize brain activity that supports unbiased belief updating in clinically depressed individuals. Our results reveal that unbiased belief updating in depression is mediated by strong neural coding of estimation errors in response to both good news (in left inferior frontal gyrus and bilateral superior frontal gyrus) and bad news (in right inferior parietal lobule and right inferior frontal gyrus) regarding the future. In contrast, intact mental health was linked to a relatively attenuated neural coding of bad news about the future. These findings identify a neural substrate mediating the breakdown of biased updating in major depression disorder, which may be essential for mental health. PMID:25221492
Allowable SEM noise for unbiased LER measurement
NASA Astrophysics Data System (ADS)
Papavieros, George; Constantoudis, Vassilios; Gogolides, Evangelos
2018-03-01
Recently, a novel method for the calculation of unbiased Line Edge Roughness based on Power Spectral Density analysis has been proposed. In this paper first an alternative method is discussed and investigated, utilizing the Height-Height Correlation Function (HHCF) of edges. The HHCF-based method enables the unbiased determination of the whole triplet of LER parameters including besides rms the correlation length and roughness exponent. The key of both methods is the sensitivity of PSD and HHCF on noise at high frequencies and short distance respectively. Secondly, we elaborate a testbed of synthesized SEM images with controlled LER and noise to justify the effectiveness of the proposed unbiased methods. Our main objective is to find out the boundaries of the method in respect to noise levels and roughness characteristics, for which the method remains reliable, i.e the maximum amount of noise allowed, for which the output results cope with the controllable known inputs. At the same time, we will also set the extremes of roughness parameters for which the methods hold their accuracy.
An unbiased study of debris discs around A-type stars with Herschel
NASA Astrophysics Data System (ADS)
Thureau, N. D.; Greaves, J. S.; Matthews, B. C.; Kennedy, G.; Phillips, N.; Booth, M.; Duchêne, G.; Horner, J.; Rodriguez, D. R.; Sibthorpe, B.; Wyatt, M. C.
2014-12-01
The Herschel DEBRIS (Disc Emission via a Bias-free Reconnaissance in the Infrared/Submillimetre) survey brings us a unique perspective on the study of debris discs around main-sequence A-type stars. Bias-free by design, the survey offers a remarkable data set with which to investigate the cold disc properties. The statistical analysis of the 100 and 160 μm data for 86 main-sequence A stars yields a lower than previously found debris disc rate. Considering better than 3σ excess sources, we find a detection rate ≥24 ± 5 per cent at 100 μm which is similar to the debris disc rate around main-sequence F/G/K-spectral type stars. While the 100 and 160 μm excesses slowly decline with time, debris discs with large excesses are found around some of the oldest A stars in our sample, evidence that the debris phenomenon can survive throughout the length of the main sequence (˜1 Gyr). Debris discs are predominantly detected around the youngest and hottest stars in our sample. Stellar properties such as metallicity are found to have no effect on the debris disc incidence. Debris discs are found around A stars in single systems and multiple systems at similar rates. While tight and wide binaries (<1 and >100 au, respectively) host debris discs with a similar frequency and global properties, no intermediate separation debris systems were detected in our sample.
ATLASGAL -- A molecular view of an unbiased sample of massive star forming clumps
NASA Astrophysics Data System (ADS)
Figura, Charles; Urquhart, James; Wyrowski, Friedrich; Giannetti, Andrea; Kim, Wonju
2018-01-01
Massive stars play an important role in many areas of astrophysics, from regulating star formation to driving the evolution of their host galaxy. Study of these stars is made difficult by their short evolutionary timescales, small populations and greater distances, and further complicated because they reach the main sequence while still shrouded in their natal clumps. As a result, many aspects of their formation are still poorly understood.We have assembled a large and statistically representative collection of massive star-forming environments that span all evolutionary stages of development by correlating mid-infrared and dust continnum surveys. We have conducted follow-up single-pointing observations toward a sample of approximately 600 of these clumps with the Mopra telescope using an 8 GHz bandwidth that spans some 27 molecular and mm-radio recombination line transitions. These lines trace a wide range of interstellar conditions with varying thermal, chemical, and kinematic properties. Many of these lines exhibit hyperfine structure allowing more detailed measurements of the clump environment (e.g. rotation temperatures and column densities).From these twenty-seven lines, we have identified thirteen line intensity ratios that strongly trace the evolutionary state of these clumps. We have investigated individual molecular and mm-radio recombination lines, contrasting these with radio and sub-mm continuum observations. We present a summary of the results of the statistical analysis of the sample, and compare them with previous similar studies to test their utility as chemical clocks of the evolutionary processes.
Tissues from population-based cancer registries: a novel approach to increasing research potential.
Goodman, Marc T; Hernandez, Brenda Y; Hewitt, Stephen; Lynch, Charles F; Coté, Timothy R; Frierson, Henry F; Moskaluk, Christopher A; Killeen, Jeffrey L; Cozen, Wendy; Key, Charles R; Clegg, Limin; Reichman, Marsha; Hankey, Benjamin F; Edwards, Brenda
2005-07-01
Population-based cancer registries, such as those included in the Surveillance, Epidemiology, and End-Results (SEER) Program, offer tremendous research potential beyond traditional surveillance activities. We describe the expansion of SEER registries to gather formalin-fixed, paraffin-embedded tissue from cancer patients on a population basis. Population-based tissue banks have the advantage of providing an unbiased sampling frame for evaluating the public health impact of genes or protein targets that may be used for therapeutic or diagnostic purposes in defined communities. Such repositories provide a unique resource for testing new molecular classification schemes for cancer, validating new biologic markers of malignancy, prognosis and progression, assessing therapeutic targets, and measuring allele frequencies of cancer-associated genetic polymorphisms or germline mutations in representative samples. The assembly of tissue microarrays will allow for the use of rapid, large-scale protein-expression profiling of tumor samples while limiting depletion of this valuable resource. Access to biologic specimens through SEER registries will provide researchers with demographic, clinical, and risk factor information on cancer patients with assured data quality and completeness. Clinical outcome data, such as disease-free survival, can be correlated with previously validated prognostic markers. Furthermore, the anonymity of the study subject can be protected through rigorous standards of confidentiality. SEER-based tissue resources represent a step forward in true, population-based tissue repositories of tumors from US patients and may serve as a foundation for molecular epidemiology studies of cancer in this country.
Griaud, François; Denefeld, Blandine; Lang, Manuel; Hensinger, Héloïse; Haberl, Peter; Berg, Matthias
2017-01-01
ABSTRACT Characterization of charge-based variants by mass spectrometry (MS) is required for the analytical development of a new biologic entity and its marketing approval by health authorities. However, standard peak-based data analysis approaches are time-consuming and biased toward the detection, identification, and quantification of main variants only. The aim of this study was to characterize in-depth acidic and basic species of a stressed IgG1 monoclonal antibody using comprehensive and unbiased MS data evaluation tools. Fractions collected from cation ion exchange (CEX) chromatography were analyzed as intact, after reduction of disulfide bridges, and after proteolytic cleavage using Lys-C. Data of both intact and reduced samples were evaluated consistently using a time-resolved deconvolution algorithm. Peptide mapping data were processed simultaneously, quantified and compared in a systematic manner for all MS signals and fractions. Differences observed between the fractions were then further characterized and assigned. Time-resolved deconvolution enhanced pattern visualization and data interpretation of main and minor modifications in 3-dimensional maps across CEX fractions. Relative quantification of all MS signals across CEX fractions before peptide assignment enabled the detection of fraction-specific chemical modifications at abundances below 1%. Acidic fractions were shown to be heterogeneous, containing antibody fragments, glycated as well as deamidated forms of the heavy and light chains. In contrast, the basic fractions contained mainly modifications of the C-terminus and pyroglutamate formation at the N-terminus of the heavy chain. Systematic data evaluation was performed to investigate multiple data sets and comprehensively extract main and minor differences between each CEX fraction in an unbiased manner. PMID:28379786
Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nurgaliev, D.; McDonald, M.; Benson, B. A.
We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less
Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters
Nurgaliev, D.; McDonald, M.; Benson, B. A.; ...
2017-05-16
We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Combescure, Monique
2009-03-15
In our previous paper [Combescure, M., 'Circulant matrices, Gauss sums and the mutually unbiased bases. I. The prime number case', Cubo A Mathematical Journal (unpublished)] we have shown that the theory of circulant matrices allows to recover the result that there exists p+1 mutually unbiased bases in dimension p, p being an arbitrary prime number. Two orthonormal bases B, B{sup '} of C{sup d} are said mutually unbiased if for all b(set-membership sign)B, for all b{sup '}(set-membership sign)B{sup '} one has that |b{center_dot}b{sup '}|=1/{radical}(d) (b{center_dot}b{sup '} Hermitian scalar product in C{sup d}). In this paper we show that the theorymore » of block-circulant matrices with circulant blocks allows to show very simply the known result that if d=p{sup n} (p a prime number and n any integer) there exists d+1 mutually unbiased bases in C{sup d}. Our result relies heavily on an idea of Klimov et al. [''Geometrical approach to the discrete Wigner function,'' J. Phys. A 39, 14471 (2006)]. As a subproduct we recover properties of quadratic Weil sums for p{>=}3, which generalizes the fact that in the prime case the quadratic Gauss sum properties follow from our results.« less
Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data
Caulfield, Thomas R.; Devkota, Batsal; Rollins, Geoffrey C.
2011-01-01
We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16. PMID:21716650
McGregor, Tracy L.; Van Driest, Sara L.; Brothers, Kyle B.; Bowton, Erica A.; Muglia, Louis J.; Roden, Dan M.
2013-01-01
The Vanderbilt DNA repository, BioVU, links DNA from leftover clinical blood samples to de-identified electronic medical records. After initiating adult sample collection, pediatric extension required consideration of ethical concerns specific to pediatrics and implementation of specialized DNA extraction methods. In the first year of pediatric sample collection, over 11,000 samples were included from individuals younger than 18 years. We compared the pediatric BioVU cohort to the overall Vanderbilt University Medical Center pediatric population and found similar demographic characteristics; however, the BioVU cohort has higher rates of select diseases, medication exposures, and laboratory testing, demonstrating enriched representation of severe or chronic disease. This unbalanced sample accumulation may accelerate research of some cohorts, but also may limit study of relatively benign conditions and the accrual of unaffected and unbiased control samples. BioVU represents a feasible model for pediatric DNA biobanking but involves both ethical and practical considerations specific to the pediatric population. PMID:23281421
The large-scale environment from cosmological simulations - I. The baryonic cosmic web
NASA Astrophysics Data System (ADS)
Cui, Weiguang; Knebe, Alexander; Yepes, Gustavo; Yang, Xiaohu; Borgani, Stefano; Kang, Xi; Power, Chris; Staveley-Smith, Lister
2018-01-01
Using a series of cosmological simulations that includes one dark-matter-only (DM-only) run, one gas cooling-star formation-supernova feedback (CSF) run and one that additionally includes feedback from active galactic nuclei (AGNs), we classify the large-scale structures with both a velocity-shear-tensor code (VWEB) and a tidal-tensor code (PWEB). We find that the baryonic processes have almost no impact on large-scale structures - at least not when classified using aforementioned techniques. More importantly, our results confirm that the gas component alone can be used to infer the filamentary structure of the universe practically un-biased, which could be applied to cosmology constraints. In addition, the gas filaments are classified with its velocity (VWEB) and density (PWEB) fields, which can theoretically connect to the radio observations, such as H I surveys. This will help us to bias-freely link the radio observations with dark matter distributions at large scale.
Probing the Properties of AGN Clustering in the Local Universe with Swift-BAT
NASA Astrophysics Data System (ADS)
Powell, M.; Cappelluti, N.; Urry, M.; Koss, M.; Allevato, V.; Ajello, M.
2017-10-01
I present the benchmark measurement of AGN clustering in the local universe with the all-sky Swift-BAT survey. The hard X-ray selection (14-195 keV) allows for the detection of some of the most obscured AGN, providing the largest, most unbiased sample of local AGN to date. We derive for the first time the halo occupation distribution (HOD) of the sample in various bins of black hole mass, accretion rate, and obscuration. In doing so, we characterize the cosmic environment of growing supermassive black holes with unprecedented precision, and determine which black hole parameters depend on environment. We then compare our results to the current evolutionary models of AGN.
Digital sorting of complex tissues for cell type-specific gene expression profiles.
Zhong, Yi; Wan, Ying-Wooi; Pang, Kaifang; Chow, Lionel M L; Liu, Zhandong
2013-03-07
Cellular heterogeneity is present in almost all gene expression profiles. However, transcriptome analysis of tissue specimens often ignores the cellular heterogeneity present in these samples. Standard deconvolution algorithms require prior knowledge of the cell type frequencies within a tissue or their in vitro expression profiles. Furthermore, these algorithms tend to report biased estimations. Here, we describe a Digital Sorting Algorithm (DSA) for extracting cell-type specific gene expression profiles from mixed tissue samples that is unbiased and does not require prior knowledge of cell type frequencies. The results suggest that DSA is a specific and sensitivity algorithm in gene expression profile deconvolution and will be useful in studying individual cell types of complex tissues.
Absolute magnitude calibration using trigonometric parallax - Incomplete, spectroscopic samples
NASA Technical Reports Server (NTRS)
Ratnatunga, Kavan U.; Casertano, Stefano
1991-01-01
A new numerical algorithm is used to calibrate the absolute magnitude of spectroscopically selected stars from their observed trigonometric parallax. This procedure, based on maximum-likelihood estimation, can retrieve unbiased estimates of the intrinsic absolute magnitude and its dispersion even from incomplete samples suffering from selection biases in apparent magnitude and color. It can also make full use of low accuracy and negative parallaxes and incorporate censorship on reported parallax values. Accurate error estimates are derived for each of the fitted parameters. The algorithm allows an a posteriori check of whether the fitted model gives a good representation of the observations. The procedure is described in general and applied to both real and simulated data.
Definition and Measurement of Selection Bias: From Constant Ratio to Constant Difference
ERIC Educational Resources Information Center
Cahan, Sorel; Gamliel, Eyal
2006-01-01
Despite its intuitive appeal and popularity, Thorndike's constant ratio (CR) model for unbiased selection is inherently inconsistent in "n"-free selection. Satisfaction of the condition for unbiased selection, when formulated in terms of success/acceptance probabilities, usually precludes satisfaction by the converse probabilities of…
Spectroscopic observation of Gaia17dht and Gaia17diu by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Fraser, M.; Dyrbye, S.; Cappella, E.
2017-12-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of Gaia17dht/SN2017izz and Gaia17diu/SN2017jdb (in host galaxies SDSS J145121.24+283521.6 and LEDA 2753585 respectively).
Statistics as Unbiased Estimators: Exploring the Teaching of Standard Deviation
ERIC Educational Resources Information Center
Wasserman, Nicholas H.; Casey, Stephanie; Champion, Joe; Huey, Maryann
2017-01-01
This manuscript presents findings from a study about the knowledge for and planned teaching of standard deviation. We investigate how understanding variance as an unbiased (inferential) estimator--not just a descriptive statistic for the variation (spread) in data--is related to teachers' instruction regarding standard deviation, particularly…
Unbiased symmetric metrics provide a useful measure to quickly compare two datasets, with similar interpretations for both under and overestimations. Two examples include the normalized mean bias factor and normalized mean absolute error factor. However, the original formulations...
Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing
Balmaseda, Angel; Harris, Eva; DeRisi, Joseph L.
2012-01-01
Dengue virus is an emerging infectious agent that infects an estimated 50–100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness. PMID:22347512
Unbiased and robust quantification of synchronization between spikes and local field potential.
Li, Zhaohui; Cui, Dong; Li, Xiaoli
2016-08-30
In neuroscience, relating the spiking activity of individual neurons to the local field potential (LFP) of neural ensembles is an increasingly useful approach for studying rhythmic neuronal synchronization. Many methods have been proposed to measure the strength of the association between spikes and rhythms in the LFP recordings, and most existing measures are dependent upon the total number of spikes. In the present work, we introduce a robust approach for quantifying spike-LFP synchronization which performs reliably for limited samples of data. The measure is termed as spike-triggered correlation matrix synchronization (SCMS), which takes LFP segments centered on each spike as multi-channel signals and calculates the index of spike-LFP synchronization by constructing a correlation matrix. The simulation based on artificial data shows that the SCMS output almost does not change with the sample size. This property is of crucial importance when making comparisons between different experimental conditions. When applied to actual neuronal data recorded from the monkey primary visual cortex, it is found that the spike-LFP synchronization strength shows orientation selectivity to drifting gratings. In comparison to another unbiased method, pairwise phase consistency (PPC), the proposed SCMS behaves better for noisy spike trains by means of numerical simulations. This study demonstrates the basic idea and calculating process of the SCMS method. Considering its unbiasedness and robustness, the measure is of great advantage to characterize the synchronization between spike trains and rhythms present in LFP. Copyright © 2016 Elsevier B.V. All rights reserved.
Nishino, Jo; Kochi, Yuta; Shigemizu, Daichi; Kato, Mamoru; Ikari, Katsunori; Ochi, Hidenori; Noma, Hisashi; Matsui, Kota; Morizono, Takashi; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Matsui, Shigeyuki
2018-01-01
Genome-wide association studies (GWAS) suggest that the genetic architecture of complex diseases consists of unexpectedly numerous variants with small effect sizes. However, the polygenic architectures of many diseases have not been well characterized due to lack of simple and fast methods for unbiased estimation of the underlying proportion of disease-associated variants and their effect-size distribution. Applying empirical Bayes estimation of semi-parametric hierarchical mixture models to GWAS summary statistics, we confirmed that schizophrenia was extremely polygenic [~40% of independent genome-wide SNPs are risk variants, most within odds ratio (OR = 1.03)], whereas rheumatoid arthritis was less polygenic (~4 to 8% risk variants, significant portion reaching OR = 1.05 to 1.1). For rheumatoid arthritis, stratified estimations revealed that expression quantitative loci in blood explained large genetic variance, and low- and high-frequency derived alleles were prone to be risk and protective, respectively, suggesting a predominance of deleterious-risk and advantageous-protective mutations. Despite genetic correlation, effect-size distributions for schizophrenia and bipolar disorder differed across allele frequency. These analyses distinguished disease polygenic architectures and provided clues for etiological differences in complex diseases. PMID:29740473
Trasande, Leonardo; Vandenberg, Laura N; Bourguignon, Jean-Pierre; Myers, John Peterson; Slama, Remy; Saal, Frederick vom; Zoeller, Robert Thomas
2017-01-01
Evidence increasingly confirms that synthetic chemicals disrupt the endocrine system and contribute to disease and disability across the lifespan. Despite a United Nations Environment Programme/WHO report affirmed by over 100 countries at the Fourth International Conference on Chemicals Management, ‘manufactured doubt’ continues to be cast as a cloud over rigorous, peer-reviewed and independently funded scientific data. This study describes the sources of doubt and their social costs, and suggested courses of action by policymakers to prevent disease and disability. The problem is largely based on the available data, which are all too limited. Rigorous testing programmes should not simply focus on oestrogen, androgen and thyroid. Tests should have proper statistical power. ‘Good laboratory practice’ (GLP) hardly represents a proper or even gold standard for laboratory studies of endocrine disruption. Studies should be evaluated with regard to the contamination of negative controls, responsiveness to positive controls and dissection techniques. Flaws in many GLP studies have been identified, yet regulatory agencies rely on these flawed studies. Peer-reviewed and unbiased research, rather than ‘sound science’, should be used to evaluate endocrine-disrupting chemicals. PMID:27417427
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-06-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-01-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889
Fandom Biases Retrospective Judgments Not Perception.
Huff, Markus; Papenmeier, Frank; Maurer, Annika E; Meitz, Tino G K; Garsoffky, Bärbel; Schwan, Stephan
2017-02-24
Attitudes and motivations have been shown to affect the processing of visual input, indicating that observers may see a given situation each literally in a different way. Yet, in real-life, processing information in an unbiased manner is considered to be of high adaptive value. Attitudinal and motivational effects were found for attention, characterization, categorization, and memory. On the other hand, for dynamic real-life events, visual processing has been found to be highly synchronous among viewers. Thus, while in a seminal study fandom as a particularly strong case of attitudes did bias judgments of a sports event, it left the question open whether attitudes do bias prior processing stages. Here, we investigated influences of fandom during the live TV broadcasting of the 2013 UEFA-Champions-League Final regarding attention, event segmentation, immediate and delayed cued recall, as well as affect, memory confidence, and retrospective judgments. Even though we replicated biased retrospective judgments, we found that eye-movements, event segmentation, and cued recall were largely similar across both groups of fans. Our findings demonstrate that, while highly involving sports events are interpreted in a fan dependent way, at initial stages they are processed in an unbiased manner.
Fandom Biases Retrospective Judgments Not Perception
Huff, Markus; Papenmeier, Frank; Maurer, Annika E.; Meitz, Tino G. K.; Garsoffky, Bärbel; Schwan, Stephan
2017-01-01
Attitudes and motivations have been shown to affect the processing of visual input, indicating that observers may see a given situation each literally in a different way. Yet, in real-life, processing information in an unbiased manner is considered to be of high adaptive value. Attitudinal and motivational effects were found for attention, characterization, categorization, and memory. On the other hand, for dynamic real-life events, visual processing has been found to be highly synchronous among viewers. Thus, while in a seminal study fandom as a particularly strong case of attitudes did bias judgments of a sports event, it left the question open whether attitudes do bias prior processing stages. Here, we investigated influences of fandom during the live TV broadcasting of the 2013 UEFA-Champions-League Final regarding attention, event segmentation, immediate and delayed cued recall, as well as affect, memory confidence, and retrospective judgments. Even though we replicated biased retrospective judgments, we found that eye-movements, event segmentation, and cued recall were largely similar across both groups of fans. Our findings demonstrate that, while highly involving sports events are interpreted in a fan dependent way, at initial stages they are processed in an unbiased manner. PMID:28233877
Mutually unbiased bases in six dimensions: The four most distant bases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raynal, Philippe; Lue Xin; Englert, Berthold-Georg
2011-06-15
We consider the average distance between four bases in six dimensions. The distance between two orthonormal bases vanishes when the bases are the same, and the distance reaches its maximal value of unity when the bases are unbiased. We perform a numerical search for the maximum average distance and find it to be strictly smaller than unity. This is strong evidence that no four mutually unbiased bases exist in six dimensions. We also provide a two-parameter family of three bases which, together with the canonical basis, reach the numerically found maximum of the average distance, and we conduct a detailedmore » study of the structure of the extremal set of bases.« less
Virological Sampling of Inaccessible Wildlife with Drones.
Geoghegan, Jemma L; Pirotta, Vanessa; Harvey, Erin; Smith, Alastair; Buchmann, Jan P; Ostrowski, Martin; Eden, John-Sebastian; Harcourt, Robert; Holmes, Edward C
2018-06-02
There is growing interest in characterizing the viromes of diverse mammalian species, particularly in the context of disease emergence. However, little is known about virome diversity in aquatic mammals, in part due to difficulties in sampling. We characterized the virome of the exhaled breath (or blow) of the Eastern Australian humpback whale ( Megaptera novaeangliae ). To achieve an unbiased survey of virome diversity, a meta-transcriptomic analysis was performed on 19 pooled whale blow samples collected via a purpose-built Unmanned Aerial Vehicle (UAV, or drone) approximately 3 km off the coast of Sydney, Australia during the 2017 winter annual northward migration from Antarctica to northern Australia. To our knowledge, this is the first time that UAVs have been used to sample viruses. Despite the relatively small number of animals surveyed in this initial study, we identified six novel virus species from five viral families. This work demonstrates the potential of UAVs in studies of virus disease, diversity, and evolution.
Clear: Composition of Likelihoods for Evolve and Resequence Experiments.
Iranmehr, Arya; Akbari, Ali; Schlötterer, Christian; Bafna, Vineet
2017-06-01
The advent of next generation sequencing technologies has made whole-genome and whole-population sampling possible, even for eukaryotes with large genomes. With this development, experimental evolution studies can be designed to observe molecular evolution "in action" via evolve-and-resequence (E&R) experiments. Among other applications, E&R studies can be used to locate the genes and variants responsible for genetic adaptation. Most existing literature on time-series data analysis often assumes large population size, accurate allele frequency estimates, or wide time spans. These assumptions do not hold in many E&R studies. In this article, we propose a method-composition of likelihoods for evolve-and-resequence experiments (Clear)-to identify signatures of selection in small population E&R experiments. Clear takes whole-genome sequences of pools of individuals as input, and properly addresses heterogeneous ascertainment bias resulting from uneven coverage. Clear also provides unbiased estimates of model parameters, including population size, selection strength, and dominance, while being computationally efficient. Extensive simulations show that Clear achieves higher power in detecting and localizing selection over a wide range of parameters, and is robust to variation of coverage. We applied the Clear statistic to multiple E&R experiments, including data from a study of adaptation of Drosophila melanogaster to alternating temperatures and a study of outcrossing yeast populations, and identified multiple regions under selection with genome-wide significance. Copyright © 2017 by the Genetics Society of America.
Data-driven confounder selection via Markov and Bayesian networks.
Häggström, Jenny
2018-06-01
To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the observed pretreatment covariates, X, sufficient for unconfoundedness, if such subsets exist. Here, estimation of these target subsets is considered when the underlying causal structure is unknown. The proposed method is to model the causal structure by a probabilistic graphical model, for example, a Markov or Bayesian network, estimate this graph from observed data and select the target subsets given the estimated graph. The approach is evaluated by simulation both in a high-dimensional setting where unconfoundedness holds given X and in a setting where unconfoundedness only holds given subsets of X. Several common target subsets are investigated and the selected subsets are compared with respect to accuracy in estimating the average causal effect. The proposed method is implemented with existing software that can easily handle high-dimensional data, in terms of large samples and large number of covariates. The results from the simulation study show that, if unconfoundedness holds given X, this approach is very successful in selecting the target subsets, outperforming alternative approaches based on random forests and LASSO, and that the subset estimating the target subset containing all causes of outcome yields smallest MSE in the average causal effect estimation. © 2017, The International Biometric Society.
Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design
ERIC Educational Resources Information Center
Smith, William C.
2014-01-01
The ability of regression discontinuity (RD) designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs) make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education…
NASA Technical Reports Server (NTRS)
Cooper, Ken; Gordon, Gail (Technical Monitor)
2001-01-01
This article offers an unfiltered look at a large cross section of the different rapid prototyping technologies available today; from a guy with one of the biggest RP toy boxes in the world as the manager of the Rapid Prototyping Laboratory at NASA's Marshall Space Flight Center (MSFC) in Huntsville, AL, USA. NASA's current operation capacity is nine RP machines, representing eight actual technologies. The article presents a realistic, unbiased look at the technologies and offers advice on what to do and where to go for the best solution to your rapid prototyping needs.
Transportation Big Data: Unbiased Analysis and Tools to Inform Sustainable Transportation Decisions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Today, transportation operation and energy systems data are generated at an unprecedented scale. The U.S. Department of Energy's National Renewable Energy Laboratory (NREL) is the go-to source for expertise in providing data and analysis to inform industry and government transportation decision making. The lab's teams of data experts and engineers are mining and analyzing large sets of complex data -- or 'big data' -- to develop solutions that support the research, development, and deployment of market-ready technologies that reduce fuel consumption and greenhouse gas emissions.
Zunder, Eli R.; Finck, Rachel; Behbehani, Gregory K.; Amir, El-ad D.; Krishnaswamy, Smita; Gonzalez, Veronica D.; Lorang, Cynthia G.; Bjornson, Zach; Spitzer, Matthew H.; Bodenmiller, Bernd; Fantl, Wendy J.; Pe’er, Dana; Nolan, Garry P.
2015-01-01
SUMMARY Mass-tag cell barcoding (MCB) labels individual cell samples with unique combinatorial barcodes, after which they are pooled for processing and measurement as a single multiplexed sample. The MCB method eliminates variability between samples in antibody staining and instrument sensitivity, reduces antibody consumption, and shortens instrument measurement time. Here, we present an optimized MCB protocol with several improvements over previously described methods. The use of palladium-based labeling reagents expands the number of measurement channels available for mass cytometry and reduces interference with lanthanide-based antibody measurement. An error-detecting combinatorial barcoding scheme allows cell doublets to be identified and removed from the analysis. A debarcoding algorithm that is single cell-based rather than population-based improves the accuracy and efficiency of sample deconvolution. This debarcoding algorithm has been packaged into software that allows rapid and unbiased sample deconvolution. The MCB procedure takes 3–4 h, not including sample acquisition time of ~1 h per million cells. PMID:25612231
Fast and accurate Monte Carlo sampling of first-passage times from Wiener diffusion models.
Drugowitsch, Jan
2016-02-11
We present a new, fast approach for drawing boundary crossing samples from Wiener diffusion models. Diffusion models are widely applied to model choices and reaction times in two-choice decisions. Samples from these models can be used to simulate the choices and reaction times they predict. These samples, in turn, can be utilized to adjust the models' parameters to match observed behavior from humans and other animals. Usually, such samples are drawn by simulating a stochastic differential equation in discrete time steps, which is slow and leads to biases in the reaction time estimates. Our method, instead, facilitates known expressions for first-passage time densities, which results in unbiased, exact samples and a hundred to thousand-fold speed increase in typical situations. In its most basic form it is restricted to diffusion models with symmetric boundaries and non-leaky accumulation, but our approach can be extended to also handle asymmetric boundaries or to approximate leaky accumulation.
NASA Astrophysics Data System (ADS)
Wolf, C.; Johnson, A. S.; Bilicki, M.; Blake, C.; Amon, A.; Erben, T.; Glazebrook, K.; Heymans, C.; Hildebrandt, H.; Joudaki, S.; Klaes, D.; Kuijken, K.; Lidman, C.; Marin, F.; Parkinson, D.; Poole, G.
2017-04-01
We present a new training set for estimating empirical photometric redshifts of galaxies, which was created as part of the 2-degree Field Lensing Survey project. This training set is located in a ˜700 deg2 area of the Kilo-Degree-Survey South field and is randomly selected and nearly complete at r < 19.5. We investigate the photometric redshift performance obtained with ugriz photometry from VST-ATLAS and W1/W2 from WISE, based on several empirical and template methods. The best redshift errors are obtained with kernel-density estimation (KDE), as are the lowest biases, which are consistent with zero within statistical noise. The 68th percentiles of the redshift scatter for magnitude-limited samples at r < (15.5, 17.5, 19.5) are (0.014, 0.017, 0.028). In this magnitude range, there are no known ambiguities in the colour-redshift map, consistent with a small rate of redshift outliers. In the fainter regime, the KDE method produces p(z) estimates per galaxy that represent unbiased and accurate redshift frequency expectations. The p(z) sum over any subsample is consistent with the true redshift frequency plus Poisson noise. Further improvements in redshift precision at r < 20 would mostly be expected from filter sets with narrower passbands to increase the sensitivity of colours to small changes in redshift.
Selection biases in empirical p(z) methods for weak lensing
Gruen, D.; Brimioulle, F.
2017-02-23
To measure the mass of foreground objects with weak gravitational lensing, one needs to estimate the redshift distribution of lensed background sources. This is commonly done in an empirical fashion, i.e. with a reference sample of galaxies of known spectroscopic redshift, matched to the source population. In this paper, we develop a simple decision tree framework that, under the ideal conditions of a large, purely magnitude-limited reference sample, allows an unbiased recovery of the source redshift probability density function p(z), as a function of magnitude and colour. We use this framework to quantify biases in empirically estimated p(z) caused bymore » selection effects present in realistic reference and weak lensing source catalogues, namely (1) complex selection of reference objects by the targeting strategy and success rate of existing spectroscopic surveys and (2) selection of background sources by the success of object detection and shape measurement at low signal to noise. For intermediate-to-high redshift clusters, and for depths and filter combinations appropriate for ongoing lensing surveys, we find that (1) spectroscopic selection can cause biases above the 10 per cent level, which can be reduced to ≈5 per cent by optimal lensing weighting, while (2) selection effects in the shape catalogue bias mass estimates at or below the 2 per cent level. Finally, this illustrates the importance of completeness of the reference catalogues for empirical redshift estimation.« less
Besch-Williford, Cynthia; Pesavento, Patricia; Hamilton, Shari; Bauer, Beth; Kapusinszky, Beatrix; Phan, Tung; Delwart, Eric; Livingston, Robert; Cushing, Susan; Watanabe, Rie; Levin, Stephen; Berger, Diana; Myles, Matthew
2017-07-01
We report the identification, pathogenesis, and transmission of a novel polyomavirus in severe combined immunodeficient F344 rats with null Prkdc and interleukin 2 receptor gamma genes. Infected rats experienced weight loss, decreased fecundity, and mortality. Large basophilic intranuclear inclusions were observed in epithelium of the respiratory tract, salivary and lacrimal glands, uterus, and prostate gland. Unbiased viral metagenomic sequencing of lesioned tissues identified a novel polyomavirus, provisionally named Rattus norvegicus polyomavirus 2 (RatPyV2), which clustered with Washington University (WU) polyomavirus in the Wuki clade of the Betapolyomavirus genus. In situ hybridization analyses and quantitative polymerase chain reaction (PCR) results demonstrated viral nucleic acids in epithelium of respiratory, glandular, and reproductive tissues. Polyomaviral disease was reproduced in Foxn1 rnu nude rats cohoused with infected rats or experimentally inoculated with virus. After development of RatPyV2-specific diagnostic assays, a survey of immune-competent rats from North American research institutions revealed detection of RatPyV2 in 7 of 1,000 fecal samples by PCR and anti-RatPyV2 antibodies in 480 of 1,500 serum samples. These findings suggest widespread infection in laboratory rat populations, which may have profound implications for established models of respiratory injury. Additionally, RatPyV2 infection studies may provide an important system to investigate the pathogenesis of WU polyomavirus diseases of man.
Stark, Lucy; Giersch, Tina; Wünschiers, Röbbe
2014-10-01
Understanding the microbial population in anaerobic digestion is an essential task to increase efficient substrate use and process stability. The metabolic state, represented e.g. by the transcriptome, of a fermenting system can help to find markers for monitoring industrial biogas production to prevent failures or to model the whole process. Advances in next-generation sequencing make transcriptomes accessible for large-scale analyses. In order to analyze the metatranscriptome of a mixed-species sample, isolation of high-quality RNA is the first step. However, different extraction methods may yield different efficiencies in different species. Especially in mixed-species environmental samples, unbiased isolation of transcripts is important for meaningful conclusions. We applied five different RNA-extraction protocols to nine taxonomic diverse bacterial species. Chosen methods are based on various lysis and extraction principles. We found that the extraction efficiency of different methods depends strongly on the target organism. RNA isolation of gram-positive bacteria was characterized by low yield whilst from gram-negative species higher concentrations can be obtained. Transferring our results to mixed-species investigations, such as metatranscriptomics with biofilms or biogas plants, leads to the conclusion that particular microorganisms might be over- or underrepresented depending on the method applied. Special care must be taken when using such metatranscriptomics data for, e.g. process modeling. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Lathrop, J. W.
1985-01-01
If thin film cells are to be considered a viable option for terrestrial power generation their reliability attributes will need to be explored and confidence in their stability obtained through accelerated testing. Development of a thin film accelerated test program will be more difficult than was the case for crystalline cells because of the monolithic construction nature of the cells. Specially constructed test samples will need to be fabricated, requiring committment to the concept of accelerated testing by the manufacturers. A new test schedule appropriate to thin film cells will need to be developed which will be different from that used in connection with crystalline cells. Preliminary work has been started to seek thin film schedule variations to two of the simplest tests: unbiased temperature and unbiased temperature humidity. Still to be examined are tests which involve the passage of current during temperature and/or humidity stress, either by biasing in the forward (or reverse) directions or by the application of light during stress. Investigation of these current (voltage) accelerated tests will involve development of methods of reliably contacting the thin conductive films during stress.
Estimating unbiased magnitudes for the announced DPRK nuclear tests, 2006-2016
NASA Astrophysics Data System (ADS)
Peacock, Sheila; Bowers, David
2017-04-01
The seismic disturbances generated from the five (2006-2016) announced nuclear test explosions by the Democratic People's Republic of Korea (DPRK) are of moderate magnitude (body-wave magnitude mb 4-5) by global earthquake standards. An upward bias of network mean mb of low- to moderate-magnitude events is long established, and is caused by the censoring of readings from stations where the signal was below noise level at the time of the predicted arrival. This sampling bias can be overcome by maximum-likelihood methods using station thresholds at detecting (and non-detecting) stations. Bias in the mean mb can also be introduced by differences in the network of stations recording each explosion - this bias can reduced by using station corrections. We apply a maximum-likelihood (JML) inversion that jointly estimates station corrections and unbiased network mb for the five DPRK explosions recorded by the CTBTO International Monitoring Network (IMS) of seismic stations. The thresholds can either be directly measured from the noise preceding the observed signal, or determined by statistical analysis of bulletin amplitudes. The network mb of the first and smallest explosion is reduced significantly relative to the mean mb (to < 4.0 mb) by removal of the censoring bias.
NASA Astrophysics Data System (ADS)
Coatman, Liam; Hewett, Paul C.; Banerji, Manda; Richards, Gordon T.; Hennawi, Joseph F.; Prochaska, Jason X.
2017-01-01
Accurate black-hole (BH) mass estimates for high-redshift (z>2) quasars are essential for better understanding the relationship between super-massive BH accretion and star formation. Progress is currently limited by the large systematic errors in virial BH-masses derived from the CIV broad emission line, which is often significantly blueshifted relative to systemic, most likely due to outflowing gas in the quasar broad-line region. We have assembled Balmer-line based BH masses for a large sample of 230 high-luminosity (1045.5-1048 ergs-1), redshift 1.5
Numerical studies of various Néel-VBS transitions in SU(N) anti-ferromagnets
NASA Astrophysics Data System (ADS)
Kaul, Ribhu K.; Block, Matthew S.
2015-09-01
In this manuscript we review recent developments in the numerical simulations of bipartite SU(N) spin models by quantum Monte Carlo (QMC) methods. We provide an account of a large family of newly discovered sign-problem free spin models which can be simulated in their ground states on large lattices, containing O(105) spins, using the stochastic series expansion method with efficient loop algorithms. One of the most important applications so far of these Hamiltonians are to unbiased studies of quantum criticality between Neel and valence bond phases in two dimensions - a summary of this body of work is provided. The article concludes with an overview of the current status of and outlook for future studies of the “designer” Hamiltonians.
Application of Response Surface Methods To Determine Conditions for Optimal Genomic Prediction
Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.
2017-01-01
An epistatic genetic architecture can have a significant impact on prediction accuracies of genomic prediction (GP) methods. Machine learning methods predict traits comprised of epistatic genetic architectures more accurately than statistical methods based on additive mixed linear models. The differences between these types of GP methods suggest a diagnostic for revealing genetic architectures underlying traits of interest. In addition to genetic architecture, the performance of GP methods may be influenced by the sample size of the training population, the number of QTL, and the proportion of phenotypic variability due to genotypic variability (heritability). Possible values for these factors and the number of combinations of the factor levels that influence the performance of GP methods can be large. Thus, efficient methods for identifying combinations of factor levels that produce most accurate GPs is needed. Herein, we employ response surface methods (RSMs) to find the experimental conditions that produce the most accurate GPs. We illustrate RSM with an example of simulated doubled haploid populations and identify the combination of factors that maximize the difference between prediction accuracies of best linear unbiased prediction (BLUP) and support vector machine (SVM) GP methods. The greatest impact on the response is due to the genetic architecture of the population, heritability of the trait, and the sample size. When epistasis is responsible for all of the genotypic variance and heritability is equal to one and the sample size of the training population is large, the advantage of using the SVM method vs. the BLUP method is greatest. However, except for values close to the maximum, most of the response surface shows little difference between the methods. We also determined that the conditions resulting in the greatest prediction accuracy for BLUP occurred when genetic architecture consists solely of additive effects, and heritability is equal to one. PMID:28720710
Perturbation analysis of queueing systems with a time-varying arrival rate
NASA Technical Reports Server (NTRS)
Cassandras, Christos G.; Pan, Jie
1991-01-01
The authors consider an M/G/1 queuing with a time-varying arrival rate. The objective is to obtain infinitesimal perturbation analysis (IPA) gradient estimates for various performance measures of interest with respect to certain system parameters. In particular, the authors consider the mean system time over n arrivals and an arrival rate alternating between two values. By choosing a convenient sample path representation of this system, they derive an unbiased IPA gradient estimator which, however, is not consistent, and investigate the nature of this problem.
Age Distribution of Lunar Impact-Melt Rocks in Apollo Drive-Tube 68001/2
NASA Technical Reports Server (NTRS)
Curran, N. M.; Bower, D. M.; Frasl, B.; Cohen, B. A.
2018-01-01
Apollo 16 double-drive tube 68001 /68002 provides impact and volcanic materials along a depth of approximately 60 cm in five compositional distinct units. 68001 /2 offers the potential to study distinct populations of impact melts with depth to understand how 'gardening' affects these samples. We will use unbiased major-element chemistry, mineralogy, and age to understand the impact history of Apollo 16 landing site. The study demonstrates the techniques that landed missions require to identify lithologies of interest (e.g., impact melts).
Che, James; Yu, Victor; Dhar, Manjima; Renier, Corinne; Matsumoto, Melissa; Heirich, Kyra; Garon, Edward B; Goldman, Jonathan; Rao, Jianyu; Sledge, George W; Pegram, Mark D; Sheth, Shruti; Jeffrey, Stefanie S; Kulkarni, Rajan P; Sollier, Elodie; Di Carlo, Dino
2016-03-15
Circulating tumor cells (CTCs) are emerging as rare but clinically significant non-invasive cellular biomarkers for cancer patient prognosis, treatment selection, and treatment monitoring. Current CTC isolation approaches, such as immunoaffinity, filtration, or size-based techniques, are often limited by throughput, purity, large output volumes, or inability to obtain viable cells for downstream analysis. For all technologies, traditional immunofluorescent staining alone has been employed to distinguish and confirm the presence of isolated CTCs among contaminating blood cells, although cells isolated by size may express vastly different phenotypes. Consequently, CTC definitions have been non-trivial, researcher-dependent, and evolving. Here we describe a complete set of objective criteria, leveraging well-established cytomorphological features of malignancy, by which we identify large CTCs. We apply the criteria to CTCs enriched from stage IV lung and breast cancer patient blood samples using the High Throughput Vortex Chip (Vortex HT), an improved microfluidic technology for the label-free, size-based enrichment and concentration of rare cells. We achieve improved capture efficiency (up to 83%), high speed of processing (8 mL/min of 10x diluted blood, or 800 μL/min of whole blood), and high purity (avg. background of 28.8±23.6 white blood cells per mL of whole blood). We show markedly improved performance of CTC capture (84% positive test rate) in comparison to previous Vortex designs and the current FDA-approved gold standard CellSearch assay. The results demonstrate the ability to quickly collect viable and pure populations of abnormal large circulating cells unbiased by molecular characteristics, which helps uncover further heterogeneity in these cells.
Che, James; Yu, Victor; Dhar, Manjima; Renier, Corinne; Matsumoto, Melissa; Heirich, Kyra; Garon, Edward B.; Goldman, Jonathan; Rao, Jianyu; Sledge, George W.; Pegram, Mark D.; Sheth, Shruti; Jeffrey, Stefanie S.; Kulkarni, Rajan P.; Sollier, Elodie; Di Carlo, Dino
2016-01-01
Circulating tumor cells (CTCs) are emerging as rare but clinically significant non-invasive cellular biomarkers for cancer patient prognosis, treatment selection, and treatment monitoring. Current CTC isolation approaches, such as immunoaffinity, filtration, or size-based techniques, are often limited by throughput, purity, large output volumes, or inability to obtain viable cells for downstream analysis. For all technologies, traditional immunofluorescent staining alone has been employed to distinguish and confirm the presence of isolated CTCs among contaminating blood cells, although cells isolated by size may express vastly different phenotypes. Consequently, CTC definitions have been non-trivial, researcher-dependent, and evolving. Here we describe a complete set of objective criteria, leveraging well-established cytomorphological features of malignancy, by which we identify large CTCs. We apply the criteria to CTCs enriched from stage IV lung and breast cancer patient blood samples using the High Throughput Vortex Chip (Vortex HT), an improved microfluidic technology for the label-free, size-based enrichment and concentration of rare cells. We achieve improved capture efficiency (up to 83%), high speed of processing (8 mL/min of 10x diluted blood, or 800 μL/min of whole blood), and high purity (avg. background of 28.8±23.6 white blood cells per mL of whole blood). We show markedly improved performance of CTC capture (84% positive test rate) in comparison to previous Vortex designs and the current FDA-approved gold standard CellSearch assay. The results demonstrate the ability to quickly collect viable and pure populations of abnormal large circulating cells unbiased by molecular characteristics, which helps uncover further heterogeneity in these cells. PMID:26863573
Five instruments for measuring tree height: an evaluation
Michael S. Williams; William A. Bechtold; V.J. LaBau
1994-01-01
Five instruments were tested for reliability in measuring tree heights under realistic conditions. Four linear models were used to determine if tree height can be measured unbiasedly over all tree sizes and if any of the instruments were more efficient in estimating tree height. The laser height finder was the only instrument to produce unbiased estimates of the true...
NASA Astrophysics Data System (ADS)
Dong, Subo; Bose, Subhash; Stritzinger, M.; Holmbo, S.; Fraser, M.; Fedorets, G.
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ATLAS17lcs (SN 2017guv) and ASASSN-17mq (AT 2017gvo) in host galaxies 2MASX J19132225-1648031 and CGCG 225-050, respectively.
NASA Astrophysics Data System (ADS)
Pastorello, Andrea; Benetti, Stefano; Cappellaro, Enrico; Terreran, Giacomo; Tomasella, Lina; Fedorets, Grigori; NUTS Collaboration
2017-07-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17io in the galaxy CGCG 316-010, along with the re classification of ATLAS17hpt (SN 2017faf), which was previously classified as a SLSN-I (ATel #10549).
NASA Technical Reports Server (NTRS)
Tilton, J. C.; Swain, P. H. (Principal Investigator); Vardeman, S. B.
1981-01-01
A key input to a statistical classification algorithm, which exploits the tendency of certain ground cover classes to occur more frequently in some spatial context than in others, is a statistical characterization of the context: the context distribution. An unbiased estimator of the context distribution is discussed which, besides having the advantage of statistical unbiasedness, has the additional advantage over other estimation techniques of being amenable to an adaptive implementation in which the context distribution estimate varies according to local contextual information. Results from applying the unbiased estimator to the contextual classification of three real LANDSAT data sets are presented and contrasted with results from non-contextual classifications and from contextual classifications utilizing other context distribution estimation techniques.
Act on Numbers: Numerical Magnitude Influences Selection and Kinematics of Finger Movement
Rugani, Rosa; Betti, Sonia; Ceccarini, Francesco; Sartori, Luisa
2017-01-01
In the past decade hand kinematics has been reliably adopted for investigating cognitive processes and disentangling debated topics. One of the most controversial issues in numerical cognition literature regards the origin – cultural vs. genetically driven – of the mental number line (MNL), oriented from left (small numbers) to right (large numbers). To date, the majority of studies have investigated this effect by means of response times, whereas studies considering more culturally unbiased measures such as kinematic parameters are rare. Here, we present a new paradigm that combines a “free response” task with the kinematic analysis of movement. Participants were seated in front of two little soccer goals placed on a table, one on the left and one on the right side. They were presented with left- or right-directed arrows and they were instructed to kick a small ball with their right index toward the goal indicated by the arrow. In a few test trials participants were presented also with a small (2) or a large (8) number, and they were allowed to choose the kicking direction. Participants performed more left responses with the small number and more right responses with the large number. The whole kicking movement was segmented in two temporal phases in order to make a hand kinematics’ fine-grained analysis. The Kick Preparation and Kick Finalization phases were selected on the basis of peak trajectory deviation from the virtual midline between the two goals. Results show an effect of both small and large numbers on action execution timing. Participants were faster to finalize the action when responding to small numbers toward the left and to large number toward the right. Here, we provide the first experimental demonstration which highlights how numerical processing affects action execution in a new and not-overlearned context. The employment of this innovative and unbiased paradigm will permit to disentangle the role of nature and culture in shaping the direction of MNL and the role of finger in the acquisition of numerical skills. Last but not least, similar paradigms will allow to determine how cognition can influence action execution. PMID:28912743
Thompson, Craig M.; Royle, J. Andrew; Garner, James D.
2012-01-01
Wildlife management often hinges upon an accurate assessment of population density. Although undeniably useful, many of the traditional approaches to density estimation such as visual counts, livetrapping, or mark–recapture suffer from a suite of methodological and analytical weaknesses. Rare, secretive, or highly mobile species exacerbate these problems through the reality of small sample sizes and movement on and off study sites. In response to these difficulties, there is growing interest in the use of non-invasive survey techniques, which provide the opportunity to collect larger samples with minimal increases in effort, as well as the application of analytical frameworks that are not reliant on large sample size arguments. One promising survey technique, the use of scat detecting dogs, offers a greatly enhanced probability of detection while at the same time generating new difficulties with respect to non-standard survey routes, variable search intensity, and the lack of a fixed survey point for characterizing non-detection. In order to account for these issues, we modified an existing spatially explicit, capture–recapture model for camera trap data to account for variable search intensity and the lack of fixed, georeferenced trap locations. We applied this modified model to a fisher (Martes pennanti) dataset from the Sierra National Forest, California, and compared the results (12.3 fishers/100 km2) to more traditional density estimates. We then evaluated model performance using simulations at 3 levels of population density. Simulation results indicated that estimates based on the posterior mode were relatively unbiased. We believe that this approach provides a flexible analytical framework for reconciling the inconsistencies between detector dog survey data and density estimation procedures.
NASA Astrophysics Data System (ADS)
Evans, Nancy R.; Bond, H. E.; Schaefer, G.; Mason, B. D.; Karovska, M.; Tingle, E.
2013-01-01
Cepheids (5 Msun stars) provide an excellent sample for determining the binary properties of fairly massive stars. International Ultraviolet Explorer (IUE) observations of Cepheids brighter than 8th magnitude resulted in a list of ALL companions more massive than 2.0 Msun uniformly sensitive to all separations. Hubble Space Telescope Wide Field Camera 3 (WFC3) has resolved three of these binaries (Eta Aql, S Nor, and V659 Cen). Combining these separations with orbital data in the literature, we derive an unbiased distribution of binary separations for a sample of 18 Cepheids, and also a distribution of mass ratios. The distribution of orbital periods shows that the 5 Msun binaries prefer shorter periods than 1 Msun stars, reflecting differences in star formation processes.
Confidence crisis of results in biomechanics research.
Knudson, Duane
2017-11-01
Many biomechanics studies have small sample sizes and incorrect statistical analyses, so reporting of inaccurate inferences and inflated magnitude of effects are common in the field. This review examines these issues in biomechanics research and summarises potential solutions from research in other fields to increase the confidence in the experimental effects reported in biomechanics. Authors, reviewers and editors of biomechanics research reports are encouraged to improve sample sizes and the resulting statistical power, improve reporting transparency, improve the rigour of statistical analyses used, and increase the acceptance of replication studies to improve the validity of inferences from data in biomechanics research. The application of sports biomechanics research results would also improve if a larger percentage of unbiased effects and their uncertainty were reported in the literature.
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
tICA-Metadynamics: Accelerating Metadynamics by Using Kinetically Selected Collective Variables.
M Sultan, Mohammad; Pande, Vijay S
2017-06-13
Metadynamics is a powerful enhanced molecular dynamics sampling method that accelerates simulations by adding history-dependent multidimensional Gaussians along selective collective variables (CVs). In practice, choosing a small number of slow CVs remains challenging due to the inherent high dimensionality of biophysical systems. Here we show that time-structure based independent component analysis (tICA), a recent advance in Markov state model literature, can be used to identify a set of variationally optimal slow coordinates for use as CVs for Metadynamics. We show that linear and nonlinear tICA-Metadynamics can complement existing MD studies by explicitly sampling the system's slowest modes and can even drive transitions along the slowest modes even when no such transitions are observed in unbiased simulations.
Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.
Xia, Jie; Reid, Terry-Elinor; Wu, Song; Zhang, Liangren; Wang, Xiang Simon
2018-05-29
Chemokine receptors (CRs) have long been druggable targets for the treatment of inflammatory diseases and HIV-1 infection. As a powerful technique, virtual screening (VS) has been widely applied to identifying small molecule leads for modern drug targets including CRs. For rational selection of a wide variety of VS approaches, ligand enrichment assessment based on a benchmarking data set has become an indispensable practice. However, the lack of versatile benchmarking sets for the whole CRs family that are able to unbiasedly evaluate every single approach including both structure- and ligand-based VS somewhat hinders modern drug discovery efforts. To address this issue, we constructed Maximal Unbiased Benchmarking Data sets for human Chemokine Receptors (MUBD-hCRs) using our recently developed tools of MUBD-DecoyMaker. The MUBD-hCRs encompasses 13 subtypes out of 20 chemokine receptors, composed of 404 ligands and 15756 decoys so far and is readily expandable in the future. It had been thoroughly validated that MUBD-hCRs ligands are chemically diverse while its decoys are maximal unbiased in terms of "artificial enrichment", "analogue bias". In addition, we studied the performance of MUBD-hCRs, in particular CXCR4 and CCR5 data sets, in ligand enrichment assessments of both structure- and ligand-based VS approaches in comparison with other benchmarking data sets available in the public domain and demonstrated that MUBD-hCRs is very capable of designating the optimal VS approach. MUBD-hCRs is a unique and maximal unbiased benchmarking set that covers major CRs subtypes so far.
Retransformation bias in a stem profile model
Raymond L. Czaplewski; David Bruce
1990-01-01
An unbiased profile model, fit to diameter divided by diameter at breast height, overestimated volume of 5.3-m log sections by 0.5 to 3.5%. Another unbiased profile model, fit to squared diameter divided by squared diameter at breast height, underestimated bole diameters by 0.2 to 2.1%. These biases are caused by retransformation of the predicted dependent variable;...
Unbiased nonorthogonal bases for tomographic reconstruction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sainz, Isabel; Klimov, Andrei B.; Roa, Luis
2010-05-15
We have developed a general method for constructing a set of nonorthogonal bases with equal separations between all different basis states in prime dimensions. The results are that the corresponding biorthogonal counterparts are pairwise unbiased with the components of the original bases. Using these bases, we derive an explicit expression for the optimal tomography in nonorthogonal bases. A special two-dimensional case is analyzed separately.
The dependability of medical students' performance ratings as documented on in-training evaluations.
van Barneveld, Christina
2005-03-01
To demonstrate an approach to obtain an unbiased estimate of the dependability of students' performance ratings during training, when the data-collection design includes nesting of student in rater, unbalanced nest sizes, and dependent observations. In 2003, two variance components analyses of in-training evaluation (ITE) report data were conducted using urGENOVA software. In the first analysis, the dependability for the nested and unbalanced data-collection design was calculated. In the second analysis, an approach using multiple generalizability studies was used to obtain an unbiased estimate of the student variance component, resulting in an unbiased estimate of dependability. Results suggested that there is bias in estimates of the dependability of students' performance on ITEs that are attributable to the data-collection design. When the bias was corrected, the results indicated that the dependability of ratings of student performance was almost zero. The combination of the multiple generalizability studies method and the use of specialized software provides an unbiased estimate of the dependability of ratings of student performance on ITE scores for data-collection designs that include nesting of student in rater, unbalanced nest sizes, and dependent observations.
X-ray Properties of an Unbiased Hard X-ray Detected Sample of AGN
NASA Technical Reports Server (NTRS)
Winter, Lisa M.; Mushotzky, Richard F.; Tueller, Jack; Markwardt, Craig
2007-01-01
The SWIFT gamma ray observatory's Burst Alert Telescope (BAT) has detected a sample of active galactic nuclei (AGN) based solely on their hard X-ray flux (14-195keV). In this paper, we present for the first time XMM-Newton X-ray spectra for 22 BAT AGXs with no previously analyzed X-ray spectra. If our sources are a representative sample of the BAT AGN, as we claim, our results present for the first time global X-ray properties of an unbiased towards absorption (n(sub H) < 3 x 10(exp 25)/sq cm), local (< z >= 0.03), AGN sample. We find 9/22 low absorption (n(sub H) < 10(exp 23)/sq cm), simple power law model sources, where 4 of these sources have a statistically significant soft component. Among these sources, we find the presence of a warm absorber statistically significant for only one Seyfert 1 source, contrasting with the ASCA results of Reynolds (1997) and George et al. (1998), who find signatures of warm absorption in half or more of their Seyfert 1 samples at similar redshifts. Additionally, the remaining sources (13122) have more complex spectra, well-fit by an absorbed power law at E > 2.0 keV. Five of the complex sources (NGC 612, ESO 362-G018, MRK 417, ESO 506-G027, and NGC 6860) are classified as Compton-thick candidates. Further, we find four more sources (SWIFT J0641.3+3257, SWIFT J0911.2+4533, SWIFT J1200.8+0650, and NGC 4992) with properties consistent with the hidden/buried AGN reported by Ueda et al. (2007). Finally, we include a comparison of the XMM EPIC spectra with available SWIFT X-ray Telescope (XRT) observations. From these comparisons, we find 6/16 sources with varying column densities, 6/16 sources with varying power law indices, and 13/16 sources with varying fluxes, over periods of hours to months. Flux and power law index are correlated for objects where both parameters vary.
Herculano-Houzel, Suzana; von Bartheld, Christopher S; Miller, Daniel J; Kaas, Jon H
2015-04-01
The number of cells comprising biological structures represents fundamental information in basic anatomy, development, aging, drug tests, pathology and genetic manipulations. Obtaining unbiased estimates of cell numbers, however, was until recently possible only through stereological techniques, which require specific training, equipment, histological processing and appropriate sampling strategies applied to structures with a homogeneous distribution of cell bodies. An alternative, the isotropic fractionator (IF), became available in 2005 as a fast and inexpensive method that requires little training, no specific software and only a few materials before it can be used to quantify total numbers of neuronal and non-neuronal cells in a whole organ such as the brain or any dissectible regions thereof. This method entails transforming a highly anisotropic tissue into a homogeneous suspension of free-floating nuclei that can then be counted under the microscope or by flow cytometry and identified morphologically and immunocytochemically as neuronal or non-neuronal. We compare the advantages and disadvantages of each method and provide researchers with guidelines for choosing the best method for their particular needs. IF is as accurate as unbiased stereology and faster than stereological techniques, as it requires no elaborate histological processing or sampling paradigms, providing reliable estimates in a few days rather than many weeks. Tissue shrinkage is also not an issue, since the estimates provided are independent of tissue volume. The main disadvantage of IF, however, is that it necessarily destroys the tissue analyzed and thus provides no spatial information on the cellular composition of biological regions of interest.
Horvitz-Thompson survey sample methods for estimating large-scale animal abundance
Samuel, M.D.; Garton, E.O.
1994-01-01
Large-scale surveys to estimate animal abundance can be useful for monitoring population status and trends, for measuring responses to management or environmental alterations, and for testing ecological hypotheses about abundance. However, large-scale surveys may be expensive and logistically complex. To ensure resources are not wasted on unattainable targets, the goals and uses of each survey should be specified carefully and alternative methods for addressing these objectives always should be considered. During survey design, the impoflance of each survey error component (spatial design, propofiion of detected animals, precision in detection) should be considered carefully to produce a complete statistically based survey. Failure to address these three survey components may produce population estimates that are inaccurate (biased low), have unrealistic precision (too precise) and do not satisfactorily meet the survey objectives. Optimum survey design requires trade-offs in these sources of error relative to the costs of sampling plots and detecting animals on plots, considerations that are specific to the spatial logistics and survey methods. The Horvitz-Thompson estimators provide a comprehensive framework for considering all three survey components during the design and analysis of large-scale wildlife surveys. Problems of spatial and temporal (especially survey to survey) heterogeneity in detection probabilities have received little consideration, but failure to account for heterogeneity produces biased population estimates. The goal of producing unbiased population estimates is in conflict with the increased variation from heterogeneous detection in the population estimate. One solution to this conflict is to use an MSE-based approach to achieve a balance between bias reduction and increased variation. Further research is needed to develop methods that address spatial heterogeneity in detection, evaluate the effects of temporal heterogeneity on survey objectives and optimize decisions related to survey bias and variance. Finally, managers and researchers involved in the survey design process must realize that obtaining the best survey results requires an interactive and recursive process of survey design, execution, analysis and redesign. Survey refinements will be possible as further knowledge is gained on the actual abundance and distribution of the population and on the most efficient techniques for detection animals.
An XRPD and EPR spectroscopy study of microcrystalline calcite bioprecipitated by Bacillus subtilis
NASA Astrophysics Data System (ADS)
Perito, B.; Romanelli, M.; Buccianti, A.; Passaponti, M.; Montegrossi, G.; Di Benedetto, F.
2018-05-01
We report in this study the first XRPD and EPR spectroscopy characterisation of a biogenic calcite, obtained from the activity of the bacterium Bacillus subtilis. Microcrystalline calcite powders obtained from bacterial culture in a suitable precipitation liquid medium were analysed without further manipulation. Both techniques reveal unusual parameters, closely related to the biological source of the mineral, i.e., to the bioprecipitation process and in particular to the organic matrix observed inside calcite. In detail, XRPD analysis revealed that bacterial calcite has slightly higher c/a lattice parameters ratio than abiotic calcite. This correlation was already noticed in microcrystalline calcite samples grown by bio-mineralisation processes, but it had never been previously verified for bacterial biocalcites. EPR spectroscopy evidenced an anomalously large value of W 6, a parameter that can be linked to occupation by different chemical species in the next nearest neighbouring sites. This parameter allows to clearly distinguish bacterial and abiotic calcite. This latter achievement was obtained after having reduced the parameters space into an unbiased Euclidean one, through an isometric log-ratio transformation. We conclude that this approach enables the coupled use of XRPD and EPR for identifying the traces of bacterial activity in fossil carbonate deposits.
Chromospherically Active Stars in the RAVE Survey
NASA Astrophysics Data System (ADS)
Žerjal, M.; Zwitter, T.; Matijevič, G.; Strassmeier, K. G.
2014-01-01
We present a qualitative characterization of activity levels of a large database of ~44,000 candidate RAVE stars (unbiased, magnitude limited medium resolution survey) that show chromospheric emission in the Ca II infrared triplet and this vastly enlarges previously known samples. Our main motivation to study these stars is the anti-correlation of chromospheric activity and stellar ages that could be calibrated using stellar clusters with known ages. Locally linear embedding used for a morphological classification of spectra revealed 53,347 cases with a suggested emission component in the calcium lines. We analyzed a subsample of ~44,000 stars with S/N>20 using a spectral subtraction technique where observed reference spectra of inactive stars were used as templates instead of synthetic ones. Both the equivalent width of the excess emission for each calcium line and their sum is derived for all candidate active stars with no respect to the origin of their emission flux. ~17,800 spectra show a detectable chromospheric flux with at least 2 σ confidence level. The overall distribution of activity levels shows a bimodal shape, with the first peak coinciding with inactive stars and the second with the pre-main-sequence cases.
Metabolic Differentiation of Early Lyme Disease from Southern Tick-Associated Rash Illness (STARI)
Molins, C. R.; Ashton, L. V.; Wormser, G. P.; Andre, B. G.; Hess, A. M.; Delorey, M. J.; Pilgard, M. A.; Johnson, B. J.; Webb, K.; Islam, M. N.; Pegalajar-Jurado, A; Molla, I.; Jewett, M. W.; Belisle, J. T.
2017-01-01
Lyme disease, the most commonly reported vector-borne disease in the United States, results from infection with Borrelia burgdorferi. Early clinical diagnosis of this disease is largely based on the presence of an erythematous skin lesion for individuals in high-risk regions. This, however, can be confused with other illnesses including southern tick-associated rash illness (STARI), an illness that lacks a defined etiological agent or laboratory diagnostic test, and is co-prevalent with Lyme disease in portions of the Eastern United States. By applying an unbiased metabolomics approach with sera retrospectively obtained from well-characterized patients we defined biochemical and diagnostic differences between early Lyme disease and STARI. Specifically, a metabolic biosignature consisting of 261 molecular features (MFs) revealed that altered N-acyl ethanolamine and primary fatty acid amide metabolism discriminated early Lyme disease from STARI. More importantly, development of classification models with the 261 MF biosignature and testing against validation samples differentiated early Lyme disease from STARI with an accuracy of 85 to 98%. These findings revealed metabolic dissimilarity between early Lyme disease and STARI, and provide a powerful and new approach to objectively distinguish early Lyme disease from an illness with nearly identical symptoms. PMID:28814545
Cosmic web type dependence of halo clustering
NASA Astrophysics Data System (ADS)
Fisher, J. D.; Faltenbacher, A.
2018-01-01
We use the Millennium Simulation to show that halo clustering varies significantly with cosmic web type. Haloes are classified as node, filament, sheet and void haloes based on the eigenvalue decomposition of the velocity shear tensor. The velocity field is sampled by the peculiar velocities of a fixed number of neighbouring haloes, and spatial derivatives are computed using a kernel borrowed from smoothed particle hydrodynamics. The classification scheme is used to examine the clustering of haloes as a function of web type for haloes with masses larger than 1011 h- 1 M⊙. We find that node haloes show positive bias, filament haloes show negligible bias and void and sheet haloes are antibiased independent of halo mass. Our findings suggest that the mass dependence of halo clustering is rooted in the composition of web types as a function of halo mass. The substantial fraction of node-type haloes for halo masses ≳ 2 × 1013 h- 1 M⊙ leads to positive bias. Filament-type haloes prevail at intermediate masses, 1012-1013 h- 1 M⊙, resulting in unbiased clustering. The large contribution of sheet-type haloes at low halo masses ≲ 1012 h- 1 M⊙ generates antibiasing.
Monsters and babies from the first/IRAS survey
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Bruegel, W J M
Radio continuum emission at cm wavelengths is relatively little affected by extinction. When combined with far-infrared (FIR) surveys this provides for a convenient and unbiased method to select (radio-loud) AGN and starbursts deeply embedded in gas and dust-rich galaxies. Such radio-selected FIR samples are useful for detailed investigations of the complex relationships between (radio) galaxy and starburst activity, and to determine whether ULIRGs are powered by hidden quasars (monsters) or young stars (babies). We present the results of a large program to obtain identifications and spectra of radio-sleected, optically faint IRAS/FSC objects using the FIRST/VLA 20 cm survey (Becker, Whitemore » and Helfand 1995). These objects are all radio-'quiet' in the sense that their radio power/FIR luminosities follow the well-known radio/FIR relationship for star forming galaxies. We compare these results to a previous study by our group of a sample of radio-'loud' IRAS/FSC ULIRGs selected from the Texas 365 MHz survey (Douglas et al. 1996). Many of these objects also show evidence for dominant, A-type stellar populations, as well as high ionization lines usually associated with AGN. These radio-loud ULIRGs have properties intermediate between those of starbursts and quasars, suggesting a possibile evolutionary connection. Deep Keck spectroscopic observations of three ULIRGs from these samples are presented, including high signal-to-noise spectropolarimetry. The polarimetry observations failed to show evidence of a hidden quasar in polarized (scattered) light in the two systems in which the stellar light was dominated by A-type stars. Although observations of a larger sample would be needed to allow a general conclusion, our current data suggest that a large fraction of ULIRGs may be powered by luminous starbursts, not by hidden, luminous AGN (quasars). While we used radio-selected FIR sources to search for evidence of a causal AGN/starburst connection, we conclude our presentation with a dramatic example of an AGN/starburst object from an entirely unrelated quasar survey selected at the opposite, blue end of the spectrum.« less
The frequency of stellar X-ray flares from a large-scale XMM-Newton sample
NASA Astrophysics Data System (ADS)
Pye, John P.; Rosen, Simon
2015-08-01
We present a uniform, large-scale survey of X-ray flare emission, with emphasis on the corrections needed to arrive at estimates of flare occurrence rates. The XMM-Newton Serendipitous Source Catalogue has been used as the basis for a survey of X-ray flares from late-type (i.e. spectral type F-M) stars in the Hipparcos Tycho catalogue. The XMM catalogue and its associated data products provide an excellent basis for a comprehensive and sensitive survey of stellar flares - both from targeted active stars and from those observed serendipitously in the half-degree diameter field-of-view of each observation. Our sample contains ~130 flares with well-observed profiles; they range in duration from ~103 to ~104s, have peak X-ray fluxes from ~10-13 to ~10-11 erg cm-2 s-1, peak X-ray luminosities from ~1029 to ~1032 erg s-1 and X-ray energy output from ~1032 to ~1035 erg. Most of the serendipitously-observed stars have little previously reported information. We present flare frequency distributions from both target and serendipitous observations. The latter provide an unbiased (with respect to stellar activity) study of flare energetics. The serendipitous sample demonstrates the need for care when calculating flaring rates, especially when normalising the number of flares to a total exposure time, where it is important to consider both the stars seen to flare and those measured as non-variable, since in our survey, the latter outnumber the former by more than a factor ten. The serendipitous variable and non-variable stars appear very similar in terms of the distributions of general properties such as quiescent X-ray luminosity; from the available data, it is unclear whether the distinction by flaring is due to an additional, intrinsic property such as intra-system interactions in a close binary system, or is simply the result of limited observations of a random flaring process, with no real difference between the two samples. We discuss future observations and analyses aimed at resolving this issue.
Apparatus bias and place conditioning with ethanol in mice.
Cunningham, Christopher L; Ferree, Nikole K; Howard, MacKenzie A
2003-12-01
Although the distinction between "biased" and "unbiased" is generally recognized as an important methodological issue in place conditioning, previous studies have not adequately addressed the distinction between a biased/unbiased apparatus and a biased/unbiased stimulus assignment procedure. Moreover, a review of the recent literature indicates that many reports (70% of 76 papers published in 2001) fail to provide adequate information about apparatus bias. This issue is important because the mechanisms underlying a drug's effect in the place-conditioning procedure may differ depending on whether the apparatus is biased or unbiased. The present studies were designed to assess the impact of apparatus bias and stimulus assignment procedure on ethanol-induced place conditioning in mice (DBA/2 J). A secondary goal was to compare various dependent variables commonly used to index conditioned place preference. Apparatus bias was manipulated by varying the combination of tactile (floor) cues available during preference tests. Experiment 1 used an unbiased apparatus in which the stimulus alternatives were equally preferred during a pre-test as indicated by the group average. Experiment 2 used a biased apparatus in which one of the stimuli was strongly preferred by most mice (mean % time on cue = 67%) during the pre-test. In both studies, the stimulus paired with drug (CS+) was assigned randomly (i.e., an "unbiased" stimulus assignment procedure). Experimental mice received four pairings of CS+ with ethanol (2 g/kg, i.p.) and four pairings of the alternative stimulus (CS-) with saline; control mice received saline on both types of trial. Each experiment concluded with a 60-min choice test. With the unbiased apparatus (experiment 1), significant place conditioning was obtained regardless of whether drug was paired with the subject's initially preferred or non-preferred stimulus. However, with the biased apparatus (experiment 2), place conditioning was apparent only when ethanol was paired with the initially non-preferred cue, and not when it was paired with the initially preferred cue. These conclusions held regardless of which dependent variable was used to index place conditioning, but only if the counterbalancing factor was included in statistical analyses. These studies indicate that apparatus bias plays a major role in determining whether biased assignment of an ethanol-paired stimulus affects ability to demonstrate conditioned place preference. Ethanol's ability to produce conditioned place preference in an unbiased apparatus, regardless of the direction of the initial cue bias, supports previous studies that interpret such findings as evidence of a primary rewarding drug effect. Moreover, these studies suggest that the asymmetrical outcome observed in the biased apparatus is most likely due to a measurement problem (e.g., ceiling effect) rather than to an interaction between the drug's effect and an unconditioned motivational response (e.g., "anxiety") to the initially non-preferred stimulus. More generally, these findings illustrate the importance of providing clear information on apparatus bias in all place-conditioning studies.
A sampling plan for riparian birds of the Lower Colorado River-Final Report
Bart, Jonathan; Dunn, Leah; Leist, Amy
2010-01-01
A sampling plan was designed for the Bureau of Reclamation for selected riparian birds occurring along the Colorado River from Lake Mead to the southerly International Boundary with Mexico. The goals of the sampling plan were to estimate long-term trends in abundance and investigate habitat relationships especially in new habitat being created by the Bureau of Reclamation. The initial objective was to design a plan for the Gila Woodpecker (Melanerpes uropygialis), Arizona Bell's Vireo (Vireo bellii arizonae), Sonoran Yellow Warbler (Dendroica petechia sonorana), Summer Tanager (Piranga rubra), Gilded Flicker (Colaptes chrysoides), and Vermilion Flycatcher (Pyrocephalus rubinus); however, too little data were obtained for the last two species. Recommendations were therefore based on results for the first four species. The study area was partitioned into plots of 7 to 23 hectares. Plot borders were drawn to place the best habitat for the focal species in the smallest number of plots so that survey efforts could be concentrated on these habitats. Double sampling was used in the survey. In this design, a large sample of plots is surveyed a single time, yielding estimates of unknown accuracy, and a subsample is surveyed intensively to obtain accurate estimates. The subsample is used to estimate detection ratios, which are then applied to the results from the extensive survey to obtain unbiased estimates of density and population size. These estimates are then used to estimate long-term trends in abundance. Four sampling plans for selecting plots were evaluated based on a simulation using data from the Breeding Bird Survey. The design with the highest power involved selecting new plots every year. Power with 80 plots surveyed per year was more than 80 percent for three of the four species. Results from the surveys were used to provide recommendations to the Bureau of Reclamation for their surveys of new habitat being created in the study area.
The effects of sample size on population genomic analyses--implications for the tests of neutrality.
Subramanian, Sankar
2016-02-20
One of the fundamental measures of molecular genetic variation is the Watterson's estimator (θ), which is based on the number of segregating sites. The estimation of θ is unbiased only under neutrality and constant population growth. It is well known that the estimation of θ is biased when these assumptions are violated. However, the effects of sample size in modulating the bias was not well appreciated. We examined this issue in detail based on large-scale exome data and robust simulations. Our investigation revealed that sample size appreciably influences θ estimation and this effect was much higher for constrained genomic regions than that of neutral regions. For instance, θ estimated for synonymous sites using 512 human exomes was 1.9 times higher than that obtained using 16 exomes. However, this difference was 2.5 times for the nonsynonymous sites of the same data. We observed a positive correlation between the rate of increase in θ estimates (with respect to the sample size) and the magnitude of selection pressure. For example, θ estimated for the nonsynonymous sites of highly constrained genes (dN/dS < 0.1) using 512 exomes was 3.6 times higher than that estimated using 16 exomes. In contrast this difference was only 2 times for the less constrained genes (dN/dS > 0.9). The results of this study reveal the extent of underestimation owing to small sample sizes and thus emphasize the importance of sample size in estimating a number of population genomic parameters. Our results have serious implications for neutrality tests such as Tajima D, Fu-Li D and those based on the McDonald and Kreitman test: Neutrality Index and the fraction of adaptive substitutions. For instance, use of 16 exomes produced 2.4 times higher proportion of adaptive substitutions compared to that obtained using 512 exomes (24% vs 10 %).
Logistic regression trees for initial selection of interesting loci in case-control studies
Nickolov, Radoslav Z; Milanov, Valentin B
2007-01-01
Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paterek, Tomasz; Dakic, Borivoje; Brukner, Caslav
In this Reply to the preceding Comment by Hall and Rao [Phys. Rev. A 83, 036101 (2011)], we motivate terminology of our original paper and point out that further research is needed in order to (dis)prove the claimed link between every orthogonal Latin square of order being a power of a prime and a mutually unbiased basis.
Allele frequencies for 12 autosomal short tandem repeat loci in two Bolivian populations.
Cifuentes, L; Jorquera, H; Acuña, M; Ordóñez, J; Sierra, A L
2008-03-18
Two hundred and sixty unrelated subjects who asked for paternity testing at two Bolivian Laboratories in La Paz and Santa Cruz were studied. The loci D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D7S820, TH01, TPOX, and CSF1PO were typed from blood samples, amplifying DNA by polymerase chain reactions and electrophoresis. Allele frequencies were estimated by simple counting and the unbiased heterozygosity was calculated. Hardy-Weinberg equilibrium was studied and gene frequencies were compared between the two samples. All loci conformed to the Hardy-Weinberg law and allele frequencies were similar in samples from the two cities. The Bolivian gene frequencies estimated were significantly different from those described for Chile and the United States Hispanic-Americans for most of the loci.
McClure, Foster D; Lee, Jung K
2012-01-01
The validation process for an analytical method usually employs an interlaboratory study conducted as a balanced completely randomized model involving a specified number of randomly chosen laboratories, each analyzing a specified number of randomly allocated replicates. For such studies, formulas to obtain approximate unbiased estimates of the variance and uncertainty of the sample laboratory-to-laboratory (lab-to-lab) STD (S(L)) have been developed primarily to account for the uncertainty of S(L) when there is a need to develop an uncertainty budget that includes the uncertainty of S(L). For the sake of completeness on this topic, formulas to estimate the variance and uncertainty of the sample lab-to-lab variance (S(L)2) were also developed. In some cases, it was necessary to derive the formulas based on an approximate distribution for S(L)2.
Nonlinear vs. linear biasing in Trp-cage folding simulations
NASA Astrophysics Data System (ADS)
Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka
2015-03-01
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.
Nonlinear vs. linear biasing in Trp-cage folding simulations.
Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka
2015-03-21
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.
Trasande, Leonardo; Vandenberg, Laura N; Bourguignon, Jean-Pierre; Myers, John Peterson; Slama, Remy; Vom Saal, Frederick; Zoeller, Robert Thomas
2016-11-01
Evidence increasingly confirms that synthetic chemicals disrupt the endocrine system and contribute to disease and disability across the lifespan. Despite a United Nations Environment Programme/WHO report affirmed by over 100 countries at the Fourth International Conference on Chemicals Management, 'manufactured doubt' continues to be cast as a cloud over rigorous, peer-reviewed and independently funded scientific data. This study describes the sources of doubt and their social costs, and suggested courses of action by policymakers to prevent disease and disability. The problem is largely based on the available data, which are all too limited. Rigorous testing programmes should not simply focus on oestrogen, androgen and thyroid. Tests should have proper statistical power. 'Good laboratory practice' (GLP) hardly represents a proper or even gold standard for laboratory studies of endocrine disruption. Studies should be evaluated with regard to the contamination of negative controls, responsiveness to positive controls and dissection techniques. Flaws in many GLP studies have been identified, yet regulatory agencies rely on these flawed studies. Peer-reviewed and unbiased research, rather than 'sound science', should be used to evaluate endocrine-disrupting chemicals. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Inferring animal densities from tracking data using Markov chains.
Whitehead, Hal; Jonsen, Ian D
2013-01-01
The distributions and relative densities of species are keys to ecology. Large amounts of tracking data are being collected on a wide variety of animal species using several methods, especially electronic tags that record location. These tracking data are effectively used for many purposes, but generally provide biased measures of distribution, because the starts of the tracks are not randomly distributed among the locations used by the animals. We introduce a simple Markov-chain method that produces unbiased measures of relative density from tracking data. The density estimates can be over a geographical grid, and/or relative to environmental measures. The method assumes that the tracked animals are a random subset of the population in respect to how they move through the habitat cells, and that the movements of the animals among the habitat cells form a time-homogenous Markov chain. We illustrate the method using simulated data as well as real data on the movements of sperm whales. The simulations illustrate the bias introduced when the initial tracking locations are not randomly distributed, as well as the lack of bias when the Markov method is used. We believe that this method will be important in giving unbiased estimates of density from the growing corpus of animal tracking data.
Arnason, T; Albertsdóttir, E; Fikse, W F; Eriksson, S; Sigurdsson, A
2012-02-01
The consequences of assuming a zero environmental covariance between a binary trait 'test-status' and a continuous trait on the estimates of genetic parameters by restricted maximum likelihood and Gibbs sampling and on response from genetic selection when the true environmental covariance deviates from zero were studied. Data were simulated for two traits (one that culling was based on and a continuous trait) using the following true parameters, on the underlying scale: h² = 0.4; r(A) = 0.5; r(E) = 0.5, 0.0 or -0.5. The selection on the continuous trait was applied to five subsequent generations where 25 sires and 500 dams produced 1500 offspring per generation. Mass selection was applied in the analysis of the effect on estimation of genetic parameters. Estimated breeding values were used in the study of the effect of genetic selection on response and accuracy. The culling frequency was either 0.5 or 0.8 within each generation. Each of 10 replicates included 7500 records on 'test-status' and 9600 animals in the pedigree file. Results from bivariate analysis showed unbiased estimates of variance components and genetic parameters when true r(E) = 0.0. For r(E) = 0.5, variance components (13-19% bias) and especially (50-80%) were underestimated for the continuous trait, while heritability estimates were unbiased. For r(E) = -0.5, heritability estimates of test-status were unbiased, while genetic variance and heritability of the continuous trait together with were overestimated (25-50%). The bias was larger for the higher culling frequency. Culling always reduced genetic progress from selection, but the genetic progress was found to be robust to the use of wrong parameter values of the true environmental correlation between test-status and the continuous trait. Use of a bivariate linear-linear model reduced bias in genetic evaluations, when data were subject to culling. © 2011 Blackwell Verlag GmbH.
Generation and evaluation of an ultra-high-field atlas with applications in DBS planning
NASA Astrophysics Data System (ADS)
Wang, Brian T.; Poirier, Stefan; Guo, Ting; Parrent, Andrew G.; Peters, Terry M.; Khan, Ali R.
2016-03-01
Purpose Deep brain stimulation (DBS) is a common treatment for Parkinson's disease (PD) and involves the use of brain atlases or intrinsic landmarks to estimate the location of target deep brain structures, such as the subthalamic nucleus (STN) and the globus pallidus pars interna (GPi). However, these structures can be difficult to localize with conventional clinical magnetic resonance imaging (MRI), and thus targeting can be prone to error. Ultra-high-field imaging at 7T has the ability to clearly resolve these structures and thus atlases built with these data have the potential to improve targeting accuracy. Methods T1 and T2-weighted images of 12 healthy control subjects were acquired using a 7T MR scanner. These images were then used with groupwise registration to generate an unbiased average template with T1w and T2w contrast. Deep brain structures were manually labelled in each subject by two raters and rater reliability was assessed. We compared the use of this unbiased atlas with two other methods of atlas-based segmentation (single-template and multi-template) for subthalamic nucleus (STN) segmentation on 7T MRI data. We also applied this atlas to clinical DBS data acquired at 1.5T to evaluate its efficacy for DBS target localization as compared to using a standard atlas. Results The unbiased templates provide superb detail of subcortical structures. Through one-way ANOVA tests, the unbiased template is significantly (p <0.05) more accurate than a single-template in atlas-based segmentation and DBS target localization tasks. Conclusion The generated unbiased averaged templates provide better visualization of deep brain nuclei and an increase in accuracy over single-template and lower field strength atlases.
Extending unbiased stereology of brain ultrastructure to three-dimensional volumes
NASA Technical Reports Server (NTRS)
Fiala, J. C.; Harris, K. M.; Koslow, S. H. (Principal Investigator)
2001-01-01
OBJECTIVE: Analysis of brain ultrastructure is needed to reveal how neurons communicate with one another via synapses and how disease processes alter this communication. In the past, such analyses have usually been based on single or paired sections obtained by electron microscopy. Reconstruction from multiple serial sections provides a much needed, richer representation of the three-dimensional organization of the brain. This paper introduces a new reconstruction system and new methods for analyzing in three dimensions the location and ultrastructure of neuronal components, such as synapses, which are distributed non-randomly throughout the brain. DESIGN AND MEASUREMENTS: Volumes are reconstructed by defining transformations that align the entire area of adjacent sections. Whole-field alignment requires rotation, translation, skew, scaling, and second-order nonlinear deformations. Such transformations are implemented by a linear combination of bivariate polynomials. Computer software for generating transformations based on user input is described. Stereological techniques for assessing structural distributions in reconstructed volumes are the unbiased bricking, disector, unbiased ratio, and per-length counting techniques. A new general method, the fractional counter, is also described. This unbiased technique relies on the counting of fractions of objects contained in a test volume. A volume of brain tissue from stratum radiatum of hippocampal area CA1 is reconstructed and analyzed for synaptic density to demonstrate and compare the techniques. RESULTS AND CONCLUSIONS: Reconstruction makes practicable volume-oriented analysis of ultrastructure using such techniques as the unbiased bricking and fractional counter methods. These analysis methods are less sensitive to the section-to-section variations in counts and section thickness, factors that contribute to the inaccuracy of other stereological methods. In addition, volume reconstruction facilitates visualization and modeling of structures and analysis of three-dimensional relationships such as synaptic connectivity.
2015-01-01
Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745
Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon
2014-05-27
Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.
Systematic effects of foreground removal in 21-cm surveys of reionization
NASA Astrophysics Data System (ADS)
Petrovic, Nada; Oh, S. Peng
2011-05-01
21-cm observations have the potential to revolutionize our understanding of the high-redshift Universe. Whilst extremely bright radio continuum foregrounds exist at these frequencies, their spectral smoothness can be exploited to allow efficient foreground subtraction. It is well known that - regardless of other instrumental effects - this removes power on scales comparable to the survey bandwidth. We investigate associated systematic biases. We show that removing line-of-sight fluctuations on large scales aliases into suppression of the 3D power spectrum across a broad range of scales. This bias can be dealt with by correctly marginalizing over small wavenumbers in the 1D power spectrum; however, the unbiased estimator will have unavoidably larger variance. We also show that Gaussian realizations of the power spectrum permit accurate and extremely rapid Monte Carlo simulations for error analysis; repeated realizations of the fully non-Gaussian field are unnecessary. We perform Monte Carlo maximum likelihood simulations of foreground removal which yield unbiased, minimum variance estimates of the power spectrum in agreement with Fisher matrix estimates. Foreground removal also distorts the 21-cm probability distribution function (PDF), reducing the contrast between neutral and ionized regions, with potentially serious consequences for efforts to extract information from the PDF. We show that it is the subtraction of large-scale modes which is responsible for this distortion, and that it is less severe in the earlier stages of reionization. It can be reduced by using larger bandwidths. In the late stages of reionization, identification of the largest ionized regions (which consist of foreground emission only) provides calibration points which potentially allow recovery of large-scale modes. Finally, we also show that (i) the broad frequency response of synchrotron and free-free emission will smear out any features in the electron momentum distribution and ensure spectrally smooth foregrounds and (ii) extragalactic radio recombination lines should be negligible foregrounds.
CHRR: coordinate hit-and-run with rounding for uniform sampling of constraint-based models.
Haraldsdóttir, Hulda S; Cousins, Ben; Thiele, Ines; Fleming, Ronan M T; Vempala, Santosh
2017-06-01
In constraint-based metabolic modelling, physical and biochemical constraints define a polyhedral convex set of feasible flux vectors. Uniform sampling of this set provides an unbiased characterization of the metabolic capabilities of a biochemical network. However, reliable uniform sampling of genome-scale biochemical networks is challenging due to their high dimensionality and inherent anisotropy. Here, we present an implementation of a new sampling algorithm, coordinate hit-and-run with rounding (CHRR). This algorithm is based on the provably efficient hit-and-run random walk and crucially uses a preprocessing step to round the anisotropic flux set. CHRR provably converges to a uniform stationary sampling distribution. We apply it to metabolic networks of increasing dimensionality. We show that it converges several times faster than a popular artificial centering hit-and-run algorithm, enabling reliable and tractable sampling of genome-scale biochemical networks. https://github.com/opencobra/cobratoolbox . ronan.mt.fleming@gmail.com or vempala@cc.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Estimation of river and stream temperature trends under haphazard sampling
Gray, Brian R.; Lyubchich, Vyacheslav; Gel, Yulia R.; Rogala, James T.; Robertson, Dale M.; Wei, Xiaoqiao
2015-01-01
Long-term temporal trends in water temperature in rivers and streams are typically estimated under the assumption of evenly-spaced space-time measurements. However, sampling times and dates associated with historical water temperature datasets and some sampling designs may be haphazard. As a result, trends in temperature may be confounded with trends in time or space of sampling which, in turn, may yield biased trend estimators and thus unreliable conclusions. We address this concern using multilevel (hierarchical) linear models, where time effects are allowed to vary randomly by day and date effects by year. We evaluate the proposed approach by Monte Carlo simulations with imbalance, sparse data and confounding by trend in time and date of sampling. Simulation results indicate unbiased trend estimators while results from a case study of temperature data from the Illinois River, USA conform to river thermal assumptions. We also propose a new nonparametric bootstrap inference on multilevel models that allows for a relatively flexible and distribution-free quantification of uncertainties. The proposed multilevel modeling approach may be elaborated to accommodate nonlinearities within days and years when sampling times or dates typically span temperature extremes.
Current and efficiency of Brownian particles under oscillating forces in entropic barriers
NASA Astrophysics Data System (ADS)
Nutku, Ferhat; Aydιner, Ekrem
2015-04-01
In this study, considering the temporarily unbiased force and different forms of oscillating forces, we investigate the current and efficiency of Brownian particles in an entropic tube structure and present the numerically obtained results. We show that different force forms give rise to different current and efficiency profiles in different optimized parameter intervals. We find that an unbiased oscillating force and an unbiased temporal force lead to the current and efficiency, which are dependent on these parameters. We also observe that the current and efficiency caused by temporal and different oscillating forces have maximum and minimum values in different parameter intervals. We conclude that the current or efficiency can be controlled dynamically by adjusting the parameters of entropic barriers and applied force. Project supported by the Funds from Istanbul University (Grant No. 45662).
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Quantum key distribution for composite dimensional finite systems
NASA Astrophysics Data System (ADS)
Shalaby, Mohamed; Kamal, Yasser
2017-06-01
The application of quantum mechanics contributes to the field of cryptography with very important advantage as it offers a mechanism for detecting the eavesdropper. The pioneering work of quantum key distribution uses mutually unbiased bases (MUBs) to prepare and measure qubits (or qudits). Weak mutually unbiased bases (WMUBs) have weaker properties than MUBs properties, however, unlike MUBs, a complete set of WMUBs can be constructed for systems with composite dimensions. In this paper, we study the use of weak mutually unbiased bases (WMUBs) in quantum key distribution for composite dimensional finite systems. We prove that the security analysis of using a complete set of WMUBs to prepare and measure the quantum states in the generalized BB84 protocol, gives better results than using the maximum number of MUBs that can be constructed, when they are analyzed against the intercept and resend attack.
Estimating Seven Coefficients of Pairwise Relatedness Using Population-Genomic Data
Ackerman, Matthew S.; Johri, Parul; Spitze, Ken; Xu, Sen; Doak, Thomas G.; Young, Kimberly; Lynch, Michael
2017-01-01
Population structure can be described by genotypic-correlation coefficients between groups of individuals, the most basic of which are the pairwise relatedness coefficients between any two individuals. There are nine pairwise relatedness coefficients in the most general model, and we show that these can be reduced to seven coefficients for biallelic loci. Although all nine coefficients can be estimated from pedigrees, six coefficients have been beyond empirical reach. We provide a numerical optimization procedure that estimates all seven reduced coefficients from population-genomic data. Simulations show that the procedure is nearly unbiased, even at 3× coverage, and errors in five of the seven coefficients are statistically uncorrelated. The remaining two coefficients have a negative correlation of errors, but their sum provides an unbiased assessment of the overall correlation of heterozygosity between two individuals. Application of these new methods to four populations of the freshwater crustacean Daphnia pulex reveal the occurrence of half siblings in our samples, as well as a number of identical individuals that are likely obligately asexual clone mates. Statistically significant negative estimates of these pairwise relatedness coefficients, including inbreeding coefficients that were typically negative, underscore the difficulties that arise when interpreting genotypic correlations as estimations of the probability that alleles are identical by descent. PMID:28341647
NASA Astrophysics Data System (ADS)
Aumann, Hartmut H.; Fishbein, Evan; Gohlke, Jan
2007-09-01
The application of infrared hyper-spectral sounder data to climate research requires the global analysis of multi-decadal time series of various atmosphere, surface or cloud related parameters. The data used in this analysis has to meet stringent global and scene independent absolute accuracy and stability requirements, it also has to be spatially and radiometrically unbiased, manageable in size and self-contained. Self-contained means that the data set contains not only a globally unbiased sample of the state of the Earth Climate system as seen in the infrared, it has to contain enough data to contrast clear with average (cloudy) data and to allow an independent assessment of the radiometric and spectral accuracy and stability of the data. We illustrate this with data from the Atmospheric Infrared Sounder (AIRS) and Infrared Atmospheric Sounder Interferometer (IASI) data. AIRS and IASI were designed with fairly similar functional requirements. AIRS was launched on the EOS Aqua spacecraft in May 2002 into a 705 km polar sun-synchronous orbit with accurately maintained 1:30 PM ascending node. Essentially un-interrupted data are available since September 2002. Since October 2006 IASI is in a 9:30 AM polar orbit at 815 km altitude on the MetOp2 satellite, with data available since May 2007.
NASA Astrophysics Data System (ADS)
Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho
2017-03-01
So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
Free energies from dynamic weighted histogram analysis using unbiased Markov state model.
Rosta, Edina; Hummer, Gerhard
2015-01-13
The weighted histogram analysis method (WHAM) is widely used to obtain accurate free energies from biased molecular simulations. However, WHAM free energies can exhibit significant errors if some of the biasing windows are not fully equilibrated. To account for the lack of full equilibration, we develop the dynamic histogram analysis method (DHAM). DHAM uses a global Markov state model to obtain the free energy along the reaction coordinate. A maximum likelihood estimate of the Markov transition matrix is constructed by joint unbiasing of the transition counts from multiple umbrella-sampling simulations along discretized reaction coordinates. The free energy profile is the stationary distribution of the resulting Markov matrix. For this matrix, we derive an explicit approximation that does not require the usual iterative solution of WHAM. We apply DHAM to model systems, a chemical reaction in water treated using quantum-mechanics/molecular-mechanics (QM/MM) simulations, and the Na(+) ion passage through the membrane-embedded ion channel GLIC. We find that DHAM gives accurate free energies even in cases where WHAM fails. In addition, DHAM provides kinetic information, which we here use to assess the extent of convergence in each of the simulation windows. DHAM may also prove useful in the construction of Markov state models from biased simulations in phase-space regions with otherwise low population.
Enhanced Sampling in the Well-Tempered Ensemble
NASA Astrophysics Data System (ADS)
Bonomi, M.; Parrinello, M.
2010-05-01
We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi , J. Comput. Chem. 30, 1615 (2009)JCCHDD0192-865110.1002/jcc.21305]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.
Enhanced sampling in the well-tempered ensemble.
Bonomi, M; Parrinello, M
2010-05-14
We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi, J. Comput. Chem. 30, 1615 (2009)]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.
Teacher Efficacy of Secondary Special Education Science Teachers
NASA Astrophysics Data System (ADS)
Bonton, Celeste
Students with disabilities are a specific group of the student population that are guaranteed rights that allow them to receive a free and unbiased education in an environment with their non-disabled peers. The importance of this study relates to providing students with disabilities with the opportunity to receive instruction from the most efficient and prepared educators. The purpose of this study is to determine how specific factors influence special education belief systems. In particular, educators who provide science instruction in whole group or small group classrooms in a large metropolitan area in Georgia possess specific beliefs about their ability to provide meaningful instruction. Data was collected through a correlational study completed by educators through an online survey website. The SEBEST quantitative survey instrument was used on a medium sample size (approximately 120 teachers) in a large metropolitan school district. The selected statistical analysis was the Shapiro-Wilk and Mann-Whitney in order to determine if any correlation exists among preservice training and perceived self-efficacy of secondary special education teachers in the content area of science. The results of this study showed that special education teachers in the content area of science have a higher perceived self-efficacy if they have completed an alternative certification program. Other variables tested did not show any statistical significance. Further research can be centered on the analysis of actual teacher efficacy, year end teacher efficacy measurements, teacher stipends, increased recruitment, and special education teachers of multiple content areas.
Astrochemical evolution along star formation: Overview of the IRAM Large Program ASAI
NASA Astrophysics Data System (ADS)
Lefloch, Bertrand; Bachiller, R.; Ceccarelli, C.; Cernicharo, J.; Codella, C.; Fuente, A.; Kahane, C.; López-Sepulcre, A.; Tafalla, M.; Vastel, C.; Caux, E.; González-García, M.; Bianchi, E.; Gómez-Ruiz, A.; Holdship, J.; Mendoza, E.; Ospina-Zamudio, J.; Podio, L.; Quénard, D.; Roueff, E.; Sakai, N.; Viti, S.; Yamamoto, S.; Yoshida, K.; Favre, C.; Monfredini, T.; Quitián-Lara, H. M.; Marcelino, N.; Roberty, H. Boechat; Cabrit, S.
2018-04-01
Evidence is mounting that the small bodies of our Solar System, such as comets and asteroids, have at least partially inherited their chemical composition from the first phases of the Solar System formation. It then appears that the molecular complexity of these small bodies is most likely related to the earliest stages of star formation. It is therefore important to characterize and to understand how the chemical evolution changes with solar-type protostellar evolution. We present here the Large Program "Astrochemical Surveys At IRAM" (ASAI). Its goal is to carry out unbiased millimeter line surveys between 80 and 272 GHz of a sample of ten template sources, which fully cover the first stages of the formation process of solar-type stars, from prestellar cores to the late protostellar phase. In this article, we present an overview of the surveys and results obtained from the analysis of the 3 mm band observations. The number of detected main isotopic species barely varies with the evolutionary stage and is found to be very similar to that of massive star-forming regions. The molecular content in O- and C- bearing species allows us to define two chemical classes of envelopes, whose composition is dominated by either a) a rich content in O-rich complex organic molecules, associated with hot corino sources, or b) a rich content in hydrocarbons, typical of Warm Carbon Chain Chemistry sources. Overall, a high chemical richness is found to be present already in the initial phases of solar-type star formation.
Characterization and photometric performance of the Hyper Suprime-Cam Software Pipeline
NASA Astrophysics Data System (ADS)
Huang, Song; Leauthaud, Alexie; Murata, Ryoma; Bosch, James; Price, Paul; Lupton, Robert; Mandelbaum, Rachel; Lackner, Claire; Bickerton, Steven; Miyazaki, Satoshi; Coupon, Jean; Tanaka, Masayuki
2018-01-01
The Subaru Strategic Program (SSP) is an ambitious multi-band survey using the Hyper Suprime-Cam (HSC) on the Subaru telescope. The Wide layer of the SSP is both wide and deep, reaching a detection limit of i ˜ 26.0 mag. At these depths, it is challenging to achieve accurate, unbiased, and consistent photometry across all five bands. The HSC data are reduced using a pipeline that builds on the prototype pipeline for the Large Synoptic Survey Telescope. We have developed a Python-based, flexible framework to inject synthetic galaxies into real HSC images, called SynPipe. Here we explain the design and implementation of SynPipe and generate a sample of synthetic galaxies to examine the photometric performance of the HSC pipeline. For stars, we achieve 1% photometric precision at i ˜ 19.0 mag and 6% precision at i ˜ 25.0 in the i band (corresponding to statistical scatters of ˜0.01 and ˜0.06 mag respectively). For synthetic galaxies with single-Sérsic profiles, forced CModel photometry achieves 13% photometric precision at i ˜ 20.0 mag and 18% precision at i ˜ 25.0 in the i band (corresponding to statistical scatters of ˜0.15 and ˜0.22 mag respectively). We show that both forced point spread function and CModel photometry yield unbiased color estimates that are robust to seeing conditions. We identify several caveats that apply to the version of HSC pipeline used for the first public HSC data release (DR1) that need to be taking into consideration. First, the degree to which an object is blended with other objects impacts the overall photometric performance. This is especially true for point sources. Highly blended objects tend to have larger photometric uncertainties, systematically underestimated fluxes, and slightly biased colors. Secondly, >20% of stars at 22.5 < i < 25.0 mag can be misclassified as extended objects. Thirdly, the current CModel algorithm tends to strongly underestimate the half-light radius and ellipticity of galaxy with i > 21.5 mag.
Li, Yanjie; Lu, Yue; Lin, Kevin; Hauser, Lauren A.; Lynch, David R.
2017-01-01
ABSTRACT Friedreich's ataxia (FRDA) is an autosomal recessive neurodegenerative disease usually caused by large homozygous expansions of GAA repeat sequences in intron 1 of the frataxin (FXN) gene. FRDA patients homozygous for GAA expansions have low FXN mRNA and protein levels when compared with heterozygous carriers or healthy controls. Frataxin is a mitochondrial protein involved in iron–sulfur cluster synthesis, and many FRDA phenotypes result from deficiencies in cellular metabolism due to lowered expression of FXN. Presently, there is no effective treatment for FRDA, and biomarkers to measure therapeutic trial outcomes and/or to gauge disease progression are lacking. Peripheral tissues, including blood cells, buccal cells and skin fibroblasts, can readily be isolated from FRDA patients and used to define molecular hallmarks of disease pathogenesis. For instance, FXN mRNA and protein levels as well as FXN GAA-repeat tract lengths are routinely determined using all of these cell types. However, because these tissues are not directly involved in disease pathogenesis, their relevance as models of the molecular aspects of the disease is yet to be decided. Herein, we conducted unbiased RNA sequencing to profile the transcriptomes of fibroblast cell lines derived from 18 FRDA patients and 17 unaffected control individuals. Bioinformatic analyses revealed significantly upregulated expression of genes encoding plasma membrane solute carrier proteins in FRDA fibroblasts. Conversely, the expression of genes encoding accessory factors and enzymes involved in cytoplasmic and mitochondrial protein synthesis was consistently decreased in FRDA fibroblasts. Finally, comparison of genes differentially expressed in FRDA fibroblasts to three previously published gene expression signatures defined for FRDA blood cells showed substantial overlap between the independent datasets, including correspondingly deficient expression of antioxidant defense genes. Together, these results indicate that gene expression profiling of cells derived from peripheral tissues can, in fact, consistently reveal novel molecular pathways of the disease. When performed on statistically meaningful sample group sizes, unbiased global profiling analyses utilizing peripheral tissues are critical for the discovery and validation of FRDA disease biomarkers. PMID:29125828
Building unbiased estimators from non-gaussian likelihoods with application to shear estimation
Madhavacheril, Mathew S.; McDonald, Patrick; Sehgal, Neelima; ...
2015-01-15
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong’s estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g| = 0.2.« less
Building unbiased estimators from non-Gaussian likelihoods with application to shear estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madhavacheril, Mathew S.; Sehgal, Neelima; McDonald, Patrick
2015-01-01
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong's estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g|=0.2.« less
Predicting Individual Fuel Economy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Zhenhong; Greene, David L
2011-01-01
To make informed decisions about travel and vehicle purchase, consumers need unbiased and accurate information of the fuel economy they will actually obtain. In the past, the EPA fuel economy estimates based on its 1984 rules have been widely criticized for overestimating on-road fuel economy. In 2008, EPA adopted a new estimation rule. This study compares the usefulness of the EPA's 1984 and 2008 estimates based on their prediction bias and accuracy and attempts to improve the prediction of on-road fuel economies based on consumer and vehicle attributes. We examine the usefulness of the EPA fuel economy estimates using amore » large sample of self-reported on-road fuel economy data and develop an Individualized Model for more accurately predicting an individual driver's on-road fuel economy based on easily determined vehicle and driver attributes. Accuracy rather than bias appears to have limited the usefulness of the EPA 1984 estimates in predicting on-road MPG. The EPA 2008 estimates appear to be equally inaccurate and substantially more biased relative to the self-reported data. Furthermore, the 2008 estimates exhibit an underestimation bias that increases with increasing fuel economy, suggesting that the new numbers will tend to underestimate the real-world benefits of fuel economy and emissions standards. By including several simple driver and vehicle attributes, the Individualized Model reduces the unexplained variance by over 55% and the standard error by 33% based on an independent test sample. The additional explanatory variables can be easily provided by the individuals.« less
Kebschull, Moritz; Fittler, Melanie Julia; Demmer, Ryan T; Papapanou, Panos N
2017-01-01
Today, -omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier "candidate" gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized -omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.A major issue when inferring biological information from high-throughput -omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of -omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different -omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.
Mogensen, Kris M; Andrew, Benjamin Y; Corona, Jasmine C; Robinson, Malcolm K
2016-07-01
The Society of Critical Care Medicine (SCCM) and American Society for Parenteral and Enteral Nutrition (ASPEN) recommend that obese, critically ill patients receive 11-14 kcal/kg/d using actual body weight (ABW) or 22-25 kcal/kg/d using ideal body weight (IBW), because feeding these patients 50%-70% maintenance needs while administering high protein may improve outcomes. It is unknown whether these equations achieve this target when validated against indirect calorimetry, perform equally across all degrees of obesity, or compare well with other equations. Measured resting energy expenditure (MREE) was determined in obese (body mass index [BMI] ≥30 kg/m(2)), critically ill patients. Resting energy expenditure was predicted (PREE) using several equations: 12.5 kcal/kg ABW (ASPEN-Actual BW), 23.5 kcal/kg IBW (ASPEN-Ideal BW), Harris-Benedict (adjusted-weight and 1.5 stress-factor), and Ireton-Jones for obesity. Correlation of PREE to 65% MREE, predictive accuracy, precision, bias, and large error incidence were calculated. All equations were significantly correlated with 65% MREE but had poor predictive accuracy, had excessive large error incidence, were imprecise, and were biased in the entire cohort (N = 31). In the obesity cohort (n = 20, BMI 30-50 kg/m(2)), ASPEN-Actual BW had acceptable predictive accuracy and large error incidence, was unbiased, and was nearly precise. In super obesity (n = 11, BMI >50 kg/m(2)), ASPEN-Ideal BW had acceptable predictive accuracy and large error incidence and was precise and unbiased. SCCM/ASPEN-recommended body weight equations are reasonable predictors of 65% MREE depending on the equation and degree of obesity. Assuming that feeding 65% MREE is appropriate, this study suggests that patients with a BMI 30-50 kg/m(2) should receive 11-14 kcal/kg/d using ABW and those with a BMI >50 kg/m(2) should receive 22-25 kcal/kg/d using IBW. © 2015 American Society for Parenteral and Enteral Nutrition.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.
Savitski, Mikhail M; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus
2015-09-01
Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target-decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target-decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The "picked" protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The "picked" target-decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used "classic" protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets
Savitski, Mikhail M.; Wilhelm, Mathias; Hahne, Hannes; Kuster, Bernhard; Bantscheff, Marcus
2015-01-01
Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target–decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target–decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The “picked” protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The “picked” target–decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used “classic” protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software. PMID:25987413
Beyond MOS and fibers: Optical Fourier-transform Imaging Unit for Cananea Observatory (OFIUCO)
NASA Astrophysics Data System (ADS)
Nieto-Suárez, M. A.; Rosales-Ortega, F. F.; Castillo, E.; García, P.; Escobedo, G.; Sánchez, S. F.; González, J.; Iglesias-Páramo, J.; Mollá, M.; Chávez, M.; Bertone, E.; et al.
2017-11-01
Many physical processes in astronomy are still hampered by the lack of spatial and spectral resolution, and also restricted to the field-of-view (FoV) of current 2D spectroscopy instruments available worldwide. It is due to that, many of the ongoing or proposed studies are based on large-scale imaging and/or spectroscopic surveys. Under this philosophy, large aperture telescopes are dedicated to the study of intrinsically faint and/or distance objects, covering small FoVs, with high spatial resolution, while smaller telescopes are devoted to wide-field explorations. However, future astronomical surveys, should be addressed by acquiring un-biases, spatially resolved, high-quality spectroscopic information for a wide FoV. Therefore, and in order to improve the current instrumental offer in the Observatorio Astrofísico Guillermo Haro (OAGH) in Cananea, Mexico (INAOE); and to explore a possible instrument for the future Telescopio San Pedro Mártir (6.5m), we are currently integrating at INAOE an instrument prototype that will provide us with un-biased wide-field (few arcmin) spectroscopic information, and with the flexibility of operating at different spectral resolutions (R 1-20000), with a spatial resolution limited by seeing, and therefore, to be used in a wide range of astronomical problems. This instrument called OFIUCO: Optical Fourier-transform Imaging Unit for Cananea Observatory, will make use of the Fourier Transform Spectroscopic technique, which has been proved to be feasible in the optical wavelength range (350-1000 nm) with designs such as SITELLE (CFHT). We describe here the basic technical description of a Fourier transform spectrograph with important modifications from previous astronomical versions, as well as the technical advantages and weakness, and the science cases in which this instrument can be implemented.
Voros, Szilard; Maurovich-Horvat, Pal; Marvasty, Idean B; Bansal, Aruna T; Barnes, Michael R; Vazquez, Gustavo; Murray, Sarah S; Voros, Viktor; Merkely, Bela; Brown, Bradley O; Warnick, G Russell
2014-01-01
Complex biological networks of atherosclerosis are largely unknown. The main objective of the Genetic Loci and the Burden of Atherosclerotic Lesions study is to assemble comprehensive biological networks of atherosclerosis using advanced cardiovascular imaging for phenotyping, a panomic approach to identify underlying genomic, proteomic, metabolomic, and lipidomic underpinnings, analyzed by systems biology-driven bioinformatics. By design, this is a hypothesis-free unbiased discovery study collecting a large number of biologically related factors to examine biological associations between genomic, proteomic, metabolomic, lipidomic, and phenotypic factors of atherosclerosis. The Genetic Loci and the Burden of Atherosclerotic Lesions study (NCT01738828) is a prospective, multicenter, international observational study of atherosclerotic coronary artery disease. Approximately 7500 patients are enrolled and undergo non-contrast-enhanced coronary calcium scanning by CT for the detection and quantification of coronary artery calcium, as well as coronary artery CT angiography for the detection and quantification of plaque, stenosis, and overall coronary artery disease burden. In addition, patients undergo whole genome sequencing, DNA methylation, whole blood-based transcriptome sequencing, unbiased proteomics based on mass spectrometry, as well as metabolomics and lipidomics on a mass spectrometry platform. The study is analyzed in 3 subsequent phases, and each phase consists of a discovery cohort and an independent validation cohort. For the primary analysis, the primary phenotype will be the presence of any atherosclerotic plaque, as detected by cardiac CT. Additional phenotypic analyses will include per patient maximal luminal stenosis defined as 50% and 70% diameter stenosis. Single-omic and multi-omic associations will be examined for each phenotype; putative biomarkers will be assessed for association, calibration, discrimination, and reclassification. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Docherty, A R; Moscati, A; Peterson, R; Edwards, A C; Adkins, D E; Bacanu, S A; Bigdeli, T B; Webb, B T; Flint, J; Kendler, K S
2016-10-25
Biometrical genetic studies suggest that the personality dimensions, including neuroticism, are moderately heritable (~0.4 to 0.6). Quantitative analyses that aggregate the effects of many common variants have recently further informed genetic research on European samples. However, there has been limited research to date on non-European populations. This study examined the personality dimensions in a large sample of Han Chinese descent (N=10 064) from the China, Oxford, and VCU Experimental Research on Genetic Epidemiology study, aimed at identifying genetic risk factors for recurrent major depression among a rigorously ascertained cohort. Heritability of neuroticism as measured by the Eysenck Personality Questionnaire (EPQ) was estimated to be low but statistically significant at 10% (s.e.=0.03, P=0.0001). In addition to EPQ, neuroticism based on a three-factor model, data for the Big Five (BF) personality dimensions (neuroticism, openness, conscientiousness, extraversion and agreeableness) measured by the Big Five Inventory were available for controls (n=5596). Heritability estimates of the BF were not statistically significant despite high power (>0.85) to detect heritabilities of 0.10. Polygenic risk scores constructed by best linear unbiased prediction weights applied to split-half samples failed to significantly predict any of the personality traits, but polygenic risk for neuroticism, calculated with LDpred and based on predictive variants previously identified from European populations (N=171 911), significantly predicted major depressive disorder case-control status (P=0.0004) after false discovery rate correction. The scores also significantly predicted EPQ neuroticism (P=6.3 × 10 -6 ). Factor analytic results of the measures indicated that any differences in heritabilities across samples may be due to genetic variation or variation in haplotype structure between samples, rather than measurement non-invariance. Findings demonstrate that neuroticism can be significantly predicted across ancestry, and highlight the importance of studying polygenic contributions to personality in non-European populations.
Using machine learning to disentangle homonyms in large text corpora.
Roll, Uri; Correia, Ricardo A; Berger-Tal, Oded
2018-06-01
Systematic reviews are an increasingly popular decision-making tool that provides an unbiased summary of evidence to support conservation action. These reviews bridge the gap between researchers and managers by presenting a comprehensive overview of all studies relating to a particular topic and identify specifically where and under which conditions an effect is present. However, several technical challenges can severely hinder the feasibility and applicability of systematic reviews, for example, homonyms (terms that share spelling but differ in meaning). Homonyms add noise to search results and cannot be easily identified or removed. We developed a semiautomated approach that can aid in the classification of homonyms among narratives. We used a combination of automated content analysis and artificial neural networks to quickly and accurately sift through large corpora of academic texts and classify them to distinct topics. As an example, we explored the use of the word reintroduction in academic texts. Reintroduction is used within the conservation context to indicate the release of organisms to their former native habitat; however, a Web of Science search for this word returned thousands of publications in which the term has other meanings and contexts. Using our method, we automatically classified a sample of 3000 of these publications with over 99% accuracy, relative to a manual classification. Our approach can be used easily with other homonyms and can greatly facilitate systematic reviews or similar work in which homonyms hinder the harnessing of large text corpora. Beyond homonyms we see great promise in combining automated content analysis and machine-learning methods to handle and screen big data for relevant information in conservation science. © 2017 Society for Conservation Biology.
Stevens, Antoine; Nocita, Marco; Tóth, Gergely; Montanarella, Luca; van Wesemael, Bas
2013-01-01
Soil organic carbon is a key soil property related to soil fertility, aggregate stability and the exchange of CO2 with the atmosphere. Existing soil maps and inventories can rarely be used to monitor the state and evolution in soil organic carbon content due to their poor spatial resolution, lack of consistency and high updating costs. Visible and Near Infrared diffuse reflectance spectroscopy is an alternative method to provide cheap and high-density soil data. However, there are still some uncertainties on its capacity to produce reliable predictions for areas characterized by large soil diversity. Using a large-scale EU soil survey of about 20,000 samples and covering 23 countries, we assessed the performance of reflectance spectroscopy for the prediction of soil organic carbon content. The best calibrations achieved a root mean square error ranging from 4 to 15 g C kg(-1) for mineral soils and a root mean square error of 50 g C kg(-1) for organic soil materials. Model errors are shown to be related to the levels of soil organic carbon and variations in other soil properties such as sand and clay content. Although errors are ∼5 times larger than the reproducibility error of the laboratory method, reflectance spectroscopy provides unbiased predictions of the soil organic carbon content. Such estimates could be used for assessing the mean soil organic carbon content of large geographical entities or countries. This study is a first step towards providing uniform continental-scale spectroscopic estimations of soil organic carbon, meeting an increasing demand for information on the state of the soil that can be used in biogeochemical models and the monitoring of soil degradation.
Stevens, Antoine; Nocita, Marco; Tóth, Gergely; Montanarella, Luca; van Wesemael, Bas
2013-01-01
Soil organic carbon is a key soil property related to soil fertility, aggregate stability and the exchange of CO2 with the atmosphere. Existing soil maps and inventories can rarely be used to monitor the state and evolution in soil organic carbon content due to their poor spatial resolution, lack of consistency and high updating costs. Visible and Near Infrared diffuse reflectance spectroscopy is an alternative method to provide cheap and high-density soil data. However, there are still some uncertainties on its capacity to produce reliable predictions for areas characterized by large soil diversity. Using a large-scale EU soil survey of about 20,000 samples and covering 23 countries, we assessed the performance of reflectance spectroscopy for the prediction of soil organic carbon content. The best calibrations achieved a root mean square error ranging from 4 to 15 g C kg−1 for mineral soils and a root mean square error of 50 g C kg−1 for organic soil materials. Model errors are shown to be related to the levels of soil organic carbon and variations in other soil properties such as sand and clay content. Although errors are ∼5 times larger than the reproducibility error of the laboratory method, reflectance spectroscopy provides unbiased predictions of the soil organic carbon content. Such estimates could be used for assessing the mean soil organic carbon content of large geographical entities or countries. This study is a first step towards providing uniform continental-scale spectroscopic estimations of soil organic carbon, meeting an increasing demand for information on the state of the soil that can be used in biogeochemical models and the monitoring of soil degradation. PMID:23840459
2016-10-01
identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to identify novel PCSC-targeting chemicals. In the past...display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to...Goals of the Project (SOW): Aim 1: To perform phage display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides
ERIC Educational Resources Information Center
Raudenbush, Stephen
2013-01-01
This brief considers the problem of using value-added scores to compare teachers who work in different schools. The author focuses on whether such comparisons can be regarded as fair, or, in statistical language, "unbiased." An unbiased measure does not systematically favor teachers because of the backgrounds of the students they are…
Long Term Follow up of the Delayed Effects of Acute Radiation Exposure in Primates
2017-10-01
66 of 94 We will then use shRNAs and/or CRISPR constructs targeting the gene of interest to knock down its expression in stem cells prior to...DLBCLs Mutational profiling identifies 150 driver genes Gene expression identifies sub- groups including cell of origin Unbiased CRISPR screen...Exome sequencing in 1,001 DLBCL patients comprehensively identifies 150 driver genes d Unbiased CRISPR screen in DLBCL cell lines identifies essential
Four photon parametric amplification. [in unbiased Josephson junction
NASA Technical Reports Server (NTRS)
Parrish, P. T.; Feldman, M. J.; Ohta, H.; Chiao, R. Y.
1974-01-01
An analysis is presented describing four-photon parametric amplification in an unbiased Josephson junction. Central to the theory is the model of the Josephson effect as a nonlinear inductance. Linear, small signal analysis is applied to the two-fluid model of the Josephson junction. The gain, gain-bandwidth product, high frequency limit, and effective noise temperature are calculated for a cavity reflection amplifier. The analysis is extended to multiple (series-connected) junctions and subharmonic pumping.
Test of mutually unbiased bases for six-dimensional photonic quantum systems
D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio
2013-01-01
In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a “qusix”), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution. PMID:24067548
Test of mutually unbiased bases for six-dimensional photonic quantum systems.
D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio
2013-09-25
In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a "qusix"), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution.
Graph-state formalism for mutually unbiased bases
NASA Astrophysics Data System (ADS)
Spengler, Christoph; Kraus, Barbara
2013-11-01
A pair of orthonormal bases is called mutually unbiased if all mutual overlaps between any element of one basis and an arbitrary element of the other basis coincide. In case the dimension, d, of the considered Hilbert space is a power of a prime number, complete sets of d+1 mutually unbiased bases (MUBs) exist. Here we present a method based on the graph-state formalism to construct such sets of MUBs. We show that for n p-level systems, with p being prime, one particular graph suffices to easily construct a set of pn+1 MUBs. In fact, we show that a single n-dimensional vector, which is associated with this graph, can be used to generate a complete set of MUBs and demonstrate that this vector can be easily determined. Finally, we discuss some advantages of our formalism regarding the analysis of entanglement structures in MUBs, as well as experimental realizations.
Mutually unbiased coarse-grained measurements of two or more phase-space variables
NASA Astrophysics Data System (ADS)
Paul, E. C.; Walborn, S. P.; Tasca, D. S.; Rudnicki, Łukasz
2018-05-01
Mutual unbiasedness of the eigenstates of phase-space operators—such as position and momentum, or their standard coarse-grained versions—exists only in the limiting case of infinite squeezing. In Phys. Rev. Lett. 120, 040403 (2018), 10.1103/PhysRevLett.120.040403, it was shown that mutual unbiasedness can be recovered for periodic coarse graining of these two operators. Here we investigate mutual unbiasedness of coarse-grained measurements for more than two phase-space variables. We show that mutual unbiasedness can be recovered between periodic coarse graining of any two nonparallel phase-space operators. We illustrate these results through optics experiments, using the fractional Fourier transform to prepare and measure mutually unbiased phase-space variables. The differences between two and three mutually unbiased measurements is discussed. Our results contribute to bridging the gap between continuous and discrete quantum mechanics, and they could be useful in quantum-information protocols.
Bouillon-Pichault, Marion; Jullien, Vincent; Bazzoli, Caroline; Pons, Gérard; Tod, Michel
2011-02-01
The aim of this work was to determine whether optimizing the study design in terms of ages and sampling times for a drug eliminated solely via cytochrome P450 3A4 (CYP3A4) would allow us to accurately estimate the pharmacokinetic parameters throughout the entire childhood timespan, while taking into account age- and weight-related changes. A linear monocompartmental model with first-order absorption was used successively with three different residual error models and previously published pharmacokinetic parameters ("true values"). The optimal ages were established by D-optimization using the CYP3A4 maturation function to create "optimized demographic databases." The post-dose times for each previously selected age were determined by D-optimization using the pharmacokinetic model to create "optimized sparse sampling databases." We simulated concentrations by applying the population pharmacokinetic model to the optimized sparse sampling databases to create optimized concentration databases. The latter were modeled to estimate population pharmacokinetic parameters. We then compared true and estimated parameter values. The established optimal design comprised four age ranges: 0.008 years old (i.e., around 3 days), 0.192 years old (i.e., around 2 months), 1.325 years old, and adults, with the same number of subjects per group and three or four samples per subject, in accordance with the error model. The population pharmacokinetic parameters that we estimated with this design were precise and unbiased (root mean square error [RMSE] and mean prediction error [MPE] less than 11% for clearance and distribution volume and less than 18% for k(a)), whereas the maturation parameters were unbiased but less precise (MPE < 6% and RMSE < 37%). Based on our results, taking growth and maturation into account a priori in a pediatric pharmacokinetic study is theoretically feasible. However, it requires that very early ages be included in studies, which may present an obstacle to the use of this approach. First-pass effects, alternative elimination routes, and combined elimination pathways should also be investigated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berlind, Andreas A.; Frieman, Joshua A.; Weinberg, David H.
2006-01-01
We identify galaxy groups and clusters in volume-limited samples of the SDSS redshift survey, using a redshift-space friends-of-friends algorithm. We optimize the friends-of-friends linking lengths to recover galaxy systems that occupy the same dark matter halos, using a set of mock catalogs created by populating halos of N-body simulations with galaxies. Extensive tests with these mock catalogs show that no combination of perpendicular and line-of-sight linking lengths is able to yield groups and clusters that simultaneously recover the true halo multiplicity function, projected size distribution, and velocity dispersion. We adopt a linking length combination that yields, for galaxy groups withmore » ten or more members: a group multiplicity function that is unbiased with respect to the true halo multiplicity function; an unbiased median relation between the multiplicities of groups and their associated halos; a spurious group fraction of less than {approx}1%; a halo completeness of more than {approx}97%; the correct projected size distribution as a function of multiplicity; and a velocity dispersion distribution that is {approx}20% too low at all multiplicities. These results hold over a range of mock catalogs that use different input recipes of populating halos with galaxies. We apply our group-finding algorithm to the SDSS data and obtain three group and cluster catalogs for three volume-limited samples that cover 3495.1 square degrees on the sky. We correct for incompleteness caused by fiber collisions and survey edges, and obtain measurements of the group multiplicity function, with errors calculated from realistic mock catalogs. These multiplicity function measurements provide a key constraint on the relation between galaxy populations and dark matter halos.« less
The Fecal Microbiota Profile and Bronchiolitis in Infants
Linnemann, Rachel W.; Mansbach, Jonathan M.; Ajami, Nadim J.; Espinola, Janice A.; Petrosino, Joseph F.; Piedra, Pedro A.; Stevenson, Michelle D.; Sullivan, Ashley F.; Thompson, Amy D.; Camargo, Carlos A.
2016-01-01
BACKGROUND: Little is known about the association of gut microbiota, a potentially modifiable factor, with bronchiolitis in infants. We aimed to determine the association of fecal microbiota with bronchiolitis in infants. METHODS: We conducted a case–control study. As a part of multicenter prospective study, we collected stool samples from 40 infants hospitalized with bronchiolitis. We concurrently enrolled 115 age-matched healthy controls. By applying 16S rRNA gene sequencing and an unbiased clustering approach to these 155 fecal samples, we identified microbiota profiles and determined the association of microbiota profiles with likelihood of bronchiolitis. RESULTS: Overall, the median age was 3 months, 55% were male, and 54% were non-Hispanic white. Unbiased clustering of fecal microbiota identified 4 distinct profiles: Escherichia-dominant profile (30%), Bifidobacterium-dominant profile (21%), Enterobacter/Veillonella-dominant profile (22%), and Bacteroides-dominant profile (28%). The proportion of bronchiolitis was lowest in infants with the Enterobacter/Veillonella-dominant profile (15%) and highest in the Bacteroides-dominant profile (44%), corresponding to an odds ratio of 4.59 (95% confidence interval, 1.58–15.5; P = .008). In the multivariable model, the significant association between the Bacteroides-dominant profile and a greater likelihood of bronchiolitis persisted (odds ratio for comparison with the Enterobacter/Veillonella-dominant profile, 4.24; 95% confidence interval, 1.56–12.0; P = .005). In contrast, the likelihood of bronchiolitis in infants with the Escherichia-dominant or Bifidobacterium-dominant profile was not significantly different compared with those with the Enterobacter/Veillonella-dominant profile. CONCLUSIONS: In this case–control study, we identified 4 distinct fecal microbiota profiles in infants. The Bacteroides-dominant profile was associated with a higher likelihood of bronchiolitis. PMID:27354456
Nasal Airway Microbiota Profile and Severe Bronchiolitis in Infants: A Case-control Study.
Hasegawa, Kohei; Linnemann, Rachel W; Mansbach, Jonathan M; Ajami, Nadim J; Espinola, Janice A; Petrosino, Joseph F; Piedra, Pedro A; Stevenson, Michelle D; Sullivan, Ashley F; Thompson, Amy D; Camargo, Carlos A
2017-11-01
Little is known about the relationship of airway microbiota with bronchiolitis in infants. We aimed to identify nasal airway microbiota profiles and to determine their association with the likelihood of bronchiolitis in infants. A case-control study was conducted. As a part of a multicenter prospective study, we collected nasal airway samples from 40 infants hospitalized with bronchiolitis. We concurrently enrolled 110 age-matched healthy controls. By applying 16S ribosomal RNA gene sequencing and an unbiased clustering approach to these 150 nasal samples, we identified microbiota profiles and determined the association of microbiota profiles with likelihood of bronchiolitis. Overall, the median age was 3 months and 56% were male. Unbiased clustering of airway microbiota identified 4 distinct profiles: Moraxella-dominant profile (37%), Corynebacterium/Dolosigranulum-dominant profile (27%), Staphylococcus-dominant profile (15%) and mixed profile (20%). Proportion of bronchiolitis was lowest in infants with Moraxella-dominant profile (14%) and highest in those with Staphylococcus-dominant profile (57%), corresponding to an odds ratio of 7.80 (95% confidence interval, 2.64-24.9; P < 0.001). In the multivariable model, the association between Staphylococcus-dominant profile and greater likelihood of bronchiolitis persisted (odds ratio for comparison with Moraxella-dominant profile, 5.16; 95% confidence interval, 1.26-22.9; P = 0.03). By contrast, Corynebacterium/Dolosigranulum-dominant profile group had low proportion of infants with bronchiolitis (17%); the likelihood of bronchiolitis in this group did not significantly differ from those with Moraxella-dominant profile in both unadjusted and adjusted analyses. In this case-control study, we identified 4 distinct nasal airway microbiota profiles in infants. Moraxella-dominant and Corynebacterium/Dolosigranulum-dominant profiles were associated with low likelihood of bronchiolitis, while Staphylococcus-dominant profile was associated with high likelihood of bronchiolitis.
Busquets, Joanna; Del Galdo, Francesco; Kissin, Eugene Y.
2010-01-01
Objectives. To obtain an objective, unbiased assessment of skin fibrosis in patients with SSc for use in clinical trials of SSc disease-modifying therapeutics. Methods. Skin biopsies from the dorsal forearm of six patients with diffuse SSc and six healthy controls, and skin biopsies from the forearm of one patient with diffuse SSc before and following 1 year treatment with mycophenolate mofetil were analysed by confocal laser scanning microscopy (CLSM) with specific antibodies against collagen types I and III or fibronectin. The integrated density of fluorescence (IDF) was calculated employing National Institutes of Health-ImageJ software in at least four different fields per biopsy spanning the full dermal thickness. Results. The intensities of collagen types I and III and fibronectin IDF were 174, 147 and 139% higher in SSc skin than in normal skin, respectively. All differences were statistically significant. The sum of the IDF values obtained for the three proteins yielded a comprehensive fibrosis score. The average fibrosis score for the six SSc samples was 28.3 × 106 compared with 18.6 × 106 for the six normal skin samples (P < 0.0001). Comparison of skin biopsies obtained from the same SSc patient before treatment and after 12 months of treatment with mycophenolate mofetil showed a reduction of 39% in total fibrosis score after treatment. Conclusions. CLSM followed by quantitative image analysis provides an objective and unbiased assessment of skin fibrosis in SSc and could be a useful end-point for clinical trials with disease-modifying agents to monitor the response or progression of the disease. PMID:20202926
Yurt, Kıymet Kübra; Kivrak, Elfide Gizem; Altun, Gamze; Mohamed, Hamza; Ali, Fathelrahman; Gasmalla, Hosam Eldeen; Kaplan, Suleyman
2018-02-26
A quantitative description of a three-dimensional (3D) object based on two-dimensional images can be made using stereological methods These methods involve unbiased approaches and provide reliable results with quantitative data. The quantitative morphology of the nervous system has been thoroughly researched in this context. In particular, various novel methods such as design-based stereological approaches have been applied in neuoromorphological studies. The main foundations of these methods are systematic random sampling and a 3D approach to structures such as tissues and organs. One key point in these methods is that selected samples should represent the entire structure. Quantification of neurons, i.e. particles, is important for revealing degrees of neurodegeneration and regeneration in an organ or system. One of the most crucial morphometric parameters in biological studies is thus the "number". The disector counting method introduced by Sterio in 1984 is an efficient and reliable solution for particle number estimation. In order to obtain precise results by means of stereological analysis, counting items should be seen clearly in the tissue. If an item in the tissue cannot be seen, these cannot be analyzed even using unbiased stereological techniques. Staining and sectioning processes therefore play a critical role in stereological analysis. The purpose of this review is to evaluate current neuroscientific studies using optical and physical disector counting methods and to discuss their definitions and methodological characteristics. Although the efficiency of the optical disector method in light microscopic studies has been revealed in recent years, the physical disector method is more easily performed in electron microscopic studies. Also, we offered to readers summaries of some common basic staining and sectioning methods, which can be used for stereological techniques in this review. Copyright © 2018 Elsevier B.V. All rights reserved.
Kim, Eun Sook; Wang, Yan
2017-01-01
Population heterogeneity in growth trajectories can be detected with growth mixture modeling (GMM). It is common that researchers compute composite scores of repeated measures and use them as multiple indicators of growth factors (baseline performance and growth) assuming measurement invariance between latent classes. Considering that the assumption of measurement invariance does not always hold, we investigate the impact of measurement noninvariance on class enumeration and parameter recovery in GMM through a Monte Carlo simulation study (Study 1). In Study 2, we examine the class enumeration and parameter recovery of the second-order growth mixture modeling (SOGMM) that incorporates measurement models at the first order level. Thus, SOGMM estimates growth trajectory parameters with reliable sources of variance, that is, common factor variance of repeated measures and allows heterogeneity in measurement parameters between latent classes. The class enumeration rates are examined with information criteria such as AIC, BIC, sample-size adjusted BIC, and hierarchical BIC under various simulation conditions. The results of Study 1 showed that the parameter estimates of baseline performance and growth factor means were biased to the degree of measurement noninvariance even when the correct number of latent classes was extracted. In Study 2, the class enumeration accuracy of SOGMM depended on information criteria, class separation, and sample size. The estimates of baseline performance and growth factor mean differences between classes were generally unbiased but the size of measurement noninvariance was underestimated. Overall, SOGMM is advantageous in that it yields unbiased estimates of growth trajectory parameters and more accurate class enumeration compared to GMM by incorporating measurement models. PMID:28928691
NASA Astrophysics Data System (ADS)
Aller, M. F.; Aller, H. D.; Hughes, P. A.
2001-12-01
Using centimeter-band total flux and linear polarization observations of the Pearson-Readhead sample sources systematically obtained with the UMRAO 26-m radio telescope during the past 16 years, we identify the range of variability properties and their temporal changes as functions of both optical and radio morphological classification. We find that our earlier statistical analysis, based on a time window of 6.4 years, did not delineate the full amplitude range of the total flux variability; further, several galaxies exhibit longterm, systematic changes or rather infrequent outbursts requiring long term observations for detection. Using radio classification as a delineator, we confirm, and find additional evidence, that significant changes in flux density can occur in steep spectrum and lobe-dominated objects as well as in compact, flat-spectrum objects. We find that statistically the time-averaged total flux density spectra steepen when longer time windows are included, which we attribute to a selection effect in the source sample. We have identified preferred orientations of the electric vector of the polarized emission (EVPA) in an unbiased manner in several sources, including several QSOs which have exhibited large variations in total flux while maintaining stable EVPAs, and compared these with orientations of the flow direction indicated by VLB morphology. We have looked for systematic, monotonic changes in EVPA which might be expected in the emission from a precessing jet, but none were identified. A Scargle periodogram analysis found no strong evidence for periodicity in any of the sample sources. We thank the NSF for grants AST-8815678, AST-9120224, AST-9421979, and AST-9900723 which provided partial support for this research. The operation of the 26-meter telescope is supported by the University of Michigan Department of Astronomy.
VizieR Online Data Catalog: High quality Spitzer/MIPS obs. of F4-K2 stars (Sierchio+, 2014)
NASA Astrophysics Data System (ADS)
Sierchio, J. M.; Rieke, G. H.; Su, K. Y. L.; Gaspar, A.
2016-11-01
We used specific criteria to draw samples of stars from the entire Spitzer Debris Disk Database (see section 2.1.1). V magnitudes were taken from Hipparcos and transformed to Johnson V. All stars were also required to have observations on the Two Micron All Sky Survey (2MASS) Ks system. Additional measurements were obtained at SAAO on the 0.75m telescope using the MarkII Infrared Photometer (transformed as described by Koen et al. 2007MNRAS.380.1433K), and at the Steward Observatory 61 in telescope using a NICMOS2-based camera with a 2MASS filter set and a neutral density filter to avoid saturation. These measurements will be described in a forthcoming paper (K. Y. L. Su et al., in preparation). The original programs in which our sample stars were measured are identified in Table 1. A large majority (93%) come from seven Spitzer programs: (1) the MIPS Guaranteed Time Observer (GTO) Sun-like star observations (Trilling+ 2008ApJ...674.1086T); (2) Formation and Evolution of Planetary Systems (FEPS; Meyer+ 2006, J/PASP/118/1690); (3) Completing the Census of Debris Disks (Koerner+ 2010ApJ...710L..26K); (4) potential Space Interferometry Mission/Terrestrial Planet Finder (SIM/TPF) targets (Beichman+ 2006ApJ...652.1674B); (5) an unbiased sample of F-stars (Trilling+ 2008ApJ...674.1086T); and (6) two coordinated programs selecting stars on the basis of indicators of youth (Low+ 2005ApJ...631.1170L; Plavchan+ 2009ApJ...698.1068P). See section 2.1.2. (1 data file).
Zero-bias microwave detectors based on array of nanorectifiers coupled with a dipole antenna
NASA Astrophysics Data System (ADS)
Kasjoo, Shahrir R.; Singh, Arun K.; Mat Isa, Siti S.; Ramli, Muhammad M.; Mohamad Isa, Muammar; Ahmad, Norhawati; Mohd Nor, Nurul I.; Khalid, Nazuhusna; Song, Ai Min
2016-04-01
We report on zero-bias microwave detection using a large array of unipolar nanodevices, known as the self-switching diodes (SSDs). The large array was realized in a single lithography step without the need of interconnection layers, hence allowing for a simple and low-cost fabrication process. The SSD array was coupled with a narrowband dipole antenna with a resonant frequency of 890 MHz, to form a simple rectenna (rectifying antenna). The extrinsic voltage responsivity and noise-equivalent-power (NEP) of the rectenna were ∼70 V/W and ∼0.18 nW/Hz1/2, respectively, measured in the far-field region at unbiased condition. Nevertheless, the estimated intrinsic voltage responsivity can achieve up to ∼5 kV/W with NEP of ∼2.6 pW/Hz1/2.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Sha; Jones, R. R.
Electrons ejected from atoms and subsequently driven to high energies in strong laser fields enable techniques from attosecond pulse generation to imaging with rescattered electrons. Analogous processes govern strong-field electron emission from nanostructures, where long wavelength radiation and large local field enhancements hold the promise for producing electrons with substantially higher energies, allowing for higher resolution time-resolved imaging. Here we report on the use of single-cycle terahertz pulses to drive electron emission from unbiased nano-tips. Energies exceeding 5 keV are observed, substantially greater than previously attained at higher drive frequencies. Despite large differences in the magnitude of the respective localmore » fields, we find that the maximum electron energies are only weakly dependent on the tip radius, for 10 nm« less
Early-branching Gut Fungi Possess A Large, And Comprehensive Array Of Biomass-Degrading Enzymes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Solomon, Kevin V.; Haitjema, Charles; Henske, John K.
The fungal kingdom is the source of almost all industrial enzymes in use for lignocellulose bioprocessing. Its more primitive members, however, remain relatively unexploited. We developed a systems-level approach that integrates RNA-Seq, proteomics, phenotype and biochemical studies of relatively unexplored early-branching free-living fungi. Anaerobic gut fungi isolated from herbivores produce a large array of biomass-degrading enzymes that synergistically degrade crude, unpretreated plant biomass, and are competitive with optimized commercial preparations from Aspergillus and Trichoderma. Compared to these model platforms, gut fungal enzymes are unbiased in substrate preference due to a wealth of xylan-degrading enzymes. These enzymes are universally catabolite repressed,more » and are further regulated by a rich landscape of noncoding regulatory RNAs. Furthermore, we identified several promising sequence divergent enzyme candidates for lignocellulosic bioprocessing.« less
Li, Sha; Jones, R. R.
2016-11-10
Electrons ejected from atoms and subsequently driven to high energies in strong laser fields enable techniques from attosecond pulse generation to imaging with rescattered electrons. Analogous processes govern strong-field electron emission from nanostructures, where long wavelength radiation and large local field enhancements hold the promise for producing electrons with substantially higher energies, allowing for higher resolution time-resolved imaging. Here we report on the use of single-cycle terahertz pulses to drive electron emission from unbiased nano-tips. Energies exceeding 5 keV are observed, substantially greater than previously attained at higher drive frequencies. Despite large differences in the magnitude of the respective localmore » fields, we find that the maximum electron energies are only weakly dependent on the tip radius, for 10 nm« less
Polarimetry of optically selected BL Lacertae candidates from the SDSS
NASA Astrophysics Data System (ADS)
Heidt, J.; Nilsson, K.
2011-05-01
We present and discuss polarimetric observations of 182 targets drawn from an optically selected sample of 240 probable BL Lac candidates out of the SDSS compiled by Collinge et al. (2005, AJ, 129, 2542). In contrast to most other BL Lac candidate samples extracted from the SDSS, its radio- and/or X-ray properties have not been taken into account for its derivation. Thus, because its selection is based on optical properties alone, it may be less prone to selection effects inherent in other samples derived at different frequencies, so it offers a unique opportunity to extract the first unbiased BL Lac luminosity function that is suitably large in size. We found 124 out of 182 targets (68%) to be polarized, 95 of the polarized targets (77%) to be highly polarized (>4%). The low-frequency peaked BL Lac candidates in the sample are on average only slightly more polarized than the high-frequency peaked ones. Compared to earlier studies, we found a high duty cycle in high polarization (˜ 66+2-14% to be >4% polarized) in high-frequency peaked BL Lac candidates. This may come from our polarization analysis, which minimizes the contamination by host galaxy light. No evidence of radio-quiet BL Lac objects in the sample was found. Our observations show that the probable sample of BL Lac candidates of Collinge et al. (2005) indeed contains a large number of bona fide BL Lac objects. High S/N spectroscopy and deep X-ray observations are required to construct the first luminosity function of optically selected BL Lac objects and to test more stringently for any radio-quiet BL Lac objects in the sample. Based on observations collected with the NTT on La Silla (Chile) operated by the European Southern Observatory in the course of the observing proposal 082.B-0133.Based on observations collected at the Centro Astronómico Hispano Alemán (CAHA), operated jointly by the Max-Planck-Institut für Astronomie and the Instituto de Astrofisica de Andalucia (CSIC).Based on observations made with the Nordic Optical Telescope, operated on the island of La Palma jointly by Denmark, Finland, Iceland, Norway, and Sweden, in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofisica de Canarias.Table 1 is only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/529/A162
Statistical and sampling issues when using multiple particle tracking
NASA Astrophysics Data System (ADS)
Savin, Thierry; Doyle, Patrick S.
2007-08-01
Video microscopy can be used to simultaneously track several microparticles embedded in a complex material. The trajectories are used to extract a sample of displacements at random locations in the material. From this sample, averaged quantities characterizing the dynamics of the probes are calculated to evaluate structural and/or mechanical properties of the assessed material. However, the sampling of measured displacements in heterogeneous systems is singular because the volume of observation with video microscopy is finite. By carefully characterizing the sampling design in the experimental output of the multiple particle tracking technique, we derive estimators for the mean and variance of the probes’ dynamics that are independent of the peculiar statistical characteristics. We expose stringent tests of these estimators using simulated and experimental complex systems with a known heterogeneous structure. Up to a certain fundamental limitation, which we characterize through a material degree of sampling by the embedded probe tracking, these estimators can be applied to quantify the heterogeneity of a material, providing an original and intelligible kind of information on complex fluid properties. More generally, we show that the precise assessment of the statistics in the multiple particle tracking output sample of observations is essential in order to provide accurate unbiased measurements.
Quantum Inference on Bayesian Networks
NASA Astrophysics Data System (ADS)
Yoder, Theodore; Low, Guang Hao; Chuang, Isaac
2014-03-01
Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.
Cross-Layer Resource Allocation for Wireless Visual Sensor Networks and Mobile Ad Hoc Networks
2014-10-01
MMD), minimizes the maximum dis- tortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. We employed the Particle...achieve the ideal tradeoff between the transmitted video quality and energy consumption. Each sensor node has a bit rate that can be used for both...Distortion (MMD), minimizes the maximum distortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. For both criteria
Unbiased estimators for spatial distribution functions of classical fluids
NASA Astrophysics Data System (ADS)
Adib, Artur B.; Jarzynski, Christopher
2005-01-01
We use a statistical-mechanical identity closely related to the familiar virial theorem, to derive unbiased estimators for spatial distribution functions of classical fluids. In particular, we obtain estimators for both the fluid density ρ(r) in the vicinity of a fixed solute and the pair correlation g(r) of a homogeneous classical fluid. We illustrate the utility of our estimators with numerical examples, which reveal advantages over traditional histogram-based methods of computing such distributions.
Physical and dynamical studies of meteors. Meteor-fragmentation and stream-distribution studies
NASA Technical Reports Server (NTRS)
Sekanina, Z.; Southworth, R. B.
1975-01-01
Population parameters of 275 streams including 20 additional streams in the synoptic-year sample were found by a computer technique. Some 16 percent of the sample is in these streams. Four meteor streams that have close orbital resemblance to Adonis cannot be positively identified as meteors ejected by Adonis within the last 12000 years. Ceplecha's discrete levels of meteor height are not evident in radar meteors. The spread of meteoroid fragments along their common trajectory was computed for most of the observed radar meteors. There is an unexpected relationship between spread and velocity that perhaps conceals relationships between fragmentation and orbits; a theoretical treatment will be necessary to resolve these relationships. Revised unbiased statistics of synoptic-year orbits are presented, together with parallel statistics for the 1961 to 1965 radar meteor orbits.
Seeing Through the Clouds: AGN Geometry with the Swift BAT Sample
NASA Astrophysics Data System (ADS)
Glikman, Eilat; Urry, M.; Schawinski, K.; Koss, M. J.; Winter, L. M.; Elitzur, M.; Wilkin, W. H.
2011-01-01
We investigate the intrinsic structure of the clouds surrounding AGN which give rise to their X-ray and optical emission properties. Using a complete sample of Swift BAT AGN selected in hard X-rays (14-195 keV), which is unbiased with respect to obscuration and extinction, we compute the reddening in the broad line region along the line of sight to the nucleus of each source using Balmer decrement from the ratio of the broad components of H-alpha/H-beta. We compare reddening from dust in the broad line clouds to the hydrogen column density (NH) obtained from their X-ray spectra. The distribution of the gas-to-dust ratios over many lines of sight allow us to test models of AGN structure and probe the immediate environment of the accreting supermassive black holes.
Conformational Entropy as Collective Variable for Proteins.
Palazzesi, Ferruccio; Valsson, Omar; Parrinello, Michele
2017-10-05
Many enhanced sampling methods rely on the identification of appropriate collective variables. For proteins, even small ones, finding appropriate descriptors has proven challenging. Here we suggest that the NMR S 2 order parameter can be used to this effect. We trace the validity of this statement to the suggested relation between S 2 and conformational entropy. Using the S 2 order parameter and a surrogate for the protein enthalpy in conjunction with metadynamics or variationally enhanced sampling, we are able to reversibly fold and unfold a small protein and draw its free energy at a fraction of the time that is needed in unbiased simulations. We also use S 2 in combination with the free energy flooding method to compute the unfolding rate of this peptide. We repeat this calculation at different temperatures to obtain the unfolding activation energy.
miCLIP-MaPseq, a Substrate Identification Approach for Radical SAM RNA Methylating Enzymes.
Stojković, Vanja; Chu, Tongyue; Therizols, Gabriel; Weinberg, David E; Fujimori, Danica Galonić
2018-06-13
Although present across bacteria, the large family of radical SAM RNA methylating enzymes is largely uncharacterized. Escherichia coli RlmN, the founding member of the family, methylates an adenosine in 23S rRNA and several tRNAs to yield 2-methyladenosine (m 2 A). However, varied RNA substrate specificity among RlmN enzymes, combined with the ability of certain family members to generate 8-methyladenosine (m 8 A), makes functional predictions across this family challenging. Here, we present a method for unbiased substrate identification that exploits highly efficient, mechanism-based cross-linking between the enzyme and its RNA substrates. Additionally, by determining that the thermostable group II intron reverse transcriptase introduces mismatches at the site of the cross-link, we have identified the precise positions of RNA modification using mismatch profiling. These results illustrate the capability of our method to define enzyme-substrate pairs and determine modification sites of the largely uncharacterized radical SAM RNA methylating enzyme family.
Sampling considerations for disease surveillance in wildlife populations
Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.
2008-01-01
Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
NASA Astrophysics Data System (ADS)
Kohno, Mikito; Torii, Kazufumi; Tachihara, Kengo; Umemoto, Tomofumi; Minamidani, Tetsuhiro; Nishimura, Atsushi; Fujita, Shinji; Matsuo, Mitsuhiro; Yamagishi, Mitsuyoshi; Tsuda, Yuya; Kuriki, Mika; Kuno, Nario; Ohama, Akio; Hattori, Yusuke; Sano, Hidetoshi; Yamamoto, Hiroaki; Fukui, Yasuo
2018-05-01
We observed molecular clouds in the W 33 high-mass star-forming region associated with compact and extended H II regions using the NANTEN2 telescope as well as the Nobeyama 45 m telescope in the J = 1-0 transitions of 12CO, 13CO, and C18O as part of the FOREST Unbiased Galactic plane Imaging survey with the Nobeyama 45 m telescope (FUGIN) legacy survey. We detected three velocity components at 35 km s-1, 45 km s-1, and 58 km s-1. The 35 km s-1 and 58 km s-1 clouds are likely to be physically associated with W 33 because of the enhanced 12CO J = 3-2 to J = 1-0 intensity ratio as R_3-2/1-0 > 1.0 due to the ultraviolet irradiation by OB stars, and morphological correspondence between the distributions of molecular gas and the infrared and radio continuum emissions excited by high-mass stars. The two clouds show complementary distributions around W 33. The velocity separation is too large to be gravitationally bound, and yet not explained by expanding motion by stellar feedback. Therefore, we discuss whether a cloud-cloud collision scenario likely explains the high-mass star formation in W 33.
Perandini, Simone; Soardi, G A; Larici, A R; Del Ciello, A; Rizzardi, G; Solazzo, A; Mancino, L; Zeraj, F; Bernhart, M; Signorini, M; Motton, M; Montemezzi, S
2017-05-01
To achieve multicentre external validation of the Herder and Bayesian Inference Malignancy Calculator (BIMC) models. Two hundred and fifty-nine solitary pulmonary nodules (SPNs) collected from four major hospitals which underwent 18-FDG-PET characterization were included in this multicentre retrospective study. The Herder model was tested on all available lesions (group A). A subgroup of 180 SPNs (group B) was used to provide unbiased comparison between the Herder and BIMC models. Receiver operating characteristic (ROC) area under the curve (AUC) analysis was performed to assess diagnostic accuracy. Decision analysis was performed by adopting the risk threshold stated in British Thoracic Society (BTS) guidelines. Unbiased comparison performed In Group B showed a ROC AUC for the Herder model of 0.807 (95 % CI 0.742-0.862) and for the BIMC model of 0.822 (95 % CI 0.758-0.875). Both the Herder and the BIMC models were proven to accurately predict the risk of malignancy when tested on a large multicentre external case series. The BIMC model seems advantageous on the basis of a more favourable decision analysis. • The Herder model showed a ROC AUC of 0.807 on 180 SPNs. • The BIMC model showed a ROC AUC of 0.822 on 180 SPNs. • Decision analysis is more favourable to the BIMC model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deymier, Martin J., E-mail: mdeymie@emory.edu; Claiborne, Daniel T., E-mail: dclaibo@emory.edu; Ende, Zachary, E-mail: zende@emory.edu
The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmittedmore » genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor.« less
Byrgazov, Konstantin; Lucini, Chantal Blanche; Berkowitsch, Bettina; Koenig, Margit; Haas, Oskar A; Hoermann, Gregor; Valent, Peter; Lion, Thomas
2016-11-22
Point mutations in the ABL1 kinase domain are an important mechanism of resistance to tyrosine kinase inhibitors (TKI) in BCR-ABL1-positive and, as recently shown, BCR-ABL1-like leukemias. The cell line Ba/F3 lentivirally transduced with mutant BCR-ABL1 constructs is widely used for in vitro sensitivity testing and response prediction to tyrosine kinase inhibitors. The transposon-based Sleeping Beauty system presented offers several advantages over lentiviral transduction including the absence of biosafety issues, faster generation of transgenic cell lines, and greater efficacy in introducing large gene constructs. Nevertheless, both methods can mediate multiple insertions in the genome. Here we show that multiple BCR-ABL1 insertions result in elevated IC50 levels for individual TKIs, thus overestimating the actual resistance of mutant subclones. We have therefore established flow-sorting-based fractionation of BCR-ABL1-transformed Ba/F3 cells facilitating efficient enrichment of cells carrying single-site insertions, as demonstrated by FISH-analysis. Fractions of unselected Ba/F3 cells not only showed a greater number of BCR-ABL1 hybridization signals, but also revealed higher IC50 values for the TKIs tested. The data presented highlight the need to carefully select transfected cells by flow-sorting, and to control the insertion numbers by FISH and real-time PCR to permit unbiased in vitro testing of drug resistance.
NASA Astrophysics Data System (ADS)
Kohno, Mikito; Torii, Kazufumi; Tachihara, Kengo; Umemoto, Tomofumi; Minamidani, Tetsuhiro; Nishimura, Atsushi; Fujita, Shinji; Matsuo, Mitsuhiro; Yamagishi, Mitsuyoshi; Tsuda, Yuya; Kuriki, Mika; Kuno, Nario; Ohama, Akio; Hattori, Yusuke; Sano, Hidetoshi; Yamamoto, Hiroaki; Fukui, Yasuo
2018-01-01
We observed molecular clouds in the W 33 high-mass star-forming region associated with compact and extended H II regions using the NANTEN2 telescope as well as the Nobeyama 45 m telescope in the J = 1-0 transitions of 12CO, 13CO, and C18O as part of the FOREST Unbiased Galactic plane Imaging survey with the Nobeyama 45 m telescope (FUGIN) legacy survey. We detected three velocity components at 35 km s-1, 45 km s-1, and 58 km s-1. The 35 km s-1 and 58 km s-1 clouds are likely to be physically associated with W 33 because of the enhanced 12CO J = 3-2 to J = 1-0 intensity ratio as R3-2/1-0 > 1.0 due to the ultraviolet irradiation by OB stars, and morphological correspondence between the distributions of molecular gas and the infrared and radio continuum emissions excited by high-mass stars. The two clouds show complementary distributions around W 33. The velocity separation is too large to be gravitationally bound, and yet not explained by expanding motion by stellar feedback. Therefore, we discuss whether a cloud-cloud collision scenario likely explains the high-mass star formation in W 33.
NASA Astrophysics Data System (ADS)
Kohno, Mikito; Torii, Kazufumi; Tachihara, Kengo; Umemoto, Tomofumi; Minamidani, Tetsuhiro; Nishimura, Atsushi; Fujita, Shinji; Matsuo, Mitsuhiro; Yamagishi, Mitsuyoshi; Tsuda, Yuya; Kuriki, Mika; Kuno, Nario; Ohama, Akio; Hattori, Yusuke; Sano, Hidetoshi; Yamamoto, Hiroaki; Fukui, Yasuo
2018-05-01
We observed molecular clouds in the W 33 high-mass star-forming region associated with compact and extended H II regions using the NANTEN2 telescope as well as the Nobeyama 45 m telescope in the J = 1-0 transitions of 12CO, 13CO, and C18O as part of the FOREST Unbiased Galactic plane Imaging survey with the Nobeyama 45 m telescope (FUGIN) legacy survey. We detected three velocity components at 35 km s-1, 45 km s-1, and 58 km s-1. The 35 km s-1 and 58 km s-1 clouds are likely to be physically associated with W 33 because of the enhanced 12CO J = 3-2 to J = 1-0 intensity ratio as R_3-2/1-0} > 1.0 due to the ultraviolet irradiation by OB stars, and morphological correspondence between the distributions of molecular gas and the infrared and radio continuum emissions excited by high-mass stars. The two clouds show complementary distributions around W 33. The velocity separation is too large to be gravitationally bound, and yet not explained by expanding motion by stellar feedback. Therefore, we discuss whether a cloud-cloud collision scenario likely explains the high-mass star formation in W 33.
A metagenomic survey of microbes in honey bee colony collapse disorder.
Cox-Foster, Diana L; Conlan, Sean; Holmes, Edward C; Palacios, Gustavo; Evans, Jay D; Moran, Nancy A; Quan, Phenix-Lan; Briese, Thomas; Hornig, Mady; Geiser, David M; Martinson, Vince; vanEngelsdorp, Dennis; Kalkstein, Abby L; Drysdale, Andrew; Hui, Jeffrey; Zhai, Junhui; Cui, Liwang; Hutchison, Stephen K; Simons, Jan Fredrik; Egholm, Michael; Pettis, Jeffery S; Lipkin, W Ian
2007-10-12
In colony collapse disorder (CCD), honey bee colonies inexplicably lose their workers. CCD has resulted in a loss of 50 to 90% of colonies in beekeeping operations across the United States. The observation that irradiated combs from affected colonies can be repopulated with naive bees suggests that infection may contribute to CCD. We used an unbiased metagenomic approach to survey microflora in CCD hives, normal hives, and imported royal jelly. Candidate pathogens were screened for significance of association with CCD by the examination of samples collected from several sites over a period of 3 years. One organism, Israeli acute paralysis virus of bees, was strongly correlated with CCD.
Rapid System to Quantitatively Characterize the Airborne Microbial Community
NASA Technical Reports Server (NTRS)
Macnaughton, Sarah J.
1998-01-01
Bioaerosols have been linked to a wide range of different allergies and respiratory illnesses. Currently, microorganism culture is the most commonly used method for exposure assessment. Such culture techniques, however, generally fail to detect between 90-99% of the actual viable biomass. Consequently, an unbiased technique for detecting airborne microorganisms is essential. In this Phase II proposal, a portable air sampling device his been developed for the collection of airborne microbial biomass from indoor (and outdoor) environments. Methods were evaluated for extracting and identifying lipids that provide information on indoor air microbial biomass, and automation of these procedures was investigated. Also, techniques to automate the extraction of DNA were explored.
Population annealing with weighted averages: A Monte Carlo method for rough free-energy landscapes
NASA Astrophysics Data System (ADS)
Machta, J.
2010-08-01
The population annealing algorithm introduced by Hukushima and Iba is described. Population annealing combines simulated annealing and Boltzmann weighted differential reproduction within a population of replicas to sample equilibrium states. Population annealing gives direct access to the free energy. It is shown that unbiased measurements of observables can be obtained by weighted averages over many runs with weight factors related to the free-energy estimate from the run. Population annealing is well suited to parallelization and may be a useful alternative to parallel tempering for systems with rough free-energy landscapes such as spin glasses. The method is demonstrated for spin glasses.
Rare Event Simulation in Radiation Transport
NASA Astrophysics Data System (ADS)
Kollman, Craig
This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved, even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiplied by the likelihood ratio between the true and simulated probabilities so as to keep our estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive "learning" algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give, with probability one, a sequence of estimates converging exponentially fast to the true solution. In the final chapter, an attempt to generalize this algorithm to a continuous state space is made. This involves partitioning the space into a finite number of cells. There is a tradeoff between additional computation per iteration and variance reduction per iteration that arises in determining the optimal grid size. All versions of this algorithm can be thought of as a compromise between deterministic and Monte Carlo methods, capturing advantages of both techniques.
Using knowledge from human research to improve understanding of contest theory and contest dynamics
Denson, Thomas F.
2017-01-01
Our understanding of animal contests and the factors that affect contest dynamics and decisions stems from a long and prosperous collaboration between empiricists and theoreticians. Over the last two decades, however, theoretical predictions regarding the factors that affect individual decisions before, during and after a contest are becoming increasingly difficult to test empirically. Extremely large sample sizes are necessary to experimentally test the nuanced theoretical assumptions surrounding how information is used by animals during a contest, how context changes the information used, and how individuals change behaviour as a result of both the information available and the context in which the information is acquired. In this review, we discuss how the investigation of contests in humans through the collaboration of biologists and psychologists may advance contest theory and dynamics in general. We argue that a long and productive history exploring human behaviour and psychology combined with technological advancements provide a unique opportunity to manipulate human perception during contests and collect unbiased data, allowing more targeted examinations of particular aspects of contest theory (e.g. winner/loser effects, information use as a function of age). We hope that our perspective provides the impetus for many future collaborations between biologists and psychologists. PMID:29237857
Isonymy structure of four Venezuelan states.
Rodríguez-Larralde, A; Barrai, I; Alfonzo, J C
1993-01-01
The isonymy structure of four Venezuelan States-Falcón, Mérida, Nueva Esparta and Yaracuy-was studied using the surnames of the Venezuelan register of electors updated in 1984. The surname distributions of 155 counties were obtained and, for each county, estimates of consanguinity due to random isonymy and Fisher's alpha were calculated. It was shown that for large sample sizes the inverse of Fisher's alpha is identical to the unbiased estimate of within-population random isonymy. A three-dimensional isometric surface plot was obtained for each State, based on the counties' random isonymy estimates. The highest estimates of random consanguinity were found in the States of Nueva Esparta and Mérida, while the lowest were found in Yaracuy. Other microdifferentiation indicators from the same data gave similar results, and an interpretation was attempted, based on the particular economic and geographic characteristics of each State. Four different genetic distances between all possible pairs of counties were calculated within States; geographic distance shows the highest correlations with random isonymy and Euclidean distance, with the exception of the State of Nueva Esparta, where there is no correlation between geographic distance and random isonymy. It was possible to group counties in clusters, from dendrograms based on Euclidean distance. Isonymy clustering was also consistent with socioeconomic and geographic characteristics of the counties.
The Fecal Viral Flora of California Sea Lions▿†
Li, Linlin; Shan, Tongling; Wang, Chunlin; Côté, Colette; Kolman, John; Onions, David; Gulland, Frances M. D.; Delwart, Eric
2011-01-01
California sea lions are one of the major marine mammal species along the Pacific coast of North America. Sea lions are susceptible to a wide variety of viruses, some of which can be transmitted to or from terrestrial mammals. Using an unbiased viral metagenomic approach, we surveyed the fecal virome in California sea lions of different ages and health statuses. Averages of 1.6 and 2.5 distinct mammalian viral species were shed by pups and juvenile sea lions, respectively. Previously undescribed mammalian viruses from four RNA virus families (Astroviridae, Picornaviridae, Caliciviridae, and Reoviridae) and one DNA virus family (Parvoviridae) were characterized. The first complete or partial genomes of sapeloviruses, sapoviruses, noroviruses, and bocavirus in marine mammals are reported. Astroviruses and bocaviruses showed the highest prevalence and abundance in California sea lion feces. The diversity of bacteriophages was higher in unweaned sea lion pups than in juveniles and animals in rehabilitation, where the phage community consisted largely of phages related to the family Microviridae. This study increases our understanding of the viral diversity in marine mammals, highlights the high rate of enteric viral infections in these highly social carnivores, and may be used as a baseline viral survey for comparison with samples from California sea lions during unexplained disease outbreaks. PMID:21795334
Peng, Hui; Saunders, David M V; Sun, Jianxian; Jones, Paul D; Wong, Chris K C; Liu, Hongling; Giesy, John P
2016-12-06
Characterization of toxicological profiles by use of traditional targeted strategies might underestimate the risk of environmental mixtures. Unbiased identification of prioritized compounds provides a promising strategy for meeting regulatory needs. In this study, untargeted screening of brominated compounds in house dust was conducted using a data-independent precursor isolation and characteristic fragment (DIPIC-Frag) approach, which used data-independent acquisition (DIA) and a chemometric strategy to detect peaks and align precursor ions. A total of 1008 brominated compound peaks were identified in 23 house dust samples. Precursor ions and formulas were identified for 738 (73%) of the brominated compounds. A correlation matrix was used to cluster brominated compounds; three large groups were found for the 140 high-abundance brominated compounds, and only 24 (17%) of these compounds were previously known flame retardants. The predominant class of unknown brominated compounds was predicted to consist of nitrogen-containing compounds. Following further validation by authentic standards, these compounds (56%) were determined to be novel brominated azo dyes. The mutagenicity of one major component was investigated, and mutagenicity was observed at environmentally relevant concentrations. Results of this study demonstrated the existence of numerous unknown brominated compounds in house dust, with mutagenic azo dyes unexpectedly being identified as the predominant compounds.
Broad phylogenomic sampling and the sister lineage of land plants.
Timme, Ruth E; Bachvaroff, Tsvetan R; Delwiche, Charles F
2012-01-01
The tremendous diversity of land plants all descended from a single charophyte green alga that colonized the land somewhere between 430 and 470 million years ago. Six orders of charophyte green algae, in addition to embryophytes, comprise the Streptophyta s.l. Previous studies have focused on reconstructing the phylogeny of organisms tied to this key colonization event, but wildly conflicting results have sparked a contentious debate over which lineage gave rise to land plants. The dominant view has been that 'stoneworts,' or Charales, are the sister lineage, but an alternative hypothesis supports the Zygnematales (often referred to as "pond scum") as the sister lineage. In this paper, we provide a well-supported, 160-nuclear-gene phylogenomic analysis supporting the Zygnematales as the closest living relative to land plants. Our study makes two key contributions to the field: 1) the use of an unbiased method to collect a large set of orthologs from deeply diverging species and 2) the use of these data in determining the sister lineage to land plants. We anticipate this updated phylogeny not only will hugely impact lesson plans in introductory biology courses, but also will provide a solid phylogenetic tree for future green-lineage research, whether it be related to plants or green algae.
Perneczky, R; Drzezga, A; Diehl-Schmid, J; Schmid, G; Wohlschläger, A; Kars, S; Grimmer, T; Wagenpfeil, S; Monsch, A; Kurz, A
2006-09-01
Functional imaging studies report that higher education is associated with more severe pathology in patients with Alzheimer's disease, controlling for disease severity. Therefore, schooling seems to provide brain reserve against neurodegeneration. To provide further evidence for brain reserve in a large sample, using a sensitive technique for the indirect assessment of brain abnormality (18F-fluoro-deoxy-glucose-positron emission tomography (FDG-PET)), a comprehensive measure of global cognitive impairment to control for disease severity (total score of the Consortium to Establish a Registry for Alzheimer's Disease Neuropsychological Battery) and an approach unbiased by predefined regions of interest for the statistical analysis (statistical parametric mapping (SPM)). 93 patients with mild Alzheimer's disease and 16 healthy controls underwent 18F-FDG-PET imaging of the brain. A linear regression analysis with education as independent and glucose utilisation as dependent variables, adjusted for global cognitive status and demographic variables, was conducted in SPM2. The regression analysis showed a marked inverse association between years of schooling and glucose metabolism in the posterior temporo-occipital association cortex and the precuneus in the left hemisphere. In line with previous reports, the findings suggest that education is associated with brain reserve and that people with higher education can cope with brain damage for a longer time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carlberg, R. G.; Grillmair, C. J., E-mail: carlberg@astro.utoronto.ca, E-mail: carl@ipac.caltech.edu
Measurements of velocity and density perturbations along stellar streams in the Milky Way provide a time-integrated measure of dark matter substructure at larger galactic radius than the complementary instantaneous inner-halo strong lensing detection of dark matter sub-halos in distant galaxies. An interesting case to consider is the proposed Phoenix–Hermus star stream, which is long, thin, and on a nearly circular orbit, making it a particular good target to study for velocity variations along its length. In the presence of dark matter sub-halos, the stream velocities are significantly perturbed in a manner that is readily understood with the impulse approximation. Amore » set of simulations shows that only sub-halos above a few 10{sup 7} M {sub ⊙} lead to reasonably long-lived observationally detectable velocity variations of amplitude of order 1 km s{sup −1}, with an average of about one visible hit per (two-armed) stream over a 3 Gyr interval. An implication is that globular clusters themselves will not have a visible impact on the stream. Radial velocities have the benefit of being completely insensitive to distance errors. Distance errors scatter individual star velocities perpendicular and tangential to the mean orbit, but their mean values remain unbiased. Calculations like these help build the quantitative case to acquire large, fairly deep, precision velocity samples of stream stars.« less
Analysis of the Free-Energy Surface of Proteins from Reversible Folding Simulations
Allen, Lucy R.; Krivov, Sergei V.; Paci, Emanuele
2009-01-01
Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical λ-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins. PMID:19593364
Simulations of the Far-infrared Sky
NASA Astrophysics Data System (ADS)
Andreani, P.; Lutz, D.; Poglitsch, A.; Genzel, R.
2001-07-01
One of the main tasks of FIRST is to carry out shallow and deep surveys in the far-IR / submm spectral domain with unprecedented sensitivity. Selecting unbiased samples out of deep surveys will be crucial to determine the history of evolving dusty objects, and therefore of star-formation. However, the usual procedures to extract information from a survey, i.e. selection of sources, computing the number counts, the luminosity and the correlation functions, and so on, cannot lead to a fully satisfactory and rigorous determination of the source characteristics. This is expecially true in the far-IR where source identification and redshift determination are difficult. To check the reliability of results the simulation of a large number of mock surveys is mandatory. This provides information on the observational biases and instrumental effects introduced by the observing procedures and allows one to understand how the different parameters affect the source observation and detection. The project we are undertaking consists of (1) simulating the far-IR/submm surveys as PACS (and SPIRE) will observe, (2) extracting from these complete mock catalogues, (3) for the foreseen photometric bands selecting high-z candidates in colour-colour diagrams, and (4) testing different observing strategies to assess observational biases and understand how the different parameters affect source observation and detection.
Analysis of the free-energy surface of proteins from reversible folding simulations.
Allen, Lucy R; Krivov, Sergei V; Paci, Emanuele
2009-07-01
Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical lambda-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins.
Comparison of Proteins in Whole Blood and Dried Blood Spot Samples by LC/MS/MS
NASA Astrophysics Data System (ADS)
Chambers, Andrew G.; Percy, Andrew J.; Hardie, Darryl B.; Borchers, Christoph H.
2013-09-01
Dried blood spot (DBS) sampling methods are desirable for population-wide biomarker screening programs because of their ease of collection, transportation, and storage. Immunoassays are traditionally used to quantify endogenous proteins in these samples but require a separate assay for each protein. Recently, targeted mass spectrometry (MS) has been proposed for generating highly-multiplexed assays for biomarker proteins in DBS samples. In this work, we report the first comparison of proteins in whole blood and DBS samples using an untargeted MS approach. The average number of proteins identified in undepleted whole blood and DBS samples by liquid chromatography (LC)/MS/MS was 223 and 253, respectively. Protein identification repeatability was between 77 %-92 % within replicates and the majority of these repeated proteins (70 %) were observed in both sample formats. Proteins exclusively identified in the liquid or dried fluid spot format were unbiased based on their molecular weight, isoelectric point, aliphatic index, and grand average hydrophobicity. In addition, we extended this comparison to include proteins in matching plasma and serum samples with their dried fluid spot equivalents, dried plasma spot (DPS), and dried serum spot (DSS). This work begins to define the accessibility of endogenous proteins in dried fluid spot samples for analysis by MS and is useful in evaluating the scope of this new approach.
Dinstein, Ilan; Haar, Shlomi; Atsmon, Shir; Schtaerman, Hen
2017-01-01
Large controversy exists regarding the potential existence and clinical significance of larger brain volumes in toddlers who later develop autism. Assessing this relationship is important for determining the clinical utility of early head circumference (HC) measures and for assessing the validity of the early overgrowth hypothesis of autism, which suggests that early accelerated brain development may be a hallmark of the disorder. We performed a retrospective comparison of HC, height, and weight measurements between 66 toddlers who were later diagnosed with autism and 66 matched controls. These toddlers represent an unbiased regional sample from a single health service provider in the southern district of Israel. On average, participating toddlers had >8 measurements between birth and the age of two, which enabled us to characterize individual HC, height, and weight development with high precision and fit a negative exponential growth model to the data of each toddler with exceptional accuracy. The analyses revealed that HC sizes and growth rates were not significantly larger in toddlers with autism even when stratifying the autism group based on verbal capabilities at the time of diagnosis. In addition, there were no significant correlations between ADOS scores at the time of diagnosis and HC at any time-point during the first 2 years of life. These negative results add to accumulating evidence, which suggest that brain volume is not necessarily larger in toddlers who develop autism. We believe that conflicting results reported in other studies are due to small sample sizes, use of misleading population norms, changes in the clinical definition of autism over time, and/or inclusion of individuals with syndromic autism. While abnormally large brains may be evident in some individuals with autism and more clearly visible in MRI scans, converging evidence from this and other studies suggests that enlarged HC is not a common etiology of the entire autism population. Early HC measures, therefore, offer very limited clinical utility for assessment of autism risk in the general population.
Mauro, Francisco; Monleon, Vicente J; Temesgen, Hailemariam; Ford, Kevin R
2017-01-01
Forest inventories require estimates and measures of uncertainty for subpopulations such as management units. These units often times hold a small sample size, so they should be regarded as small areas. When auxiliary information is available, different small area estimation methods have been proposed to obtain reliable estimates for small areas. Unit level empirical best linear unbiased predictors (EBLUP) based on plot or grid unit level models have been studied more thoroughly than area level EBLUPs, where the modelling occurs at the management unit scale. Area level EBLUPs do not require a precise plot positioning and allow the use of variable radius plots, thus reducing fieldwork costs. However, their performance has not been examined thoroughly. We compared unit level and area level EBLUPs, using LiDAR auxiliary information collected for inventorying 98,104 ha coastal coniferous forest. Unit level models were consistently more accurate than area level EBLUPs, and area level EBLUPs were consistently more accurate than field estimates except for large management units that held a large sample. For stand density, volume, basal area, quadratic mean diameter, mean height and Lorey's height, root mean squared errors (rmses) of estimates obtained using area level EBLUPs were, on average, 1.43, 2.83, 2.09, 1.40, 1.32 and 1.64 times larger than those based on unit level estimates, respectively. Similarly, direct field estimates had rmses that were, on average, 1.37, 1.45, 1.17, 1.17, 1.26, and 1.38 times larger than rmses of area level EBLUPs. Therefore, area level models can lead to substantial gains in accuracy compared to direct estimates, and unit level models lead to very important gains in accuracy compared to area level models, potentially justifying the additional costs of obtaining accurate field plot coordinates.
Monleon, Vicente J.; Temesgen, Hailemariam; Ford, Kevin R.
2017-01-01
Forest inventories require estimates and measures of uncertainty for subpopulations such as management units. These units often times hold a small sample size, so they should be regarded as small areas. When auxiliary information is available, different small area estimation methods have been proposed to obtain reliable estimates for small areas. Unit level empirical best linear unbiased predictors (EBLUP) based on plot or grid unit level models have been studied more thoroughly than area level EBLUPs, where the modelling occurs at the management unit scale. Area level EBLUPs do not require a precise plot positioning and allow the use of variable radius plots, thus reducing fieldwork costs. However, their performance has not been examined thoroughly. We compared unit level and area level EBLUPs, using LiDAR auxiliary information collected for inventorying 98,104 ha coastal coniferous forest. Unit level models were consistently more accurate than area level EBLUPs, and area level EBLUPs were consistently more accurate than field estimates except for large management units that held a large sample. For stand density, volume, basal area, quadratic mean diameter, mean height and Lorey’s height, root mean squared errors (rmses) of estimates obtained using area level EBLUPs were, on average, 1.43, 2.83, 2.09, 1.40, 1.32 and 1.64 times larger than those based on unit level estimates, respectively. Similarly, direct field estimates had rmses that were, on average, 1.37, 1.45, 1.17, 1.17, 1.26, and 1.38 times larger than rmses of area level EBLUPs. Therefore, area level models can lead to substantial gains in accuracy compared to direct estimates, and unit level models lead to very important gains in accuracy compared to area level models, potentially justifying the additional costs of obtaining accurate field plot coordinates. PMID:29216290
Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data
2011-01-01
Background With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions. Results In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination. Conclusions In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets. PMID:21281499