Sample records for unbiased variable selection

  1. Unbiased split variable selection for random survival forests using maximally selected rank statistics.

    PubMed

    Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas

    2017-04-15

    The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Empirically Driven Variable Selection for the Estimation of Causal Effects with Observational Data

    ERIC Educational Resources Information Center

    Keller, Bryan; Chen, Jianshen

    2016-01-01

    Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies…

  3. Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield

    Treesearch

    Robert B. Thomas

    1986-01-01

    Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...

  4. Bayesian whole-genome prediction and genome-wide association analysis with missing genotypes using variable selection

    USDA-ARS?s Scientific Manuscript database

    Single-step Genomic Best Linear Unbiased Predictor (ssGBLUP) has become increasingly popular for whole-genome prediction (WGP) modeling as it utilizes any available pedigree and phenotypes on both genotyped and non-genotyped individuals. The WGP accuracy of ssGBLUP has been demonstrated to be greate...

  5. Definition and Measurement of Selection Bias: From Constant Ratio to Constant Difference

    ERIC Educational Resources Information Center

    Cahan, Sorel; Gamliel, Eyal

    2006-01-01

    Despite its intuitive appeal and popularity, Thorndike's constant ratio (CR) model for unbiased selection is inherently inconsistent in "n"-free selection. Satisfaction of the condition for unbiased selection, when formulated in terms of success/acceptance probabilities, usually precludes satisfaction by the converse probabilities of…

  6. tICA-Metadynamics: Accelerating Metadynamics by Using Kinetically Selected Collective Variables.

    PubMed

    M Sultan, Mohammad; Pande, Vijay S

    2017-06-13

    Metadynamics is a powerful enhanced molecular dynamics sampling method that accelerates simulations by adding history-dependent multidimensional Gaussians along selective collective variables (CVs). In practice, choosing a small number of slow CVs remains challenging due to the inherent high dimensionality of biophysical systems. Here we show that time-structure based independent component analysis (tICA), a recent advance in Markov state model literature, can be used to identify a set of variationally optimal slow coordinates for use as CVs for Metadynamics. We show that linear and nonlinear tICA-Metadynamics can complement existing MD studies by explicitly sampling the system's slowest modes and can even drive transitions along the slowest modes even when no such transitions are observed in unbiased simulations.

  7. Mutually unbiased projectors and duality between lines and bases in finite quantum systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shalaby, M.; Vourdas, A., E-mail: a.vourdas@bradford.ac.uk

    2013-10-15

    Quantum systems with variables in the ring Z(d) are considered, and the concepts of weak mutually unbiased bases and mutually unbiased projectors are discussed. The lines through the origin in the Z(d)×Z(d) phase space, are classified into maximal lines (sets of d points), and sublines (sets of d{sub i} points where d{sub i}|d). The sublines are intersections of maximal lines. It is shown that there exists a duality between the properties of lines (resp., sublines), and the properties of weak mutually unbiased bases (resp., mutually unbiased projectors). -- Highlights: •Lines in discrete phase space. •Bases in finite quantum systems. •Dualitymore » between bases and lines. •Weak mutually unbiased bases.« less

  8. Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits

    PubMed Central

    Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia

    2004-01-01

    Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415

  9. Mutually unbiased coarse-grained measurements of two or more phase-space variables

    NASA Astrophysics Data System (ADS)

    Paul, E. C.; Walborn, S. P.; Tasca, D. S.; Rudnicki, Łukasz

    2018-05-01

    Mutual unbiasedness of the eigenstates of phase-space operators—such as position and momentum, or their standard coarse-grained versions—exists only in the limiting case of infinite squeezing. In Phys. Rev. Lett. 120, 040403 (2018), 10.1103/PhysRevLett.120.040403, it was shown that mutual unbiasedness can be recovered for periodic coarse graining of these two operators. Here we investigate mutual unbiasedness of coarse-grained measurements for more than two phase-space variables. We show that mutual unbiasedness can be recovered between periodic coarse graining of any two nonparallel phase-space operators. We illustrate these results through optics experiments, using the fractional Fourier transform to prepare and measure mutually unbiased phase-space variables. The differences between two and three mutually unbiased measurements is discussed. Our results contribute to bridging the gap between continuous and discrete quantum mechanics, and they could be useful in quantum-information protocols.

  10. Hubby and Lewontin on Protein Variation in Natural Populations: When Molecular Genetics Came to the Rescue of Population Genetics.

    PubMed

    Charlesworth, Brian; Charlesworth, Deborah; Coyne, Jerry A; Langley, Charles H

    2016-08-01

    The 1966 GENETICS papers by John Hubby and Richard Lewontin were a landmark in the study of genome-wide levels of variability. They used the technique of gel electrophoresis of enzymes and proteins to study variation in natural populations of Drosophila pseudoobscura, at a set of loci that had been chosen purely for technical convenience, without prior knowledge of their levels of variability. Together with the independent study of human populations by Harry Harris, this seminal study provided the first relatively unbiased picture of the extent of genetic variability in protein sequences within populations, revealing that many genes had surprisingly high levels of diversity. These papers stimulated a large research program that found similarly high electrophoretic variability in many different species and led to statistical tools for interpreting the data in terms of population genetics processes such as genetic drift, balancing and purifying selection, and the effects of selection on linked variants. The current use of whole-genome sequences in studies of variation is the direct descendant of this pioneering work. Copyright © 2016 by the Genetics Society of America.

  11. A statistical test of unbiased evolution of body size in birds.

    PubMed

    Bokma, Folmer

    2002-12-01

    Of the approximately 9500 bird species, the vast majority is small-bodied. That is a general feature of evolutionary lineages, also observed for instance in mammals and plants. The avian interspecific body size distribution is right-skewed even on a logarithmic scale. That has previously been interpreted as evidence that body size evolution has been biased. However, a procedure to test for unbiased evolution from the shape of body size distributions was lacking. In the present paper unbiased body size evolution is defined precisely, and a statistical test is developed based on Monte Carlo simulation of unbiased evolution. Application of the test to birds suggests that it is highly unlikely that avian body size evolution has been unbiased as defined. Several possible explanations for this result are discussed. A plausible explanation is that the general model of unbiased evolution assumes that population size and generation time do not affect the evolutionary variability of body size; that is, that micro- and macroevolution are decoupled, which theory suggests is not likely to be the case.

  12. A Simple Joint Estimation Method of Residual Frequency Offset and Sampling Frequency Offset for DVB Systems

    NASA Astrophysics Data System (ADS)

    Kwon, Ki-Won; Cho, Yongsoo

    This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.

  13. Retransformation bias in a stem profile model

    Treesearch

    Raymond L. Czaplewski; David Bruce

    1990-01-01

    An unbiased profile model, fit to diameter divided by diameter at breast height, overestimated volume of 5.3-m log sections by 0.5 to 3.5%. Another unbiased profile model, fit to squared diameter divided by squared diameter at breast height, underestimated bole diameters by 0.2 to 2.1%. These biases are caused by retransformation of the predicted dependent variable;...

  14. Simultaneous unbiased estimates of multiple downed wood attributes in perpendicular distance sampling

    Treesearch

    Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine

    2008-01-01

    Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...

  15. Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics

    PubMed Central

    2012-01-01

    Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050

  16. Reply to ''Comment on 'Mutually unbiased bases, orthogonal Latin squares, and hidden-variable models'''

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paterek, Tomasz; Dakic, Borivoje; Brukner, Caslav

    In this Reply to the preceding Comment by Hall and Rao [Phys. Rev. A 83, 036101 (2011)], we motivate terminology of our original paper and point out that further research is needed in order to (dis)prove the claimed link between every orthogonal Latin square of order being a power of a prime and a mutually unbiased basis.

  17. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

    PubMed

    Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi

    2015-09-01

    Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Multivariate normal maximum likelihood with both ordinal and continuous variables, and data missing at random.

    PubMed

    Pritikin, Joshua N; Brick, Timothy R; Neale, Michael C

    2018-04-01

    A novel method for the maximum likelihood estimation of structural equation models (SEM) with both ordinal and continuous indicators is introduced using a flexible multivariate probit model for the ordinal indicators. A full information approach ensures unbiased estimates for data missing at random. Exceeding the capability of prior methods, up to 13 ordinal variables can be included before integration time increases beyond 1 s per row. The method relies on the axiom of conditional probability to split apart the distribution of continuous and ordinal variables. Due to the symmetry of the axiom, two similar methods are available. A simulation study provides evidence that the two similar approaches offer equal accuracy. A further simulation is used to develop a heuristic to automatically select the most computationally efficient approach. Joint ordinal continuous SEM is implemented in OpenMx, free and open-source software.

  19. A comparison of the weights-of-evidence method and probabilistic neural networks

    USGS Publications Warehouse

    Singer, Donald A.; Kouda, Ryoichi

    1999-01-01

    The need to integrate large quantities of digital geoscience information to classify locations as mineral deposits or nondeposits has been met by the weights-of-evidence method in many situations. Widespread selection of this method may be more the result of its ease of use and interpretation rather than comparisons with alternative methods. A comparison of the weights-of-evidence method to probabilistic neural networks is performed here with data from Chisel Lake-Andeson Lake, Manitoba, Canada. Each method is designed to estimate the probability of belonging to learned classes where the estimated probabilities are used to classify the unknowns. Using these data, significantly lower classification error rates were observed for the neural network, not only when test and training data were the same (0.02 versus 23%), but also when validation data, not used in any training, were used to test the efficiency of classification (0.7 versus 17%). Despite these data containing too few deposits, these tests of this set of data demonstrate the neural network's ability at making unbiased probability estimates and lower error rates when measured by number of polygons or by the area of land misclassified. For both methods, independent validation tests are required to ensure that estimates are representative of real-world results. Results from the weights-of-evidence method demonstrate a strong bias where most errors are barren areas misclassified as deposits. The weights-of-evidence method is based on Bayes rule, which requires independent variables in order to make unbiased estimates. The chi-square test for independence indicates no significant correlations among the variables in the Chisel Lake–Andeson Lake data. However, the expected number of deposits test clearly demonstrates that these data violate the independence assumption. Other, independent simulations with three variables show that using variables with correlations of 1.0 can double the expected number of deposits as can correlations of −1.0. Studies done in the 1970s on methods that use Bayes rule show that moderate correlations among attributes seriously affect estimates and even small correlations lead to increases in misclassifications. Adverse effects have been observed with small to moderate correlations when only six to eight variables were used. Consistent evidence of upward biased probability estimates from multivariate methods founded on Bayes rule must be of considerable concern to institutions and governmental agencies where unbiased estimates are required. In addition to increasing the misclassification rate, biased probability estimates make classification into deposit and nondeposit classes an arbitrary subjective decision. The probabilistic neural network has no problem dealing with correlated variables—its performance depends strongly on having a thoroughly representative training set. Probabilistic neural networks or logistic regression should receive serious consideration where unbiased estimates are required. The weights-of-evidence method would serve to estimate thresholds between anomalies and background and for exploratory data analysis.

  20. NMR analysis of seven selections of vermentino grape berry: metabolites composition and development.

    PubMed

    Mulas, Gilberto; Galaffu, Maria Grazia; Pretti, Luca; Nieddu, Gianni; Mercenaro, Luca; Tonelli, Roberto; Anedda, Roberto

    2011-02-09

    The goal of this work was to study via NMR the unaltered metabolic profile of Sardinian Vermentino grape berry. Seven selections of Vermentino were harvested from the same vineyard. Berries were stored and extracted following an unbiased extraction protocol. Extracts were analyzed to investigate variability in metabolites concentration as a function of the clone, the position of berries in the bunch or growing area within the vineyard. Quantitative NMR and statistical analysis (PCA, correlation analysis, Anova) of the experimental data point out that, among the investigated sources of variation, the position of the berries within the bunch mainly influences the metabolic profile of berries, while the metabolic profile does not seem to be significantly influenced by growing area and clone. Significant variability of the amino acids such as arginine, proline, and organic acids (malic and citric) characterizes the rapid rearrangements of the metabolic profile in response to environmental stimuli. Finally, an application is described on the analysis of metabolite variation throughout the physiological development of berries.

  1. Framework for making better predictions by directly estimating variables' predictivity.

    PubMed

    Lo, Adeline; Chernoff, Herman; Zheng, Tian; Lo, Shaw-Hwa

    2016-12-13

    We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables. We demonstrate that the [Formula: see text]-score of the PR method of VS yields a relatively unbiased estimate of a parameter that is not sensitive to noisy variables and is a lower bound to the parameter of interest. Thus, the PR method using the [Formula: see text]-score provides an effective approach to selecting highly predictive variables. We offer simulations and an application of the [Formula: see text]-score on real data to demonstrate the statistic's predictive performance on sample data. We conjecture that using the partition retention and [Formula: see text]-score can aid in finding variable sets with promising prediction rates; however, further research in the avenue of sample-based measures of predictivity is much desired.

  2. Estimation of the simple correlation coefficient.

    PubMed

    Shieh, Gwowen

    2010-11-01

    This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.

  3. Epidemiologic research using probabilistic outcome definitions.

    PubMed

    Cai, Bing; Hennessy, Sean; Lo Re, Vincent; Small, Dylan S

    2015-01-01

    Epidemiologic studies using electronic healthcare data often define the presence or absence of binary clinical outcomes by using algorithms with imperfect specificity, sensitivity, and positive predictive value. This results in misclassification and bias in study results. We describe and evaluate a new method called probabilistic outcome definition (POD) that uses logistic regression to estimate the probability of a clinical outcome using multiple potential algorithms and then uses multiple imputation to make valid inferences about the risk ratio or other epidemiologic parameters of interest. We conducted a simulation to evaluate the performance of the POD method with two variables that can predict the true outcome and compared the POD method with the conventional method. The simulation results showed that when the true risk ratio is equal to 1.0 (null), the conventional method based on a binary outcome provides unbiased estimates. However, when the risk ratio is not equal to 1.0, the traditional method, either using one predictive variable or both predictive variables to define the outcome, is biased when the positive predictive value is <100%, and the bias is very severe when the sensitivity or positive predictive value is poor (less than 0.75 in our simulation). In contrast, the POD method provides unbiased estimates of the risk ratio both when this measure of effect is equal to 1.0 and not equal to 1.0. Even when the sensitivity and positive predictive value are low, the POD method continues to provide unbiased estimates of the risk ratio. The POD method provides an improved way to define outcomes in database research. This method has a major advantage over the conventional method in that it provided unbiased estimates of risk ratios and it is easy to use. Copyright © 2014 John Wiley & Sons, Ltd.

  4. Regression calibration for models with two predictor variables measured with error and their interaction, using instrumental variables and longitudinal data.

    PubMed

    Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan

    2014-02-10

    Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.

  5. Phenotyping: Using Machine Learning for Improved Pairwise Genotype Classification Based on Root Traits

    PubMed Central

    Zhao, Jiangsan; Bodner, Gernot; Rewald, Boris

    2016-01-01

    Phenotyping local crop cultivars is becoming more and more important, as they are an important genetic source for breeding – especially in regard to inherent root system architectures. Machine learning algorithms are promising tools to assist in the analysis of complex data sets; novel approaches are need to apply them on root phenotyping data of mature plants. A greenhouse experiment was conducted in large, sand-filled columns to differentiate 16 European Pisum sativum cultivars based on 36 manually derived root traits. Through combining random forest and support vector machine models, machine learning algorithms were successfully used for unbiased identification of most distinguishing root traits and subsequent pairwise cultivar differentiation. Up to 86% of pea cultivar pairs could be distinguished based on top five important root traits (Timp5) – Timp5 differed widely between cultivar pairs. Selecting top important root traits (Timp) provided a significant improved classification compared to using all available traits or randomly selected trait sets. The most frequent Timp of mature pea cultivars was total surface area of lateral roots originating from tap root segments at 0–5 cm depth. The high classification rate implies that culturing did not lead to a major loss of variability in root system architecture in the studied pea cultivars. Our results illustrate the potential of machine learning approaches for unbiased (root) trait selection and cultivar classification based on rather small, complex phenotypic data sets derived from pot experiments. Powerful statistical approaches are essential to make use of the increasing amount of (root) phenotyping information, integrating the complex trait sets describing crop cultivars. PMID:27999587

  6. Framework for making better predictions by directly estimating variables’ predictivity

    PubMed Central

    Chernoff, Herman; Lo, Shaw-Hwa

    2016-01-01

    We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables. We demonstrate that the I-score of the PR method of VS yields a relatively unbiased estimate of a parameter that is not sensitive to noisy variables and is a lower bound to the parameter of interest. Thus, the PR method using the I-score provides an effective approach to selecting highly predictive variables. We offer simulations and an application of the I-score on real data to demonstrate the statistic’s predictive performance on sample data. We conjecture that using the partition retention and I-score can aid in finding variable sets with promising prediction rates; however, further research in the avenue of sample-based measures of predictivity is much desired. PMID:27911830

  7. Confounding, causality, and confusion: the role of intermediate variables in interpreting observational studies in obstetrics.

    PubMed

    Ananth, Cande V; Schisterman, Enrique F

    2017-08-01

    Prospective and retrospective cohorts and case-control studies are some of the most important study designs in epidemiology because, under certain assumptions, they can mimic a randomized trial when done well. These assumptions include, but are not limited to, properly accounting for 2 important sources of bias: confounding and selection bias. While not adjusting the causal association for an intermediate variable will yield an unbiased estimate of the exposure-outcome's total causal effect, it is often that obstetricians will want to adjust for an intermediate variable to assess if the intermediate is the underlying driver of the association. Such a practice must be weighed in light of the underlying research question and whether such an adjustment is necessary should be carefully considered. Gestational age is, by far, the most commonly encountered variable in obstetrics that is often mislabeled as a confounder when, in fact, it may be an intermediate. If, indeed, gestational age is an intermediate but if mistakenly labeled as a confounding variable and consequently adjusted in an analysis, the conclusions can be unexpected. The implications of this overadjustment of an intermediate as though it were a confounder can render an otherwise persuasive study downright meaningless. This commentary provides an exposition of confounding bias, collider stratification, and selection biases, with applications in obstetrics and perinatal epidemiology. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.

    PubMed

    Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio

    Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  9. Application of multivariate statistics to vestibular testing: discriminating between Meniere's disease and migraine associated dizziness

    NASA Technical Reports Server (NTRS)

    Dimitri, P. S.; Wall, C. 3rd; Oas, J. G.; Rauch, S. D.

    2001-01-01

    Meniere's disease (MD) and migraine associated dizziness (MAD) are two disorders that can have similar symptomatologies, but differ vastly in treatment. Vestibular testing is sometimes used to help differentiate between these disorders, but the inefficiency of a human interpreter analyzing a multitude of variables independently decreases its utility. Our hypothesis was that we could objectively discriminate between patients with MD and those with MAD using select variables from the vestibular test battery. Sinusoidal harmonic acceleration test variables were reduced to three vestibulo-ocular reflex physiologic parameters: gain, time constant, and asymmetry. A combination of these parameters plus a measurement of reduced vestibular response from caloric testing allowed us to achieve a joint classification rate of 91%, independent quadratic classification algorithm. Data from posturography were not useful for this type of differentiation. Overall, our classification function can be used as an unbiased assistant to discriminate between MD and MAD and gave us insight into the pathophysiologic differences between the two disorders.

  10. A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue

    PubMed Central

    Nyengaard, Jens Randel; Lind, Martin; Spector, Myron

    2015-01-01

    Objective To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin–eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. Results We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm3 (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. Conclusion We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage. PMID:26069715

  11. A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue.

    PubMed

    Foldager, Casper Bindzus; Nyengaard, Jens Randel; Lind, Martin; Spector, Myron

    2015-04-01

    To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin-eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm(3) (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage.

  12. Smoothed Biasing Forces Yield Unbiased Free Energies with the Extended-System Adaptive Biasing Force Method

    PubMed Central

    2016-01-01

    We report a theoretical description and numerical tests of the extended-system adaptive biasing force method (eABF), together with an unbiased estimator of the free energy surface from eABF dynamics. Whereas the original ABF approach uses its running estimate of the free energy gradient as the adaptive biasing force, eABF is built on the idea that the exact free energy gradient is not necessary for efficient exploration, and that it is still possible to recover the exact free energy separately with an appropriate estimator. eABF does not directly bias the collective coordinates of interest, but rather fictitious variables that are harmonically coupled to them; therefore is does not require second derivative estimates, making it easily applicable to a wider range of problems than ABF. Furthermore, the extended variables present a smoother, coarse-grain-like sampling problem on a mollified free energy surface, leading to faster exploration and convergence. We also introduce CZAR, a simple, unbiased free energy estimator from eABF trajectories. eABF/CZAR converges to the physical free energy surface faster than standard ABF for a wide range of parameters. PMID:27959559

  13. Comparative modeling and benchmarking data sets for human histone deacetylases and sirtuin families.

    PubMed

    Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

    2015-02-23

    Histone deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases, and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective histone deacetylase inhibitors (HDACIs). To facilitate the process, we constructed maximal unbiased benchmarking data sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs cover all four classes including Class III (Sirtuins family) and 14 HDAC isoforms, composed of 631 inhibitors and 24609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of "artificial enrichment" and "analogue bias". We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets and demonstrate that our MUBD-HDACs are unique in that they can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the "2D bias" and "LBVS favorable" effect within the benchmarking sets. In summary, MUBD-HDACs are the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that are available so far. MUBD-HDACs are freely available at http://www.xswlab.org/ .

  14. Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables

    PubMed Central

    Rosenblum, Michael; van der Laan, Mark J.

    2010-01-01

    Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636

  15. Sampling for Patient Exit Interviews: Assessment of Methods Using Mathematical Derivation and Computer Simulations.

    PubMed

    Geldsetzer, Pascal; Fink, Günther; Vaikath, Maria; Bärnighausen, Till

    2018-02-01

    (1) To evaluate the operational efficiency of various sampling methods for patient exit interviews; (2) to discuss under what circumstances each method yields an unbiased sample; and (3) to propose a new, operationally efficient, and unbiased sampling method. Literature review, mathematical derivation, and Monte Carlo simulations. Our simulations show that in patient exit interviews it is most operationally efficient if the interviewer, after completing an interview, selects the next patient exiting the clinical consultation. We demonstrate mathematically that this method yields a biased sample: patients who spend a longer time with the clinician are overrepresented. This bias can be removed by selecting the next patient who enters, rather than exits, the consultation room. We show that this sampling method is operationally more efficient than alternative methods (systematic and simple random sampling) in most primary health care settings. Under the assumption that the order in which patients enter the consultation room is unrelated to the length of time spent with the clinician and the interviewer, selecting the next patient entering the consultation room tends to be the operationally most efficient unbiased sampling method for patient exit interviews. © 2016 The Authors. Health Services Research published by Wiley Periodicals, Inc. on behalf of Health Research and Educational Trust.

  16. Comparative Modeling and Benchmarking Data Sets for Human Histone Deacetylases and Sirtuin Families

    PubMed Central

    Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

    2015-01-01

    Histone Deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective Histone Deacetylases Inhibitors (HDACIs). To facilitate the process, we constructed the Maximal Unbiased Benchmarking Data Sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs covers all 4 Classes including Class III (Sirtuins family) and 14 HDACs isoforms, composed of 631 inhibitors and 24,609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of “artificial enrichment” and “analogue bias”. We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets, and demonstrate that our MUBD-HDACs is unique in that it can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the “2D bias” and “LBVS favorable” effect within the benchmarking sets. In summary, MUBD-HDACs is the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that is available so far. MUBD-HDACs is freely available at http://www.xswlab.org/. PMID:25633490

  17. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat.

    PubMed

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-06-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.

  18. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat

    PubMed Central

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-01-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889

  19. Apparatus bias and place conditioning with ethanol in mice.

    PubMed

    Cunningham, Christopher L; Ferree, Nikole K; Howard, MacKenzie A

    2003-12-01

    Although the distinction between "biased" and "unbiased" is generally recognized as an important methodological issue in place conditioning, previous studies have not adequately addressed the distinction between a biased/unbiased apparatus and a biased/unbiased stimulus assignment procedure. Moreover, a review of the recent literature indicates that many reports (70% of 76 papers published in 2001) fail to provide adequate information about apparatus bias. This issue is important because the mechanisms underlying a drug's effect in the place-conditioning procedure may differ depending on whether the apparatus is biased or unbiased. The present studies were designed to assess the impact of apparatus bias and stimulus assignment procedure on ethanol-induced place conditioning in mice (DBA/2 J). A secondary goal was to compare various dependent variables commonly used to index conditioned place preference. Apparatus bias was manipulated by varying the combination of tactile (floor) cues available during preference tests. Experiment 1 used an unbiased apparatus in which the stimulus alternatives were equally preferred during a pre-test as indicated by the group average. Experiment 2 used a biased apparatus in which one of the stimuli was strongly preferred by most mice (mean % time on cue = 67%) during the pre-test. In both studies, the stimulus paired with drug (CS+) was assigned randomly (i.e., an "unbiased" stimulus assignment procedure). Experimental mice received four pairings of CS+ with ethanol (2 g/kg, i.p.) and four pairings of the alternative stimulus (CS-) with saline; control mice received saline on both types of trial. Each experiment concluded with a 60-min choice test. With the unbiased apparatus (experiment 1), significant place conditioning was obtained regardless of whether drug was paired with the subject's initially preferred or non-preferred stimulus. However, with the biased apparatus (experiment 2), place conditioning was apparent only when ethanol was paired with the initially non-preferred cue, and not when it was paired with the initially preferred cue. These conclusions held regardless of which dependent variable was used to index place conditioning, but only if the counterbalancing factor was included in statistical analyses. These studies indicate that apparatus bias plays a major role in determining whether biased assignment of an ethanol-paired stimulus affects ability to demonstrate conditioned place preference. Ethanol's ability to produce conditioned place preference in an unbiased apparatus, regardless of the direction of the initial cue bias, supports previous studies that interpret such findings as evidence of a primary rewarding drug effect. Moreover, these studies suggest that the asymmetrical outcome observed in the biased apparatus is most likely due to a measurement problem (e.g., ceiling effect) rather than to an interaction between the drug's effect and an unconditioned motivational response (e.g., "anxiety") to the initially non-preferred stimulus. More generally, these findings illustrate the importance of providing clear information on apparatus bias in all place-conditioning studies.

  20. Motor activity as an unbiased variable to assess anaphylaxis in allergic rats

    PubMed Central

    Abril-Gil, Mar; Garcia-Just, Alba; Cambras, Trinitat; Pérez-Cano, Francisco J; Castellote, Cristina; Franch, Àngels

    2015-01-01

    The release of mediators by mast cells triggers allergic symptoms involving various physiological systems and, in the most severe cases, the development of anaphylactic shock compromising mainly the nervous and cardiovascular systems. We aimed to establish variables to objectively study the anaphylactic response (AR) after an oral challenge in an allergy model. Brown Norway rats were immunized by intraperitoneal injection of ovalbumin with alum and toxin from Bordetella pertussis. Specific immunoglobulin (Ig) E antibodies were developed in immunized animals. Forty days after immunization, the rats were orally challenged with the allergen, and motor activity, body temperature and serum mast cell protease concentration were determined. The anaphylaxis induced a reduction in body temperature and a decrease in the number of animal movements, which was inversely correlated with serum mast cell protease release. In summary, motor activity is a reliable tool for assessing AR and also an unbiased method for screening new anti-allergic drugs. PMID:25716015

  1. Unbiased estimation in seamless phase II/III trials with unequal treatment effect variances and hypothesis-driven selection rules.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-09-30

    Seamless phase II/III clinical trials offer an efficient way to select an experimental treatment and perform confirmatory analysis within a single trial. However, combining the data from both stages in the final analysis can induce bias into the estimates of treatment effects. Methods for bias adjustment developed thus far have made restrictive assumptions about the design and selection rules followed. In order to address these shortcomings, we apply recent methodological advances to derive the uniformly minimum variance conditionally unbiased estimator for two-stage seamless phase II/III trials. Our framework allows for the precision of the treatment arm estimates to take arbitrary values, can be utilised for all treatments that are taken forward to phase III and is applicable when the decision to select or drop treatment arms is driven by a multiplicity-adjusted hypothesis testing procedure. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  2. Using small area estimation and Lidar-derived variables for multivariate prediction of forest attributes

    Treesearch

    F. Mauro; Vicente Monleon; H. Temesgen

    2015-01-01

    Small area estimation (SAE) techniques have been successfully applied in forest inventories to provide reliable estimates for domains where the sample size is small (i.e. small areas). Previous studies have explored the use of either Area Level or Unit Level Empirical Best Linear Unbiased Predictors (EBLUPs) in a univariate framework, modeling each variable of interest...

  3. Combating Unmeasured Confounding in Cross-Sectional Studies: Evaluating Instrumental-Variable and Heckman Selection Models

    PubMed Central

    DeMaris, Alfred

    2014-01-01

    Unmeasured confounding is the principal threat to unbiased estimation of treatment “effects” (i.e., regression parameters for binary regressors) in nonexperimental research. It refers to unmeasured characteristics of individuals that lead them both to be in a particular “treatment” category and to register higher or lower values than others on a response variable. In this article, I introduce readers to 2 econometric techniques designed to control the problem, with a particular emphasis on the Heckman selection model (HSM). Both techniques can be used with only cross-sectional data. Using a Monte Carlo experiment, I compare the performance of instrumental-variable regression (IVR) and HSM to that of ordinary least squares (OLS) under conditions with treatment and unmeasured confounding both present and absent. I find HSM generally to outperform IVR with respect to mean-square-error of treatment estimates, as well as power for detecting either a treatment effect or unobserved confounding. However, both HSM and IVR require a large sample to be fully effective. The use of HSM and IVR in tandem with OLS to untangle unobserved confounding bias in cross-sectional data is further demonstrated with an empirical application. Using data from the 2006–2010 General Social Survey (National Opinion Research Center, 2014), I examine the association between being married and subjective well-being. PMID:25110904

  4. Rasch scaling paranormal belief and experience: structure and semantics of Thalbourne's Australian Sheep-Goat Scale.

    PubMed

    Lange, Rense; Thalbourne, Michael A

    2002-12-01

    Research on the relation between demographic variables and paranormal belief remains controversial given the possible semantic distortions introduced by item and test level biases. We illustrate how Rasch scaling can be used to detect such biases and to quantify their effects, using the Australian Sheep-Goal Scale as a substantive example. Based on data from 1.822 respondents, this test was Rasch scalable, reliable, and unbiased at the test level. Consistent with other research in which unbiased measures of paranormal belief were used, extremely weak age and sex effects were found (partial eta2 = .005 and .012, respectively).

  5. Common Variable Immunodeficiency Non-Infectious Disease Endotypes Redefined Using Unbiased Network Clustering in Large Electronic Datasets.

    PubMed

    Farmer, Jocelyn R; Ong, Mei-Sing; Barmettler, Sara; Yonker, Lael M; Fuleihan, Ramsay; Sullivan, Kathleen E; Cunningham-Rundles, Charlotte; Walter, Jolan E

    2017-01-01

    Common variable immunodeficiency (CVID) is increasingly recognized for its association with autoimmune and inflammatory complications. Despite recent advances in immunophenotypic and genetic discovery, clinical care of CVID remains limited by our inability to accurately model risk for non-infectious disease development. Herein, we demonstrate the utility of unbiased network clustering as a novel method to analyze inter-relationships between non-infectious disease outcomes in CVID using databases at the United States Immunodeficiency Network (USIDNET), the centralized immunodeficiency registry of the United States, and Partners, a tertiary care network in Boston, MA, USA, with a shared electronic medical record amenable to natural language processing. Immunophenotypes were comparable in terms of native antibody deficiencies, low titer response to pneumococcus, and B cell maturation arrest. However, recorded non-infectious disease outcomes were more substantial in the Partners cohort across the spectrum of lymphoproliferation, cytopenias, autoimmunity, atopy, and malignancy. Using unbiased network clustering to analyze 34 non-infectious disease outcomes in the Partners cohort, we further identified unique patterns of lymphoproliferative (two clusters), autoimmune (two clusters), and atopic (one cluster) disease that were defined as CVID non-infectious endotypes according to discrete and non-overlapping immunophenotypes. Markers were both previously described {high serum IgE in the atopic cluster [odds ratio (OR) 6.5] and low class-switched memory B cells in the total lymphoproliferative cluster (OR 9.2)} and novel [low serum C3 in the total lymphoproliferative cluster (OR 5.1)]. Mortality risk in the Partners cohort was significantly associated with individual non-infectious disease outcomes as well as lymphoproliferative cluster 2, specifically (OR 5.9). In contrast, unbiased network clustering failed to associate known comorbidities in the adult USIDNET cohort. Together, these data suggest that unbiased network clustering can be used in CVID to redefine non-infectious disease inter-relationships; however, applicability may be limited to datasets well annotated through mechanisms such as natural language processing. The lymphoproliferative, autoimmune, and atopic Partners CVID endotypes herein described can be used moving forward to streamline genetic and biomarker discovery and to facilitate early screening and intervention in CVID patients at highest risk for autoimmune and inflammatory progression.

  6. Unbiased Estimation of Refractive State of Aberrated Eyes

    PubMed Central

    Martin, Jesson; Vasudevan, Balamurali; Himebaugh, Nikole; Bradley, Arthur; Thibos, Larry

    2011-01-01

    To identify unbiased methods for estimating the target vergence required to maximize visual acuity based on wavefront aberration measurements. Experiments were designed to minimize the impact of confounding factors that have hampered previous research. Objective wavefront refractions and subjective acuity refractions were obtained for the same monochromatic wavelength. Accommodation and pupil fluctuations were eliminated by cycloplegia. Unbiased subjective refractions that maximize visual acuity for high contrast letters were performed with a computer controlled forced choice staircase procedure, using 0.125 diopter steps of defocus. All experiments were performed for two pupil diameters (3mm and 6mm). As reported in the literature, subjective refractive error does not change appreciably when the pupil dilates. For 3 mm pupils most metrics yielded objective refractions that were about 0.1D more hyperopic than subjective acuity refractions. When pupil diameter increased to 6 mm, this bias changed in the myopic direction and the variability between metrics also increased. These inaccuracies were small compared to the precision of the measurements, which implies that most metrics provided unbiased estimates of refractive state for medium and large pupils. A variety of image quality metrics may be used to determine ocular refractive state for monochromatic (635nm) light, thereby achieving accurate results without the need for empirical correction factors. PMID:21777601

  7. Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use

    USGS Publications Warehouse

    Arthur, Steve M.; Schwartz, Charles C.

    1999-01-01

    We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the effects of variability of those estimates. Use of GPS-equipped collars can facilitate obtaining larger samples of unbiased data and improve accuracy and precision of home range estimates.

  8. On Measuring and Reducing Selection Bias with a Quasi-Doubly Randomized Preference Trial

    ERIC Educational Resources Information Center

    Joyce, Ted; Remler, Dahlia K.; Jaeger, David A.; Altindag, Onur; O'Connell, Stephen D.; Crockett, Sean

    2017-01-01

    Randomized experiments provide unbiased estimates of treatment effects, but are costly and time consuming. We demonstrate how a randomized experiment can be leveraged to measure selection bias by conducting a subsequent observational study that is identical in every way except that subjects choose their treatment--a quasi-doubly randomized…

  9. Human systems immunology: hypothesis-based modeling and unbiased data-driven approaches.

    PubMed

    Arazi, Arnon; Pendergraft, William F; Ribeiro, Ruy M; Perelson, Alan S; Hacohen, Nir

    2013-10-31

    Systems immunology is an emerging paradigm that aims at a more systematic and quantitative understanding of the immune system. Two major approaches have been utilized to date in this field: unbiased data-driven modeling to comprehensively identify molecular and cellular components of a system and their interactions; and hypothesis-based quantitative modeling to understand the operating principles of a system by extracting a minimal set of variables and rules underlying them. In this review, we describe applications of the two approaches to the study of viral infections and autoimmune diseases in humans, and discuss possible ways by which these two approaches can synergize when applied to human immunology. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. Unbiased feature selection in learning random forests for high-dimensional data.

    PubMed

    Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi

    2015-01-01

    Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.

  11. Motor activity as an unbiased variable to assess anaphylaxis in allergic rats.

    PubMed

    Abril-Gil, Mar; Garcia-Just, Alba; Cambras, Trinitat; Pérez-Cano, Francisco J; Castellote, Cristina; Franch, Àngels; Castell, Margarida

    2015-10-01

    The release of mediators by mast cells triggers allergic symptoms involving various physiological systems and, in the most severe cases, the development of anaphylactic shock compromising mainly the nervous and cardiovascular systems. We aimed to establish variables to objectively study the anaphylactic response (AR) after an oral challenge in an allergy model. Brown Norway rats were immunized by intraperitoneal injection of ovalbumin with alum and toxin from Bordetella pertussis. Specific immunoglobulin (Ig) E antibodies were developed in immunized animals. Forty days after immunization, the rats were orally challenged with the allergen, and motor activity, body temperature and serum mast cell protease concentration were determined. The anaphylaxis induced a reduction in body temperature and a decrease in the number of animal movements, which was inversely correlated with serum mast cell protease release. In summary, motor activity is a reliable tool for assessing AR and also an unbiased method for screening new anti-allergic drugs. © 2015 by the Society for Experimental Biology and Medicine.

  12. Estimation of genetic parameters and response to selection for a continuous trait subject to culling before testing.

    PubMed

    Arnason, T; Albertsdóttir, E; Fikse, W F; Eriksson, S; Sigurdsson, A

    2012-02-01

    The consequences of assuming a zero environmental covariance between a binary trait 'test-status' and a continuous trait on the estimates of genetic parameters by restricted maximum likelihood and Gibbs sampling and on response from genetic selection when the true environmental covariance deviates from zero were studied. Data were simulated for two traits (one that culling was based on and a continuous trait) using the following true parameters, on the underlying scale: h² = 0.4; r(A) = 0.5; r(E) = 0.5, 0.0 or -0.5. The selection on the continuous trait was applied to five subsequent generations where 25 sires and 500 dams produced 1500 offspring per generation. Mass selection was applied in the analysis of the effect on estimation of genetic parameters. Estimated breeding values were used in the study of the effect of genetic selection on response and accuracy. The culling frequency was either 0.5 or 0.8 within each generation. Each of 10 replicates included 7500 records on 'test-status' and 9600 animals in the pedigree file. Results from bivariate analysis showed unbiased estimates of variance components and genetic parameters when true r(E) = 0.0. For r(E) = 0.5, variance components (13-19% bias) and especially (50-80%) were underestimated for the continuous trait, while heritability estimates were unbiased. For r(E) = -0.5, heritability estimates of test-status were unbiased, while genetic variance and heritability of the continuous trait together with were overestimated (25-50%). The bias was larger for the higher culling frequency. Culling always reduced genetic progress from selection, but the genetic progress was found to be robust to the use of wrong parameter values of the true environmental correlation between test-status and the continuous trait. Use of a bivariate linear-linear model reduced bias in genetic evaluations, when data were subject to culling. © 2011 Blackwell Verlag GmbH.

  13. Optical Variability and Classification of High Redshift (3.5 < z < 5.5) Quasars on SDSS Stripe 82

    NASA Astrophysics Data System (ADS)

    AlSayyad, Yusra; McGreer, Ian D.; Fan, Xiaohui; Connolly, Andrew J.; Ivezic, Zeljko; Becker, Andrew C.

    2015-01-01

    Recent studies have shown promise in combining optical colors with variability to efficiently select and estimate the redshifts of low- to mid-redshift quasars in upcoming ground-based time-domain surveys. We extend these studies to fainter and less abundant high-redshift quasars using light curves from 235 sq. deg. and 10 years of Stripe 82 imaging reprocessed with the prototype LSST data management stack. Sources are detected on the i-band co-adds (5σ: i ~ 24) but measured on the single-epoch (ugriz) images, generating complete and unbiased lightcurves for sources fainter than the single-epoch detection threshold. Using these forced photometry lightcurves, we explore optical variability characteristics of high redshift quasars and validate classification methods with particular attention to the low signal limit. In this low SNR limit, we quantify the degradation of the uncertainties and biases on variability parameters using simulated light curves. Completeness/efficiency and redshift accuracy are verified with new spectroscopic observations on the MMT and APO 3.5m. These preliminary results are part of a survey to measure the z~4 luminosity function for quasars (i < 23) on Stripe 82 and to validate purely photometric classification techniques for high redshift quasars in LSST.

  14. Optimized probability sampling of study sites to improve generalizability in a multisite intervention trial.

    PubMed

    Kraschnewski, Jennifer L; Keyserling, Thomas C; Bangdiwala, Shrikant I; Gizlice, Ziya; Garcia, Beverly A; Johnston, Larry F; Gustafson, Alison; Petrovic, Lindsay; Glasgow, Russell E; Samuel-Hodge, Carmen D

    2010-01-01

    Studies of type 2 translation, the adaption of evidence-based interventions to real-world settings, should include representative study sites and staff to improve external validity. Sites for such studies are, however, often selected by convenience sampling, which limits generalizability. We used an optimized probability sampling protocol to select an unbiased, representative sample of study sites to prepare for a randomized trial of a weight loss intervention. We invited North Carolina health departments within 200 miles of the research center to participate (N = 81). Of the 43 health departments that were eligible, 30 were interested in participating. To select a representative and feasible sample of 6 health departments that met inclusion criteria, we generated all combinations of 6 from the 30 health departments that were eligible and interested. From the subset of combinations that met inclusion criteria, we selected 1 at random. Of 593,775 possible combinations of 6 counties, 15,177 (3%) met inclusion criteria. Sites in the selected subset were similar to all eligible sites in terms of health department characteristics and county demographics. Optimized probability sampling improved generalizability by ensuring an unbiased and representative sample of study sites.

  15. Phenotype-Driven Therapeutics in Severe Asthma.

    PubMed

    Opina, Maria Theresa D; Moore, Wendy C

    2017-02-01

    Inhaled corticosteroids are the mainstay of asthma treatment using a step-up approach with incremental dosing and additional controller medications in order to achieve symptom control and prevent exacerbations. While most patients respond well to this treatment approach, some patients remain refractory despite high doses of inhaled corticosteroids and a long-acting β-agonist. The problem lies in the heterogeneity of severe asthma, which is further supported by the emergence of severe asthma phenotypes. This heterogeneity contributes to the variability in treatment response. Randomized controlled trials involving add-on therapies in poorly controlled asthma have challenged the idea of a "one size fits all" approach targeting specific phenotypes in their subject selection. This review discusses severe asthma phenotypes from unbiased clustering approaches and the most recent scientific evidence on novel treatments to provide a guide in personalizing severe asthma treatment.

  16. The behaviour of random forest permutation-based variable importance measures under predictor correlation.

    PubMed

    Nicodemus, Kristin K; Malley, James D; Strobl, Carolin; Ziegler, Andreas

    2010-02-27

    Random forests (RF) have been increasingly used in applications such as genome-wide association and microarray studies where predictor correlation is frequently observed. Recent works on permutation-based variable importance measures (VIMs) used in RF have come to apparently contradictory conclusions. We present an extended simulation study to synthesize results. In the case when both predictor correlation was present and predictors were associated with the outcome (HA), the unconditional RF VIM attributed a higher share of importance to correlated predictors, while under the null hypothesis that no predictors are associated with the outcome (H0) the unconditional RF VIM was unbiased. Conditional VIMs showed a decrease in VIM values for correlated predictors versus the unconditional VIMs under HA and was unbiased under H0. Scaled VIMs were clearly biased under HA and H0. Unconditional unscaled VIMs are a computationally tractable choice for large datasets and are unbiased under the null hypothesis. Whether the observed increased VIMs for correlated predictors may be considered a "bias" - because they do not directly reflect the coefficients in the generating model - or if it is a beneficial attribute of these VIMs is dependent on the application. For example, in genetic association studies, where correlation between markers may help to localize the functionally relevant variant, the increased importance of correlated predictors may be an advantage. On the other hand, we show examples where this increased importance may result in spurious signals.

  17. Self Assessment and Student-Centred Learning

    ERIC Educational Resources Information Center

    McDonald, Betty

    2012-01-01

    This paper seeks to show how self assessment facilitates student-centred learning (SCL) and fills a gap in the literature. Two groups of students were selected from a single class in a tertiary educational institution. The control group of 25 was selected randomly by the tossing of an unbiased coin (heads = control group). They were trained in the…

  18. Machine learning shows association between genetic variability in PPARG and cerebral connectivity in preterm infants

    PubMed Central

    Krishnan, Michelle L.; Wang, Zi; Aljabar, Paul; Ball, Gareth; Mirza, Ghazala; Saxena, Alka; Counsell, Serena J.; Hajnal, Joseph V.; Montana, Giovanni

    2017-01-01

    Preterm infants show abnormal structural and functional brain development, and have a high risk of long-term neurocognitive problems. The molecular and cellular mechanisms involved are poorly understood, but novel methods now make it possible to address them by examining the relationship between common genetic variability and brain endophenotype. We addressed the hypothesis that variability in the Peroxisome Proliferator Activated Receptor (PPAR) pathway would be related to brain development. We employed machine learning in an unsupervised, unbiased, combined analysis of whole-brain diffusion tractography together with genomewide, single-nucleotide polymorphism (SNP)-based genotypes from a cohort of 272 preterm infants, using Sparse Reduced Rank Regression (sRRR) and correcting for ethnicity and age at birth and imaging. Empirical selection frequencies for SNPs associated with cerebral connectivity ranged from 0.663 to zero, with multiple highly selected SNPs mapping to genes for PPARG (six SNPs), ITGA6 (four SNPs), and FXR1 (two SNPs). SNPs in PPARG were significantly overrepresented (ranked 7–11 and 67 of 556,000 SNPs; P < 2.2 × 10−7), and were mostly in introns or regulatory regions with predicted effects including protein coding and nonsense-mediated decay. Edge-centric graph-theoretic analysis showed that highly selected white-matter tracts were consistent across the group and important for information transfer (P < 2.2 × 10−17); they most often connected to the insula (P < 6 × 10−17). These results suggest that the inhibited brain development seen in humans exposed to the stress of a premature extrauterine environment is modulated by genetic factors, and that PPARG signaling has a previously unrecognized role in cerebral development. PMID:29229843

  19. Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.

    PubMed

    Xia, Jie; Reid, Terry-Elinor; Wu, Song; Zhang, Liangren; Wang, Xiang Simon

    2018-05-29

    Chemokine receptors (CRs) have long been druggable targets for the treatment of inflammatory diseases and HIV-1 infection. As a powerful technique, virtual screening (VS) has been widely applied to identifying small molecule leads for modern drug targets including CRs. For rational selection of a wide variety of VS approaches, ligand enrichment assessment based on a benchmarking data set has become an indispensable practice. However, the lack of versatile benchmarking sets for the whole CRs family that are able to unbiasedly evaluate every single approach including both structure- and ligand-based VS somewhat hinders modern drug discovery efforts. To address this issue, we constructed Maximal Unbiased Benchmarking Data sets for human Chemokine Receptors (MUBD-hCRs) using our recently developed tools of MUBD-DecoyMaker. The MUBD-hCRs encompasses 13 subtypes out of 20 chemokine receptors, composed of 404 ligands and 15756 decoys so far and is readily expandable in the future. It had been thoroughly validated that MUBD-hCRs ligands are chemically diverse while its decoys are maximal unbiased in terms of "artificial enrichment", "analogue bias". In addition, we studied the performance of MUBD-hCRs, in particular CXCR4 and CCR5 data sets, in ligand enrichment assessments of both structure- and ligand-based VS approaches in comparison with other benchmarking data sets available in the public domain and demonstrated that MUBD-hCRs is very capable of designating the optimal VS approach. MUBD-hCRs is a unique and maximal unbiased benchmarking set that covers major CRs subtypes so far.

  20. A comparison of selection at list time and time-stratified sampling for estimating suspended sediment loads

    Treesearch

    Robert B. Thomas; Jack Lewis

    1993-01-01

    Time-stratified sampling of sediment for estimating suspended load is introduced and compared to selection at list time (SALT) sampling. Both methods provide unbiased estimates of load and variance. The magnitude of the variance of the two methods is compared using five storm populations of suspended sediment flux derived from turbidity data. Under like conditions,...

  1. Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics.

    PubMed

    Bonomi, M; Barducci, A; Parrinello, M

    2009-08-01

    Metadynamics is a widely used and successful method for reconstructing the free-energy surface of complex systems as a function of a small number of suitably chosen collective variables. This is achieved by biasing the dynamics of the system. The bias acting on the collective variables distorts the probability distribution of the other variables. Here we present a simple reweighting algorithm for recovering the unbiased probability distribution of any variable from a well-tempered metadynamics simulation. We show the efficiency of the reweighting procedure by reconstructing the distribution of the four backbone dihedral angles of alanine dipeptide from two and even one dimensional metadynamics simulation. 2009 Wiley Periodicals, Inc.

  2. News Sources on Rhodesia: A Comparative Analysis.

    ERIC Educational Resources Information Center

    McCoy, Jennifer; Cholawsky, Elizabeth

    1982-01-01

    Concludes that the "London Times" and the Foreign Broadcast Information Service of the United States government provide both comprehensive and unbiased coverage of events in Rhodesia, while the "New York Times" is less complete and the "Christian Science Monitor" is selective. (FL)

  3. Meaner king uses biased bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reimpell, Michael; Werner, Reinhard F.

    2007-06-15

    The mean king problem is a quantum mechanical retrodiction problem, in which Alice has to name the outcome of an ideal measurement made in one of several different orthonormal bases. Alice is allowed to prepare the state of the system and to do a final measurement, possibly including an entangled copy. However, Alice gains knowledge about which basis was measured only after she no longer has access to the quantum system or its copy. We give a necessary and sufficient condition on the bases, for Alice to have a strategy to solve this problem, without assuming that the bases aremore » mutually unbiased. The condition requires the existence of an overall joint probability distribution for random variables, whose marginal pair distributions are fixed as the transition probability matrices of the given bases. In particular, in the qubit case the problem is decided by Bell's original three variable inequality. In the standard setting of mutually unbiased bases, when they do exist, Alice can always succeed. However, for randomly chosen bases her success probability rapidly goes to zero with increasing dimension.« less

  4. Meaner king uses biased bases

    NASA Astrophysics Data System (ADS)

    Reimpell, Michael; Werner, Reinhard F.

    2007-06-01

    The mean king problem is a quantum mechanical retrodiction problem, in which Alice has to name the outcome of an ideal measurement made in one of several different orthonormal bases. Alice is allowed to prepare the state of the system and to do a final measurement, possibly including an entangled copy. However, Alice gains knowledge about which basis was measured only after she no longer has access to the quantum system or its copy. We give a necessary and sufficient condition on the bases, for Alice to have a strategy to solve this problem, without assuming that the bases are mutually unbiased. The condition requires the existence of an overall joint probability distribution for random variables, whose marginal pair distributions are fixed as the transition probability matrices of the given bases. In particular, in the qubit case the problem is decided by Bell’s original three variable inequality. In the standard setting of mutually unbiased bases, when they do exist, Alice can always succeed. However, for randomly chosen bases her success probability rapidly goes to zero with increasing dimension.

  5. Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.

    PubMed

    Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih

    2016-10-01

    In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.

  6. Indices estimated using REML/BLUP and introduction of a super-trait for the selection of progenies in popcorn.

    PubMed

    Vittorazzi, C; Amaral Junior, A T; Guimarães, A G; Viana, A P; Silva, F H L; Pena, G F; Daher, R F; Gerhardt, I F S; Oliveira, G H F; Pereira, M G

    2017-09-27

    Selection indices commonly utilize economic weights, which become arbitrary genetic gains. In popcorn, this is even more evident due to the negative correlation between the main characteristics of economic importance - grain yield and popping expansion. As an option in the use of classical biometrics as a selection index, the optimal procedure restricted maximum likelihood/best linear unbiased predictor (REML/BLUP) allows the simultaneous estimation of genetic parameters and the prediction of genotypic values. Based on the mixed model methodology, the objective of this study was to investigate the comparative efficiency of eight selection indices estimated by REML/BLUP for the effective selection of superior popcorn families in the eighth intrapopulation recurrent selection cycle. We also investigated the efficiency of the inclusion of the variable "expanded popcorn volume per hectare" in the most advantageous selection of superior progenies. In total, 200 full-sib families were evaluated in two different areas in the North and Northwest regions of the State of Rio de Janeiro, Brazil. The REML/BLUP procedure resulted in higher estimated gains than those obtained with classical biometric selection index methodologies and should be incorporated into the selection of progenies. The following indices resulted in higher gains in the characteristics of greatest economic importance: the classical selection index/values attributed by trial, via REML/BLUP, and the greatest genotypic values/expanded popcorn volume per hectare, via REML. The expanded popcorn volume per hectare characteristic enabled satisfactory gains in grain yield and popping expansion; this characteristic should be considered super-trait in popcorn breeding programs.

  7. A sampling strategy to estimate the area and perimeter of irregularly shaped planar regions

    Treesearch

    Timothy G. Gregoire; Harry T. Valentine

    1995-01-01

    The length of a randomly oriented ray emanating from an interior point of a planar region can be used to unbiasedly estimate the region's area and perimeter. Estimators and corresponding variance estimators under various selection strategies are presented.

  8. Bias due to differential participation in case-control studies and review of available approaches for adjustment.

    PubMed

    Aigner, Annette; Grittner, Ulrike; Becher, Heiko

    2018-01-01

    Low response rates in epidemiologic research potentially lead to the recruitment of a non-representative sample of controls in case-control studies. Problems in the unbiased estimation of odds ratios arise when characteristics causing the probability of participation are associated with exposure and outcome. This is a specific setting of selection bias and a realistic hazard in many case-control studies. This paper formally describes the problem and shows its potential extent, reviews existing approaches for bias adjustment applicable under certain conditions, compares and applies them. We focus on two scenarios: a characteristic C causing differential participation of controls is linked to the outcome through its association with risk factor E (scenario I), and C is additionally a genuine risk factor itself (scenario II). We further assume external data sources are available which provide an unbiased estimate of C in the underlying population. Given these scenarios, we (i) review available approaches and their performance in the setting of bias due to differential participation; (ii) describe two existing approaches to correct for the bias in both scenarios in more detail; (iii) present the magnitude of the resulting bias by simulation if the selection of a non-representative sample is ignored; and (iv) demonstrate the approaches' application via data from a case-control study on stroke. The bias of the effect measure for variable E in scenario I and C in scenario II can be large and should therefore be adjusted for in any analysis. It is positively associated with the difference in response rates between groups of the characteristic causing differential participation, and inversely associated with the total response rate in the controls. Adjustment in a standard logistic regression framework is possible in both scenarios if the population distribution of the characteristic causing differential participation is known or can be approximated well.

  9. Breeding of Acrocomia aculeata using genetic diversity parameters and correlations to select accessions based on vegetative, phenological, and reproductive characteristics.

    PubMed

    Coser, S M; Motoike, S Y; Corrêa, T R; Pires, T P; Resende, M D V

    2016-10-17

    Macaw palm (Acrocomia aculeata) is a promising species for use in biofuel production, and establishing breeding programs is important for the development of commercial plantations. The aim of the present study was to analyze genetic diversity, verify correlations between traits, estimate genetic parameters, and select different accessions of A. aculeata in the Macaw Palm Germplasm Bank located in Universidade Federal de Viçosa, to develop a breeding program for this species. Accessions were selected based on precocity (PREC), total spathe (TS), diameter at breast height (DBH), height of the first spathe (HFS), and canopy area (CA). The traits were evaluated in 52 accessions during the 2012/2013 season and analyzed by restricted estimation maximum likelihood/best linear unbiased predictor procedures. Genetic diversity resulted in the formation of four groups by Tocher's clustering method. The correlation analysis showed it was possible to have indirect and early selection for the traits PREC and DBH. Estimated genetic parameters strengthened the genetic variability verified by cluster analysis. Narrow-sense heritability was classified as moderate (PREC, TS, and CA) to high (HFS and DBH), resulting in strong genetic control of the traits and success in obtaining genetic gains by selection. Accuracy values were classified as moderate (PREC and CA) to high (TS, HFS, and DBH), reinforcing the success of the selection process. Selection of accessions for PREC, TS, and HFS by the rank-average method permits selection gains of over 100%, emphasizing the successful use of the accessions in breeding programs and obtaining superior genotypes for commercial plantations.

  10. Modeling longitudinal data, I: principles of multivariate analysis.

    PubMed

    Ravani, Pietro; Barrett, Brendan; Parfrey, Patrick

    2009-01-01

    Statistical models are used to study the relationship between exposure and disease while accounting for the potential role of other factors' impact on outcomes. This adjustment is useful to obtain unbiased estimates of true effects or to predict future outcomes. Statistical models include a systematic component and an error component. The systematic component explains the variability of the response variable as a function of the predictors and is summarized in the effect estimates (model coefficients). The error element of the model represents the variability in the data unexplained by the model and is used to build measures of precision around the point estimates (confidence intervals).

  11. Simulation of relationship between river discharge and sediment yield in the semi-arid river watersheds

    NASA Astrophysics Data System (ADS)

    Khaleghi, Mohammad Reza; Varvani, Javad

    2018-02-01

    Complex and variable nature of the river sediment yield caused many problems in estimating the long-term sediment yield and problems input into the reservoirs. Sediment Rating Curves (SRCs) are generally used to estimate the suspended sediment load of the rivers and drainage watersheds. Since the regression equations of the SRCs are obtained by logarithmic retransformation and have a little independent variable in this equation, they also overestimate or underestimate the true sediment load of the rivers. To evaluate the bias correction factors in Kalshor and Kashafroud watersheds, seven hydrometric stations of this region with suitable upstream watershed and spatial distribution were selected. Investigation of the accuracy index (ratio of estimated sediment yield to observed sediment yield) and the precision index of different bias correction factors of FAO, Quasi-Maximum Likelihood Estimator (QMLE), Smearing, and Minimum-Variance Unbiased Estimator (MVUE) with LSD test showed that FAO coefficient increases the estimated error in all of the stations. Application of MVUE in linear and mean load rating curves has not statistically meaningful effects. QMLE and smearing factors increased the estimated error in mean load rating curve, but that does not have any effect on linear rating curve estimation.

  12. Mutually unbiased product bases for multiple qudits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McNulty, Daniel; Pammer, Bogdan; Weigert, Stefan

    We investigate the interplay between mutual unbiasedness and product bases for multiple qudits of possibly different dimensions. A product state of such a system is shown to be mutually unbiased to a product basis only if each of its factors is mutually unbiased to all the states which occur in the corresponding factors of the product basis. This result implies both a tight limit on the number of mutually unbiased product bases which the system can support and a complete classification of mutually unbiased product bases for multiple qubits or qutrits. In addition, only maximally entangled states can be mutuallymore » unbiased to a maximal set of mutually unbiased product bases.« less

  13. Efforts Toward the Development of Unbiased Selection and Assessment Instruments.

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.

    Investigations into item bias provide an empirical basis for the identification and elimination of test items which appear to measure different traits across populations or cultural groups. The Psychometric rationales for six approaches to the identification of biased test items are reviewed: (1) Transformed item difficulties: within-group…

  14. Estimating total suspended sediment yield with probability sampling

    Treesearch

    Robert B. Thomas

    1985-01-01

    The ""Selection At List Time"" (SALT) scheme controls sampling of concentration for estimating total suspended sediment yield. The probability of taking a sample is proportional to its estimated contribution to total suspended sediment discharge. This procedure gives unbiased estimates of total suspended sediment yield and the variance of the...

  15. APPLICATION OF A MULTIPURPOSE UNEQUAL-PROBABILITY STREAM SURVEY IN THE MID-ATLANTIC COASTAL PLAIN

    EPA Science Inventory

    A stratified random sample with unequal-probability selection was used to design a multipurpose survey of headwater streams in the Mid-Atlantic Coastal Plain. Objectives for data from the survey include unbiased estimates of regional stream conditions, and adequate coverage of un...

  16. Design unbiased estimation in line intersect sampling using segmented transects

    Treesearch

    David L.R. Affleck; Timothy G. Gregoire; Harry T. Valentine; Harry T. Valentine

    2005-01-01

    In many applications of line intersect sampling. transects consist of multiple, connected segments in a prescribed configuration. The relationship between the transect configuration and the selection probability of a population element is illustrated and a consistent sampling protocol, applicable to populations composed of arbitrarily shaped elements, is proposed. It...

  17. Migration monitoring with automated technology

    Treesearch

    Rhonda L. Millikin

    2005-01-01

    Automated technology can supplement ground-based methods of migration monitoring by providing: (1) unbiased and automated sampling; (2) independent validation of current methods; (3) a larger sample area for landscape-level analysis of habitat selection for stopover, and (4) an opportunity to study flight behavior. In particular, radar-acoustic sensor fusion can...

  18. Evolution of learning strategies in temporally and spatially variable environments: A review of theory

    PubMed Central

    Aoki, Kenichi; Feldman, Marcus W.

    2013-01-01

    The theoretical literature from 1985 to the present on the evolution of learning strategies in variable environments is reviewed, with the focus on deterministic dynamical models that are amenable to local stability analysis, and on deterministic models yielding evolutionarily stable strategies. Individual learning, unbiased and biased social learning, mixed learning, and learning schedules are considered. A rapidly changing environment or frequent migration in a spatially heterogeneous environment favors individual learning over unbiased social learning. However, results are not so straightforward in the context of learning schedules or when biases in social learning are introduced. The three major methods of modeling temporal environmental change – coevolutionary, two-timescale, and information decay – are compared and shown to sometimes yield contradictory results. The so-called Rogers’ paradox is inherent in the two-timescale method as originally applied to the evolution of pure strategies, but is often eliminated when the other methods are used. Moreover, Rogers’ paradox is not observed for the mixed learning strategies and learning schedules that we review. We believe that further theoretical work is necessary on learning schedules and biased social learning, based on models that are logically consistent and empirically pertinent. PMID:24211681

  19. Evolution of learning strategies in temporally and spatially variable environments: a review of theory.

    PubMed

    Aoki, Kenichi; Feldman, Marcus W

    2014-02-01

    The theoretical literature from 1985 to the present on the evolution of learning strategies in variable environments is reviewed, with the focus on deterministic dynamical models that are amenable to local stability analysis, and on deterministic models yielding evolutionarily stable strategies. Individual learning, unbiased and biased social learning, mixed learning, and learning schedules are considered. A rapidly changing environment or frequent migration in a spatially heterogeneous environment favors individual learning over unbiased social learning. However, results are not so straightforward in the context of learning schedules or when biases in social learning are introduced. The three major methods of modeling temporal environmental change--coevolutionary, two-timescale, and information decay--are compared and shown to sometimes yield contradictory results. The so-called Rogers' paradox is inherent in the two-timescale method as originally applied to the evolution of pure strategies, but is often eliminated when the other methods are used. Moreover, Rogers' paradox is not observed for the mixed learning strategies and learning schedules that we review. We believe that further theoretical work is necessary on learning schedules and biased social learning, based on models that are logically consistent and empirically pertinent. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Minimizing Statistical Bias with Queries.

    DTIC Science & Technology

    1995-09-14

    method for optimally selecting these points would o er enormous savings in time and money. An active learning system will typically attempt to select data...research in active learning assumes that the sec- ond term of Equation 2 is approximately zero, that is, that the learner is unbiased. If this is the case...outperforms the variance- minimizing algorithm and random exploration. and e ective strategy for active learning . I have given empirical evidence that, with

  1. Parametric and Nonparametric Statistical Methods for Genomic Selection of Traits with Additive and Epistatic Genetic Architectures

    PubMed Central

    Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.

    2014-01-01

    Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE. PMID:24727289

  2. Using Maximum Entropy to Find Patterns in Genomes

    NASA Astrophysics Data System (ADS)

    Liu, Sophia; Hockenberry, Adam; Lancichinetti, Andrea; Jewett, Michael; Amaral, Luis

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. To accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. This approach can also be easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes. National Institute of General Medical Science, Northwestern University Presidential Fellowship, National Science Foundation, David and Lucile Packard Foundation, Camille Dreyfus Teacher Scholar Award.

  3. Host Galaxy Properties of SWIFT Hard X-ray Selected AGN

    NASA Astrophysics Data System (ADS)

    Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.

    2010-01-01

    Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.

  4. GTARG - The TOPEX/Poseidon ground track maintenance maneuver targeting program

    NASA Technical Reports Server (NTRS)

    Shapiro, Bruce E.; Bhat, Ramachandra S.

    1993-01-01

    GTARG is a computer program used to design orbit maintenance maneuvers for the TOPEX/Poseidon satellite. These maneuvers ensure that the ground track is kept within +/-1 km with of an = 9.9 day exact repeat pattern. Maneuver parameters are determined using either of two targeting strategies: longitude targeting, which maximizes the time between maneuvers, and time targeting, in which maneuvers are targeted to occur at specific intervals. The GTARG algorithm propagates nonsingular mean elements, taking into account anticipated error sigma's in orbit determination, Delta v execution, drag prediction and Delta v quantization. A satellite unique drag model is used which incorporates an approximate mean orbital Jacchia-Roberts atmosphere and a variable mean area model. Maneuver Delta v magnitudes are targeted to precisely maintain either the unbiased ground track itself, or a comfortable (3 sigma) error envelope about the unbiased ground track.

  5. On estimation in k-tree sampling

    Treesearch

    Christoph Kleinn; Frantisek Vilcko

    2007-01-01

    The plot design known as k-tree sampling involves taking the k nearest trees from a selected sample point as sample trees. While this plot design is very practical and easily applied in the field for moderate values of k, unbiased estimation remains a problem. In this article, we give a brief introduction to the...

  6. Using object-based image analysis to guide the selection of field sample locations

    USDA-ARS?s Scientific Manuscript database

    One of the most challenging tasks for resource management and research is designing field sampling schemes to achieve unbiased estimates of ecosystem parameters as efficiently as possible. This study focused on the potential of fine-scale image objects from object-based image analysis (OBIA) to be u...

  7. CASE-CONTROL STUDY OF AIR QUALITY AND BIRTH DEFECTS: COMPARISON OF GEOCODED AND NON-GEOCODED POPULATIONS

    EPA Science Inventory

    Unbiased geocoding of maternal residence is critical to the success of an ongoing case-control study of exposure to five criteria air pollutants and the risk of selected birth defects in seven Texas counties between 1997 and 2000. The geocoded residence at delivery will be used ...

  8. RELIANCE ON GEOCODED MATERNAL RESIDENCE: IMPACT ON A POPULATION-BASED CASE-CONTROL STUDY OF AIR QUALITY AND BIRTH DEFECTS

    EPA Science Inventory

    Introduction: Unbiased geocoding of maternal residence is critical to the success of an ongoing population-based case-control study of exposure to five criteria air pollutants and the risk of selected birth defects in seven Texas counties between 1997 and 2000. The geocoded res...

  9. Correlations of the IR Luminosity and Eddington Ratio with a Hard X-ray Selected Sample of AGN

    NASA Technical Reports Server (NTRS)

    Mushotzy, Richard F.; Winter, Lisa M.; McIntosh, Daniel H.; Tueller, Jack

    2008-01-01

    We use the SWIFT Burst Alert Telescope (BAT) sample of hard x-ray selected active galactic nuclei (AGN) with a median redshift of 0.03 and the 2MASS J and K band photometry to examine the correlation of hard x-ray emission to Eddington ratio as well as the relationship of the J and K band nuclear luminosity to the hard x-ray luminosity. The BAT sample is almost unbiased by the effects of obscuration and thus offers the first large unbiased sample for the examination of correlations between different wavelength bands. We find that the near-IR nuclear J and K band luminosity is related to the BAT (14 - 195 keV) luminosity over a factor of 10(exp 3) in luminosity (L(sub IR) approx.equals L(sub BAT)(sup 1.25) and thus is unlikely to be due to dust. We also find that the Eddington ratio is proportional to the x-ray luminosity. This new result should be a strong constraint on models of the formation of the broad band continuum.

  10. A Bayesian method for assessing multiscalespecies-habitat relationships

    USGS Publications Warehouse

    Stuber, Erica F.; Gruber, Lutz F.; Fontaine, Joseph J.

    2017-01-01

    ContextScientists face several theoretical and methodological challenges in appropriately describing fundamental wildlife-habitat relationships in models. The spatial scales of habitat relationships are often unknown, and are expected to follow a multi-scale hierarchy. Typical frequentist or information theoretic approaches often suffer under collinearity in multi-scale studies, fail to converge when models are complex or represent an intractable computational burden when candidate model sets are large.ObjectivesOur objective was to implement an automated, Bayesian method for inference on the spatial scales of habitat variables that best predict animal abundance.MethodsWe introduce Bayesian latent indicator scale selection (BLISS), a Bayesian method to select spatial scales of predictors using latent scale indicator variables that are estimated with reversible-jump Markov chain Monte Carlo sampling. BLISS does not suffer from collinearity, and substantially reduces computation time of studies. We present a simulation study to validate our method and apply our method to a case-study of land cover predictors for ring-necked pheasant (Phasianus colchicus) abundance in Nebraska, USA.ResultsOur method returns accurate descriptions of the explanatory power of multiple spatial scales, and unbiased and precise parameter estimates under commonly encountered data limitations including spatial scale autocorrelation, effect size, and sample size. BLISS outperforms commonly used model selection methods including stepwise and AIC, and reduces runtime by 90%.ConclusionsGiven the pervasiveness of scale-dependency in ecology, and the implications of mismatches between the scales of analyses and ecological processes, identifying the spatial scales over which species are integrating habitat information is an important step in understanding species-habitat relationships. BLISS is a widely applicable method for identifying important spatial scales, propagating scale uncertainty, and testing hypotheses of scaling relationships.

  11. Correcting Four Similar Correlational Measures for Attenuation Due to Errors of Measurement in the Dependent Variable: Eta, Epsilon, Omega, and Intraclass r.

    ERIC Educational Resources Information Center

    Stanley, Julian C.; Livingston, Samuel A.

    Besides the ubiquitous Pearson product-moment r, there are a number of other measures of relationship that are attenuated by errors of measurement and for which the relationship between true measures can be estimated. Among these are the correlation ratio (eta squared), Kelley's unbiased correlation ratio (epsilon squared), Hays' omega squared,…

  12. Serotonin, behavior, and natural selection in New World monkeys.

    PubMed

    Reales, Guillermo; Paixão-Côrtes, Vanessa R; Cybis, Gabriela B; Gonçalves, Gislene L; Pissinatti, Alcides; Salzano, Francisco M; Bortolini, Maria CÁtira

    2018-06-26

    Traits that undergo massive natural selection pressure, with multiple events of positive selection, are hard to find. Social behaviour, in social animals, is crucial for survival, and genetic networks involved in behaviour, such as those of serotonin (5-HT) and other neurotransmitters, must be the target of natural selection. Here, we used molecular analyses to search for signals of positive selection in the 5-HT system and found such signals in the M3-M4 intracellular domain of the 5-HT3A serotonin receptor subunit (HTR3A) in primates. We detected four amino acid sites with signs of putatively positive selection (398, 403, 432 and 416); the first three showed indications of being selected in New World monkeys (NWM, Platyrrhini), specifically in the Callitrichinae branch. Additionally, we searched for associations of these amino acid variants with social behavioural traits (i.e. sex-biased dispersal, dominance and social monogamy) using classical and Bayesian methods, and found statistically significant associations for unbiased sex dispersal (398L and 416S), unbiased sex dominance (416S) and social monogamy (416S), as well as significant positive correlation between female dispersal and 403G. Furthermore, we found putatively functional protein motifs determined by three selected sites, of which we highlight a ligand motif to GSK3 in the 416S variant, appearing only in Platyrrhini. 5-HT, 5-HT3A receptor and GSK3 are part of a network that participates in neurodevelopment and regulates behaviour, among other functions. We suggest that these genetic variations, together with those found in other neurotransmitter systems, must contribute to adaptive behaviours and consequently to fitness in NWMs. © 2018 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2018 European Society For Evolutionary Biology.

  13. Influences on choice of surgery as a career: a study of consecutive cohorts in a medical school.

    PubMed

    Sobral, Dejano T

    2006-06-01

    To examine the differential impact of person-based and programme-related features on graduates' dichotomous choice between surgical or non-surgical field specialties for first-year residency. A 10-year cohort study was conducted, following 578 students (55.4% male) who graduated from a university medical school during 1994-2003. Data were collected as follows: at the beginning of medical studies, on career preference and learning frame; during medical studies, on academic achievement, cross-year peer tutoring and selective clinical traineeship, and at graduation, on the first-year residency selected. Contingency and logistic regression analyses were performed, with graduates grouped by the dichotomous choice of surgery or not. Overall, 23% of graduates selected a first-year residency in surgery. Seven time-steady features related to this choice: male sex, high self-confidence, option of surgery at admission, active learning style, preference for surgery after Year 1, peer tutoring on clinical surgery, and selective training in clinical surgery. Logistic regression analysis, including all features, predicted 87.1% of the graduates' choices. Male sex, updated preference, peer tutoring and selective training were the most significant predictors in the pathway to choice. The relative roles of person-based and programme-related factors in the choice process are discussed. The findings suggest that for most students the choice of surgery derives from a temporal summation of influences that encompass entry and post-entry factors blended in variable patterns. It is likely that sex-unbiased peer tutoring and selective training supported the students' search process for personal compatibility with specialty-related domains of content and process.

  14. Implementation of a Comprehensive Curriculum in Personal Finance for Medical Fellows

    PubMed Central

    Bar-Or, Yuval D; Fessler, Henry E; Desai, Dipan A

    2018-01-01

    Introduction: Many residents and fellows complete graduate medical education having received minimal unbiased financial planning guidance. This places them at risk of making ill-informed financial decisions, which may lead to significant harm to them and their families. Therefore, we sought to provide fellows with comprehensive unbiased financial education and empower them to make timely, constructive financial decisions. Methods: A self-selected cohort of cardiovascular disease, pulmonary and critical care, and infectious disease fellows (n = 18) at a single institution attended a live, eight-hour interactive course on personal finance. The course consisted of four two-hour sessions delivered over four weeks, facilitated by an unbiased business school faculty member with expertise in personal finance. Prior to the course, all participants completed a demographic survey. After course completion, participants were offered an exit survey evaluating the course, which also asked respondents for any tangible financial decisions made as a result of the course learning.  Results: Participants included 12 women and six men, with a mean age of 33 and varying amounts of debt and financial assets. Twelve respondents completed the exit survey, and all “Strongly Agreed” that courses on financial literacy are important for trainees. In addition, 11 reported that the course helped them make important financial decisions, providing 21 examples. Conclusions: Fellows derive a significant benefit from objective financial literacy education. Graduate medical education programs should offer comprehensive financial literacy education to all graduating trainees, and that education should be provided by an unbiased expert who has no incentive to sell financial products and services. PMID:29515942

  15. Implementation of a Comprehensive Curriculum in Personal Finance for Medical Fellows.

    PubMed

    Bar-Or, Yuval D; Fessler, Henry E; Desai, Dipan A; Zakaria, Sammy

    2018-01-01

    Many residents and fellows complete graduate medical education having received minimal unbiased financial planning guidance. This places them at risk of making ill-informed financial decisions, which may lead to significant harm to them and their families. Therefore, we sought to provide fellows with comprehensive unbiased financial education and empower them to make timely, constructive financial decisions. A self-selected cohort of cardiovascular disease, pulmonary and critical care, and infectious disease fellows (n = 18) at a single institution attended a live, eight-hour interactive course on personal finance. The course consisted of four two-hour sessions delivered over four weeks, facilitated by an unbiased business school faculty member with expertise in personal finance. Prior to the course, all participants completed a demographic survey. After course completion, participants were offered an exit survey evaluating the course, which also asked respondents for any tangible financial decisions made as a result of the course learning.  Results: Participants included 12 women and six men, with a mean age of 33 and varying amounts of debt and financial assets. Twelve respondents completed the exit survey, and all "Strongly Agreed" that courses on financial literacy are important for trainees. In addition, 11 reported that the course helped them make important financial decisions, providing 21 examples. Fellows derive a significant benefit from objective financial literacy education. Graduate medical education programs should offer comprehensive financial literacy education to all graduating trainees, and that education should be provided by an unbiased expert who has no incentive to sell financial products and services.

  16. Evaluation of modal pushover-based scaling of one component of ground motion: Tall buildings

    USGS Publications Warehouse

    Kalkan, Erol; Chopra, Anil K.

    2012-01-01

    Nonlinear response history analysis (RHA) is now increasingly used for performance-based seismic design of tall buildings. Required for nonlinear RHAs is a set of ground motions selected and scaled appropriately so that analysis results would be accurate (unbiased) and efficient (having relatively small dispersion). This paper evaluates accuracy and efficiency of recently developed modal pushover–based scaling (MPS) method to scale ground motions for tall buildings. The procedure presented explicitly considers structural strength and is based on the standard intensity measure (IM) of spectral acceleration in a form convenient for evaluating existing structures or proposed designs for new structures. Based on results presented for two actual buildings (19 and 52 stories, respectively), it is demonstrated that the MPS procedure provided a highly accurate estimate of the engineering demand parameters (EDPs), accompanied by significantly reduced record-to-record variability of the responses. In addition, the MPS procedure is shown to be superior to the scaling procedure specified in the ASCE/SEI 7-05 document.

  17. High levels of absorption in orientation-unbiased, radio-selected 3CR Active Galaxies

    NASA Astrophysics Data System (ADS)

    Wilkes, Belinda J.; Haas, Martin; Barthel, Peter; Leipski, Christian; Kuraszkiewicz, Joanna; Worrall, Diana; Birkinshaw, Mark; Willner, Steven P.

    2014-08-01

    A critical problem in understanding active galaxies (AGN) is the separation of intrinsic physical differences from observed differences that are due to orientation. Obscuration of the active nucleus is anisotropic and strongly frequency dependent leading to complex selection effects for observations in most wavebands. These can only be quantified using a sample that is sufficiently unbiased to test orientation effects. Low-frequency radio emission is one way to select a close-to orientation-unbiased sample, albeit limited to the minority of AGN with strong radio emission.Recent Chandra, Spitzer and Herschel observations combined with multi-wavelength data for a complete sample of high-redshift (1 24.2) = 2.5:1.4:1 in these high-luminosity (log L(0.3-8keV) ~ 44-46) sources. These ratios are consistent with current expectations based on modelingthe Cosmic X-ray Background. A strong correlation with radio orientation constrains the geometry of the obscuring disk/torus to have a ~60 degree opening angle and ~12 degree Compton-thick cross-section. The deduced ~50% obscured fraction of the population contrasts with typical estimates of ~20% obscured in optically- and X-ray-selected high-luminosity samples. Once the primary nuclear emission is obscured, AGN X-ray spectra are frequently dominated by unobscured non-nuclear or scattered nuclear emission which cannot be distinguished from direct nuclear emission with a lower obscuration level unless high quality data is available. As a result, both the level of obscuration and the estimated instrinsic luminosities of highly-obscured AGN are likely to be significantly (*10-1000) underestimated for 25-50% of the population. This may explain the lower obscured fractions reported for optical and X-ray samples which have no independent measure of the AGN luminosity. Correcting AGN samples for these underestimated luminosities would result in flatter derived luminosity functions and potentially change their evolution.

  18. Beyond total treatment effects in randomised controlled trials: Baseline measurement of intermediate outcomes needed to reduce confounding in mediation investigations.

    PubMed

    Landau, Sabine; Emsley, Richard; Dunn, Graham

    2018-06-01

    Random allocation avoids confounding bias when estimating the average treatment effect. For continuous outcomes measured at post-treatment as well as prior to randomisation (baseline), analyses based on (A) post-treatment outcome alone, (B) change scores over the treatment phase or (C) conditioning on baseline values (analysis of covariance) provide unbiased estimators of the average treatment effect. The decision to include baseline values of the clinical outcome in the analysis is based on precision arguments, with analysis of covariance known to be most precise. Investigators increasingly carry out explanatory analyses to decompose total treatment effects into components that are mediated by an intermediate continuous outcome and a non-mediated part. Traditional mediation analysis might be performed based on (A) post-treatment values of the intermediate and clinical outcomes alone, (B) respective change scores or (C) conditioning on baseline measures of both intermediate and clinical outcomes. Using causal diagrams and Monte Carlo simulation, we investigated the performance of the three competing mediation approaches. We considered a data generating model that included three possible confounding processes involving baseline variables: The first two processes modelled baseline measures of the clinical variable or the intermediate variable as common causes of post-treatment measures of these two variables. The third process allowed the two baseline variables themselves to be correlated due to past common causes. We compared the analysis models implied by the competing mediation approaches with this data generating model to hypothesise likely biases in estimators, and tested these in a simulation study. We applied the methods to a randomised trial of pragmatic rehabilitation in patients with chronic fatigue syndrome, which examined the role of limiting activities as a mediator. Estimates of causal mediation effects derived by approach (A) will be biased if one of the three processes involving baseline measures of intermediate or clinical outcomes is operating. Necessary assumptions for the change score approach (B) to provide unbiased estimates under either process include the independence of baseline measures and change scores of the intermediate variable. Finally, estimates provided by the analysis of covariance approach (C) were found to be unbiased under all the three processes considered here. When applied to the example, there was evidence of mediation under all methods but the estimate of the indirect effect depended on the approach used with the proportion mediated varying from 57% to 86%. Trialists planning mediation analyses should measure baseline values of putative mediators as well as of continuous clinical outcomes. An analysis of covariance approach is recommended to avoid potential biases due to confounding processes involving baseline measures of intermediate or clinical outcomes, and not simply for increased precision.

  19. Two Extremely Red Galaxies

    NASA Technical Reports Server (NTRS)

    Joseph, Robert D.; Hora, Joseph; Stockton, Alan; Hu, Esther; Sanders, David

    1997-01-01

    This report concerns one of the major observational studies in the ISO Central Programme, the ISO Normal Galaxy Survey. This is a survey of an unbiased sample of spiral and lenticular galaxies selected from the Revised Shapley-Ames Catalog. It is therefore optically-selected, with a brightness limit of blue magnitude = 12, and otherwise randomly chosen. The original sample included 150 galaxies, but this was reduced to 74 when the allocated observing time was expended because the ISO overheads encountered in flight were much larger than predicted.

  20. Systematic random sampling of the comet assay.

    PubMed

    McArt, Darragh G; Wasson, Gillian R; McKerr, George; Saetzler, Kurt; Reed, Matt; Howard, C Vyvyan

    2009-07-01

    The comet assay is a technique used to quantify DNA damage and repair at a cellular level. In the assay, cells are embedded in agarose and the cellular content is stripped away leaving only the DNA trapped in an agarose cavity which can then be electrophoresed. The damaged DNA can enter the agarose and migrate while the undamaged DNA cannot and is retained. DNA damage is measured as the proportion of the migratory 'tail' DNA compared to the total DNA in the cell. The fundamental basis of these arbitrary values is obtained in the comet acquisition phase using fluorescence microscopy with a stoichiometric stain in tandem with image analysis software. Current methods deployed in such an acquisition are expected to be both objectively and randomly obtained. In this paper we examine the 'randomness' of the acquisition phase and suggest an alternative method that offers both objective and unbiased comet selection. In order to achieve this, we have adopted a survey sampling approach widely used in stereology, which offers a method of systematic random sampling (SRS). This is desirable as it offers an impartial and reproducible method of comet analysis that can be used both manually or automated. By making use of an unbiased sampling frame and using microscope verniers, we are able to increase the precision of estimates of DNA damage. Results obtained from a multiple-user pooled variation experiment showed that the SRS technique attained a lower variability than that of the traditional approach. The analysis of a single user with repetition experiment showed greater individual variances while not being detrimental to overall averages. This would suggest that the SRS method offers a better reflection of DNA damage for a given slide and also offers better user reproducibility.

  1. Host Galaxy Properties Of The Swift Bat Hard X-ray Survey Of Agn

    NASA Astrophysics Data System (ADS)

    Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.

    2010-03-01

    Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 90 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, stellar mass, star formation, and AGN luminosity for a sample of 145 AGN Hard X-ray Selected AGN.

  2. Piecewise SALT sampling for estimating suspended sediment yields

    Treesearch

    Robert B. Thomas

    1989-01-01

    A probability sampling method called SALT (Selection At List Time) has been developed for collecting and summarizing data on delivery of suspended sediment in rivers. It is based on sampling and estimating yield using a suspended-sediment rating curve for high discharges and simple random sampling for low flows. The method gives unbiased estimates of total yield and...

  3. An evaluation of flow-stratified sampling for estimating suspended sediment loads

    Treesearch

    Robert B. Thomas; Jack Lewis

    1995-01-01

    Abstract - Flow-stratified sampling is a new method for sampling water quality constituents such as suspended sediment to estimate loads. As with selection-at-list-time (SALT) and time-stratified sampling, flow-stratified sampling is a statistical method requiring random sampling, and yielding unbiased estimates of load and variance. It can be used to estimate event...

  4. Adaptive enhanced sampling with a path-variable for the simulation of protein folding and aggregation

    NASA Astrophysics Data System (ADS)

    Peter, Emanuel K.

    2017-12-01

    In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.

  5. Within-subject template estimation for unbiased longitudinal image analysis.

    PubMed

    Reuter, Martin; Schmansky, Nicholas J; Rosas, H Diana; Fischl, Bruce

    2012-07-16

    Longitudinal image analysis has become increasingly important in clinical studies of normal aging and neurodegenerative disorders. Furthermore, there is a growing appreciation of the potential utility of longitudinally acquired structural images and reliable image processing to evaluate disease modifying therapies. Challenges have been related to the variability that is inherent in the available cross-sectional processing tools, to the introduction of bias in longitudinal processing and to potential over-regularization. In this paper we introduce a novel longitudinal image processing framework, based on unbiased, robust, within-subject template creation, for automatic surface reconstruction and segmentation of brain MRI of arbitrarily many time points. We demonstrate that it is essential to treat all input images exactly the same as removing only interpolation asymmetries is not sufficient to remove processing bias. We successfully reduce variability and avoid over-regularization by initializing the processing in each time point with common information from the subject template. The presented results show a significant increase in precision and discrimination power while preserving the ability to detect large anatomical deviations; as such they hold great potential in clinical applications, e.g. allowing for smaller sample sizes or shorter trials to establish disease specific biomarkers or to quantify drug effects. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. Missing data and multiple imputation in clinical epidemiological research.

    PubMed

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.

  7. Missing data and multiple imputation in clinical epidemiological research

    PubMed Central

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data. PMID:28352203

  8. Adaptive enhanced sampling with a path-variable for the simulation of protein folding and aggregation.

    PubMed

    Peter, Emanuel K

    2017-12-07

    In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.

  9. Logistic regression trees for initial selection of interesting loci in case-control studies

    PubMed Central

    Nickolov, Radoslav Z; Milanov, Valentin B

    2007-01-01

    Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557

  10. Treatment effects model for assessing disease management: measuring outcomes and strengthening program management.

    PubMed

    Wendel, Jeanne; Dumitras, Diana

    2005-06-01

    This paper describes an analytical methodology for obtaining statistically unbiased outcomes estimates for programs in which participation decisions may be correlated with variables that impact outcomes. This methodology is particularly useful for intraorganizational program evaluations conducted for business purposes. In this situation, data is likely to be available for a population of managed care members who are eligible to participate in a disease management (DM) program, with some electing to participate while others eschew the opportunity. The most pragmatic analytical strategy for in-house evaluation of such programs is likely to be the pre-intervention/post-intervention design in which the control group consists of people who were invited to participate in the DM program, but declined the invitation. Regression estimates of program impacts may be statistically biased if factors that impact participation decisions are correlated with outcomes measures. This paper describes an econometric procedure, the Treatment Effects model, developed to produce statistically unbiased estimates of program impacts in this type of situation. Two equations are estimated to (a) estimate the impacts of patient characteristics on decisions to participate in the program, and then (b) use this information to produce a statistically unbiased estimate of the impact of program participation on outcomes. This methodology is well-established in economics and econometrics, but has not been widely applied in the DM outcomes measurement literature; hence, this paper focuses on one illustrative application.

  11. Rapid Evolution of Ovarian-Biased Genes in the Yellow Fever Mosquito (Aedes aegypti).

    PubMed

    Whittle, Carrie A; Extavour, Cassandra G

    2017-08-01

    Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system ( e.g. , sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition. Copyright © 2017 by the Genetics Society of America.

  12. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  13. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE PAGES

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.; ...

    2017-04-01

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  14. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection

    PubMed Central

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-01-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices. PMID:25177107

  15. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    PubMed

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  16. Double sampling to estimate density and population trends in birds

    USGS Publications Warehouse

    Bart, Jonathan; Earnst, Susan L.

    2002-01-01

    We present a method for estimating density of nesting birds based on double sampling. The approach involves surveying a large sample of plots using a rapid method such as uncorrected point counts, variable circular plot counts, or the recently suggested double-observer method. A subsample of those plots is also surveyed using intensive methods to determine actual density. The ratio of the mean count on those plots (using the rapid method) to the mean actual density (as determined by the intensive searches) is used to adjust results from the rapid method. The approach works well when results from the rapid method are highly correlated with actual density. We illustrate the method with three years of shorebird surveys from the tundra in northern Alaska. In the rapid method, surveyors covered ~10 ha h-1 and surveyed each plot a single time. The intensive surveys involved three thorough searches, required ~3 h ha-1, and took 20% of the study effort. Surveyors using the rapid method detected an average of 79% of birds present. That detection ratio was used to convert the index obtained in the rapid method into an essentially unbiased estimate of density. Trends estimated from several years of data would also be essentially unbiased. Other advantages of double sampling are that (1) the rapid method can be changed as new methods become available, (2) domains can be compared even if detection rates differ, (3) total population size can be estimated, and (4) valuable ancillary information (e.g. nest success) can be obtained on intensive plots with little additional effort. We suggest that double sampling be used to test the assumption that rapid methods, such as variable circular plot and double-observer methods, yield density estimates that are essentially unbiased. The feasibility of implementing double sampling in a range of habitats needs to be evaluated.

  17. Automated systematic random sampling and Cavalieri stereology of histologic sections demonstrating acute tubular necrosis after cardiac arrest and cardiopulmonary resuscitation in the mouse.

    PubMed

    Wakasaki, Rumie; Eiwaz, Mahaba; McClellan, Nicholas; Matsushita, Katsuyuki; Golgotiu, Kirsti; Hutchens, Michael P

    2018-06-14

    A technical challenge in translational models of kidney injury is determination of the extent of cell death. Histologic sections are commonly analyzed by area morphometry or unbiased stereology, but stereology requires specialized equipment. Therefore, a challenge to rigorous quantification would be addressed by an unbiased stereology tool with reduced equipment dependence. We hypothesized that it would be feasible to build a novel software component which would facilitate unbiased stereologic quantification on scanned slides, and that unbiased stereology would demonstrate greater precision and decreased bias compared with 2D morphometry. We developed a macro for the widely used image analysis program, Image J, and performed cardiac arrest with cardiopulmonary resuscitation (CA/CPR, a model of acute cardiorenal syndrome) in mice. Fluorojade-B stained kidney sections were analyzed using three methods to quantify cell death: gold standard stereology using a controlled stage and commercially-available software, unbiased stereology using the novel ImageJ macro, and quantitative 2D morphometry also using the novel macro. There was strong agreement between both methods of unbiased stereology (bias -0.004±0.006 with 95% limits of agreement -0.015 to 0.007). 2D morphometry demonstrated poor agreement and significant bias compared to either method of unbiased stereology. Unbiased stereology is facilitated by a novel macro for ImageJ and results agree with those obtained using gold-standard methods. Automated 2D morphometry overestimated tubular epithelial cell death and correlated modestly with values obtained from unbiased stereology. These results support widespread use of unbiased stereology for analysis of histologic outcomes of injury models.

  18. Entropic uncertainty relations and locking: Tight bounds for mutually unbiased bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ballester, Manuel A.; Wehner, Stephanie

    We prove tight entropic uncertainty relations for a large number of mutually unbiased measurements. In particular, we show that a bound derived from the result by Maassen and Uffink [Phys. Rev. Lett. 60, 1103 (1988)] for two such measurements can in fact be tight for up to {radical}(d) measurements in mutually unbiased bases. We then show that using more mutually unbiased bases does not always lead to a better locking effect. We prove that the optimal bound for the accessible information using up to {radical}(d) specific mutually unbiased bases is log d/2, which is the same as can be achievedmore » by using only two bases. Our result indicates that merely using mutually unbiased bases is not sufficient to achieve a strong locking effect and we need to look for additional properties.« less

  19. Efficiency of using first-generation information during second-generation selection: results of computer simulation.

    Treesearch

    T.Z. Ye; K.J.S. Jayawickrama; G.R. Johnson

    2004-01-01

    BLUP (Best linear unbiased prediction) method has been widely used in forest tree improvement programs. Since one of the properties of BLUP is that related individuals contribute to the predictions of each other, it seems logical that integrating data from all generations and from all populations would improve both the precision and accuracy in predicting genetic...

  20. Systematic sampling of discrete and continuous populations: sample selection and the choice of estimator

    Treesearch

    Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire

    2009-01-01

    Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...

  1. Understanding the physical attractiveness literature: Qualitative reviews versus meta-analysis.

    PubMed

    Feingold, Alan

    2017-01-01

    The target article is a qualitative review of selected findings in the physical attractiveness literature. This commentary explains why the meta-analytic approach, frequently used by other attractiveness reviewers, is preferable for drawing unbiased conclusions about the effects of attractiveness. The article's main contribution is affording a foundation for subsequent meta-analysis of the studies discussed in a subjective fashion.

  2. On the calculation of puckering free energy surfaces

    NASA Astrophysics Data System (ADS)

    Sega, M.; Autieri, E.; Pederiva, F.

    2009-06-01

    Cremer-Pople puckering coordinates appear to be the natural candidate variables to explore the conformational space of cyclic compounds and in literature different parametrizations have been used to this end. However, while every parametrization is equivalent in identifying conformations, it is not obvious that they can also act as proper collective variables for the exploration of the puckered conformations free energy surface. It is shown that only the polar parametrization is fit to produce an unbiased estimate of the free energy landscape. As an example, the case of a six-membered ring, glucuronic acid, is presented, showing the artifacts that are generated when a wrong parametrization is used.

  3. On the calculation of puckering free energy surfaces.

    PubMed

    Sega, M; Autieri, E; Pederiva, F

    2009-06-14

    Cremer-Pople puckering coordinates appear to be the natural candidate variables to explore the conformational space of cyclic compounds and in literature different parametrizations have been used to this end. However, while every parametrization is equivalent in identifying conformations, it is not obvious that they can also act as proper collective variables for the exploration of the puckered conformations free energy surface. It is shown that only the polar parametrization is fit to produce an unbiased estimate of the free energy landscape. As an example, the case of a six-membered ring, glucuronic acid, is presented, showing the artifacts that are generated when a wrong parametrization is used.

  4. [Application of ordinary Kriging method in entomologic ecology].

    PubMed

    Zhang, Runjie; Zhou, Qiang; Chen, Cuixian; Wang, Shousong

    2003-01-01

    Geostatistics is a statistic method based on regional variables and using the tool of variogram to analyze the spatial structure and the patterns of organism. In simulating the variogram within a great range, though optimal simulation cannot be obtained, the simulation method of a dialogue between human and computer can be used to optimize the parameters of the spherical models. In this paper, the method mentioned above and the weighted polynomial regression were utilized to simulate the one-step spherical model, the two-step spherical model and linear function model, and the available nearby samples were used to draw on the ordinary Kriging procedure, which provided a best linear unbiased estimate of the constraint of the unbiased estimation. The sum of square deviation between the estimating and measuring values of varying theory models were figured out, and the relative graphs were shown. It was showed that the simulation based on the two-step spherical model was the best simulation, and the one-step spherical model was better than the linear function model.

  5. Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-04-01

    Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.

  6. Natural selection and genetic variation for reproductive reaction norms in a wild bird population.

    PubMed

    Brommer, Jon E; Merilä, Juha; Sheldon, Ben C; Gustafsson, Lars

    2005-06-01

    Many morphological and life-history traits show phenotypic plasticity that can be described by reaction norms, but few studies have attempted individual-level analyses of reaction norms in the wild. We analyzed variation in individual reaction norms between laying date and three climatic variables (local temperature, local rainfall, and North Atlantic Oscillation) of 1126 female collared flycatchers (Ficedula albicollis) with a restricted maximum likehood linear mixed model approach using random-effect best linear unbiased predictor estimates for the elevation (i.e., expected laying date in the average environment) and slope (i.e., adjustment in laying date as a function of environment) of females' reaction norms. Variation in laying date was best explained by local temperature, and individual females differed in both the elevation and the slope of their laying date-temperature reaction norms. As revealed by animal model analyses, there was weak evidence for additive genetic variance of elevation (h2 +/- SE = 0.09 +/- 0.09), whereas there was no evidence for heritability of slope (h2 +/- SE = 0.00 +/- 0.01). Selection analysis, using a female's lifetime production of fledglings or recruits as an estimate of her fitness, revealed significant selection for a lower phenotypic value and breeding value for elevation (i.e., earlier laying date at the average temperature). There was selection for steeper phenotypic values of slope (i.e., greater plasticity in the adjustment of laying date to temperature), but no significant selection on the breeding values of slope. Although these results suggest that phenotypic laying date is influenced by additive genetic factors, as well as by an interaction with the environment, selection on plasticity would not produce an evolutionary response.

  7. On the degrees of freedom of reduced-rank estimators in multivariate regression

    PubMed Central

    Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.

    2015-01-01

    Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155

  8. Host Galaxy Morphologies Of Hard X-ray Selected AGN From The Swift BAT Survey

    NASA Astrophysics Data System (ADS)

    Koss, Michael; Mushotzky, R.; Veilleux, S.

    2009-01-01

    Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. For these host galaxies, only a fraction, 29%, have high quality optical images, predominately from the SDSS. In addition, about 33% show peculiar morphologies and interaction. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.

  9. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel A.

    2015-01-01

    I introduce the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population across its entire redshift range. Using unbiased selection criteria we have designated a subset of 130 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, and Gemini to obtain complementary optical/NIR photometry to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass functions and their evolution with redshift between z=0 and z=5, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to probe cosmic star-formation.

  10. Conformational free energy modeling of druglike molecules by metadynamics in the WHIM space.

    PubMed

    Spiwok, Vojtěch; Hlat-Glembová, Katarína; Tvaroška, Igor; Králová, Blanka

    2012-03-26

    Protein-ligand affinities can be significantly influenced not only by the interaction itself but also by conformational equilibrium of both binding partners, free ligand and free protein. Identification of important conformational families of a ligand and prediction of their thermodynamics is important for efficient ligand design. Here we report conformational free energy modeling of nine small-molecule drugs in explicitly modeled water by metadynamics with a bias potential applied in the space of weighted holistic invariant molecular (WHIM) descriptors. Application of metadynamics enhances conformational sampling compared to unbiased molecular dynamics simulation and allows to predict relative free energies of key conformations. Selected free energy minima and one example of transition state were tested by a series of unbiased molecular dynamics simulation. Comparison of free energy surfaces of free and target-bound Imatinib provides an estimate of free energy penalty of conformational change induced by its binding to the target. © 2012 American Chemical Society

  11. The long-term effects of military conscription on mortality: estimates from the Vietnam-era draft lottery.

    PubMed

    Conley, Dalton; Heerwig, Jennifer

    2012-08-01

    Research on the effects of Vietnam military service suggests that Vietnam veterans experienced significantly higher mortality than the civilian population at large. These results, however, may be biased by nonrandom selection into the military if unobserved background differences between veterans and nonveterans affect mortality directly. To generate unbiased estimates of exposure to conscription on mortality, the present study compares the observed proportion of draft-eligible male decedents born 1950-1952 to the (1) expected proportion of draft-eligible male decedents given Vietnam draft-eligibility cutoffs; and (2) observed proportion of draft-eligible decedent women. The results demonstrate no effect of draft exposure on mortality, including for cause-specific death rates. When we examine population subgroups-including splits by race, educational attainment, nativity, and marital status-we find weak evidence for an interaction between education and draft eligibility. This interaction works in the opposite direction of putative education-enhancing, mortality-reducing effects of conscription that have, in the past, led to concern about a potential exclusion restriction violation in instrumental variable (IV) regression models. We suggest that previous research, which has shown that Vietnam-era veterans experienced significantly higher mortality than nonveterans, might be biased by nonrandom selection into the military and should be further investigated.

  12. Machine Learning methods for Quantitative Radiomic Biomarkers.

    PubMed

    Parmar, Chintan; Grossmann, Patrick; Bussink, Johan; Lambin, Philippe; Aerts, Hugo J W L

    2015-08-17

    Radiomics extracts and mines large number of medical imaging features quantifying tumor phenotypic characteristics. Highly accurate and reliable machine-learning approaches can drive the success of radiomic applications in clinical care. In this radiomic study, fourteen feature selection methods and twelve classification methods were examined in terms of their performance and stability for predicting overall survival. A total of 440 radiomic features were extracted from pre-treatment computed tomography (CT) images of 464 lung cancer patients. To ensure the unbiased evaluation of different machine-learning methods, publicly available implementations along with reported parameter configurations were used. Furthermore, we used two independent radiomic cohorts for training (n = 310 patients) and validation (n = 154 patients). We identified that Wilcoxon test based feature selection method WLCX (stability = 0.84 ± 0.05, AUC = 0.65 ± 0.02) and a classification method random forest RF (RSD = 3.52%, AUC = 0.66 ± 0.03) had highest prognostic performance with high stability against data perturbation. Our variability analysis indicated that the choice of classification method is the most dominant source of performance variation (34.21% of total variance). Identification of optimal machine-learning methods for radiomic applications is a crucial step towards stable and clinically relevant radiomic biomarkers, providing a non-invasive way of quantifying and monitoring tumor-phenotypic characteristics in clinical practice.

  13. Enrichment assessment of multiple virtual screening strategies for Toll-like receptor 8 agonists based on a maximal unbiased benchmarking data set.

    PubMed

    Pei, Fen; Jin, Hongwei; Zhou, Xin; Xia, Jie; Sun, Lidan; Liu, Zhenming; Zhang, Liangren

    2015-11-01

    Toll-like receptor 8 agonists, which activate adaptive immune responses by inducing robust production of T-helper 1-polarizing cytokines, are promising candidates for vaccine adjuvants. As the binding site of toll-like receptor 8 is large and highly flexible, virtual screening by individual method has inevitable limitations; thus, a comprehensive comparison of different methods may provide insights into seeking effective strategy for the discovery of novel toll-like receptor 8 agonists. In this study, the performance of knowledge-based pharmacophore, shape-based 3D screening, and combined strategies was assessed against a maximum unbiased benchmarking data set containing 13 actives and 1302 decoys specialized for toll-like receptor 8 agonists. Prior structure-activity relationship knowledge was involved in knowledge-based pharmacophore generation, and a set of antagonists was innovatively used to verify the selectivity of the selected knowledge-based pharmacophore. The benchmarking data set was generated from our recently developed 'mubd-decoymaker' protocol. The enrichment assessment demonstrated a considerable performance through our selected three-layer virtual screening strategy: knowledge-based pharmacophore (Phar1) screening, shape-based 3D similarity search (Q4_combo), and then a Gold docking screening. This virtual screening strategy could be further employed to perform large-scale database screening and to discover novel toll-like receptor 8 agonists. © 2015 John Wiley & Sons A/S.

  14. Fast State-Space Methods for Inferring Dendritic Synaptic Connectivity

    DTIC Science & Technology

    2013-08-08

    the results of 100 simulations with the same parameters as in Figures 4 and 5. As expected, the LARS/LARS+ results are (downward) biased and have low...with a strength slightly biased toward lower values. To measure the variability of the results across the 20 simulations , we computed for each...are downward biased and have low variance, and the OLS results are unbiased but have high variance. Note that for LARS+ the values above the median are

  15. Assessing mediation using marginal structural models in the presence of confounding and moderation

    PubMed Central

    Coffman, Donna L.; Zhong, Wei

    2012-01-01

    This paper presents marginal structural models (MSMs) with inverse propensity weighting (IPW) for assessing mediation. Generally, individuals are not randomly assigned to levels of the mediator. Therefore, confounders of the mediator and outcome may exist that limit causal inferences, a goal of mediation analysis. Either regression adjustment or IPW can be used to take confounding into account, but IPW has several advantages. Regression adjustment of even one confounder of the mediator and outcome that has been influenced by treatment results in biased estimates of the direct effect (i.e., the effect of treatment on the outcome that does not go through the mediator). One advantage of IPW is that it can properly adjust for this type of confounding, assuming there are no unmeasured confounders. Further, we illustrate that IPW estimation provides unbiased estimates of all effects when there is a baseline moderator variable that interacts with the treatment, when there is a baseline moderator variable that interacts with the mediator, and when the treatment interacts with the mediator. IPW estimation also provides unbiased estimates of all effects in the presence of non-randomized treatments. In addition, for testing mediation we propose a test of the null hypothesis of no mediation. Finally, we illustrate this approach with an empirical data set in which the mediator is continuous, as is often the case in psychological research. PMID:22905648

  16. Assessing mediation using marginal structural models in the presence of confounding and moderation.

    PubMed

    Coffman, Donna L; Zhong, Wei

    2012-12-01

    This article presents marginal structural models with inverse propensity weighting (IPW) for assessing mediation. Generally, individuals are not randomly assigned to levels of the mediator. Therefore, confounders of the mediator and outcome may exist that limit causal inferences, a goal of mediation analysis. Either regression adjustment or IPW can be used to take confounding into account, but IPW has several advantages. Regression adjustment of even one confounder of the mediator and outcome that has been influenced by treatment results in biased estimates of the direct effect (i.e., the effect of treatment on the outcome that does not go through the mediator). One advantage of IPW is that it can properly adjust for this type of confounding, assuming there are no unmeasured confounders. Further, we illustrate that IPW estimation provides unbiased estimates of all effects when there is a baseline moderator variable that interacts with the treatment, when there is a baseline moderator variable that interacts with the mediator, and when the treatment interacts with the mediator. IPW estimation also provides unbiased estimates of all effects in the presence of nonrandomized treatments. In addition, for testing mediation we propose a test of the null hypothesis of no mediation. Finally, we illustrate this approach with an empirical data set in which the mediator is continuous, as is often the case in psychological research. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  17. Comparison of estimates of hardwood bole volume using importance sampling, the centroid method, and some taper equations

    Treesearch

    Harry V., Jr. Wiant; Michael L. Spangler; John E. Baumgras

    2002-01-01

    Various taper systems and the centroid method were compared to unbiased volume estimates made by importance sampling for 720 hardwood trees selected throughout the state of West Virginia. Only the centroid method consistently gave volumes estimates that did not differ significantly from those made by importance sampling, although some taper equations did well for most...

  18. Evaluation of Scat Deposition Transects versus Radio Telemetry for Developing a Species Distribution Model for a Rare Desert Carnivore, the Kit Fox.

    PubMed

    Dempsey, Steven J; Gese, Eric M; Kluever, Bryan M; Lonsinger, Robert C; Waits, Lisette P

    2015-01-01

    Development and evaluation of noninvasive methods for monitoring species distribution and abundance is a growing area of ecological research. While noninvasive methods have the advantage of reduced risk of negative factors associated with capture, comparisons to methods using more traditional invasive sampling is lacking. Historically kit foxes (Vulpes macrotis) occupied the desert and semi-arid regions of southwestern North America. Once the most abundant carnivore in the Great Basin Desert of Utah, the species is now considered rare. In recent decades, attempts have been made to model the environmental variables influencing kit fox distribution. Using noninvasive scat deposition surveys for determination of kit fox presence, we modeled resource selection functions to predict kit fox distribution using three popular techniques (Maxent, fixed-effects, and mixed-effects generalized linear models) and compared these with similar models developed from invasive sampling (telemetry locations from radio-collared foxes). Resource selection functions were developed using a combination of landscape variables including elevation, slope, aspect, vegetation height, and soil type. All models were tested against subsequent scat collections as a method of model validation. We demonstrate the importance of comparing multiple model types for development of resource selection functions used to predict a species distribution, and evaluating the importance of environmental variables on species distribution. All models we examined showed a large effect of elevation on kit fox presence, followed by slope and vegetation height. However, the invasive sampling method (i.e., radio-telemetry) appeared to be better at determining resource selection, and therefore may be more robust in predicting kit fox distribution. In contrast, the distribution maps created from the noninvasive sampling (i.e., scat transects) were significantly different than the invasive method, thus scat transects may be appropriate when used in an occupancy framework to predict species distribution. We concluded that while scat deposition transects may be useful for monitoring kit fox abundance and possibly occupancy, they do not appear to be appropriate for determining resource selection. On our study area, scat transects were biased to roadways, while data collected using radio-telemetry was dictated by movements of the kit foxes themselves. We recommend that future studies applying noninvasive scat sampling should consider a more robust random sampling design across the landscape (e.g., random transects or more complete road coverage) that would then provide a more accurate and unbiased depiction of resource selection useful to predict kit fox distribution.

  19. Sodium Binding Sites and Permeation Mechanism in the NaChBac Channel: A Molecular Dynamics Study.

    PubMed

    Guardiani, Carlo; Rodger, P Mark; Fedorenko, Olena A; Roberts, Stephen K; Khovanov, Igor A

    2017-03-14

    NaChBac was the first discovered bacterial sodium voltage-dependent channel, yet computational studies are still limited due to the lack of a crystal structure. In this work, a pore-only construct built using the NavMs template was investigated using unbiased molecular dynamics and metadynamics. The potential of mean force (PMF) from the unbiased run features four minima, three of which correspond to sites IN, CEN, and HFS discovered in NavAb. During the run, the selectivity filter (SF) is spontaneously occupied by two ions, and frequent access of a third one is often observed. In the innermost sites IN and CEN, Na + is fully hydrated by six water molecules and occupies an on-axis position. In site HFS sodium interacts with a glutamate and a serine from the same subunit and is forced to adopt an off-axis placement. Metadynamics simulations biasing one and two ions show an energy barrier in the SF that prevents single-ion permeation. An analysis of the permeation mechanism was performed both computing minimum energy paths in the axial-axial PMF and through a combination of Markov state modeling and transition path theory. Both approaches reveal a knock-on mechanism involving at least two but possibly three ions. The currents predicted from the unbiased simulation using linear response theory are in excellent agreement with single-channel patch-clamp recordings.

  20. [Imputing missing data in public health: general concepts and application to dichotomous variables].

    PubMed

    Hernández, Gilma; Moriña, David; Navarro, Albert

    The presence of missing data in collected variables is common in health surveys, but the subsequent imputation thereof at the time of analysis is not. Working with imputed data may have certain benefits regarding the precision of the estimators and the unbiased identification of associations between variables. The imputation process is probably still little understood by many non-statisticians, who view this process as highly complex and with an uncertain goal. To clarify these questions, this note aims to provide a straightforward, non-exhaustive overview of the imputation process to enable public health researchers ascertain its strengths. All this in the context of dichotomous variables which are commonplace in public health. To illustrate these concepts, an example in which missing data is handled by means of simple and multiple imputation is introduced. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  1. Metabolomic prediction of yield in hybrid rice.

    PubMed

    Xu, Shizhong; Xu, Yang; Gong, Liang; Zhang, Qifa

    2016-10-01

    Rice (Oryza sativa) provides a staple food source for more than 50% of the world's population. An increase in yield can significantly contribute to global food security. Hybrid breeding can potentially help to meet this goal because hybrid rice often shows a considerable increase in yield when compared with pure-bred cultivars. We recently developed a marker-guided prediction method for hybrid yield and showed a substantial increase in yield through genomic hybrid breeding. We now have transcriptomic and metabolomic data as potential resources for prediction. Using six prediction methods, including least absolute shrinkage and selection operator (LASSO), best linear unbiased prediction (BLUP), stochastic search variable selection, partial least squares, and support vector machines using the radial basis function and polynomial kernel function, we found that the predictability of hybrid yield can be further increased using these omic data. LASSO and BLUP are the most efficient methods for yield prediction. For high heritability traits, genomic data remain the most efficient predictors. When metabolomic data are used, the predictability of hybrid yield is almost doubled compared with genomic prediction. Of the 21 945 potential hybrids derived from 210 recombinant inbred lines, selection of the top 10 hybrids predicted from metabolites would lead to a ~30% increase in yield. We hypothesize that each metabolite represents a biologically built-in genetic network for yield; thus, using metabolites for prediction is equivalent to using information integrated from these hidden genetic networks for yield prediction. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  2. A Scaled Framework for CRISPR Editing of Human Pluripotent Stem Cells to Study Psychiatric Disease.

    PubMed

    Hazelbaker, Dane Z; Beccard, Amanda; Bara, Anne M; Dabkowski, Nicole; Messana, Angelica; Mazzucato, Patrizia; Lam, Daisy; Manning, Danielle; Eggan, Kevin; Barrett, Lindy E

    2017-10-10

    Scaling of CRISPR-Cas9 technology in human pluripotent stem cells (hPSCs) represents an important step for modeling complex disease and developing drug screens in human cells. However, variables affecting the scaling efficiency of gene editing in hPSCs remain poorly understood. Here, we report a standardized CRISPR-Cas9 approach, with robust benchmarking at each step, to successfully target and genotype a set of psychiatric disease-implicated genes in hPSCs and provide a resource of edited hPSC lines for six of these genes. We found that transcriptional state and nucleosome positioning around targeted loci was not correlated with editing efficiency. However, editing frequencies varied between different hPSC lines and correlated with genomic stability, underscoring the need for careful cell line selection and unbiased assessments of genomic integrity. Together, our step-by-step quantification and in-depth analyses provide an experimental roadmap for scaling Cas9-mediated editing in hPSCs to study psychiatric disease, with broader applicability for other polygenic diseases. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  3. EVLA observations of radio-loud quasars selected to study radio orientation

    NASA Astrophysics Data System (ADS)

    Maithil, Jaya; Brotherton, Michael S.; Runnoe, Jessie; Wardle, John F. C.; DiPompeo, Michael; De Breuck, Carlos; Wills, Beverley J.

    2018-06-01

    We present preliminary work to develop an unbiased sample of radio-loud quasars to test orientation indicators. We have obtained radio data of 147 radio-loud quasars using EVLA at 10 GHz and with the A-array. With this high-resolution data we have measured the uncontaminated core flux density to determine orientation indicators based on radio core dominance. The radio cores of quasars have a flat spectrum over a broad range of frequencies, so we expect that the core flux density at the FIRST and the observed frequencies should be the same in the absence of variability. Jackson & Brown (2012) pointed out that the survey measurements of core flux density, like FIRST, often doesn't have the spatial resolution to distinguish cores from extended emission. Our measurements show that at FIRST spatial resolution, core flux measurements are indeed systematically high. Our results establish that orientation studies need high-resolution radio data as compared to survey data, and that the optical emission is a better normalization than the extended radio emission for a core dominance parameter to track orientation.

  4. The effect of memory and context changes on color matches to real objects.

    PubMed

    Allred, Sarah R; Olkkonen, Maria

    2015-07-01

    Real-world color identification tasks often require matching the color of objects between contexts and after a temporal delay, thus placing demands on both perceptual and memory processes. Although the mechanisms of matching colors between different contexts have been widely studied under the rubric of color constancy, little research has investigated the role of long-term memory in such tasks or how memory interacts with color constancy. To investigate this relationship, observers made color matches to real study objects that spanned color space, and we independently manipulated the illumination impinging on the objects, the surfaces in which objects were embedded, and the delay between seeing the study object and selecting its color match. Adding a 10-min delay increased both the bias and variability of color matches compared to a baseline condition. These memory errors were well accounted for by modeling memory as a noisy but unbiased version of perception constrained by the matching methods. Surprisingly, we did not observe significant increases in errors when illumination and surround changes were added to the 10-minute delay, although the context changes alone did elicit significant errors.

  5. SU-F-T-130: [18F]-FDG Uptake Dose Response in Lung Correlates Linearly with Proton Therapy Dose

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, D; Titt, U; Mirkovic, D

    2016-06-15

    Purpose: Analysis of clinical outcomes in lung cancer patients treated with protons using 18F-FDG uptake in lung as a measure of dose response. Methods: A test case lung cancer patient was selected in an unbiased way. The test patient’s treatment planning and post treatment positron emission tomography (PET) were collected from picture archiving and communication system at the UT M.D. Anderson Cancer Center. Average computerized tomography scan was registered with post PET/CT through both rigid and deformable registrations for selected region of interest (ROI) via VelocityAI imaging informatics software. For the voxels in the ROI, a system that extracts themore » Standard Uptake Value (SUV) from PET was developed, and the corresponding relative biological effectiveness (RBE) weighted (both variable and constant) dose was computed using the Monte Carlo (MC) methods. The treatment planning system (TPS) dose was also obtained. Using histogram analysis, the voxel average normalized SUV vs. 3 different doses was obtained and linear regression fit was performed. Results: From the registration process, there were some regions that showed significant artifacts near the diaphragm and heart region, which yielded poor r-squared values when the linear regression fit was performed on normalized SUV vs. dose. Excluding these values, TPS fit yielded mean r-squared value of 0.79 (range 0.61–0.95), constant RBE fit yielded 0.79 (range 0.52–0.94), and variable RBE fit yielded 0.80 (range 0.52–0.94). Conclusion: A system that extracts SUV from PET to correlate between normalized SUV and various dose calculations was developed. A linear relation between normalized SUV and all three different doses was found.« less

  6. Simulation of design-unbiased point-to-particle sampling compared to alternatives on plantation rows

    Treesearch

    Thomas B. Lynch; David Hamlin; Mark J. Ducey

    2016-01-01

    Total quantities of tree attributes can be estimated in plantations by sampling on plantation rows using several methods. At random sample points on a row, either fixed row lengths or variable row lengths with a fixed number of sample trees can be assessed. Ratio of means or mean of ratios estimators can be developed for the fixed number of trees option but are not...

  7. A Bayesian Approach to Identifying Structural Nonlinearity using Free-Decay Response: Application to Damage Detection in Composites

    DTIC Science & Technology

    2010-03-03

    obtainable while for the free-decay problem we simply have to include the initial conditions as random variables to be predicted. A different approach that...important and useful properties of MLEs is that, under regularity conditions , they are asymptotically unbiased and possess the minimum possible...becomes pLðzjh;s2G;MiÞ (i.e. the likelihood is conditional on the specified model). However, in this work we will only consider a single model and drop the

  8. Genomic Methods for Clinical and Translational Pain Research

    PubMed Central

    Wang, Dan; Kim, Hyungsuk; Wang, Xiao-Min; Dionne, Raymond

    2012-01-01

    Pain is a complex sensory experience for which the molecular mechanisms are yet to be fully elucidated. Individual differences in pain sensitivity are mediated by a complex network of multiple gene polymorphisms, physiological and psychological processes, and environmental factors. Here, we present the methods for applying unbiased molecular-genetic approaches, genome-wide association study (GWAS), and global gene expression analysis, to help better understand the molecular basis of pain sensitivity in humans and variable responses to analgesic drugs. PMID:22351080

  9. Evidence of Adaptive Evolution and Relaxed Constraints in Sex-Biased Genes of South American and West Indies Fruit Flies (Diptera: Tephritidae)

    PubMed Central

    Campanini, Emeline B; Torres, Felipe R; Rezende, Víctor B; Nakamura, Aline M; de Oliveira, Janaína L; Lima, André L A; Chahad-Ehlers, Samira; Sobrinho, Iderval S; de Brito, Reinaldo A

    2018-01-01

    Abstract Several studies have demonstrated that genes differentially expressed between sexes (sex-biased genes) tend to evolve faster than unbiased genes, particularly in males. The reason for this accelerated evolution is not clear, but several explanations have involved adaptive and nonadaptive mechanisms. Furthermore, the differences of sex-biased expression patterns of closely related species are also little explored out of Drosophila. To address the evolutionary processes involved with sex-biased expression in species with incipient differentiation, we analyzed male and female transcriptomes of Anastrepha fraterculus and Anastrepha obliqua, a pair of species that have diverged recently, likely in the presence of gene flow. Using these data, we inferred differentiation indexes and evolutionary rates and tested for signals of selection in thousands of genes expressed in head and reproductive transcriptomes from both species. Our results indicate that sex-biased and reproductive-biased genes evolve faster than unbiased genes in both species, which is due to both adaptive pressure and relaxed constraints. Furthermore, among male-biased genes evolving under positive selection, we identified some related to sexual functions such as courtship behavior and fertility. These findings suggest that sex-biased genes may have played important roles in the establishment of reproductive isolation between these species, due to a combination of selection and drift, and unveil a plethora of genetic markers useful for more studies in these species and their differentiation. PMID:29346618

  10. Control system estimation and design for aerospace vehicles

    NASA Technical Reports Server (NTRS)

    Stefani, R. T.; Williams, T. L.; Yakowitz, S. J.

    1972-01-01

    The selection of an estimator which is unbiased when applied to structural parameter estimation is discussed. The mathematical relationships for structural parameter estimation are defined. It is shown that a conventional weighted least squares (CWLS) estimate is biased when applied to structural parameter estimation. Two approaches to bias removal are suggested: (1) change the CWLS estimator or (2) change the objective function. The advantages of each approach are analyzed.

  11. Resonant tunneling in graphene pseudomagnetic quantum dots.

    PubMed

    Qi, Zenan; Bahamon, D A; Pereira, Vitor M; Park, Harold S; Campbell, D K; Neto, A H Castro

    2013-06-12

    Realistic relaxed configurations of triaxially strained graphene quantum dots are obtained from unbiased atomistic mechanical simulations. The local electronic structure and quantum transport characteristics of y-junctions based on such dots are studied, revealing that the quasi-uniform pseudomagnetic field induced by strain restricts transport to Landau level- and edge state-assisted resonant tunneling. Valley degeneracy is broken in the presence of an external field, allowing the selective filtering of the valley and chirality of the states assisting in the resonant tunneling. Asymmetric strain conditions can be explored to select the exit channel of the y-junction.

  12. Point estimation following two-stage adaptive threshold enrichment clinical trials.

    PubMed

    Kimani, Peter K; Todd, Susan; Renfro, Lindsay A; Stallard, Nigel

    2018-05-31

    Recently, several study designs incorporating treatment effect assessment in biomarker-based subpopulations have been proposed. Most statistical methodologies for such designs focus on the control of type I error rate and power. In this paper, we have developed point estimators for clinical trials that use the two-stage adaptive enrichment threshold design. The design consists of two stages, where in stage 1, patients are recruited in the full population. Stage 1 outcome data are then used to perform interim analysis to decide whether the trial continues to stage 2 with the full population or a subpopulation. The subpopulation is defined based on one of the candidate threshold values of a numerical predictive biomarker. To estimate treatment effect in the selected subpopulation, we have derived unbiased estimators, shrinkage estimators, and estimators that estimate bias and subtract it from the naive estimate. We have recommended one of the unbiased estimators. However, since none of the estimators dominated in all simulation scenarios based on both bias and mean squared error, an alternative strategy would be to use a hybrid estimator where the estimator used depends on the subpopulation selected. This would require a simulation study of plausible scenarios before the trial. © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  13. Mutually unbiased bases and semi-definite programming

    NASA Astrophysics Data System (ADS)

    Brierley, Stephen; Weigert, Stefan

    2010-11-01

    A complex Hilbert space of dimension six supports at least three but not more than seven mutually unbiased bases. Two computer-aided analytical methods to tighten these bounds are reviewed, based on a discretization of parameter space and on Gröbner bases. A third algorithmic approach is presented: the non-existence of more than three mutually unbiased bases in composite dimensions can be decided by a global optimization method known as semidefinite programming. The method is used to confirm that the spectral matrix cannot be part of a complete set of seven mutually unbiased bases in dimension six.

  14. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel

    2015-08-01

    I will describe the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population and its redshift evolution from z=0 to z=7. Using unbiased selection criteria we have designated a subset of 119 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, Gemini, VLT, and Magellan to obtain complementary optical/NIR photometry and spectroscopy to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass distributions and their evolution with redshift, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to serve as tools for measuring and studying cosmic star-formation in the distant universe.

  15. Estimating unbiased economies of scale of HIV prevention projects: a case study of Avahan.

    PubMed

    Lépine, Aurélia; Vassall, Anna; Chandrashekar, Sudha; Blanc, Elodie; Le Nestour, Alexis

    2015-04-01

    Governments and donors are investing considerable resources on HIV prevention in order to scale up these services rapidly. Given the current economic climate, providers of HIV prevention services increasingly need to demonstrate that these investments offer good 'value for money'. One of the primary routes to achieve efficiency is to take advantage of economies of scale (a reduction in the average cost of a health service as provision scales-up), yet empirical evidence on economies of scale is scarce. Methodologically, the estimation of economies of scale is hampered by several statistical issues preventing causal inference and thus making the estimation of economies of scale complex. In order to estimate unbiased economies of scale when scaling up HIV prevention services, we apply our analysis to one of the few HIV prevention programmes globally delivered at a large scale: the Indian Avahan initiative. We costed the project by collecting data from the 138 Avahan NGOs and the supporting partners in the first four years of its scale-up, between 2004 and 2007. We develop a parsimonious empirical model and apply a system Generalized Method of Moments (GMM) and fixed-effects Instrumental Variable (IV) estimators to estimate unbiased economies of scale. At the programme level, we find that, after controlling for the endogeneity of scale, the scale-up of Avahan has generated high economies of scale. Our findings suggest that average cost reductions per person reached are achievable when scaling-up HIV prevention in low and middle income countries. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Unbiased, scalable sampling of protein loop conformations from probabilistic priors.

    PubMed

    Zhang, Yajia; Hauser, Kris

    2013-01-01

    Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion.

  17. Unbiased, scalable sampling of protein loop conformations from probabilistic priors

    PubMed Central

    2013-01-01

    Background Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Results Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Conclusion Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion. PMID:24565175

  18. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.

    PubMed

    Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N

    2016-11-01

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.

  19. Grant opportunities for academic research and training

    USGS Publications Warehouse

    ,

    2016-08-30

    As an unbiased, multidisciplinary science organization, the U.S. Geological Survey (USGS) is dedicated to the timely, relevant, and impartial study of the health of our ecosystems and environment, our natural resources, the impacts of climate and land-use change, and the natural hazards that affect our lives. Grant opportunities for researchers and faculty to participate in USGS science through the engagement of students are available in the selected programs described in this publication.

  20. Unbiasedness

    USGS Publications Warehouse

    Link, W.A.; Armitage, Peter; Colton, Theodore

    1998-01-01

    Unbiasedness is probably the best known criterion for evaluating the performance of estimators. This note describes unbiasedness, demonstrating various failings of the criterion. It is shown that unbiased estimators might not exist, or might not be unique; an example of a unique but clearly unacceptable unbiased estimator is given. It is shown that unbiased estimators are not translation invariant. Various alternative criteria are described, and are illustrated through examples.

  1. FAST TRACK COMMUNICATION: Affine constellations without mutually unbiased counterparts

    NASA Astrophysics Data System (ADS)

    Weigert, Stefan; Durt, Thomas

    2010-10-01

    It has been conjectured that a complete set of mutually unbiased bases in a space of dimension d exists if and only if there is an affine plane of order d. We introduce affine constellations and compare their existence properties with those of mutually unbiased constellations. The observed discrepancies make a deeper relation between the two existence problems unlikely.

  2. Evaluating disease management programme effectiveness: an introduction to instrumental variables.

    PubMed

    Linden, Ariel; Adams, John L

    2006-04-01

    This paper introduces the concept of instrumental variables (IVs) as a means of providing an unbiased estimate of treatment effects in evaluating disease management (DM) programme effectiveness. Model development is described using zip codes as the IV. Three diabetes DM outcomes were evaluated: annual diabetes costs, emergency department (ED) visits and hospital days. Both ordinary least squares (OLS) and IV estimates showed a significant treatment effect for diabetes costs (P = 0.011) but neither model produced a significant treatment effect for ED visits. However, the IV estimate showed a significant treatment effect for hospital days (P = 0.006) whereas the OLS model did not. These results illustrate the utility of IV estimation when the OLS model is sensitive to the confounding effect of hidden bias.

  3. Uncertainty relation for the discrete Fourier transform.

    PubMed

    Massar, Serge; Spindel, Philippe

    2008-05-16

    We derive an uncertainty relation for two unitary operators which obey a commutation relation of the form UV=e(i phi) VU. Its most important application is to constrain how much a quantum state can be localized simultaneously in two mutually unbiased bases related by a discrete fourier transform. It provides an uncertainty relation which smoothly interpolates between the well-known cases of the Pauli operators in two dimensions and the continuous variables position and momentum. This work also provides an uncertainty relation for modular variables, and could find applications in signal processing. In the finite dimensional case the minimum uncertainty states, discrete analogues of coherent and squeezed states, are minimum energy solutions of Harper's equation, a discrete version of the harmonic oscillator equation.

  4. Choosing the Allometric Exponent in Covariate Model Building.

    PubMed

    Sinha, Jaydeep; Al-Sallami, Hesham S; Duffull, Stephen B

    2018-04-27

    Allometric scaling is often used to describe the covariate model linking total body weight (WT) to clearance (CL); however, there is no consensus on how to select its value. The aims of this study were to assess the influence of between-subject variability (BSV) and study design on (1) the power to correctly select the exponent from a priori choices, and (2) the power to obtain unbiased exponent estimates. The influence of WT distribution range (randomly sampled from the Third National Health and Nutrition Examination Survey, 1988-1994 [NHANES III] database), sample size (N = 10, 20, 50, 100, 200, 500, 1000 subjects), and BSV on CL (low 20%, normal 40%, high 60%) were assessed using stochastic simulation estimation. A priori exponent values used for the simulations were 0.67, 0.75, and 1, respectively. For normal to high BSV drugs, it is almost impossible to correctly select the exponent from an a priori set of exponents, i.e. 1 vs. 0.75, 1 vs. 0.67, or 0.75 vs. 0.67 in regular studies involving < 200 adult participants. On the other hand, such regular study designs are sufficient to appropriately estimate the exponent. However, regular studies with < 100 patients risk potential bias in estimating the exponent. Those study designs with limited sample size and narrow range of WT (e.g. < 100 adult participants) potentially risk either selection of a false value or yielding a biased estimate of the allometric exponent; however, such bias is only relevant in cases of extrapolating the value of CL outside the studied population, e.g. analysis of a study of adults that is used to extrapolate to children.

  5. Predictive performance of genomic selection methods for carcass traits in Hanwoo beef cattle: impacts of the genetic architecture.

    PubMed

    Mehrban, Hossein; Lee, Deuk Hwan; Moradi, Mohammad Hossein; IlCho, Chung; Naserkheil, Masoumeh; Ibáñez-Escriche, Noelia

    2017-01-04

    Hanwoo beef is known for its marbled fat, tenderness, juiciness and characteristic flavor, as well as for its low cholesterol and high omega 3 fatty acid contents. As yet, there has been no comprehensive investigation to estimate genomic selection accuracy for carcass traits in Hanwoo cattle using dense markers. This study aimed at evaluating the accuracy of alternative statistical methods that differed in assumptions about the underlying genetic model for various carcass traits: backfat thickness (BT), carcass weight (CW), eye muscle area (EMA), and marbling score (MS). Accuracies of direct genomic breeding values (DGV) for carcass traits were estimated by applying fivefold cross-validation to a dataset including 1183 animals and approximately 34,000 single nucleotide polymorphisms (SNPs). Accuracies of BayesC, Bayesian LASSO (BayesL) and genomic best linear unbiased prediction (GBLUP) methods were similar for BT, EMA and MS. However, for CW, DGV accuracy was 7% higher with BayesC than with BayesL and GBLUP. The increased accuracy of BayesC, compared to GBLUP and BayesL, was maintained for CW, regardless of the training sample size, but not for BT, EMA, and MS. Genome-wide association studies detected consistent large effects for SNPs on chromosomes 6 and 14 for CW. The predictive performance of the models depended on the trait analyzed. For CW, the results showed a clear superiority of BayesC compared to GBLUP and BayesL. These findings indicate the importance of using a proper variable selection method for genomic selection of traits and also suggest that the genetic architecture that underlies CW differs from that of the other carcass traits analyzed. Thus, our study provides significant new insights into the carcass traits of Hanwoo cattle.

  6. Unbiased water and methanol maser surveys of NGC 1333

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lyo, A-Ran; Kim, Jongsoo; Byun, Do-Young

    2014-11-01

    We present the results of unbiased 22 GHz H{sub 2}O water and 44 GHz class I CH{sub 3}OH methanol maser surveys in the central 7' × 10' area of NGC 1333 and two additional mapping observations of a 22 GHz water maser in a ∼3' × 3' area of the IRAS4A region. In the 22 GHz water maser survey of NGC 1333 with a sensitivity of σ ∼ 0.3 Jy, we confirmed the detection of masers toward H{sub 2}O(B) in the region of HH 7-11 and IRAS4B. We also detected new water masers located ∼20'' away in the western directionmore » of IRAS4B or ∼25'' away in the southern direction of IRAS4A. We could not, however, find young stellar objects or molecular outflows associated with them. They showed two different velocity components of ∼0 and ∼16 km s{sup –1}, which are blue- and redshifted relative to the adopted systemic velocity of ∼7 km s{sup –1} for NGC 1333. They also showed time variabilities in both intensity and velocity from multi-epoch observations and an anti-correlation between the intensities of the blue- and redshifted velocity components. We suggest that the unidentified power source of these masers might be found in the earliest evolutionary stage of star formation, before the onset of molecular outflows. Finding this kind of water maser is only possible through an unbiased blind survey. In the 44 GHz methanol maser survey with a sensitivity of σ ∼ 0.5 Jy, we confirmed masers toward IRAS4A2 and the eastern shock region of IRAS2A. Both sources are also detected in 95 and 132 GHz methanol maser lines. In addition, we had new detections of methanol masers at 95 and 132 GHz toward IRAS4B. In terms of the isotropic luminosity, we detected methanol maser sources brighter than ∼5 × 10{sup 25} erg s{sup –1} from our unbiased survey.« less

  7. Multiple Imputation For Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys

    PubMed Central

    Rendall, Michael S.; Ghosh-Dastidar, Bonnie; Weden, Margaret M.; Baker, Elizabeth H.; Nazarov, Zafar

    2013-01-01

    Within-survey multiple imputation (MI) methods are adapted to pooled-survey regression estimation where one survey has more regressors, but typically fewer observations, than the other. This adaptation is achieved through: (1) larger numbers of imputations to compensate for the higher fraction of missing values; (2) model-fit statistics to check the assumption that the two surveys sample from a common universe; and (3) specificying the analysis model completely from variables present in the survey with the larger set of regressors, thereby excluding variables never jointly observed. In contrast to the typical within-survey MI context, cross-survey missingness is monotonic and easily satisfies the Missing At Random (MAR) assumption needed for unbiased MI. Large efficiency gains and substantial reduction in omitted variable bias are demonstrated in an application to sociodemographic differences in the risk of child obesity estimated from two nationally-representative cohort surveys. PMID:24223447

  8. Evidence of Adaptive Evolution and Relaxed Constraints in Sex-Biased Genes of South American and West Indies Fruit Flies (Diptera: Tephritidae).

    PubMed

    Congrains, Carlos; Campanini, Emeline B; Torres, Felipe R; Rezende, Víctor B; Nakamura, Aline M; de Oliveira, Janaína L; Lima, André L A; Chahad-Ehlers, Samira; Sobrinho, Iderval S; de Brito, Reinaldo A

    2018-01-01

    Several studies have demonstrated that genes differentially expressed between sexes (sex-biased genes) tend to evolve faster than unbiased genes, particularly in males. The reason for this accelerated evolution is not clear, but several explanations have involved adaptive and nonadaptive mechanisms. Furthermore, the differences of sex-biased expression patterns of closely related species are also little explored out of Drosophila. To address the evolutionary processes involved with sex-biased expression in species with incipient differentiation, we analyzed male and female transcriptomes of Anastrepha fraterculus and Anastrepha obliqua, a pair of species that have diverged recently, likely in the presence of gene flow. Using these data, we inferred differentiation indexes and evolutionary rates and tested for signals of selection in thousands of genes expressed in head and reproductive transcriptomes from both species. Our results indicate that sex-biased and reproductive-biased genes evolve faster than unbiased genes in both species, which is due to both adaptive pressure and relaxed constraints. Furthermore, among male-biased genes evolving under positive selection, we identified some related to sexual functions such as courtship behavior and fertility. These findings suggest that sex-biased genes may have played important roles in the establishment of reproductive isolation between these species, due to a combination of selection and drift, and unveil a plethora of genetic markers useful for more studies in these species and their differentiation. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Optimal reconstruction of the states in qutrit systems

    NASA Astrophysics Data System (ADS)

    Yan, Fei; Yang, Ming; Cao, Zhuo-Liang

    2010-10-01

    Based on mutually unbiased measurements, an optimal tomographic scheme for the multiqutrit states is presented explicitly. Because the reconstruction process of states based on mutually unbiased states is free of information waste, we refer to our scheme as the optimal scheme. By optimal we mean that the number of the required conditional operations reaches the minimum in this tomographic scheme for the states of qutrit systems. Special attention will be paid to how those different mutually unbiased measurements are realized; that is, how to decompose each transformation that connects each mutually unbiased basis with the standard computational basis. It is found that all those transformations can be decomposed into several basic implementable single- and two-qutrit unitary operations. For the three-qutrit system, there exist five different mutually unbiased-bases structures with different entanglement properties, so we introduce the concept of physical complexity to minimize the number of nonlocal operations needed over the five different structures. This scheme is helpful for experimental scientists to realize the most economical reconstruction of quantum states in qutrit systems.

  10. Evaluation of selection index: application to the choice of an indirect multitrait selection index for soybean breeding.

    PubMed

    Bouchez, A; Goffinet, B

    1990-02-01

    Selection indices can be used to predict one trait from information available on several traits in order to improve the prediction accuracy. Plant or animal breeders are interested in selecting only the best individuals, and need to compare the efficiency of different trait combinations in order to choose the index ensuring the best prediction quality for individual values. As the usual tools for index evaluation do not remain unbiased in all cases, we propose a robust way of evaluation by means of an estimator of the mean-square error of prediction (EMSEP). This estimator remains valid even when parameters are not known, as usually assumed, but are estimated. EMSEP is applied to the choice of an indirect multitrait selection index at the F5 generation of a classical breeding scheme for soybeans. Best predictions for precocity are obtained by means of indices using only part of the available information.

  11. Ways to improve your correlation functions

    NASA Technical Reports Server (NTRS)

    Hamilton, A. J. S.

    1993-01-01

    This paper describes a number of ways to improve on the standard method for measuring the two-point correlation function of large scale structure in the Universe. Issues addressed are: (1) the problem of the mean density, and how to solve it; (2) how to estimate the uncertainty in a measured correlation function; (3) minimum variance pair weighting; (4) unbiased estimation of the selection function when magnitudes are discrete; and (5) analytic computation of angular integrals in background pair counts.

  12. Racial Bias and Predictive Validity in Testing for Selection.

    DTIC Science & Technology

    1983-07-01

    the inequa - lity rR (P.C) *0 (2) must define test bias. This definition of test bias conforms to the requirements of the Civil Rights Act of 1964 as...of Educational Measurement, 1976, 13, 43-52. Einhorn, H. J., & Bass, A. R. Methodological considerations relevant to discrimination in employment ...34unbiased" selec- tion model: A question of utilities. Journal of Applied Psychology, 1975, 60, 345-351. Guion, R. M. Employment tests and discriminatory

  13. Sparse Recovery via Differential Inclusions

    DTIC Science & Technology

    2014-07-01

    2242. [Wai09] Martin J. Wainwright, Sharp thresholds for high-dimensional and noisy spar- sity recovery using l1 -constrained quadratic programming...solution, (1.11) βt = { 0, if t < 1/y; y(1− e−κ(t−1/y)), otherwise, which converges to the unbiased Bregman ISS estimator exponentially fast. Let us ...are not given the support set S, so the following two prop- erties are used to evaluate the performance of an estimator β̂. 1. Model selection

  14. Identification of lipid-phosphatidylserine (PS) as the target of unbiasedly selected cancer specific peptide-peptoid hybrid PPS1.

    PubMed

    Desai, Tanvi J; Toombs, Jason E; Minna, John D; Brekken, Rolf A; Udugamasooriya, Damith Gomika

    2016-05-24

    Phosphatidylserine (PS) is an anionic phospholipid maintained on the inner-leaflet of the cell membrane and is externalized in malignant cells. We previously launched a careful unbiased selection targeting biomolecules (e.g. protein, lipid or carbohydrate) distinct to cancer cells by exploiting HCC4017 lung cancer and HBEC30KT normal epithelial cells derived from the same patient, identifying HCC4017 specific peptide-peptoid hybrid PPS1. In this current study, we identified PS as the target of PPS1. We validated direct PPS1 binding to PS using ELISA-like assays, lipid dot blot and liposome based binding assays. In addition, PPS1 recognized other negatively charged and cancer specific lipids such as phosphatidic acid, phosphatidylinositol and phosphatidylglycerol. PPS1 did not bind to neutral lipids such as phosphatidylethanolamine found in cancer and phosphatidylcholine and sphingomyelin found in normal cells. Further we found that the dimeric version of PPS1 (PPS1D1) displayed strong cytotoxicity towards lung cancer cell lines that externalize PS, but not normal cells. PPS1D1 showed potent single agent anti-tumor activity and enhanced the efficacy of docetaxel in mice bearing H460 lung cancer xenografts. Since PS and anionic phospholipid externalization is common across many cancer types, PPS1 may be an alternative to overcome limitations of protein targeted agents.

  15. Estimating Latent Variable Interactions With Non-Normal Observed Data: A Comparison of Four Approaches

    PubMed Central

    Cham, Heining; West, Stephen G.; Ma, Yue; Aiken, Leona S.

    2012-01-01

    A Monte Carlo simulation was conducted to investigate the robustness of four latent variable interaction modeling approaches (Constrained Product Indicator [CPI], Generalized Appended Product Indicator [GAPI], Unconstrained Product Indicator [UPI], and Latent Moderated Structural Equations [LMS]) under high degrees of non-normality of the observed exogenous variables. Results showed that the CPI and LMS approaches yielded biased estimates of the interaction effect when the exogenous variables were highly non-normal. When the violation of non-normality was not severe (normal; symmetric with excess kurtosis < 1), the LMS approach yielded the most efficient estimates of the latent interaction effect with the highest statistical power. In highly non-normal conditions, the GAPI and UPI approaches with ML estimation yielded unbiased latent interaction effect estimates, with acceptable actual Type-I error rates for both the Wald and likelihood ratio tests of interaction effect at N ≥ 500. An empirical example illustrated the use of the four approaches in testing a latent variable interaction between academic self-efficacy and positive family role models in the prediction of academic performance. PMID:23457417

  16. Long-Term Variability of AGN at Hard X-Rays

    NASA Technical Reports Server (NTRS)

    Soldi, S.; Beckmann, V.; Baumgartner W. H.; Ponti, G.; Shrader, C. R.; Lubinski, P.; Krimm, H. A.; Mattana, F.; Tueller, J.

    2013-01-01

    Variability at all observed wavelengths is a distinctive property of active galactic nuclei (AGN). Hard X-rays provide us with a view of the innermost regions of AGN, mostly unbiased by absorption along the line of sight. Characterizing the intrinsic hard X-ray variability of a large AGN sample and comparing it to the results obtained at lower X-ray energies can significantly contribute to our understanding of the mechanisms underlying the high-energy radiation. Methods. Swift/BAT provides us with the unique opportunity to follow, on time scales of days to years and with a regular sampling, the 14-195 keV emission of the largest AGN sample available up to date for this kind of investigation. As a continuation of an early work on the first 9 months of BAT data, we study the amplitude of the variations, and their dependence on sub-class and on energy, for a sample of 110 radio quiet and radio loud AGN selected from the BAT 58-month survey. About 80 of the AGN in the sample are found to exhibit significant variability on months to years time scales, radio loud sources being the most variable. The amplitude of the variations and their energy dependence are incompatible with variability being driven at hard X-rays by changes of the absorption column density. In general, the variations in the 14-24 and 35-100 keV bands are well correlated, suggesting a common origin of the variability across the BAT energy band. However, radio quiet AGN display on average 10 larger variations at 14-24 keV than at 35-100 keV and a softer-when-brighter behavior for most of the Seyfert galaxies with detectable spectral variability on month time scale. In addition, sources with harder spectra are found to be more variable than softer ones. These properties are generally consistent with a variable power law continuum, in flux and shape, pivoting at energies 50 keV, to which a constant reflection component is superposed. When the same time scales are considered, the timing properties of AGN at hard X-rays are comparable to those at lower energies, with at least some of the differences possibly ascribable to components contributing differently in the two energy domains (e.g., reflection, absorption).

  17. Directed acyclic graphs (DAGs): an aid to assess confounding in dental research.

    PubMed

    Merchant, Anwar T; Pitiphat, Waranuch

    2002-12-01

    Confounding, a special type of bias, occurs when an extraneous factor is associated with the exposure and independently affects the outcome. In order to get an unbiased estimate of the exposure-outcome relationship, we need to identify potential confounders, collect information on them, design appropriate studies, and adjust for confounding in data analysis. However, it is not always clear which variables to collect information on and adjust for in the analyses. Inappropriate adjustment for confounding can even introduce bias where none existed. Directed acyclic graphs (DAGs) provide a method to select potential confounders and minimize bias in the design and analysis of epidemiological studies. DAGs have been used extensively in expert systems and robotics. Robins (1987) introduced the application of DAGs in epidemiology to overcome shortcomings of traditional methods to control for confounding, especially as they related to unmeasured confounding. DAGs provide a quick and visual way to assess confounding without making parametric assumptions. We introduce DAGs, starting with definitions and rules for basic manipulation, stressing more on applications than theory. We then demonstrate their application in the control of confounding through examples of observational and cross-sectional epidemiological studies.

  18. Aspects of mutually unbiased bases in odd-prime-power dimensions

    NASA Astrophysics Data System (ADS)

    Chaturvedi, S.

    2002-04-01

    We rephrase the Wootters-Fields construction [W. K. Wootters and B. C. Fields, Ann. Phys. 191, 363 (1989)] of a full set of mutually unbiased bases in a complex vector space of dimensions N=pr, where p is an odd prime, in terms of the character vectors of the cyclic group G of order p. This form may be useful in explicitly writing down mutually unbiased bases for N=pr.

  19. A Comparison of Methods for a Priori Bias Correction in Soil Moisture Data Assimilation

    NASA Technical Reports Server (NTRS)

    Kumar, Sujay V.; Reichle, Rolf H.; Harrison, Kenneth W.; Peters-Lidard, Christa D.; Yatheendradas, Soni; Santanello, Joseph A.

    2011-01-01

    Data assimilation is being increasingly used to merge remotely sensed land surface variables such as soil moisture, snow and skin temperature with estimates from land models. Its success, however, depends on unbiased model predictions and unbiased observations. Here, a suite of continental-scale, synthetic soil moisture assimilation experiments is used to compare two approaches that address typical biases in soil moisture prior to data assimilation: (i) parameter estimation to calibrate the land model to the climatology of the soil moisture observations, and (ii) scaling of the observations to the model s soil moisture climatology. To enable this research, an optimization infrastructure was added to the NASA Land Information System (LIS) that includes gradient-based optimization methods and global, heuristic search algorithms. The land model calibration eliminates the bias but does not necessarily result in more realistic model parameters. Nevertheless, the experiments confirm that model calibration yields assimilation estimates of surface and root zone soil moisture that are as skillful as those obtained through scaling of the observations to the model s climatology. Analysis of innovation diagnostics underlines the importance of addressing bias in soil moisture assimilation and confirms that both approaches adequately address the issue.

  20. Sensory substitution in bilateral vestibular a-reflexic patients

    PubMed Central

    Alberts, Bart B G T; Selen, Luc P J; Verhagen, Wim I M; Medendorp, W Pieter

    2015-01-01

    Patients with bilateral vestibular loss have balance problems in darkness, but maintain spatial orientation rather effectively in the light. It has been suggested that these patients compensate for vestibular cues by relying on extravestibular signals, including visual and somatosensory cues, and integrating them with internal beliefs. How this integration comes about is unknown, but recent literature suggests the healthy brain remaps the various signals into a task-dependent reference frame, thereby weighting them according to their reliability. In this paper, we examined this account in six patients with bilateral vestibular a-reflexia, and compared them to six age-matched healthy controls. Subjects had to report the orientation of their body relative to a reference orientation or the orientation of a flashed luminous line relative to the gravitational vertical, by means of a two-alternative-forced-choice response. We tested both groups psychometrically in upright position (0°) and 90° sideways roll tilt. Perception of body tilt was unbiased in both patients and controls. Response variability, which was larger for 90° tilt, did not differ between groups, indicating that body somatosensory cues have tilt-dependent uncertainty. Perception of the visual vertical was unbiased when upright, but showed systematic undercompensation at 90° tilt. Variability, which was larger for 90° tilt than upright, did not differ between patients and controls. Our results suggest that extravestibular signals substitute for vestibular input in patients’ perception of spatial orientation. This is in line with the current status of rehabilitation programs in acute vestibular patients, targeting at recognizing body somatosensory signals as a reliable replacement for vestibular loss. PMID:25975644

  1. On approximately symmetric informationally complete positive operator-valued measures and related systems of quantum states

    NASA Astrophysics Data System (ADS)

    Klappenecker, Andreas; Rötteler, Martin; Shparlinski, Igor E.; Winterhof, Arne

    2005-08-01

    We address the problem of constructing positive operator-valued measures (POVMs) in finite dimension n consisting of n2 operators of rank one which have an inner product close to uniform. This is motivated by the related question of constructing symmetric informationally complete POVMs (SIC-POVMs) for which the inner products are perfectly uniform. However, SIC-POVMs are notoriously hard to construct and, despite some success of constructing them numerically, there is no analytic construction known. We present two constructions of approximate versions of SIC-POVMs, where a small deviation from uniformity of the inner products is allowed. The first construction is based on selecting vectors from a maximal collection of mutually unbiased bases and works whenever the dimension of the system is a prime power. The second construction is based on perturbing the matrix elements of a subset of mutually unbiased bases. Moreover, we construct vector systems in Cn which are almost orthogonal and which might turn out to be useful for quantum computation. Our constructions are based on results of analytic number theory.

  2. Student and recent graduate employment opportunities

    USGS Publications Warehouse

    ,

    2016-08-30

    As an unbiased, multidisciplinary science organization, the U.S. Geological Survey (USGS) is dedicated to the timely, relevant, and impartial study of the health of our ecosystems and environment, our natural resources, the impacts of climate and land-use change, and the natural hazards that affect our lives. Opportunities for undergraduate and graduate students, as well as recent graduates, to participate in USGS science are available in the selected programs described in this publication. Please note: U.S. citizenship is required for all government positions.

  3. Simulators and virtual reality in surgical education.

    PubMed

    Chou, Betty; Handa, Victoria L

    2006-06-01

    This article explores the pros and cons of virtual reality simulators, their abilities to train and assess surgical skills, and their potential future applications. Computer-based virtual reality simulators and more conventional box trainers are compared and contrasted. The virtual reality simulator provides objective assessment of surgical skills and immediate feedback further to enhance training. With this ability to provide standardized, unbiased assessment of surgical skills, the virtual reality trainer has the potential to be a tool for selecting, instructing, certifying, and recertifying gynecologists.

  4. Internships, employment opportunities, and research grants

    USGS Publications Warehouse

    2008-01-01

    As an unbiased, multidisciplinary science organization that focuses on biology, geography, geology, geospatial information, and water, the U.S. Geological Survey (USGS) is dedicated to the timely, relevant, and impartial study of the landscape, our natural resources, and the natural hazards that threaten us. Opportunities for undergraduate and graduate students and faculty to participate in USGS science are available through the selected programs described below. Please note: U.S. citizenship is required for all positions, although some noncitizens may be eligible in rare circumstances.

  5. Environmental Monitoring and Assessment Program Western Pilot Project - Information about selected fish and macroinvertebrates sampled from North Dakota perennial streams, 2000-2003

    USGS Publications Warehouse

    Vining, Kevin C.; Lundgren, Robert F.

    2008-01-01

    Sixty-five sampling sites, selected by a statistical design to represent lengths of perennial streams in North Dakota, were chosen to be sampled for fish and aquatic insects (macroinvertebrates) to establish unbiased baseline data. Channel catfish and common carp were the most abundant game and large fish species in the Cultivated Plains and Rangeland Plains, respectively. Blackflies were present in more than 50 percent of stream lengths sampled in the State; mayflies and caddisflies were present in more than 80 percent. Dragonflies were present in a greater percentage of stream lengths in the Rangeland Plains than in the Cultivated Plains.

  6. Absolute magnitude calibration using trigonometric parallax - Incomplete, spectroscopic samples

    NASA Technical Reports Server (NTRS)

    Ratnatunga, Kavan U.; Casertano, Stefano

    1991-01-01

    A new numerical algorithm is used to calibrate the absolute magnitude of spectroscopically selected stars from their observed trigonometric parallax. This procedure, based on maximum-likelihood estimation, can retrieve unbiased estimates of the intrinsic absolute magnitude and its dispersion even from incomplete samples suffering from selection biases in apparent magnitude and color. It can also make full use of low accuracy and negative parallaxes and incorporate censorship on reported parallax values. Accurate error estimates are derived for each of the fitted parameters. The algorithm allows an a posteriori check of whether the fitted model gives a good representation of the observations. The procedure is described in general and applied to both real and simulated data.

  7. A Highly Efficient Design Strategy for Regression with Outcome Pooling

    PubMed Central

    Mitchell, Emily M.; Lyles, Robert H.; Manatunga, Amita K.; Perkins, Neil J.; Schisterman, Enrique F.

    2014-01-01

    The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. PMID:25220822

  8. Selective gene dosage by CRISPR-Cas9 genome editing in hexaploid Camelina sativa.

    PubMed

    Morineau, Céline; Bellec, Yannick; Tellier, Frédérique; Gissot, Lionel; Kelemen, Zsolt; Nogué, Fabien; Faure, Jean-Denis

    2017-06-01

    In many plant species, gene dosage is an important cause of phenotype variation. Engineering gene dosage, particularly in polyploid genomes, would provide an efficient tool for plant breeding. The hexaploid oilseed crop Camelina sativa, which has three closely related expressed subgenomes, is an ideal species for investigation of the possibility of creating a large collection of combinatorial mutants. Selective, targeted mutagenesis of the three delta-12-desaturase (FAD2) genes was achieved by CRISPR-Cas9 gene editing, leading to reduced levels of polyunsaturated fatty acids and increased accumulation of oleic acid in the oil. Analysis of mutations over four generations demonstrated the presence of a large variety of heritable mutations in the three isologous CsFAD2 genes. The different combinations of single, double and triple mutants in the T3 generation were isolated, and the complete loss-of-function mutants revealed the importance of delta-12-desaturation for Camelina development. Combinatorial association of different alleles for the three FAD2 loci provided a large diversity of Camelina lines with various lipid profiles, ranging from 10% to 62% oleic acid accumulation in the oil. The different allelic combinations allowed an unbiased analysis of gene dosage and function in this hexaploid species, but also provided a unique source of genetic variability for plant breeding. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  9. A highly efficient design strategy for regression with outcome pooling.

    PubMed

    Mitchell, Emily M; Lyles, Robert H; Manatunga, Amita K; Perkins, Neil J; Schisterman, Enrique F

    2014-12-10

    The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. Copyright © 2014 John Wiley & Sons, Ltd.

  10. Opportunistically collected data reveal habitat selection by migrating Whooping Cranes in the U.S. Northern Plains

    USGS Publications Warehouse

    Niemuth, Neil D.; Ryba, Adam J.; Pearse, Aaron T.; Kvas, Susan M.; Brandt, David; Wangler, Brian; Austin, Jane; Carlisle, Martha J.

    2018-01-01

    The Whooping Crane (Grus americana) is a federally endangered species in the United States and Canada that relies on wetland, grassland, and cropland habitat during its long migration between wintering grounds in coastal Texas, USA, and breeding sites in Alberta and Northwest Territories, Canada. We combined opportunistic Whooping Crane sightings with landscape data to identify correlates of Whooping Crane occurrence along the migration corridor in North Dakota and South Dakota, USA. Whooping Cranes selected landscapes characterized by diverse wetland communities and upland foraging opportunities. Model performance substantially improved when variables related to detection were included, emphasizing the importance of accounting for biases associated with detection and reporting of birds in opportunistic datasets. We created a predictive map showing relative probability of occurrence across the study region by applying our model to GIS data layers; validation using independent, unbiased locations from birds equipped with platform transmitting terminals indicated that our final model adequately predicted habitat use by migrant Whooping Cranes. The probability map demonstrated that existing conservation efforts have protected much top-tier Whooping Crane habitat, especially in the portions of North Dakota and South Dakota that lie east of the Missouri River. Our results can support species recovery by informing prioritization for acquisition and restoration of landscapes that provide safe roosting and foraging habitats. Our results can also guide the siting of structures such as wind towers and electrical transmission and distribution lines, which pose a strike and mortality risk to migrating Whooping Cranes.

  11. Hierarchical thinking in network biology: the unbiased modularization of biochemical networks.

    PubMed

    Papin, Jason A; Reed, Jennifer L; Palsson, Bernhard O

    2004-12-01

    As reconstructed biochemical reaction networks continue to grow in size and scope, there is a growing need to describe the functional modules within them. Such modules facilitate the study of biological processes by deconstructing complex biological networks into conceptually simple entities. The definition of network modules is often based on intuitive reasoning. As an alternative, methods are being developed for defining biochemical network modules in an unbiased fashion. These unbiased network modules are mathematically derived from the structure of the whole network under consideration.

  12. Anthropic selection and the habitability of planets orbiting M and K dwarfs

    NASA Astrophysics Data System (ADS)

    Waltham, Dave

    2011-10-01

    The Earth may have untypical characteristics which were necessary preconditions for the emergence of life and, ultimately, intelligent observers. This paper presents a rigorous procedure for quantifying such "anthropic selection" effects by comparing Earth's properties to those of exoplanets. The hypothesis that there is anthropic selection for stellar mass (i.e. planets orbiting stars with masses within a particular range are more favourable for the emergence of observers) is then tested. The results rule out the expected strong selection for low mass stars which would result, all else being equal, if the typical timescale for the emergence of intelligent observers is very long. This indicates that the habitable zone of small stars may be less hospitable for intelligent life than the habitable zone of solar-mass stars. Additional planetary properties can also be analyzed, using the approach introduced here, once relatively complete and unbiased statistics are made available by current and planned exoplanet characterization projects.

  13. Maximum likelihood estimation for life distributions with competing failure modes

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1979-01-01

    Systems which are placed on test at time zero, function for a period and die at some random time were studied. Failure may be due to one of several causes or modes. The parameters of the life distribution may depend upon the levels of various stress variables the item is subject to. Maximum likelihood estimation methods are discussed. Specific methods are reported for the smallest extreme-value distributions of life. Monte-Carlo results indicate the methods to be promising. Under appropriate conditions, the location parameters are nearly unbiased, the scale parameter is slight biased, and the asymptotic covariances are rapidly approached.

  14. Risk assessment tools in criminal justice and forensic psychiatry: The need for better data.

    PubMed

    Douglas, T; Pugh, J; Singh, I; Savulescu, J; Fazel, S

    2017-05-01

    Violence risk assessment tools are increasingly used within criminal justice and forensic psychiatry, however there is little relevant, reliable and unbiased data regarding their predictive accuracy. We argue that such data are needed to (i) prevent excessive reliance on risk assessment scores, (ii) allow matching of different risk assessment tools to different contexts of application, (iii) protect against problematic forms of discrimination and stigmatisation, and (iv) ensure that contentious demographic variables are not prematurely removed from risk assessment tools. Copyright © 2016 The Author(s). Published by Elsevier Masson SAS.. All rights reserved.

  15. An Unbiased Estimator of Gene Diversity with Improved Variance for Samples Containing Related and Inbred Individuals of any Ploidy

    PubMed Central

    Harris, Alexandre M.; DeGiorgio, Michael

    2016-01-01

    Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781

  16. Sampling in ecology and evolution - bridging the gap between theory and practice

    USGS Publications Warehouse

    Albert, C.H.; Yoccoz, N.G.; Edwards, T.C.; Graham, C.H.; Zimmermann, N.E.; Thuiller, W.

    2010-01-01

    Sampling is a key issue for answering most ecological and evolutionary questions. The importance of developing a rigorous sampling design tailored to specific questions has already been discussed in the ecological and sampling literature and has provided useful tools and recommendations to sample and analyse ecological data. However, sampling issues are often difficult to overcome in ecological studies due to apparent inconsistencies between theory and practice, often leading to the implementation of simplified sampling designs that suffer from unknown biases. Moreover, we believe that classical sampling principles which are based on estimation of means and variances are insufficient to fully address many ecological questions that rely on estimating relationships between a response and a set of predictor variables over time and space. Our objective is thus to highlight the importance of selecting an appropriate sampling space and an appropriate sampling design. We also emphasize the importance of using prior knowledge of the study system to estimate models or complex parameters and thus better understand ecological patterns and processes generating these patterns. Using a semi-virtual simulation study as an illustration we reveal how the selection of the space (e.g. geographic, climatic), in which the sampling is designed, influences the patterns that can be ultimately detected. We also demonstrate the inefficiency of common sampling designs to reveal response curves between ecological variables and climatic gradients. Further, we show that response-surface methodology, which has rarely been used in ecology, is much more efficient than more traditional methods. Finally, we discuss the use of prior knowledge, simulation studies and model-based designs in defining appropriate sampling designs. We conclude by a call for development of methods to unbiasedly estimate nonlinear ecologically relevant parameters, in order to make inferences while fulfilling requirements of both sampling theory and field work logistics. ?? 2010 The Authors.

  17. ON THE RADIO AND OPTICAL LUMINOSITY EVOLUTION OF QUASARS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singal, J.; Petrosian, V.; Lawrence, A.

    2011-12-20

    We calculate simultaneously the radio and optical luminosity evolutions of quasars, and the distribution in radio loudness R defined as the ratio of radio and optical luminosities, using a flux-limited data set containing 636 quasars with radio and optical fluxes from White et al. We first note that when dealing with multi-variate data it is imperative to first determine the true correlations among the variables, not those introduced by the observational selection effects, before obtaining the individual distributions of the variables. We use the methods developed by Efron and Petrosian which are designed to obtain unbiased correlations, distributions, and evolutionmore » with redshift from a data set truncated due to observational biases. It is found that the population of quasars exhibits strong positive correlation between the radio and optical luminosities. With this correlation, whether intrinsic or observationally induced accounted for, we find that there is a strong luminosity evolution with redshift in both wavebands, with significantly higher radio than optical evolution. We conclude that the luminosity evolution obtained by arbitrarily separating the sources into radio-loud (R > 10) and radio-quiet (R < 10) populations introduces significant biases that skew the result considerably. We also construct the local radio and optical luminosity functions and the density evolution. Finally, we consider the distribution of the radio-loudness parameter R obtained from careful treatment of the selection effects and luminosity evolutions with that obtained from the raw data without such considerations. We find a significant difference between the two distributions and no clear sign of bi-modality in the true distribution for the range of R values considered. Our results indicate therefore, somewhat surprisingly, that there is no critical switch in the efficiency of the production of disk outflows/jets between very radio-quiet and very radio-loud quasars, but rather a smooth transition. Also, this efficiency seems higher for the high-redshift and more luminous sources in the sample considered.« less

  18. The dynamics of sex ratio evolution: from the gene perspective to multilevel selection.

    PubMed

    Argasinski, Krzysztof

    2013-01-01

    The new dynamical game theoretic model of sex ratio evolution emphasizes the role of males as passive carriers of sex ratio genes. This shows inconsistency between population genetic models of sex ratio evolution and classical strategic models. In this work a novel technique of change of coordinates will be applied to the new model. This will reveal new aspects of the modelled phenomenon which cannot be shown or proven in the original formulation. The underlying goal is to describe the dynamics of selection of particular genes in the entire population, instead of in the same sex subpopulation, as in the previous paper and earlier population genetics approaches. This allows for analytical derivation of the unbiased strategic model from the model with rigorous non-simplified genetics. In effect, an alternative system of replicator equations is derived. It contains two subsystems: the first describes changes in gene frequencies (this is an alternative unbiased formalization of the Fisher-Dusing argument), whereas the second describes changes in the sex ratios in subpopulations of carriers of genes for each strategy. An intriguing analytical result of this work is that the fitness of a gene depends on the current sex ratio in the subpopulation of its carriers, not on the encoded individual strategy. Thus, the argument of the gene fitness function is not constant but is determined by the trajectory of the sex ratio among carriers of that gene. This aspect of the modelled phenomenon cannot be revealed by the static analysis. Dynamics of the sex ratio among gene carriers is driven by a dynamic "tug of war" between female carriers expressing the encoded strategic trait value and random partners of male carriers expressing the average population strategy (a primary sex ratio). This mechanism can be called "double-level selection". Therefore, gene interest perspective leads to multi-level selection.

  19. Conformational Entropy as Collective Variable for Proteins.

    PubMed

    Palazzesi, Ferruccio; Valsson, Omar; Parrinello, Michele

    2017-10-05

    Many enhanced sampling methods rely on the identification of appropriate collective variables. For proteins, even small ones, finding appropriate descriptors has proven challenging. Here we suggest that the NMR S 2 order parameter can be used to this effect. We trace the validity of this statement to the suggested relation between S 2 and conformational entropy. Using the S 2 order parameter and a surrogate for the protein enthalpy in conjunction with metadynamics or variationally enhanced sampling, we are able to reversibly fold and unfold a small protein and draw its free energy at a fraction of the time that is needed in unbiased simulations. We also use S 2 in combination with the free energy flooding method to compute the unfolding rate of this peptide. We repeat this calculation at different temperatures to obtain the unfolding activation energy.

  20. Density estimation in wildlife surveys

    USGS Publications Warehouse

    Bart, Jonathan; Droege, Sam; Geissler, Paul E.; Peterjohn, Bruce G.; Ralph, C. John

    2004-01-01

    Several authors have recently discussed the problems with using index methods to estimate trends in population size. Some have expressed the view that index methods should virtually never be used. Others have responded by defending index methods and questioning whether better alternatives exist. We suggest that index methods are often a cost-effective component of valid wildlife monitoring but that double-sampling or another procedure that corrects for bias or establishes bounds on bias is essential. The common assertion that index methods require constant detection rates for trend estimation is mathematically incorrect; the requirement is no long-term trend in detection "ratios" (index result/parameter of interest), a requirement that is probably approximately met by many well-designed index surveys. We urge that more attention be given to defining bird density rigorously and in ways useful to managers. Once this is done, 4 sources of bias in density estimates may be distinguished: coverage, closure, surplus birds, and detection rates. Distance, double-observer, and removal methods do not reduce bias due to coverage, closure, or surplus birds. These methods may yield unbiased estimates of the number of birds present at the time of the survey, but only if their required assumptions are met, which we doubt occurs very often in practice. Double-sampling, in contrast, produces unbiased density estimates if the plots are randomly selected and estimates on the intensive surveys are unbiased. More work is needed, however, to determine the feasibility of double-sampling in different populations and habitats. We believe the tension that has developed over appropriate survey methods can best be resolved through increased appreciation of the mathematical aspects of indices, especially the effects of bias, and through studies in which candidate methods are evaluated against known numbers determined through intensive surveys.

  1. Accuracy of genomic selection in European maize elite breeding populations.

    PubMed

    Zhao, Yusheng; Gowda, Manje; Liu, Wenxin; Würschum, Tobias; Maurer, Hans P; Longin, Friedrich H; Ranc, Nicolas; Reif, Jochen C

    2012-03-01

    Genomic selection is a promising breeding strategy for rapid improvement of complex traits. The objective of our study was to investigate the prediction accuracy of genomic breeding values through cross validation. The study was based on experimental data of six segregating populations from a half-diallel mating design with 788 testcross progenies from an elite maize breeding program. The plants were intensively phenotyped in multi-location field trials and fingerprinted with 960 SNP markers. We used random regression best linear unbiased prediction in combination with fivefold cross validation. The prediction accuracy across populations was higher for grain moisture (0.90) than for grain yield (0.58). The accuracy of genomic selection realized for grain yield corresponds to the precision of phenotyping at unreplicated field trials in 3-4 locations. As for maize up to three generations are feasible per year, selection gain per unit time is high and, consequently, genomic selection holds great promise for maize breeding programs.

  2. Pollen-based continental climate reconstructions at 6 and 21 ka: a global synthesis

    USGS Publications Warehouse

    Bartlein, P.J.; Harrison, S.P.; Brewer, Sandra; Connor, S.; Davis, B.A.S.; Gajewski, K.; Guiot, J.; Harrison-Prentice, T. I.; Henderson, A.; Peyron, O.; Prentice, I.C.; Scholze, M.; Seppa, H.; Shuman, B.; Sugita, S.; Thompson, R.S.; Viau, A.E.; Williams, J.; Wu, H.

    2010-01-01

    Subfossil pollen and plant macrofossil data derived from 14C-dated sediment profiles can provide quantitative information on glacial and interglacial climates. The data allow climate variables related to growing-season warmth, winter cold, and plant-available moisture to be reconstructed. Continental-scale reconstructions have been made for the mid-Holocene (MH, around 6 ka) and Last Glacial Maximum (LGM, around 21 ka), allowing comparison with palaeoclimate simulations currently being carried out as part of the fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change. The synthesis of the available MH and LGM climate reconstructions and their uncertainties, obtained using modern-analogue, regression and model-inversion techniques, is presented for four temperature variables and two moisture variables. Reconstructions of the same variables based on surface-pollen assemblages are shown to be accurate and unbiased. Reconstructed LGM and MH climate anomaly patterns are coherent, consistent between variables, and robust with respect to the choice of technique. They support a conceptual model of the controls of Late Quaternary climate change whereby the first-order effects of orbital variations and greenhouse forcing on the seasonal cycle of temperature are predictably modified by responses of the atmospheric circulation and surface energy balance.

  3. Spectroscopic observation of SN2017gkk by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Onori, F.; Benetti, S.; Cappellaro, E.; Losada, Illa R.; Gafton, E.; NUTS Collaboration

    2017-09-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of supernova SN2017gkk (=MASTER OT J091344.71762842.5) in host galaxy NGC 2748.

  4. Spectroscopic observation of ASASSN-17he by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Kostrzewa-Rutkowska, Z.; Benetti, S.; Dong, S.; Stritzinger, M.; Stanek, K.; Brimacombe, J.; Sagues, A.; Galindo, P.; Losada, I. Rivero

    2017-10-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17he. The candidate was discovered by by the All-Sky Automated Survey for Supernovae.

  5. Assessing performance of closed-loop insulin delivery systems by continuous glucose monitoring: drawbacks and way forward.

    PubMed

    Hovorka, Roman; Nodale, Marianna; Haidar, Ahmad; Wilinska, Malgorzata E

    2013-01-01

    We investigated whether continuous glucose monitoring (CGM) levels can accurately assess glycemic control while directing closed-loop insulin delivery. Data were analyzed retrospectively from 33 subjects with type 1 diabetes who underwent closed-loop and conventional pump therapy on two separate nights. Glycemic control was evaluated by reference plasma glucose and contrasted against three methods based on Navigator (Abbott Diabetes Care, Alameda, CA) CGM levels. Glucose mean and variability were estimated by unmodified CGM levels with acceptable clinical accuracy. Time when glucose was in target range was overestimated by CGM during closed-loop nights (CGM vs. plasma glucose median [interquartile range], 86% [65-97%] vs. 75% [59-91%]; P=0.04) but not during conventional pump therapy (57% [32-72%] vs. 51% [29-68%]; P=0.82) providing comparable treatment effect (mean [SD], 28% [29%] vs. 23% [21%]; P=0.11). Using the CGM measurement error of 15% derived from plasma glucose-CGM pairs (n=4,254), stochastic interpretation of CGM gave unbiased estimate of time in target during both closed-loop (79% [62-86%] vs. 75% [59-91%]; P=0.24) and conventional pump therapy (54% [33-66%] vs. 51% [29-68%]; P=0.44). Treatment effect (23% [24%] vs. 23% [21%]; P=0.96) and time below target were accurately estimated by stochastic CGM. Recalibrating CGM using reference plasma glucose values taken at the start and end of overnight closed-loop was not superior to stochastic CGM. CGM is acceptable to estimate glucose mean and variability, but without adjustment it may overestimate benefit of closed-loop. Stochastic CGM provided unbiased estimate of time when glucose is in target and below target and may be acceptable for assessment of closed-loop in the outpatient setting.

  6. Highly sensitive and unbiased approach for elucidating antibody repertoires

    PubMed Central

    Lin, Sherry G.; Ba, Zhaoqing; Du, Zhou; Zhang, Yu; Hu, Jiazhi; Alt, Frederick W.

    2016-01-01

    Developing B lymphocytes undergo V(D)J recombination to assemble germ-line V, D, and J gene segments into exons that encode the antigen-binding variable region of Ig heavy (H) and light (L) chains. IgH and IgL chains associate to form the B-cell receptor (BCR), which, upon antigen binding, activates B cells to secrete BCR as an antibody. Each of the huge number of clonally independent B cells expresses a unique set of IgH and IgL variable regions. The ability of V(D)J recombination to generate vast primary B-cell repertoires results from a combinatorial assortment of large numbers of different V, D, and J segments, coupled with diversification of the junctions between them to generate the complementary determining region 3 (CDR3) for antigen contact. Approaches to evaluate in depth the content of primary antibody repertoires and, ultimately, to study how they are further molded by secondary mutation and affinity maturation processes are of great importance to the B-cell development, vaccine, and antibody fields. We now describe an unbiased, sensitive, and readily accessible assay, referred to as high-throughput genome-wide translocation sequencing-adapted repertoire sequencing (HTGTS-Rep-seq), to quantify antibody repertoires. HTGTS-Rep-seq quantitatively identifies the vast majority of IgH and IgL V(D)J exons, including their unique CDR3 sequences, from progenitor and mature mouse B lineage cells via the use of specific J primers. HTGTS-Rep-seq also accurately quantifies DJH intermediates and V(D)J exons in either productive or nonproductive configurations. HTGTS-Rep-seq should be useful for studies of human samples, including clonal B-cell expansions, and also for following antibody affinity maturation processes. PMID:27354528

  7. Convergence of Free Energy Profile of Coumarin in Lipid Bilayer

    PubMed Central

    2012-01-01

    Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from “pulling” coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives. PMID:22545027

  8. Convergence of Free Energy Profile of Coumarin in Lipid Bilayer.

    PubMed

    Paloncýová, Markéta; Berka, Karel; Otyepka, Michal

    2012-04-10

    Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from "pulling" coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives.

  9. Providing regular care for grandchildren in Thailand: An analysis of the impact on grandparents' health.

    PubMed

    Komonpaisarn, Touchanun; Loichinger, Elke

    2018-05-17

    One of the many roles of grandparents is the role as caretaker for their grandchildren. Studies looking into the situation of older adults providing care for their grandchildren have found that care responsibilities can have beneficial effects but can also pose challenges to those providing it, depending on individual and societal circumstances. The objective of our study is to shed light on the health effects of providing care for grandchildren younger than 10 years of age on grandparents. Whether this experience has positive or negative effects on the caretaker's health depends on a range of factors that we explore here in the context of Thailand. The study is based on the quantitative analysis of the 2011 round of the National Survey of Older Persons in Thailand. In order to control for endogeneity between health status and the provision of care, we apply several instrumental variable (IV) approaches in addition to regular regressions. In terms of health status, we make use of four health-related variables: self-reported health status, functional limitations, happiness level and information about negative feelings. The observed positive impact of grandparenting on three health outcomes that we find with non-endogeneity-controlled OLS analyses is likely due to reverse causality or self-selection into becoming a grandparent who provides care. The unbiased results imply that regularly taking care of young grandchildren does not provide any physical health benefits; to the contrary, it seems to have a negative impact on self-rated health, functional limitations and psychological well-being, supporting the role strain theory. Copyright © 2018 Elsevier Ltd. All rights reserved.

  10. Spectroscopic classification of Gaia18adv by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Gall, C.; Benetti, S.; Wyrzykowski, L.; Stritzinger, M.; Holmbo, S.; Dong, S.; Siltala, Lauri; NUTS Collaboration

    2018-01-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of Gaia18adv (SN2018hh) near the host galaxy SDSS J121341.37+282640.0.

  11. Large Area Crop Inventory Experiment (LACIE). Development of procedure M for multicrop inventory, with tests of a spring-wheat configuration

    NASA Technical Reports Server (NTRS)

    Horvath, R. (Principal Investigator); Cicone, R.; Crist, E.; Kauth, R. J.; Lambeck, P.; Malila, W. A.; Richardson, W.

    1979-01-01

    The author has identified the following significant results. An outgrowth of research and development activities in support of LACIE was a multicrop area estimation procedure, Procedure M. This procedure was a flexible, modular system that could be operated within the LACIE framework. Its distinctive features were refined preprocessing (including spatially varying correction for atmospheric haze), definition of field like spatial features for labeling, spectral stratification, and unbiased selection of samples to label and crop area estimation without conventional maximum likelihood classification.

  12. Spectroscopic classification of supernovae SN 2018aei and SN 2018aej by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Cannizzaro, G.; Kuncarayakti, H.; Fraser, M.; Hamanowicz, A.; Jonker, P.; Kankare, E.; Kostrzewa-Rutkowska, Z.; Onori, F.; Wevers, T.; Wyrzykowski, L.; Galbany, L.

    2018-03-01

    The NOT Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of supernovae SN 2018aei and SN 2018aej, discovered by PanSTARSS Survey for Transients (ATel #11408).

  13. Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen

    Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.

  14. Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits

    DOE PAGES

    Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; ...

    2018-03-12

    Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.

  15. Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits

    NASA Astrophysics Data System (ADS)

    Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; Gauthier, Daniel J.

    2018-03-01

    We propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator-coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.

  16. Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data.

    PubMed

    Caulfield, Thomas R; Devkota, Batsal; Rollins, Geoffrey C

    2011-01-01

    We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16.

  17. Stochastic loss and gain of symmetric divisions in the C. elegans epidermis perturbs robustness of stem cell number

    PubMed Central

    Katsanos, Dimitris; Koneru, Sneha L.; Mestek Boukhibar, Lamia; Gritti, Nicola; Ghose, Ritobrata; Appleford, Peter J.; Doitsidou, Maria; Woollard, Alison; van Zon, Jeroen S.; Poole, Richard J.

    2017-01-01

    Biological systems are subject to inherent stochasticity. Nevertheless, development is remarkably robust, ensuring the consistency of key phenotypic traits such as correct cell numbers in a certain tissue. It is currently unclear which genes modulate phenotypic variability, what their relationship is to core components of developmental gene networks, and what is the developmental basis of variable phenotypes. Here, we start addressing these questions using the robust number of Caenorhabditis elegans epidermal stem cells, known as seam cells, as a readout. We employ genetics, cell lineage tracing, and single molecule imaging to show that mutations in lin-22, a Hes-related basic helix-loop-helix (bHLH) transcription factor, increase seam cell number variability. We show that the increase in phenotypic variability is due to stochastic conversion of normally symmetric cell divisions to asymmetric and vice versa during development, which affect the terminal seam cell number in opposing directions. We demonstrate that LIN-22 acts within the epidermal gene network to antagonise the Wnt signalling pathway. However, lin-22 mutants exhibit cell-to-cell variability in Wnt pathway activation, which correlates with and may drive phenotypic variability. Our study demonstrates the feasibility to study phenotypic trait variance in tractable model organisms using unbiased mutagenesis screens. PMID:29108019

  18. CUDA Optimization Strategies for Compute- and Memory-Bound Neuroimaging Algorithms

    PubMed Central

    Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W.

    2011-01-01

    As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. PMID:21159404

  19. CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

    PubMed

    Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

    2012-06-01

    As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  20. Role of the p55-gamma subunit of PI3K in ALK-induced cell migration: RNAi-based selection of cell migration regulators.

    PubMed

    Seo, Minchul; Kim, Jong-Heon; Suk, Kyoungho

    2017-05-04

    Recently, unbiased functional genetic selection identified novel cell migration-regulating genes. This RNAi-based functional selection was performed using 63,996 pooled lentiviral shRNAs targeting 21,332 mouse genes. After five rounds of selection using cells with accelerated or impaired migration, shRNAs were retrieved and identified by half-hairpin barcode sequencing using cells with the selected phenotypes. This selection process led to the identification of 29 novel cell migration regulators. One of these candidates, anaplastic lymphoma kinase (ALK), was further investigated. Subsequent studies revealed that ALK promoted cell migration through the PI3K-AKT pathway via the p55γ regulatory subunit of PI3K, rather than more commonly used p85 subunit. Western blot and immunohistochemistry studies using mouse brain tissues revealed similar temporal expression patterns of ALK, phospho-p55γ, and phospho-AKT during different stages of development. These data support an important role for the p55γ subunit of PI3K in ALK-induced cell migration during brain development.

  1. Applying operational research and data mining to performance based medical personnel motivation system.

    PubMed

    Niaksu, Olegas; Zaptorius, Jonas

    2014-01-01

    This paper presents the methodology suitable for creation of a performance related remuneration system in healthcare sector, which would meet requirements for efficiency and sustainable quality of healthcare services. Methodology for performance indicators selection, ranking and a posteriori evaluation has been proposed and discussed. Priority Distribution Method is applied for unbiased performance criteria weighting. Data mining methods are proposed to monitor and evaluate the results of motivation system.We developed a method for healthcare specific criteria selection consisting of 8 steps; proposed and demonstrated application of Priority Distribution Method for the selected criteria weighting. Moreover, a set of data mining methods for evaluation of the motivational system outcomes was proposed. The described methodology for calculating performance related payment needs practical approbation. We plan to develop semi-automated tools for institutional and personal performance indicators monitoring. The final step would be approbation of the methodology in a healthcare facility.

  2. Z-rich solar particle event characteristics 1972-1976

    NASA Technical Reports Server (NTRS)

    Zwickl, R. D.; Roelof, E. C.; Gold, R. E.; Krimigis, S. M.; Armstrong, T. P.

    1978-01-01

    It is found in the reported investigation that Z-rich solar particle events usually have large and prolonged anisotropies in addition to an extremely variable charge composition that varies not only from event to event but also throughout the event. These observations suggest that one can no longer regard the event-averaged composition of solar particle events at low energies as providing an unbiased global sample of the solar atmospheric composition. The variability from event to event and among classes of events is just too great. However, the tendency for the Z-rich events to be associated with both the low-speed solar wind at or just before the onset of solar wind streams and with active regions located in the western hemisphere, indicates that charge composition studies of solar particle events can yield a better knowledge of the flare acceleration process as well as the inhomogeneous nature of magnetic field structure and particle composition in the solar atmosphere.

  3. Multifractal cross-correlation effects in two-variable time series of complex network vertex observables

    NASA Astrophysics Data System (ADS)

    OświÈ©cimka, Paweł; Livi, Lorenzo; DroŻdŻ, Stanisław

    2016-10-01

    We investigate the scaling of the cross-correlations calculated for two-variable time series containing vertex properties in the context of complex networks. Time series of such observables are obtained by means of stationary, unbiased random walks. We consider three vertex properties that provide, respectively, short-, medium-, and long-range information regarding the topological role of vertices in a given network. In order to reveal the relation between these quantities, we applied the multifractal cross-correlation analysis technique, which provides information about the nonlinear effects in coupling of time series. We show that the considered network models are characterized by unique multifractal properties of the cross-correlation. In particular, it is possible to distinguish between Erdös-Rényi, Barabási-Albert, and Watts-Strogatz networks on the basis of fractal cross-correlation. Moreover, the analysis of protein contact networks reveals characteristics shared with both scale-free and small-world models.

  4. METAGUI 3: A graphical user interface for choosing the collective variables in molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Giorgino, Toni; Laio, Alessandro; Rodriguez, Alex

    2017-08-01

    Molecular dynamics (MD) simulations allow the exploration of the phase space of biopolymers through the integration of equations of motion of their constituent atoms. The analysis of MD trajectories often relies on the choice of collective variables (CVs) along which the dynamics of the system is projected. We developed a graphical user interface (GUI) for facilitating the interactive choice of the appropriate CVs. The GUI allows: defining interactively new CVs; partitioning the configurations into microstates characterized by similar values of the CVs; calculating the free energies of the microstates for both unbiased and biased (metadynamics) simulations; clustering the microstates in kinetic basins; visualizing the free energy landscape as a function of a subset of the CVs used for the analysis. A simple mouse click allows one to quickly inspect structures corresponding to specific points in the landscape.

  5. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.

    PubMed

    Buettner, Florian; Natarajan, Kedar N; Casale, F Paolo; Proserpio, Valentina; Scialdone, Antonio; Theis, Fabian J; Teichmann, Sarah A; Marioni, John C; Stegle, Oliver

    2015-02-01

    Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.

  6. From metadynamics to dynamics.

    PubMed

    Tiwary, Pratyush; Parrinello, Michele

    2013-12-06

    Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.

  7. From Metadynamics to Dynamics

    NASA Astrophysics Data System (ADS)

    Tiwary, Pratyush; Parrinello, Michele

    2013-12-01

    Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.

  8. Spectroscopic observation of SN 2017jzp and SN 2018bf by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Kuncarayakti, H.; Mattila, S.; Kotak, R.; Harmanen, J.; Reynolds, T.; Wyrzykowski, L.; Stritzinger, M.; Onori, F.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.

    2018-01-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of SNe 2017jzp and 2018bf in host galaxies KUG 1326+679 and SDSS J225746.53+253833.5, respectively.

  9. Spectroscopic observation of ASASSN-17nb and CSS170922:172546+342249 by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Harmanen, J.; Mattila, S.; Kuncarayakti, H.; Reynolds, T.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.; Dong, S.; Pastorello, A.; Pursimo, T.; NUTS Collaboration

    2017-10-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17nb in MCG+06-17-007 and CSS170922:172546+342249 in an unknown host galaxy.

  10. Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression

    PubMed Central

    Garrett, Neil; Sharot, Tali; Faulkner, Paul; Korn, Christoph W.; Roiser, Jonathan P.; Dolan, Raymond J.

    2014-01-01

    Recent evidence suggests that a state of good mental health is associated with biased processing of information that supports a positively skewed view of the future. Depression, on the other hand, is associated with unbiased processing of such information. Here, we use brain imaging in conjunction with a belief update task administered to clinically depressed patients and healthy controls to characterize brain activity that supports unbiased belief updating in clinically depressed individuals. Our results reveal that unbiased belief updating in depression is mediated by strong neural coding of estimation errors in response to both good news (in left inferior frontal gyrus and bilateral superior frontal gyrus) and bad news (in right inferior parietal lobule and right inferior frontal gyrus) regarding the future. In contrast, intact mental health was linked to a relatively attenuated neural coding of bad news about the future. These findings identify a neural substrate mediating the breakdown of biased updating in major depression disorder, which may be essential for mental health. PMID:25221492

  11. Allowable SEM noise for unbiased LER measurement

    NASA Astrophysics Data System (ADS)

    Papavieros, George; Constantoudis, Vassilios; Gogolides, Evangelos

    2018-03-01

    Recently, a novel method for the calculation of unbiased Line Edge Roughness based on Power Spectral Density analysis has been proposed. In this paper first an alternative method is discussed and investigated, utilizing the Height-Height Correlation Function (HHCF) of edges. The HHCF-based method enables the unbiased determination of the whole triplet of LER parameters including besides rms the correlation length and roughness exponent. The key of both methods is the sensitivity of PSD and HHCF on noise at high frequencies and short distance respectively. Secondly, we elaborate a testbed of synthesized SEM images with controlled LER and noise to justify the effectiveness of the proposed unbiased methods. Our main objective is to find out the boundaries of the method in respect to noise levels and roughness characteristics, for which the method remains reliable, i.e the maximum amount of noise allowed, for which the output results cope with the controllable known inputs. At the same time, we will also set the extremes of roughness parameters for which the methods hold their accuracy.

  12. Filopodia Conduct Target Selection in Cortical Neurons Using Differences in Signal Kinetics of a Single Kinase.

    PubMed

    Mao, Yu-Ting; Zhu, Julia X; Hanamura, Kenji; Iurilli, Giuliano; Datta, Sandeep Robert; Dalva, Matthew B

    2018-05-16

    Dendritic filopodia select synaptic partner axons by interviewing the cell surface of potential targets, but how filopodia decipher the complex pattern of adhesive and repulsive molecular cues to find appropriate contacts is unknown. Here, we demonstrate in cortical neurons that a single cue is sufficient for dendritic filopodia to reject or select specific axonal contacts for elaboration as synaptic sites. Super-resolution and live-cell imaging reveals that EphB2 is located in the tips of filopodia and at nascent synaptic sites. Surprisingly, a genetically encoded indicator of EphB kinase activity, unbiased classification, and a photoactivatable EphB2 reveal that simple differences in the kinetics of EphB kinase signaling at the tips of filopodia mediate the choice between retraction and synaptogenesis. This may enable individual filopodia to choose targets based on differences in the activation rate of a single tyrosine kinase, greatly simplifying the process of partner selection and suggesting a general principle. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. Block-circulant matrices with circulant blocks, Weil sums, and mutually unbiased bases. II. The prime power case

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Combescure, Monique

    2009-03-15

    In our previous paper [Combescure, M., 'Circulant matrices, Gauss sums and the mutually unbiased bases. I. The prime number case', Cubo A Mathematical Journal (unpublished)] we have shown that the theory of circulant matrices allows to recover the result that there exists p+1 mutually unbiased bases in dimension p, p being an arbitrary prime number. Two orthonormal bases B, B{sup '} of C{sup d} are said mutually unbiased if for all b(set-membership sign)B, for all b{sup '}(set-membership sign)B{sup '} one has that |b{center_dot}b{sup '}|=1/{radical}(d) (b{center_dot}b{sup '} Hermitian scalar product in C{sup d}). In this paper we show that the theorymore » of block-circulant matrices with circulant blocks allows to show very simply the known result that if d=p{sup n} (p a prime number and n any integer) there exists d+1 mutually unbiased bases in C{sup d}. Our result relies heavily on an idea of Klimov et al. [''Geometrical approach to the discrete Wigner function,'' J. Phys. A 39, 14471 (2006)]. As a subproduct we recover properties of quadratic Weil sums for p{>=}3, which generalizes the fact that in the prime case the quadratic Gauss sum properties follow from our results.« less

  14. Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data

    PubMed Central

    Caulfield, Thomas R.; Devkota, Batsal; Rollins, Geoffrey C.

    2011-01-01

    We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16. PMID:21716650

  15. A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers

    PubMed Central

    2009-01-01

    Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835

  16. Criteria for the use of regression analysis for remote sensing of sediment and pollutants

    NASA Technical Reports Server (NTRS)

    Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R.

    1982-01-01

    An examination of limitations, requirements, and precision of the linear multiple-regression technique for quantification of marine environmental parameters is conducted. Both environmental and optical physics conditions have been defined for which an exact solution to the signal response equations is of the same form as the multiple regression equation. Various statistical parameters are examined to define a criteria for selection of an unbiased fit when upwelled radiance values contain error and are correlated with each other. Field experimental data are examined to define data smoothing requirements in order to satisfy the criteria of Daniel and Wood (1971). Recommendations are made concerning improved selection of ground-truth locations to maximize variance and to minimize physical errors associated with the remote sensing experiment.

  17. Unbiased Estimates of Variance Components with Bootstrap Procedures

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2007-01-01

    This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…

  18. Spectroscopic observation of Gaia17dht and Gaia17diu by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Fraser, M.; Dyrbye, S.; Cappella, E.

    2017-12-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of Gaia17dht/SN2017izz and Gaia17diu/SN2017jdb (in host galaxies SDSS J145121.24+283521.6 and LEDA 2753585 respectively).

  19. Statistics as Unbiased Estimators: Exploring the Teaching of Standard Deviation

    ERIC Educational Resources Information Center

    Wasserman, Nicholas H.; Casey, Stephanie; Champion, Joe; Huey, Maryann

    2017-01-01

    This manuscript presents findings from a study about the knowledge for and planned teaching of standard deviation. We investigate how understanding variance as an unbiased (inferential) estimator--not just a descriptive statistic for the variation (spread) in data--is related to teachers' instruction regarding standard deviation, particularly…

  20. Generalized approach for using unbiased symmetric metrics with negative values: normalized mean bias factor and normalized mean absolute error factor

    EPA Science Inventory

    Unbiased symmetric metrics provide a useful measure to quickly compare two datasets, with similar interpretations for both under and overestimations. Two examples include the normalized mean bias factor and normalized mean absolute error factor. However, the original formulations...

  1. Pollen-based continental climate reconstructions at 6 and 21 ka: A global synthesis

    USGS Publications Warehouse

    Bartlein, P.J.; Harrison, S.P.; Brewer, Sandra; Connor, S.; Davis, B.A.S.; Gajewski, K.; Guiot, J.; Harrison-Prentice, T. I.; Henderson, A.; Peyron, O.; Prentice, I.C.; Scholze, M.; Seppa, H.; Shuman, B.; Sugita, S.; Thompson, R.S.; Viau, A.E.; Williams, J.; Wu, H.

    2011-01-01

    Subfossil pollen and plant macrofossil data derived from 14C-dated sediment profiles can provide quantitative information on glacial and interglacial climates. The data allow climate variables related to growing-season warmth, winter cold, and plant-available moisture to be reconstructed. Continental-scale reconstructions have been made for the mid-Holocene (MH, around 6 ka) and Last Glacial Maximum (LGM, around 21 ka), allowing comparison with palaeoclimate simulations currently being carried out as part of the fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change. The synthesis of the available MH and LGM climate reconstructions and their uncertainties, obtained using modern-analogue, regression and model-inversion techniques, is presented for four temperature variables and two moisture variables. Reconstructions of the same variables based on surface-pollen assemblages are shown to be accurate and unbiased. Reconstructed LGM and MH climate anomaly patterns are coherent, consistent between variables, and robust with respect to the choice of technique. They support a conceptual model of the controls of Late Quaternary climate change whereby the first-order effects of orbital variations and greenhouse forcing on the seasonal cycle of temperature are predictably modified by responses of the atmospheric circulation and surface energy balance. ?? 2010 The Author(s).

  2. Responder analysis without dichotomization.

    PubMed

    Zhang, Zhiwei; Chu, Jianxiong; Rahardja, Dewi; Zhang, Hui; Tang, Li

    2016-01-01

    In clinical trials, it is common practice to categorize subjects as responders and non-responders on the basis of one or more clinical measurements under pre-specified rules. Such a responder analysis is often criticized for the loss of information in dichotomizing one or more continuous or ordinal variables. It is worth noting that a responder analysis can be performed without dichotomization, because the proportion of responders for each treatment can be derived from a model for the original clinical variables (used to define a responder) and estimated by substituting maximum likelihood estimators of model parameters. This model-based approach can be considerably more efficient and more effective for dealing with missing data than the usual approach based on dichotomization. For parameter estimation, the model-based approach generally requires correct specification of the model for the original variables. However, under the sharp null hypothesis, the model-based approach remains unbiased for estimating the treatment difference even if the model is misspecified. We elaborate on these points and illustrate them with a series of simulation studies mimicking a study of Parkinson's disease, which involves longitudinal continuous data in the definition of a responder.

  3. Simultaneous selection for cowpea (Vigna unguiculata L.) genotypes with adaptability and yield stability using mixed models.

    PubMed

    Torres, F E; Teodoro, P E; Rodrigues, E V; Santos, A; Corrêa, A M; Ceccon, G

    2016-04-29

    The aim of this study was to select erect cowpea (Vigna unguiculata L.) genotypes simultaneously for high adaptability, stability, and yield grain in Mato Grosso do Sul, Brazil using mixed models. We conducted six trials of different cowpea genotypes in 2005 and 2006 in Aquidauana, Chapadão do Sul, Dourados, and Primavera do Leste. The experimental design was randomized complete blocks with four replications and 20 genotypes. Genetic parameters were estimated by restricted maximum likelihood/best linear unbiased prediction, and selection was based on the harmonic mean of the relative performance of genetic values method using three strategies: selection based on the predicted breeding value, having considered the performance mean of the genotypes in all environments (no interaction effect); the performance in each environment (with an interaction effect); and the simultaneous selection for grain yield, stability, and adaptability. The MNC99542F-5 and MNC99-537F-4 genotypes could be grown in various environments, as they exhibited high grain yield, adaptability, and stability. The average heritability of the genotypes was moderate to high and the selective accuracy was 82%, indicating an excellent potential for selection.

  4. Advanced Energy Retrofit Guide: Practical Ways to Improve Energy Performance, K-12 Schools (Book)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    The U.S. Department of Energy developed the K-12 Advanced Energy Retrofit Guide to provide specific methodologies, information, and guidance to help energy managers and other stakeholders plan and execute energy efficiency improvements. We emphasize actionable information, practical methodologies, diverse case studies, and unbiased evaluation of the most promising retrofit measure for each building type. K-12 schools were selected as one of the highest priority building sectors, because schools affect the lives of most Americans. They also represent approximately 8% of the energy use and 10% of the floor area in commercial buildings.

  5. Beyond the bulk: disclosing the life of single microbial cells

    PubMed Central

    Rosenthal, Katrin; Oehling, Verena

    2017-01-01

    Abstract Microbial single cell analysis has led to discoveries that are beyond what can be resolved with population-based studies. It provides a pristine view of the mechanisms that organize cellular physiology, unbiased by population heterogeneity or uncontrollable environmental impacts. A holistic description of cellular functions at the single cell level requires analytical concepts beyond the miniaturization of existing technologies, defined but uncontrolled by the biological system itself. This review provides an overview of the latest advances in single cell technologies and demonstrates their potential. Opportunities and limitations of single cell microbiology are discussed using selected application-related examples. PMID:29029257

  6. Nature vs. Nurture: The influence of OB star environments on proto-planetary disk evolution

    NASA Astrophysics Data System (ADS)

    Bouwman, Jeroen

    2006-09-01

    We propose a combined IRAC/IRS study of a large, well-defined and unbiased X-ray selected sample of pre-main-sequence stars in three OB associations: Pismis 24 in NGC 6357, NGC 2244 in the Rosette Nebula, and IC 1795 in the W3 complex. The samples are based on recent Chandra X-ray Observatory studies which reliably identify hundreds of cluster members and were carefully chosen to avoid high infrared nebular background. A new Chandra exposure of IC 1795 is requested, and an optical followup to characterise the host stars is planned.

  7. Population genetics of polymorphism and divergence for diploid selection models with arbitrary dominance.

    PubMed

    Williamson, Scott; Fledel-Alon, Adi; Bustamante, Carlos D

    2004-09-01

    We develop a Poisson random-field model of polymorphism and divergence that allows arbitrary dominance relations in a diploid context. This model provides a maximum-likelihood framework for estimating both selection and dominance parameters of new mutations using information on the frequency spectrum of sequence polymorphisms. This is the first DNA sequence-based estimator of the dominance parameter. Our model also leads to a likelihood-ratio test for distinguishing nongenic from genic selection; simulations indicate that this test is quite powerful when a large number of segregating sites are available. We also use simulations to explore the bias in selection parameter estimates caused by unacknowledged dominance relations. When inference is based on the frequency spectrum of polymorphisms, genic selection estimates of the selection parameter can be very strongly biased even for minor deviations from the genic selection model. Surprisingly, however, when inference is based on polymorphism and divergence (McDonald-Kreitman) data, genic selection estimates of the selection parameter are nearly unbiased, even for completely dominant or recessive mutations. Further, we find that weak overdominant selection can increase, rather than decrease, the substitution rate relative to levels of polymorphism. This nonintuitive result has major implications for the interpretation of several popular tests of neutrality.

  8. Mixed models for selection of Jatropha progenies with high adaptability and yield stability in Brazilian regions.

    PubMed

    Teodoro, P E; Bhering, L L; Costa, R D; Rocha, R B; Laviola, B G

    2016-08-19

    The aim of this study was to estimate genetic parameters via mixed models and simultaneously to select Jatropha progenies grown in three regions of Brazil that meet high adaptability and stability. From a previous phenotypic selection, three progeny tests were installed in 2008 in the municipalities of Planaltina-DF (Midwest), Nova Porteirinha-MG (Southeast), and Pelotas-RS (South). We evaluated 18 families of half-sib in a randomized block design with three replications. Genetic parameters were estimated using restricted maximum likelihood/best linear unbiased prediction. Selection was based on the harmonic mean of the relative performance of genetic values method in three strategies considering: 1) performance in each environment (with interaction effect); 2) performance in each environment (with interaction effect); and 3) simultaneous selection for grain yield, stability and adaptability. Accuracy obtained (91%) reveals excellent experimental quality and consequently safety and credibility in the selection of superior progenies for grain yield. The gain with the selection of the best five progenies was more than 20%, regardless of the selection strategy. Thus, based on the three selection strategies used in this study, the progenies 4, 11, and 3 (selected in all environments and the mean environment and by adaptability and phenotypic stability methods) are the most suitable for growing in the three regions evaluated.

  9. Mutually unbiased bases in six dimensions: The four most distant bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Raynal, Philippe; Lue Xin; Englert, Berthold-Georg

    2011-06-15

    We consider the average distance between four bases in six dimensions. The distance between two orthonormal bases vanishes when the bases are the same, and the distance reaches its maximal value of unity when the bases are unbiased. We perform a numerical search for the maximum average distance and find it to be strictly smaller than unity. This is strong evidence that no four mutually unbiased bases exist in six dimensions. We also provide a two-parameter family of three bases which, together with the canonical basis, reach the numerically found maximum of the average distance, and we conduct a detailedmore » study of the structure of the extremal set of bases.« less

  10. Extreme Mean and Its Applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.

    1979-01-01

    Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.

  11. Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design

    ERIC Educational Resources Information Center

    Smith, William C.

    2014-01-01

    The ability of regression discontinuity (RD) designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs) make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education…

  12. Comparison of normalization methods for the analysis of metagenomic gene abundance data.

    PubMed

    Pereira, Mariana Buongermino; Wallroth, Mikael; Jonsson, Viktor; Kristiansson, Erik

    2018-04-20

    In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead to incorrect or obfuscated biological interpretation.

  13. Variation in cassava germplasm for tolerance to post-harvest physiological deterioration.

    PubMed

    Venturini, M T; Santos, L R; Vildoso, C I A; Santos, V S; Oliveira, E J

    2016-05-06

    Tolerant varieties can effectively control post-harvest physiological deterioration (PPD) of cassava, although knowledge on the genetic variability and inheritance of this trait is needed. The objective of this study was to estimate genetic parameters and identify sources of tolerance to PPD and their stability in cassava accessions. Roots from 418 cassava accessions, grown in four independent experiments, were evaluated for PPD tolerance 0, 2, 5, and 10 days post-harvest. Data were transformed into area under the PPD-progress curve (AUP-PPD) to quantify tolerance. Genetic parameters, stability (Si), adaptability (Ai), and the joint analysis of stability and adaptability (Zi) were obtained via residual maximum likelihood (REML) and best linear unbiased prediction (BLUP) methods. Variance in the genotype (G) x environment (E) interaction and genotypic variance were important for PPD tolerance. Individual broad-sense heritability (hg(2)= 0.38 ± 0.04) and average heritability in accessions (hmg(2)= 0.52) showed high genetic control of PPD tolerance. Genotypic correlation of AUP-PPD in different experiments was of medium magnitude (ȓgA = 0.42), indicating significant G x E interaction. The predicted genotypic values o f G x E free of interaction (û + ĝi) showed high variation. Of the 30 accessions with high Zi, 19 were common to û + ĝi, Si, and Ai parameters. The genetic gain with selection of these 19 cassava accessions was -55.94, -466.86, -397.72, and -444.03% for û + ĝi, Si, Ai, and Zi, respectively, compared with the overall mean for each parameter. These results demonstrate the variability and potential of cassava germplasm to introduce PPD tolerance in commercial varieties.

  14. The Longterm Centimeter-band Total Flux and Linear Polarization Properties of the Pearson-Readhead Survey Sources

    NASA Astrophysics Data System (ADS)

    Aller, M. F.; Aller, H. D.; Hughes, P. A.

    2001-12-01

    Using centimeter-band total flux and linear polarization observations of the Pearson-Readhead sample sources systematically obtained with the UMRAO 26-m radio telescope during the past 16 years, we identify the range of variability properties and their temporal changes as functions of both optical and radio morphological classification. We find that our earlier statistical analysis, based on a time window of 6.4 years, did not delineate the full amplitude range of the total flux variability; further, several galaxies exhibit longterm, systematic changes or rather infrequent outbursts requiring long term observations for detection. Using radio classification as a delineator, we confirm, and find additional evidence, that significant changes in flux density can occur in steep spectrum and lobe-dominated objects as well as in compact, flat-spectrum objects. We find that statistically the time-averaged total flux density spectra steepen when longer time windows are included, which we attribute to a selection effect in the source sample. We have identified preferred orientations of the electric vector of the polarized emission (EVPA) in an unbiased manner in several sources, including several QSOs which have exhibited large variations in total flux while maintaining stable EVPAs, and compared these with orientations of the flow direction indicated by VLB morphology. We have looked for systematic, monotonic changes in EVPA which might be expected in the emission from a precessing jet, but none were identified. A Scargle periodogram analysis found no strong evidence for periodicity in any of the sample sources. We thank the NSF for grants AST-8815678, AST-9120224, AST-9421979, and AST-9900723 which provided partial support for this research. The operation of the 26-meter telescope is supported by the University of Michigan Department of Astronomy.

  15. Variability of pesticide detections and concentrations in field replicate water samples collected for the National Water-Quality Assessment Program, 1992-97

    USGS Publications Warehouse

    Martin, Jeffrey D.

    2002-01-01

    Correlation analysis indicates that for most pesticides and concentrations, pooled estimates of relative standard deviation rather than pooled estimates of standard deviation should be used to estimate variability because pooled estimates of relative standard deviation are less affected by heteroscedasticity. The 2 Variability of Pesticide Detections and Concentrations in Field Replicate Water Samples, 1992–97 median pooled relative standard deviation was calculated for all pesticides to summarize the typical variability for pesticide data collected for the NAWQA Program. The median pooled relative standard deviation was 15 percent at concentrations less than 0.01 micrograms per liter (µg/L), 13 percent at concentrations near 0.01 µg/L, 12 percent at concentrations near 0.1 µg/L, 7.9 percent at concentrations near 1 µg/L, and 2.7 percent at concentrations greater than 5 µg/L. Pooled estimates of standard deviation or relative standard deviation presented in this report are larger than estimates based on averages, medians, smooths, or regression of the individual measurements of standard deviation or relative standard deviation from field replicates. Pooled estimates, however, are the preferred method for characterizing variability because they provide unbiased estimates of the variability of the population. Assessments of variability based on standard deviation (rather than variance) underestimate the true variability of the population. Because pooled estimates of variability are larger than estimates based on other approaches, users of estimates of variability must be cognizant of the approach used to obtain the estimate and must use caution in the comparison of estimates based on different approaches.

  16. MRMC analysis of agreement studies

    NASA Astrophysics Data System (ADS)

    Gallas, Brandon D.; Anam, Amrita; Chen, Weijie; Wunderlich, Adam; Zhang, Zhiwei

    2016-03-01

    The purpose of this work is to present and evaluate methods based on U-statistics to compare intra- or inter-reader agreement across different imaging modalities. We apply these methods to multi-reader multi-case (MRMC) studies. We measure reader-averaged agreement and estimate its variance accounting for the variability from readers and cases (an MRMC analysis). In our application, pathologists (readers) evaluate patient tissue mounted on glass slides (cases) in two ways. They evaluate the slides on a microscope (reference modality) and they evaluate digital scans of the slides on a computer display (new modality). In the current work, we consider concordance as the agreement measure, but many of the concepts outlined here apply to other agreement measures. Concordance is the probability that two readers rank two cases in the same order. Concordance can be estimated with a U-statistic and thus it has some nice properties: it is unbiased, asymptotically normal, and its variance is given by an explicit formula. Another property of a U-statistic is that it is symmetric in its inputs; it doesn't matter which reader is listed first or which case is listed first, the result is the same. Using this property and a few tricks while building the U-statistic kernel for concordance, we get a mathematically tractable problem and efficient software. Simulations show that our variance and covariance estimates are unbiased.

  17. Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries

    USGS Publications Warehouse

    McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.

    2013-01-01

    Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.

  18. Two-stage sequential sampling: A neighborhood-free adaptive sampling procedure

    USGS Publications Warehouse

    Salehi, M.; Smith, D.R.

    2005-01-01

    Designing an efficient sampling scheme for a rare and clustered population is a challenging area of research. Adaptive cluster sampling, which has been shown to be viable for such a population, is based on sampling a neighborhood of units around a unit that meets a specified condition. However, the edge units produced by sampling neighborhoods have proven to limit the efficiency and applicability of adaptive cluster sampling. We propose a sampling design that is adaptive in the sense that the final sample depends on observed values, but it avoids the use of neighborhoods and the sampling of edge units. Unbiased estimators of population total and its variance are derived using Murthy's estimator. The modified two-stage sampling design is easy to implement and can be applied to a wider range of populations than adaptive cluster sampling. We evaluate the proposed sampling design by simulating sampling of two real biological populations and an artificial population for which the variable of interest took the value either 0 or 1 (e.g., indicating presence and absence of a rare event). We show that the proposed sampling design is more efficient than conventional sampling in nearly all cases. The approach used to derive estimators (Murthy's estimator) opens the door for unbiased estimators to be found for similar sequential sampling designs. ?? 2005 American Statistical Association and the International Biometric Society.

  19. Quantum random bit generation using energy fluctuations in stimulated Raman scattering.

    PubMed

    Bustard, Philip J; England, Duncan G; Nunn, Josh; Moffatt, Doug; Spanner, Michael; Lausten, Rune; Sussman, Benjamin J

    2013-12-02

    Random number sequences are a critical resource in modern information processing systems, with applications in cryptography, numerical simulation, and data sampling. We introduce a quantum random number generator based on the measurement of pulse energy quantum fluctuations in Stokes light generated by spontaneously-initiated stimulated Raman scattering. Bright Stokes pulse energy fluctuations up to five times the mean energy are measured with fast photodiodes and converted to unbiased random binary strings. Since the pulse energy is a continuous variable, multiple bits can be extracted from a single measurement. Our approach can be generalized to a wide range of Raman active materials; here we demonstrate a prototype using the optical phonon line in bulk diamond.

  20. Fortune Favours the Bold: An Agent-Based Model Reveals Adaptive Advantages of Overconfidence in War

    PubMed Central

    Johnson, Dominic D. P.; Weidmann, Nils B.; Cederman, Lars-Erik

    2011-01-01

    Overconfidence has long been considered a cause of war. Like other decision-making biases, overconfidence seems detrimental because it increases the frequency and costs of fighting. However, evolutionary biologists have proposed that overconfidence may also confer adaptive advantages: increasing ambition, resolve, persistence, bluffing opponents, and winning net payoffs from risky opportunities despite occasional failures. We report the results of an agent-based model of inter-state conflict, which allows us to evaluate the performance of different strategies in competition with each other. Counter-intuitively, we find that overconfident states predominate in the population at the expense of unbiased or underconfident states. Overconfident states win because: (1) they are more likely to accumulate resources from frequent attempts at conquest; (2) they are more likely to gang up on weak states, forcing victims to split their defences; and (3) when the decision threshold for attacking requires an overwhelming asymmetry of power, unbiased and underconfident states shirk many conflicts they are actually likely to win. These “adaptive advantages” of overconfidence may, via selection effects, learning, or evolved psychology, have spread and become entrenched among modern states, organizations and decision-makers. This would help to explain the frequent association of overconfidence and war, even if it no longer brings benefits today. PMID:21731627

  1. Network topological analysis reveals the functional cohesiveness for the newly discovered links by Yeast 2 Hybrid approach

    NASA Astrophysics Data System (ADS)

    Ghiassian, Susan; Pevzner, Sam; Rolland, Thomas; Tassan, Murat; Barabasi, Albert Laszlo; Vidal, Mark; CCNR, Northeastern University Collaboration; Dana Farber Cancer Institute Collaboration

    2014-03-01

    Protein-protein interaction maps and interactomes are the blueprint of Network Medicine and systems biology and are being experimentally studied by different groups. Despite the wide usage of Literature Curated Interactome (LCI), these sources are biased towards different parameters such as highly studied proteins. Yeast two hybrid method is a high throughput experimental setup which screens proteins in an unbiased fashion. Current knowledge of protein interactions is far from complete. In fact the previous offered data from Y2H method (2005), is estimated to offer only 5% of all potential protein interactions. Currently this coverage has increased to 20% of what is known as reference HI In this work we study the topological properties of Y2H protein-protein interactions network with LCI and show although they both agree on some properties, LCI shows a clear unbiased nature of interaction selections. Most importantly, we assess the properties of PPI as it evolves with increasing the coverage. We show that, the newly discovered interactions tend to connect proteins that have been closer than average in the previous PPI release. reinforcing the modular structure of PPI. Furthermore, we show, some unseen effects on PPI (as opposed to LCI) can be explained by its incompleteness.

  2. Revisiting AFLP fingerprinting for an unbiased assessment of genetic structure and differentiation of taurine and zebu cattle

    PubMed Central

    2014-01-01

    Background Descendants from the extinct aurochs (Bos primigenius), taurine (Bos taurus) and zebu cattle (Bos indicus) were domesticated 10,000 years ago in Southwestern and Southern Asia, respectively, and colonized the world undergoing complex events of admixture and selection. Molecular data, in particular genome-wide single nucleotide polymorphism (SNP) markers, can complement historic and archaeological records to elucidate these past events. However, SNP ascertainment in cattle has been optimized for taurine breeds, imposing limitations to the study of diversity in zebu cattle. As amplified fragment length polymorphism (AFLP) markers are discovered and genotyped as the samples are assayed, this type of marker is free of ascertainment bias. In order to obtain unbiased assessments of genetic differentiation and structure in taurine and zebu cattle, we analyzed a dataset of 135 AFLP markers in 1,593 samples from 13 zebu and 58 taurine breeds, representing nine continental areas. Results We found a geographical pattern of expected heterozygosity in European taurine breeds decreasing with the distance from the domestication centre, arguing against a large-scale introgression from European or African aurochs. Zebu cattle were found to be at least as diverse as taurine cattle. Western African zebu cattle were found to have diverged more from Indian zebu than South American zebu. Model-based clustering and ancestry informative markers analyses suggested that this is due to taurine introgression. Although a large part of South American zebu cattle also descend from taurine cows, we did not detect significant levels of taurine ancestry in these breeds, probably because of systematic backcrossing with zebu bulls. Furthermore, limited zebu introgression was found in Podolian taurine breeds in Italy. Conclusions The assessment of cattle diversity reported here contributes an unbiased global view to genetic differentiation and structure of taurine and zebu cattle populations, which is essential for an effective conservation of the bovine genetic resources. PMID:24739206

  3. Five instruments for measuring tree height: an evaluation

    Treesearch

    Michael S. Williams; William A. Bechtold; V.J. LaBau

    1994-01-01

    Five instruments were tested for reliability in measuring tree heights under realistic conditions. Four linear models were used to determine if tree height can be measured unbiasedly over all tree sizes and if any of the instruments were more efficient in estimating tree height. The laser height finder was the only instrument to produce unbiased estimates of the true...

  4. Spectroscopic observations of ATLAS17lcs (SN 2017guv) and ASASSN-17mq (AT 2017gvo) by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Dong, Subo; Bose, Subhash; Stritzinger, M.; Holmbo, S.; Fraser, M.; Fedorets, G.

    2017-10-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ATLAS17lcs (SN 2017guv) and ASASSN-17mq (AT 2017gvo) in host galaxies 2MASX J19132225-1648031 and CGCG 225-050, respectively.

  5. Spectroscopic observations of ASASSN-17io and ATLAS17hpt (SN 2017faf) by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Pastorello, Andrea; Benetti, Stefano; Cappellaro, Enrico; Terreran, Giacomo; Tomasella, Lina; Fedorets, Grigori; NUTS Collaboration

    2017-07-01

    The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17io in the galaxy CGCG 316-010, along with the re classification of ATLAS17hpt (SN 2017faf), which was previously classified as a SLSN-I (ATel #10549).

  6. Contextual classification of multispectral image data: An unbiased estimator for the context distribution

    NASA Technical Reports Server (NTRS)

    Tilton, J. C.; Swain, P. H. (Principal Investigator); Vardeman, S. B.

    1981-01-01

    A key input to a statistical classification algorithm, which exploits the tendency of certain ground cover classes to occur more frequently in some spatial context than in others, is a statistical characterization of the context: the context distribution. An unbiased estimator of the context distribution is discussed which, besides having the advantage of statistical unbiasedness, has the additional advantage over other estimation techniques of being amenable to an adaptive implementation in which the context distribution estimate varies according to local contextual information. Results from applying the unbiased estimator to the contextual classification of three real LANDSAT data sets are presented and contrasted with results from non-contextual classifications and from contextual classifications utilizing other context distribution estimation techniques.

  7. Nearby stars of the Galactic disc and halo - IV

    NASA Astrophysics Data System (ADS)

    Fuhrmann, Klaus

    2008-02-01

    The Milky Way Galaxy has an age of about 13 billion years. Solar-type stars evolve all the long way to the realm of degenerate objects on essentially this time-scale. This, as well as the particular advantage that the Sun offers through reliable differential spectroscopic analyses, render these stars the ideal tracers for the fossil record of our parent spiral. Astrophysics is a science that is known to be notoriously plagued by selection effects. The present work - with a major focus in this fourth contribution on model atmosphere analyses of spectroscopic binaries and multiple star systems - aims at a volume-complete sample of about 300 nearby F-, G-, and K-type stars that particularly avoids any kinematical or chemical pre-selection from the outset. It thereby provides an unbiased record of the local stellar populations - the ancient thick disc and the much younger thin disc. On this base, the detailed individual scrutiny of the long-lived stars of both populations unveils the thick disc as a single-burst component with a local normalization of no less than 20 per cent. This enormous fraction, combined with its much larger scaleheight, implies a mass for the thick disc that is comparable to that of the thin disc. On account of its completely different mass-to-light ratio the thick disc thereby becomes the dark side of the Milky Way, an ideal major source for baryonic dark matter. This massive, ancient population consequently challenges any gradual build-up scenario for our parent spiral. Even more, on the supposition that the Galaxy is not unusual, the thick disc - as it emerges from this unbiased spectroscopic work - particularly challenges the hierarchical cold-dark-matter-dominated formation picture for spiral galaxies in general.

  8. CONNECTING GRBs AND ULIRGs: A SENSITIVE, UNBIASED SURVEY FOR RADIO EMISSION FROM GAMMA-RAY BURST HOST GALAXIES AT 0 < z < 2.5

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perley, D. A.; Perley, R. A.; Hjorth, J.

    2015-03-10

    Luminous infrared galaxies and submillimeter galaxies contribute significantly to stellar mass assembly and provide an important test of the connection between the gamma-ray burst (GRB) rate and that of overall cosmic star formation. We present sensitive 3 GHz radio observations using the Karl G. Jansky Very Large Array of 32 uniformly selected GRB host galaxies spanning a redshift range from 0 < z < 2.5, providing the first fully dust- and sample-unbiased measurement of the fraction of GRBs originating from the universe's most bolometrically luminous galaxies. Four galaxies are detected, with inferred radio star formation rates (SFRs) ranging between 50 and 300 Mmore » {sub ☉} yr{sup –1}. Three of the four detections correspond to events consistent with being optically obscured 'dark' bursts. Our overall detection fraction implies that between 9% and 23% of GRBs between 0.5 < z < 2.5 occur in galaxies with S {sub 3GHz} > 10 μJy, corresponding to SFR > 50 M {sub ☉} yr{sup –1} at z ∼ 1 or >250 M {sub ☉} yr{sup –1} at z ∼ 2. Similar galaxies contribute approximately 10%-30% of all cosmic star formation, so our results are consistent with a GRB rate that is not strongly biased with respect to the total SFR of a galaxy. However, all four radio-detected hosts have stellar masses significantly lower than IR/submillimeter-selected field galaxies of similar luminosities. We suggest that the GRB rate may be suppressed in metal-rich environments but independently enhanced in intense starbursts, producing a strong efficiency dependence on mass but little net dependence on bulk galaxy SFR.« less

  9. NREL Evaluates Performance of Fast-Charge Electric Buses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-09-16

    This real-world performance evaluation is designed to enhance understanding of the overall usage and effectiveness of electric buses in transit operation and to provide unbiased technical information to other agencies interested in adding such vehicles to their fleets. Initial results indicate that the electric buses under study offer significant fuel and emissions savings. The final results will help Foothill Transit optimize the energy-saving potential of its transit fleet. NREL's performance evaluations help vehicle manufacturers fine-tune their designs and help fleet managers select fuel-efficient, low-emission vehicles that meet their bottom line and operational goals. help Foothill Transit optimize the energy-saving potentialmore » of its transit fleet. NREL's performance evaluations help vehicle manufacturers fine-tune their designs and help fleet managers select fuel-efficient, low-emission vehicles that meet their bottom line and operational goals.« less

  10. Correction of Selection Bias in Survey Data: Is the Statistical Cure Worse Than the Bias?

    PubMed

    Hanley, James A

    2017-04-01

    In previous articles in the American Journal of Epidemiology (Am J Epidemiol. 2013;177(5):431-442) and American Journal of Public Health (Am J Public Health. 2013;103(10):1895-1901), Masters et al. reported age-specific hazard ratios for the contrasts in mortality rates between obesity categories. They corrected the observed hazard ratios for selection bias caused by what they postulated was the nonrepresentativeness of the participants in the National Health Interview Study that increased with age, obesity, and ill health. However, it is possible that their regression approach to remove the alleged bias has not produced, and in general cannot produce, sensible hazard ratio estimates. First, we must consider how many nonparticipants there might have been in each category of obesity and of age at entry and how much higher the mortality rates would have to be in nonparticipants than in participants in these same categories. What plausible set of numerical values would convert the ("biased") decreasing-with-age hazard ratios seen in the data into the ("unbiased") increasing-with-age ratios that they computed? Can these values be encapsulated in (and can sensible values be recovered from) one additional internal variable in a regression model? Second, one must examine the age pattern of the hazard ratios that have been adjusted for selection. Without the correction, the hazard ratios are attenuated with increasing age. With it, the hazard ratios at older ages are considerably higher, but those at younger ages are well below one. Third, one must test whether the regression approach suggested by Masters et al. would correct the nonrepresentativeness that increased with age and ill health that I introduced into real and hypothetical data sets. I found that the approach did not recover the hazard ratio patterns present in the unselected data sets: the corrections overshot the target at older ages and undershot it at lower ages.

  11. Selective Proteasomal Degradation of the B′β Subunit of Protein Phosphatase 2A by the E3 Ubiquitin Ligase Adaptor Kelch-like 15*

    PubMed Central

    Oberg, Elizabeth A.; Nifoussi, Shanna K.; Gingras, Anne-Claude; Strack, Stefan

    2012-01-01

    Protein phosphatase 2A (PP2A), a ubiquitous and pleiotropic regulator of intracellular signaling, is composed of a core dimer (AC) bound to a variable (B) regulatory subunit. PP2A is an enzyme family of dozens of heterotrimers with different subcellular locations and cellular substrates dictated by the B subunit. B′β is a brain-specific PP2A regulatory subunit that mediates dephosphorylation of Ca2+/calmodulin-dependent protein kinase II and tyrosine hydroxylase. Unbiased proteomic screens for B′β interactors identified Cullin3 (Cul3), a scaffolding component of E3 ubiquitin ligase complexes, and the previously uncharacterized Kelch-like 15 (KLHL15). KLHL15 is one of ∼40 Kelch-like proteins, many of which have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases. Here, we report that KLHL15-Cul3 specifically targets B′β to promote turnover of the PP2A subunit by ubiquitylation and proteasomal degradation. Comparison of KLHL15 and B′β tissue expression profiles suggests that the E3 ligase adaptor contributes to selective expression of the PP2A/B′β holoenzyme in the brain. We mapped KLHL15 residues critical for homodimerization as well as interaction with Cul3 and B′β. Explaining PP2A subunit selectivity, the divergent N terminus of B′β was found necessary and sufficient for KLHL15-mediated degradation, with Tyr-52 having an obligatory role. Although KLHL15 can interact with the PP2A/B′β heterotrimer, it only degrades B′β, thus promoting exchange with other regulatory subunits. E3 ligase adaptor-mediated control of PP2A holoenzyme composition thereby adds another layer of regulation to cellular dephosphorylation events. PMID:23135275

  12. Estimating tree bole volume using artificial neural network models for four species in Turkey.

    PubMed

    Ozçelik, Ramazan; Diamantopoulou, Maria J; Brooks, John R; Wiant, Harry V

    2010-01-01

    Tree bole volumes of 89 Scots pine (Pinus sylvestris L.), 96 Brutian pine (Pinus brutia Ten.), 107 Cilicica fir (Abies cilicica Carr.) and 67 Cedar of Lebanon (Cedrus libani A. Rich.) trees were estimated using Artificial Neural Network (ANN) models. Neural networks offer a number of advantages including the ability to implicitly detect complex nonlinear relationships between input and output variables, which is very helpful in tree volume modeling. Two different neural network architectures were used and produced the Back propagation (BPANN) and the Cascade Correlation (CCANN) Artificial Neural Network models. In addition, tree bole volume estimates were compared to other established tree bole volume estimation techniques including the centroid method, taper equations, and existing standard volume tables. An overview of the features of ANNs and traditional methods is presented and the advantages and limitations of each one of them are discussed. For validation purposes, actual volumes were determined by aggregating the volumes of measured short sections (average 1 meter) of the tree bole using Smalian's formula. The results reported in this research suggest that the selected cascade correlation artificial neural network (CCANN) models are reliable for estimating the tree bole volume of the four examined tree species since they gave unbiased results and were superior to almost all methods in terms of error (%) expressed as the mean of the percentage errors. 2009 Elsevier Ltd. All rights reserved.

  13. Risk appreciation for living kidney donors: another new subspecialty?

    PubMed

    Steiner, Robert W

    2004-05-01

    Quantitative estimates of the risk of end stage renal disease (ESRD) for living donors would seem essential to defensible donor selection practices, as the 'safe/unsafe' model for donor selection is not viable. All kidney donors take risk, and four fundamental, qualitative criteria should instead be used to decide when donor rejection is justified. These criteria are lack of donor education about transplantation, donor irrationality, lack of free and voluntary donation, and/or that donor acceptance would unavoidably threaten the public trust or the integrity of the center's selection procedures. Such a data-based selection policy, with explicit documentation of unbiased and comprehensive donor education, will help neutralize the center's self interest in a more defensible way than by rejecting 'complicated' kidney donors out of hand, and in a more practical way than by the creation of center-independent donor counselors or waiting for donor registries to come to fruition. Living kidney donors with isolated medical abnormalities comprise a sizable subset of at risk donors for whom center acceptance practices vary markedly. This population provides a paradigm opportunity for quantitative risk estimation and counseling.

  14. How memory mechanisms are a key component in the guidance of our eye movements: evidence from the global effect.

    PubMed

    Silvis, J D; Van der Stigchel, S

    2014-04-01

    Investigating eye movements has been a promising approach to uncover the role of visual working memory in early attentional processes. Prior research has already demonstrated that eye movements in search tasks are more easily drawn toward stimuli that show similarities to working memory content, as compared with neutral stimuli. Previous saccade tasks, however, have always required a selection process, thereby automatically recruiting working memory. The present study was an attempt to confirm the role of working memory in oculomotor selection in an unbiased saccade task that rendered memory mechanisms irrelevant. Participants executed a saccade in a display with two elements, without any instruction to aim for one particular element. The results show that when two objects appear simultaneously, a working memory match attracts the first saccade more profoundly than do mismatch objects, an effect that was present throughout the saccade latency distribution. These findings demonstrate that memory plays a fundamental biasing role in the earliest competitive processes in the selection of visual objects, even when working memory is not recruited during selection.

  15. Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris

    Treesearch

    Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey

    2005-01-01

    Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...

  16. Unbiased nonorthogonal bases for tomographic reconstruction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sainz, Isabel; Klimov, Andrei B.; Roa, Luis

    2010-05-15

    We have developed a general method for constructing a set of nonorthogonal bases with equal separations between all different basis states in prime dimensions. The results are that the corresponding biorthogonal counterparts are pairwise unbiased with the components of the original bases. Using these bases, we derive an explicit expression for the optimal tomography in nonorthogonal bases. A special two-dimensional case is analyzed separately.

  17. The dependability of medical students' performance ratings as documented on in-training evaluations.

    PubMed

    van Barneveld, Christina

    2005-03-01

    To demonstrate an approach to obtain an unbiased estimate of the dependability of students' performance ratings during training, when the data-collection design includes nesting of student in rater, unbalanced nest sizes, and dependent observations. In 2003, two variance components analyses of in-training evaluation (ITE) report data were conducted using urGENOVA software. In the first analysis, the dependability for the nested and unbalanced data-collection design was calculated. In the second analysis, an approach using multiple generalizability studies was used to obtain an unbiased estimate of the student variance component, resulting in an unbiased estimate of dependability. Results suggested that there is bias in estimates of the dependability of students' performance on ITEs that are attributable to the data-collection design. When the bias was corrected, the results indicated that the dependability of ratings of student performance was almost zero. The combination of the multiple generalizability studies method and the use of specialized software provides an unbiased estimate of the dependability of ratings of student performance on ITE scores for data-collection designs that include nesting of student in rater, unbalanced nest sizes, and dependent observations.

  18. Variationally Optimized Free-Energy Flooding for Rate Calculation.

    PubMed

    McCarty, James; Valsson, Omar; Tiwary, Pratyush; Parrinello, Michele

    2015-08-14

    We propose a new method to obtain kinetic properties of infrequent events from molecular dynamics simulation. The procedure employs a recently introduced variational approach [Valsson and Parrinello, Phys. Rev. Lett. 113, 090601 (2014)] to construct a bias potential as a function of several collective variables that is designed to flood the associated free energy surface up to a predefined level. The resulting bias potential effectively accelerates transitions between metastable free energy minima while ensuring bias-free transition states, thus allowing accurate kinetic rates to be obtained. We test the method on a few illustrative systems for which we obtain an order of magnitude improvement in efficiency relative to previous approaches and several orders of magnitude relative to unbiased molecular dynamics. We expect an even larger improvement in more complex systems. This and the ability of the variational approach to deal efficiently with a large number of collective variables will greatly enhance the scope of these calculations. This work is a vindication of the potential that the variational principle has if applied in innovative ways.

  19. Simultaneous Estimation of Model State Variables and Observation and Forecast Biases Using a Two-Stage Hybrid Kalman Filter

    NASA Technical Reports Server (NTRS)

    Pauwels, V. R. N.; DeLannoy, G. J. M.; Hendricks Franssen, H.-J.; Vereecken, H.

    2013-01-01

    In this paper, we present a two-stage hybrid Kalman filter to estimate both observation and forecast bias in hydrologic models, in addition to state variables. The biases are estimated using the discrete Kalman filter, and the state variables using the ensemble Kalman filter. A key issue in this multi-component assimilation scheme is the exact partitioning of the difference between observation and forecasts into state, forecast bias and observation bias updates. Here, the error covariances of the forecast bias and the unbiased states are calculated as constant fractions of the biased state error covariance, and the observation bias error covariance is a function of the observation prediction error covariance. In a series of synthetic experiments, focusing on the assimilation of discharge into a rainfall-runoff model, it is shown that both static and dynamic observation and forecast biases can be successfully estimated. The results indicate a strong improvement in the estimation of the state variables and resulting discharge as opposed to the use of a bias-unaware ensemble Kalman filter. Furthermore, minimal code modification in existing data assimilation software is needed to implement the method. The results suggest that a better performance of data assimilation methods should be possible if both forecast and observation biases are taken into account.

  20. Independent contrasts and PGLS regression estimators are equivalent.

    PubMed

    Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

    2012-05-01

    We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.

  1. The Taguchi methodology as a statistical tool for biotechnological applications: a critical appraisal.

    PubMed

    Rao, Ravella Sreenivas; Kumar, C Ganesh; Prakasham, R Shetty; Hobbs, Phil J

    2008-04-01

    Success in experiments and/or technology mainly depends on a properly designed process or product. The traditional method of process optimization involves the study of one variable at a time, which requires a number of combinations of experiments that are time, cost and labor intensive. The Taguchi method of design of experiments is a simple statistical tool involving a system of tabulated designs (arrays) that allows a maximum number of main effects to be estimated in an unbiased (orthogonal) fashion with a minimum number of experimental runs. It has been applied to predict the significant contribution of the design variable(s) and the optimum combination of each variable by conducting experiments on a real-time basis. The modeling that is performed essentially relates signal-to-noise ratio to the control variables in a 'main effect only' approach. This approach enables both multiple response and dynamic problems to be studied by handling noise factors. Taguchi principles and concepts have made extensive contributions to industry by bringing focused awareness to robustness, noise and quality. This methodology has been widely applied in many industrial sectors; however, its application in biological sciences has been limited. In the present review, the application and comparison of the Taguchi methodology has been emphasized with specific case studies in the field of biotechnology, particularly in diverse areas like fermentation, food processing, molecular biology, wastewater treatment and bioremediation.

  2. Ligand-directed profiling of organelles with internalizing phage libraries

    PubMed Central

    Dobroff, Andrey S.; Rangel, Roberto; Guzman-Roja, Liliana; Salmeron, Carolina C.; Gelovani, Juri G.; Sidman, Richard L.; Bologa, Cristian G.; Oprea, Tudor I.; Brinker, C. Jeffrey; Pasqualini, Renata; Arap, Wadih

    2015-01-01

    Phage display is a resourceful tool to, in an unbiased manner, discover and characterize functional protein-protein interactions, to create vaccines, and to engineer peptides, antibodies, and other proteins as targeted diagnostic and/or therapeutic agents. Recently, our group has developed a new class of internalizing phage (iPhage) for ligand-directed targeting of organelles and/or to identify molecular pathways within live cells. This unique technology is suitable for applications ranging from fundamental cell biology to drug development. Here we describe the method for generating and screening the iPhage display system, and explain how to select and validate candidate internalizing homing peptide. PMID:25640897

  3. Simultaneous grouping pursuit and feature selection over an undirected graph*

    PubMed Central

    Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei

    2013-01-01

    Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061

  4. Genomic selection in sugar beet breeding populations.

    PubMed

    Würschum, Tobias; Reif, Jochen C; Kraft, Thomas; Janssen, Geert; Zhao, Yusheng

    2013-09-18

    Genomic selection exploits dense genome-wide marker data to predict breeding values. In this study we used a large sugar beet population of 924 lines representing different germplasm types present in breeding populations: unselected segregating families and diverse lines from more advanced stages of selection. All lines have been intensively phenotyped in multi-location field trials for six agronomically important traits and genotyped with 677 SNP markers. We used ridge regression best linear unbiased prediction in combination with fivefold cross-validation and obtained high prediction accuracies for all except one trait. In addition, we investigated whether a calibration developed based on a training population composed of diverse lines is suited to predict the phenotypic performance within families. Our results show that the prediction accuracy is lower than that obtained within the diverse set of lines, but comparable to that obtained by cross-validation within the respective families. The results presented in this study suggest that a training population derived from intensively phenotyped and genotyped diverse lines from a breeding program does hold potential to build up robust calibration models for genomic selection. Taken together, our results indicate that genomic selection is a valuable tool and can thus complement the genomics toolbox in sugar beet breeding.

  5. Treatment Effect Estimation Using Nonlinear Two-Stage Instrumental Variable Estimators: Another Cautionary Note.

    PubMed

    Chapman, Cole G; Brooks, John M

    2016-12-01

    To examine the settings of simulation evidence supporting use of nonlinear two-stage residual inclusion (2SRI) instrumental variable (IV) methods for estimating average treatment effects (ATE) using observational data and investigate potential bias of 2SRI across alternative scenarios of essential heterogeneity and uniqueness of marginal patients. Potential bias of linear and nonlinear IV methods for ATE and local average treatment effects (LATE) is assessed using simulation models with a binary outcome and binary endogenous treatment across settings varying by the relationship between treatment effectiveness and treatment choice. Results show that nonlinear 2SRI models produce estimates of ATE and LATE that are substantially biased when the relationships between treatment and outcome for marginal patients are unique from relationships for the full population. Bias of linear IV estimates for LATE was low across all scenarios. Researchers are increasingly opting for nonlinear 2SRI to estimate treatment effects in models with binary and otherwise inherently nonlinear dependent variables, believing that it produces generally unbiased and consistent estimates. This research shows that positive properties of nonlinear 2SRI rely on assumptions about the relationships between treatment effect heterogeneity and choice. © Health Research and Educational Trust.

  6. Quantum Inference on Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Yoder, Theodore; Low, Guang Hao; Chuang, Isaac

    2014-03-01

    Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.

  7. An annual quasidifference approach to water price elasticity

    NASA Astrophysics Data System (ADS)

    Bell, David R.; Griffin, Ronald C.

    2008-08-01

    The preferred price specification for retail water demand estimation has not been fully settled by prior literature. Empirical consistency of price indices is necessary to enable testing of competing specifications. Available methods of unbiasing the price index are summarized here. Using original rate information from several hundred Texas utilities, new indices of marginal and average price change are constructed. Marginal water price change is shown to explain consumption variation better than average water price change, based on standard information criteria. Annual change in quantity consumed per month is estimated with differences in climate variables and the new quasidifference marginal price index. As expected, the annual price elasticity of demand is found to vary with daily high and low temperatures and the frequency of precipitation.

  8. Testing for biases in selection on avian reproductive traits and partitioning direct and indirect selection using quantitative genetic models.

    PubMed

    Reed, Thomas E; Gienapp, Phillip; Visser, Marcel E

    2016-10-01

    Key life history traits such as breeding time and clutch size are frequently both heritable and under directional selection, yet many studies fail to document microevolutionary responses. One general explanation is that selection estimates are biased by the omission of correlated traits that have causal effects on fitness, but few valid tests of this exist. Here, we show, using a quantitative genetic framework and six decades of life-history data on two free-living populations of great tits Parus major, that selection estimates for egg-laying date and clutch size are relatively unbiased. Predicted responses to selection based on the Robertson-Price Identity were similar to those based on the multivariate breeder's equation (MVBE), indicating that unmeasured covarying traits were not missing from the analysis. Changing patterns of phenotypic selection on these traits (for laying date, linked to climate change) therefore reflect changing selection on breeding values, and genetic constraints appear not to limit their independent evolution. Quantitative genetic analysis of correlational data from pedigreed populations can be a valuable complement to experimental approaches to help identify whether apparent associations between traits and fitness are biased by missing traits, and to parse the roles of direct versus indirect selection across a range of environments. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.

  9. Generation and evaluation of an ultra-high-field atlas with applications in DBS planning

    NASA Astrophysics Data System (ADS)

    Wang, Brian T.; Poirier, Stefan; Guo, Ting; Parrent, Andrew G.; Peters, Terry M.; Khan, Ali R.

    2016-03-01

    Purpose Deep brain stimulation (DBS) is a common treatment for Parkinson's disease (PD) and involves the use of brain atlases or intrinsic landmarks to estimate the location of target deep brain structures, such as the subthalamic nucleus (STN) and the globus pallidus pars interna (GPi). However, these structures can be difficult to localize with conventional clinical magnetic resonance imaging (MRI), and thus targeting can be prone to error. Ultra-high-field imaging at 7T has the ability to clearly resolve these structures and thus atlases built with these data have the potential to improve targeting accuracy. Methods T1 and T2-weighted images of 12 healthy control subjects were acquired using a 7T MR scanner. These images were then used with groupwise registration to generate an unbiased average template with T1w and T2w contrast. Deep brain structures were manually labelled in each subject by two raters and rater reliability was assessed. We compared the use of this unbiased atlas with two other methods of atlas-based segmentation (single-template and multi-template) for subthalamic nucleus (STN) segmentation on 7T MRI data. We also applied this atlas to clinical DBS data acquired at 1.5T to evaluate its efficacy for DBS target localization as compared to using a standard atlas. Results The unbiased templates provide superb detail of subcortical structures. Through one-way ANOVA tests, the unbiased template is significantly (p <0.05) more accurate than a single-template in atlas-based segmentation and DBS target localization tasks. Conclusion The generated unbiased averaged templates provide better visualization of deep brain nuclei and an increase in accuracy over single-template and lower field strength atlases.

  10. Extending unbiased stereology of brain ultrastructure to three-dimensional volumes

    NASA Technical Reports Server (NTRS)

    Fiala, J. C.; Harris, K. M.; Koslow, S. H. (Principal Investigator)

    2001-01-01

    OBJECTIVE: Analysis of brain ultrastructure is needed to reveal how neurons communicate with one another via synapses and how disease processes alter this communication. In the past, such analyses have usually been based on single or paired sections obtained by electron microscopy. Reconstruction from multiple serial sections provides a much needed, richer representation of the three-dimensional organization of the brain. This paper introduces a new reconstruction system and new methods for analyzing in three dimensions the location and ultrastructure of neuronal components, such as synapses, which are distributed non-randomly throughout the brain. DESIGN AND MEASUREMENTS: Volumes are reconstructed by defining transformations that align the entire area of adjacent sections. Whole-field alignment requires rotation, translation, skew, scaling, and second-order nonlinear deformations. Such transformations are implemented by a linear combination of bivariate polynomials. Computer software for generating transformations based on user input is described. Stereological techniques for assessing structural distributions in reconstructed volumes are the unbiased bricking, disector, unbiased ratio, and per-length counting techniques. A new general method, the fractional counter, is also described. This unbiased technique relies on the counting of fractions of objects contained in a test volume. A volume of brain tissue from stratum radiatum of hippocampal area CA1 is reconstructed and analyzed for synaptic density to demonstrate and compare the techniques. RESULTS AND CONCLUSIONS: Reconstruction makes practicable volume-oriented analysis of ultrastructure using such techniques as the unbiased bricking and fractional counter methods. These analysis methods are less sensitive to the section-to-section variations in counts and section thickness, factors that contribute to the inaccuracy of other stereological methods. In addition, volume reconstruction facilitates visualization and modeling of structures and analysis of three-dimensional relationships such as synaptic connectivity.

  11. An Unbiased Method To Build Benchmarking Sets for Ligand-Based Virtual Screening and its Application To GPCRs

    PubMed Central

    2015-01-01

    Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745

  12. An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs.

    PubMed

    Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon

    2014-05-27

    Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.

  13. Naturalization fosters the long-term political integration of immigrants

    PubMed Central

    Hainmueller, Jens; Hangartner, Dominik; Pietrantuono, Giuseppe

    2015-01-01

    Does naturalization cause better political integration of immigrants into the host society? Despite heated debates about citizenship policy, there exists almost no evidence that isolates the independent effect of naturalization from the nonrandom selection into naturalization. We provide new evidence from a natural experiment in Switzerland, where some municipalities used referendums as the mechanism to decide naturalization requests. Balance checks suggest that for close naturalization referendums, which are decided by just a few votes, the naturalization decision is as good as random, so that narrowly rejected and narrowly approved immigrant applicants are similar on all confounding characteristics. This allows us to remove selection effects and obtain unbiased estimates of the long-term impacts of citizenship. Our study shows that for the immigrants who faced close referendums, naturalization considerably improved their political integration, including increases in formal political participation, political knowledge, and political efficacy. PMID:26417099

  14. Considerations for conducting epidemiologic case-control studies of cancer in developing countries.

    PubMed

    Brinton, L A; Herrero, R; Brenes, M; Montalván, P; de la Guardia, M E; Avila, A; Domínguez, I L; Basurto, E; Reeves, W C

    1991-01-01

    The challenges involved in conducting epidemiologic studies of cancer in developing countries can be and often are unique. This article reports on our experience in performing a case-control study of invasive cervical cancer in four Latin American countries (Columbia, Costa Rica, Mexico, and Panama), the summary medical results of which have been published in a previous issue of this journal (1). The study involved a number of principal activities--mainly selecting, conducting interviews with, and obtaining appropriate biologic specimens from 759 cervical cancer patients, 1,467 matched female controls, and 689 male sex partners of monogamous female subjects. This presentation provides an overview of the planning and methods used to select the subjects, conduct the survey work, and obtain complete and effectively unbiased data. It also points out some of the important advantages and disadvantages of working in developing areas similar to those serving as locales for this study.

  15. Current and efficiency of Brownian particles under oscillating forces in entropic barriers

    NASA Astrophysics Data System (ADS)

    Nutku, Ferhat; Aydιner, Ekrem

    2015-04-01

    In this study, considering the temporarily unbiased force and different forms of oscillating forces, we investigate the current and efficiency of Brownian particles in an entropic tube structure and present the numerically obtained results. We show that different force forms give rise to different current and efficiency profiles in different optimized parameter intervals. We find that an unbiased oscillating force and an unbiased temporal force lead to the current and efficiency, which are dependent on these parameters. We also observe that the current and efficiency caused by temporal and different oscillating forces have maximum and minimum values in different parameter intervals. We conclude that the current or efficiency can be controlled dynamically by adjusting the parameters of entropic barriers and applied force. Project supported by the Funds from Istanbul University (Grant No. 45662).

  16. An Example of an Improvable Rao-Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator.

    PubMed

    Galili, Tal; Meilijson, Isaac

    2016-01-02

    The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].

  17. Quantum key distribution for composite dimensional finite systems

    NASA Astrophysics Data System (ADS)

    Shalaby, Mohamed; Kamal, Yasser

    2017-06-01

    The application of quantum mechanics contributes to the field of cryptography with very important advantage as it offers a mechanism for detecting the eavesdropper. The pioneering work of quantum key distribution uses mutually unbiased bases (MUBs) to prepare and measure qubits (or qudits). Weak mutually unbiased bases (WMUBs) have weaker properties than MUBs properties, however, unlike MUBs, a complete set of WMUBs can be constructed for systems with composite dimensions. In this paper, we study the use of weak mutually unbiased bases (WMUBs) in quantum key distribution for composite dimensional finite systems. We prove that the security analysis of using a complete set of WMUBs to prepare and measure the quantum states in the generalized BB84 protocol, gives better results than using the maximum number of MUBs that can be constructed, when they are analyzed against the intercept and resend attack.

  18. An automated benchmarking platform for MHC class II binding prediction methods.

    PubMed

    Andreatta, Massimo; Trolle, Thomas; Yan, Zhen; Greenbaum, Jason A; Peters, Bjoern; Nielsen, Morten

    2018-05-01

    Computational methods for the prediction of peptide-MHC binding have become an integral and essential component for candidate selection in experimental T cell epitope discovery studies. The sheer amount of published prediction methods-and often discordant reports on their performance-poses a considerable quandary to the experimentalist who needs to choose the best tool for their research. With the goal to provide an unbiased, transparent evaluation of the state-of-the-art in the field, we created an automated platform to benchmark peptide-MHC class II binding prediction tools. The platform evaluates the absolute and relative predictive performance of all participating tools on data newly entered into the Immune Epitope Database (IEDB) before they are made public, thereby providing a frequent, unbiased assessment of available prediction tools. The benchmark runs on a weekly basis, is fully automated, and displays up-to-date results on a publicly accessible website. The initial benchmark described here included six commonly used prediction servers, but other tools are encouraged to join with a simple sign-up procedure. Performance evaluation on 59 data sets composed of over 10 000 binding affinity measurements suggested that NetMHCIIpan is currently the most accurate tool, followed by NN-align and the IEDB consensus method. Weekly reports on the participating methods can be found online at: http://tools.iedb.org/auto_bench/mhcii/weekly/. mniel@bioinformatics.dtu.dk. Supplementary data are available at Bioinformatics online.

  19. Are Key Principles for improved health technology assessment supported and used by health technology assessment organizations?

    PubMed

    Neumann, Peter J; Drummond, Michael F; Jönsson, Bengt; Luce, Bryan R; Schwartz, J Sanford; Siebert, Uwe; Sullivan, Sean D

    2010-01-01

    Previously, our group-the International Working Group for HTA Advancement-proposed a set of fifteen Key Principles that could be applied to health technology assessment (HTA) programs in different jurisdictions and across a range of organizations and perspectives. In this commentary, we investigate the extent to which these principles are supported and used by fourteen selected HTA organizations worldwide. We find that some principles are broadly supported: examples include being explicit about HTA goals and scope; considering a wide range of evidence and outcomes; and being unbiased and transparent. Other principles receive less widespread support: examples are addressing issues of generalizability and transferability; being transparent on the link between HTA findings and decision-making processes; considering a full societal perspective; and monitoring the implementation of HTA findings. The analysis also suggests a lack of consensus in the field about some principles--for example, considering a societal perspective. Our study highlights differences in the uptake of key principles for HTA and indicates considerable room for improvement for HTA organizations to adopt principles identified to reflect good HTA practices. Most HTA organizations espouse certain general concepts of good practice--for example, assessments should be unbiased and transparent. However, principles that require more intensive follow-up--for example, monitoring the implementation of HTA findings--have received little support and execution.

  20. Transposon-mediated generation of BCR-ABL1-expressing transgenic cell lines for unbiased sensitivity testing of tyrosine kinase inhibitors.

    PubMed

    Byrgazov, Konstantin; Lucini, Chantal Blanche; Berkowitsch, Bettina; Koenig, Margit; Haas, Oskar A; Hoermann, Gregor; Valent, Peter; Lion, Thomas

    2016-11-22

    Point mutations in the ABL1 kinase domain are an important mechanism of resistance to tyrosine kinase inhibitors (TKI) in BCR-ABL1-positive and, as recently shown, BCR-ABL1-like leukemias. The cell line Ba/F3 lentivirally transduced with mutant BCR-ABL1 constructs is widely used for in vitro sensitivity testing and response prediction to tyrosine kinase inhibitors. The transposon-based Sleeping Beauty system presented offers several advantages over lentiviral transduction including the absence of biosafety issues, faster generation of transgenic cell lines, and greater efficacy in introducing large gene constructs. Nevertheless, both methods can mediate multiple insertions in the genome. Here we show that multiple BCR-ABL1 insertions result in elevated IC50 levels for individual TKIs, thus overestimating the actual resistance of mutant subclones. We have therefore established flow-sorting-based fractionation of BCR-ABL1-transformed Ba/F3 cells facilitating efficient enrichment of cells carrying single-site insertions, as demonstrated by FISH-analysis. Fractions of unselected Ba/F3 cells not only showed a greater number of BCR-ABL1 hybridization signals, but also revealed higher IC50 values for the TKIs tested. The data presented highlight the need to carefully select transfected cells by flow-sorting, and to control the insertion numbers by FISH and real-time PCR to permit unbiased in vitro testing of drug resistance.

  1. Academic consortium for the evaluation of computer-aided diagnosis (CADx) in mammography

    NASA Astrophysics Data System (ADS)

    Mun, Seong K.; Freedman, Matthew T.; Wu, Chris Y.; Lo, Shih-Chung B.; Floyd, Carey E., Jr.; Lo, Joseph Y.; Chan, Heang-Ping; Helvie, Mark A.; Petrick, Nicholas; Sahiner, Berkman; Wei, Datong; Chakraborty, Dev P.; Clarke, Laurence P.; Kallergi, Maria; Clark, Bob; Kim, Yongmin

    1995-04-01

    Computer aided diagnosis (CADx) is a promising technology for the detection of breast cancer in screening mammography. A number of different approaches have been developed for CADx research that have achieved significant levels of performance. Research teams now recognize the need for a careful and detailed evaluation study of approaches to accelerate the development of CADx, to make CADx more clinically relevant and to optimize the CADx algorithms based on unbiased evaluations. The results of such a comparative study may provide each of the participating teams with new insights into the optimization of their individual CADx algorithms. This consortium of experienced CADx researchers is working as a group to compare results of the algorithms and to optimize the performance of CADx algorithms by learning from each other. Each institution will be contributing an equal number of cases that will be collected under a standard protocol for case selection, truth determination, and data acquisition to establish a common and unbiased database for the evaluation study. An evaluation procedure for the comparison studies are being developed to analyze the results of individual algorithms for each of the test cases in the common database. Optimization of individual CADx algorithms can be made based on the comparison studies. The consortium effort is expected to accelerate the eventual clinical implementation of CADx algorithms at participating institutions.

  2. Correlates of tuberculosis risk: predictive biomarkers for progression to active tuberculosis

    PubMed Central

    Petruccioli, Elisa; Scriba, Thomas J.; Petrone, Linda; Hatherill, Mark; Cirillo, Daniela M.; Joosten, Simone A.; Ottenhoff, Tom H.; Denkinger, Claudia M.; Goletti, Delia

    2016-01-01

    New approaches to control the spread of tuberculosis (TB) are needed, including tools to predict development of active TB from latent TB infection (LTBI). Recent studies have described potential correlates of risk, in order to inform the development of prognostic tests for TB disease progression. These efforts have included unbiased approaches employing “omics” technologies, as well as more directed, hypothesis-driven approaches assessing a small set or even individual selected markers as candidate correlates of TB risk. Unbiased high-throughput screening of blood RNAseq profiles identified signatures of active TB risk in individuals with LTBI, ≥1 year before diagnosis. A recent infant vaccination study identified enhanced expression of T-cell activation markers as a correlate of risk prior to developing TB; conversely, high levels of Ag85A antibodies and high frequencies of interferon (IFN)-γ specific T-cells were associated with reduced risk of disease. Others have described CD27−IFN-γ+CD4+ T-cells as possibly predictive markers of TB disease. T-cell responses to TB latency antigens, including heparin-binding haemagglutinin and DosR-regulon-encoded antigens have also been correlated with protection. Further studies are needed to determine whether correlates of risk can be used to prevent active TB through targeted prophylactic treatment, or to allow targeted enrolment into efficacy trials of new TB vaccines and therapeutic drugs. PMID:27836953

  3. Circulating tumor cell detection: A direct comparison between negative and unbiased enrichment in lung cancer.

    PubMed

    Xu, Yan; Liu, Biao; Ding, Fengan; Zhou, Xiaodie; Tu, Pin; Yu, Bo; He, Yan; Huang, Peilin

    2017-06-01

    Circulating tumor cells (CTCs), isolated as a 'liquid biopsy', may provide important diagnostic and prognostic information. Therefore, rapid, reliable and unbiased detection of CTCs are required for routine clinical analyses. It was demonstrated that negative enrichment, an epithelial marker-independent technique for isolating CTCs, exhibits a better efficiency in the detection of CTCs compared with positive enrichment techniques that only use specific anti-epithelial cell adhesion molecules. However, negative enrichment techniques incur significant cell loss during the isolation procedure, and as it is a method that uses only one type of antibody, it is inherently biased. The detection procedure and identification of cell types also relies on skilled and experienced technicians. In the present study, the detection sensitivity of using negative enrichment and a previously described unbiased detection method was compared. The results revealed that unbiased detection methods may efficiently detect >90% of cancer cells in blood samples containing CTCs. By contrast, only 40-60% of CTCs were detected by negative enrichment. Additionally, CTCs were identified in >65% of patients with stage I/II lung cancer. This simple yet efficient approach may achieve a high level of sensitivity. It demonstrates a potential for the large-scale clinical implementation of CTC-based diagnostic and prognostic strategies.

  4. Building unbiased estimators from non-gaussian likelihoods with application to shear estimation

    DOE PAGES

    Madhavacheril, Mathew S.; McDonald, Patrick; Sehgal, Neelima; ...

    2015-01-15

    We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong’s estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g| = 0.2.« less

  5. Building unbiased estimators from non-Gaussian likelihoods with application to shear estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madhavacheril, Mathew S.; Sehgal, Neelima; McDonald, Patrick

    2015-01-01

    We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong's estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g|=0.2.« less

  6. Developing Novel Therapeutics Targeting Undifferentiated and Castration-Resistant Prostate Cancer Stem Cells

    DTIC Science & Technology

    2016-10-01

    identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to identify novel PCSC-targeting chemicals. In the past...display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to...Goals of the Project (SOW): Aim 1: To perform phage display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides

  7. What Do We Know about Using Value-Added to Compare Teachers Who Work in Different Schools? What We Know Series: Value-Added Methods and Applications. Knowledge Brief 10

    ERIC Educational Resources Information Center

    Raudenbush, Stephen

    2013-01-01

    This brief considers the problem of using value-added scores to compare teachers who work in different schools. The author focuses on whether such comparisons can be regarded as fair, or, in statistical language, "unbiased." An unbiased measure does not systematically favor teachers because of the backgrounds of the students they are…

  8. Long Term Follow up of the Delayed Effects of Acute Radiation Exposure in Primates

    DTIC Science & Technology

    2017-10-01

    66 of 94 We will then use shRNAs and/or CRISPR constructs targeting the gene of interest to knock down its expression in stem cells prior to...DLBCLs Mutational profiling identifies 150 driver genes Gene expression identifies sub- groups including cell of origin Unbiased CRISPR screen...Exome sequencing in 1,001 DLBCL patients comprehensively identifies 150 driver genes d Unbiased CRISPR screen in DLBCL cell lines identifies essential

  9. Four photon parametric amplification. [in unbiased Josephson junction

    NASA Technical Reports Server (NTRS)

    Parrish, P. T.; Feldman, M. J.; Ohta, H.; Chiao, R. Y.

    1974-01-01

    An analysis is presented describing four-photon parametric amplification in an unbiased Josephson junction. Central to the theory is the model of the Josephson effect as a nonlinear inductance. Linear, small signal analysis is applied to the two-fluid model of the Josephson junction. The gain, gain-bandwidth product, high frequency limit, and effective noise temperature are calculated for a cavity reflection amplifier. The analysis is extended to multiple (series-connected) junctions and subharmonic pumping.

  10. Test of mutually unbiased bases for six-dimensional photonic quantum systems

    PubMed Central

    D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio

    2013-01-01

    In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a “qusix”), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution. PMID:24067548

  11. Test of mutually unbiased bases for six-dimensional photonic quantum systems.

    PubMed

    D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio

    2013-09-25

    In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a "qusix"), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution.

  12. Graph-state formalism for mutually unbiased bases

    NASA Astrophysics Data System (ADS)

    Spengler, Christoph; Kraus, Barbara

    2013-11-01

    A pair of orthonormal bases is called mutually unbiased if all mutual overlaps between any element of one basis and an arbitrary element of the other basis coincide. In case the dimension, d, of the considered Hilbert space is a power of a prime number, complete sets of d+1 mutually unbiased bases (MUBs) exist. Here we present a method based on the graph-state formalism to construct such sets of MUBs. We show that for n p-level systems, with p being prime, one particular graph suffices to easily construct a set of pn+1 MUBs. In fact, we show that a single n-dimensional vector, which is associated with this graph, can be used to generate a complete set of MUBs and demonstrate that this vector can be easily determined. Finally, we discuss some advantages of our formalism regarding the analysis of entanglement structures in MUBs, as well as experimental realizations.

  13. Diallel analysis for sex-linked and maternal effects.

    PubMed

    Zhu, J; Weir, B S

    1996-01-01

    Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.

  14. Long-term variability of T Tauri stars using WASP

    NASA Astrophysics Data System (ADS)

    Rigon, Laura; Scholz, Alexander; Anderson, David; West, Richard

    2017-03-01

    We present a reference study of the long-term optical variability of young stars using data from the WASP project. Our primary sample is a group of well-studied classical T Tauri stars (CTTSs), mostly in Taurus-Auriga. WASP light curves cover time-scales of up to 7 yr and typically contain 10 000-30 000 data points. We quantify the variability as a function of time-scale using the time-dependent standard deviation 'pooled sigma'. We find that the overwhelming majority of CTTSs have a low-level variability with σ < 0.3 mag dominated by time-scales of a few weeks, consistent with rotational modulation. Thus, for most young stars, monitoring over a month is sufficient to constrain the total amount of variability over time-scales of up to a decade. The fraction of stars with a strong optical variability (σ > 0.3 mag) is 21 per cent in our sample and 21 per cent in an unbiased control sample. An even smaller fraction (13 per cent in our sample, 6 per cent in the control) show evidence for an increase in variability amplitude as a function of time-scale from weeks to months or years. The presence of long-term variability correlates with the spectral slope at 3-5 μm, which is an indicator of inner disc geometry, and with the U-B band slope, which is an accretion diagnostics. This shows that the long-term variations in CTTSs are predominantly driven by processes in the inner disc and in the accretion zone. Four of the stars with long-term variations show periods of 20-60 d, significantly longer than the rotation periods and stable over months to years. One possible explanation is cyclic changes in the interaction between the disc and the stellar magnetic field.

  15. Subtraction of cap-trapped full-length cDNA libraries to select rare transcripts.

    PubMed

    Hirozane-Kishikawa, Tomoko; Shiraki, Toshiyuki; Waki, Kazunori; Nakamura, Mari; Arakawa, Takahiro; Kawai, Jun; Fagiolini, Michela; Hensch, Takao K; Hayashizaki, Yoshihide; Carninci, Piero

    2003-09-01

    The normalization and subtraction of highly expressed cDNAs from relatively large tissues before cloning dramatically enhanced the gene discovery by sequencing for the mouse full-length cDNA encyclopedia, but these methods have not been suitable for limited RNA materials. To normalize and subtract full-length cDNA libraries derived from limited quantities of total RNA, here we report a method to subtract plasmid libraries excised from size-unbiased amplified lambda phage cDNA libraries that avoids heavily biasing steps such as PCR and plasmid library amplification. The proportion of full-length cDNAs and the gene discovery rate are high, and library diversity can be validated by in silico randomization.

  16. Advanced Energy Retrofit Guide: Practical Ways to Improve Energy Performance; Grocery Stores (Revised) (Book)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hendron, B.

    2013-07-01

    The U.S. Department of Energy developed the Advanced Energy Retrofit Guides (AERGs) to provide specific methodologies, information, and guidance to help energy managers and other stakeholders successfully plan and execute energy efficiency improvements. Detailed technical discussion is fairly limited in these guides. Instead, we emphasize actionable information, practical methodologies, diverse case studies, and unbiased evaluations of the most promising retrofit measures for each building type. A series of AERGs is under development, addressing key segments of the commercial building stock. Grocery stores were selected as one of the highest priority sectors, because they represent one of the most energy-intensive marketmore » segments.« less

  17. Probing the Properties of AGN Clustering in the Local Universe with Swift-BAT

    NASA Astrophysics Data System (ADS)

    Powell, M.; Cappelluti, N.; Urry, M.; Koss, M.; Allevato, V.; Ajello, M.

    2017-10-01

    I present the benchmark measurement of AGN clustering in the local universe with the all-sky Swift-BAT survey. The hard X-ray selection (14-195 keV) allows for the detection of some of the most obscured AGN, providing the largest, most unbiased sample of local AGN to date. We derive for the first time the halo occupation distribution (HOD) of the sample in various bins of black hole mass, accretion rate, and obscuration. In doing so, we characterize the cosmic environment of growing supermassive black holes with unprecedented precision, and determine which black hole parameters depend on environment. We then compare our results to the current evolutionary models of AGN.

  18. Random bits, true and unbiased, from atmospheric turbulence

    PubMed Central

    Marangon, Davide G.; Vallone, Giuseppe; Villoresi, Paolo

    2014-01-01

    Random numbers represent a fundamental ingredient for secure communications and numerical simulation as well as to games and in general to Information Science. Physical processes with intrinsic unpredictability may be exploited to generate genuine random numbers. The optical propagation in strong atmospheric turbulence is here taken to this purpose, by observing a laser beam after a 143 km free-space path. In addition, we developed an algorithm to extract the randomness of the beam images at the receiver without post-processing. The numbers passed very selective randomness tests for qualification as genuine random numbers. The extracting algorithm can be easily generalized to random images generated by different physical processes. PMID:24976499

  19. Bias correction by use of errors-in-variables regression models in studies with K-X-ray fluorescence bone lead measurements.

    PubMed

    Lamadrid-Figueroa, Héctor; Téllez-Rojo, Martha M; Angeles, Gustavo; Hernández-Ávila, Mauricio; Hu, Howard

    2011-01-01

    In-vivo measurement of bone lead by means of K-X-ray fluorescence (KXRF) is the preferred biological marker of chronic exposure to lead. Unfortunately, considerable measurement error associated with KXRF estimations can introduce bias in estimates of the effect of bone lead when this variable is included as the exposure in a regression model. Estimates of uncertainty reported by the KXRF instrument reflect the variance of the measurement error and, although they can be used to correct the measurement error bias, they are seldom used in epidemiological statistical analyzes. Errors-in-variables regression (EIV) allows for correction of bias caused by measurement error in predictor variables, based on the knowledge of the reliability of such variables. The authors propose a way to obtain reliability coefficients for bone lead measurements from uncertainty data reported by the KXRF instrument and compare, by the use of Monte Carlo simulations, results obtained using EIV regression models vs. those obtained by the standard procedures. Results of the simulations show that Ordinary Least Square (OLS) regression models provide severely biased estimates of effect, and that EIV provides nearly unbiased estimates. Although EIV effect estimates are more imprecise, their mean squared error is much smaller than that of OLS estimates. In conclusion, EIV is a better alternative than OLS to estimate the effect of bone lead when measured by KXRF. Copyright © 2010 Elsevier Inc. All rights reserved.

  20. The AzTEC/SMA Interferometric Imaging Survey of Submillimeter-selected High-redshift Galaxies

    NASA Astrophysics Data System (ADS)

    Younger, Joshua D.; Fazio, Giovanni G.; Huang, Jia-Sheng; Yun, Min S.; Wilson, Grant W.; Ashby, Matthew L. N.; Gurwell, Mark A.; Peck, Alison B.; Petitpas, Glen R.; Wilner, David J.; Hughes, David H.; Aretxaga, Itziar; Kim, Sungeun; Scott, Kimberly S.; Austermann, Jason; Perera, Thushara; Lowenthal, James D.

    2009-10-01

    We present results from a continuing interferometric survey of high-redshift submillimeter galaxies (SMGs) with the Submillimeter Array, including high-resolution (beam size ~2 arcsec) imaging of eight additional AzTEC 1.1 mm selected sources in the COSMOS field, for which we obtain six reliable (peak signal-to-noise ratio (S/N) >5 or peak S/N >4 with multiwavelength counterparts within the beam) and two moderate significance (peak S/N >4) detections. When combined with previous detections, this yields an unbiased sample of millimeter-selected SMGs with complete interferometric follow up. With this sample in hand, we (1) empirically confirm the radio-submillimeter association, (2) examine the submillimeter morphology—including the nature of SMGs with multiple radio counterparts and constraints on the physical scale of the far infrared—of the sample, and (3) find additional evidence for a population of extremely luminous, radio-dim SMGs that peaks at higher redshift than previous, radio-selected samples. In particular, the presence of such a population of high-redshift sources has important consequences for models of galaxy formation—which struggle to account for such objects even under liberal assumptions—and dust production models given the limited time since the big bang.

  1. Accounting for selection and correlation in the analysis of two-stage genome-wide association studies.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-10-01

    The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. © The Author 2016. Published by Oxford University Press.

  2. An engineered high affinity Fbs1 carbohydrate binding protein for selective capture of N-glycans and N-glycopeptides

    PubMed Central

    Chen, Minyong; Shi, Xiaofeng; Duke, Rebecca M.; Ruse, Cristian I.; Dai, Nan; Taron, Christopher H.; Samuelson, James C.

    2017-01-01

    A method for selective and comprehensive enrichment of N-linked glycopeptides was developed to facilitate detection of micro-heterogeneity of N-glycosylation. The method takes advantage of the inherent properties of Fbs1, which functions within the ubiquitin-mediated degradation system to recognize the common core pentasaccharide motif (Man3GlcNAc2) of N-linked glycoproteins. We show that Fbs1 is able to bind diverse types of N-linked glycomolecules; however, wild-type Fbs1 preferentially binds high-mannose-containing glycans. We identified Fbs1 variants through mutagenesis and plasmid display selection, which possess higher affinity and improved recovery of complex N-glycomolecules. In particular, we demonstrate that the Fbs1 GYR variant may be employed for substantially unbiased enrichment of N-linked glycopeptides from human serum. Most importantly, this highly efficient N-glycopeptide enrichment method enables the simultaneous determination of N-glycan composition and N-glycosites with a deeper coverage (compared to lectin enrichment) and improves large-scale N-glycoproteomics studies due to greatly reduced sample complexity. PMID:28534482

  3. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    NASA Astrophysics Data System (ADS)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  4. Model selection for marginal regression analysis of longitudinal data with missing observations and covariate measurement error.

    PubMed

    Shen, Chung-Wei; Chen, Yi-Hau

    2015-10-01

    Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. Puffed-up but shaky selves: State self-esteem level and variability in narcissists.

    PubMed

    Geukes, Katharina; Nestler, Steffen; Hutteman, Roos; Dufner, Michael; Küfner, Albrecht C P; Egloff, Boris; Denissen, Jaap J A; Back, Mitja D

    2017-05-01

    Different theoretical conceptualizations characterize grandiose narcissists by high, yet fragile self-esteem. Empirical evidence, however, has been inconsistent, particularly regarding the relationship between narcissism and self-esteem fragility (i.e., self-esteem variability). Here, we aim at unraveling this inconsistency by disentangling the effects of two theoretically distinct facets of narcissism (i.e., admiration and rivalry) on the two aspects of state self-esteem (i.e., level and variability). We report on data from a laboratory-based and two field-based studies (total N = 596) in realistic social contexts, capturing momentary, daily, and weekly fluctuations of state self-esteem. To estimate unbiased effects of narcissism on the level and variability of self-esteem within one model, we applied mixed-effects location scale models. Results of the three studies and their meta-analytical integration indicated that narcissism is positively linked to self-esteem level and variability. When distinguishing between admiration and rivalry, however, an important dissociation was identified: Admiration was related to high (and rather stable) levels of state self-esteem, whereas rivalry was related to (rather low and) fragile self-esteem. Analyses on underlying processes suggest that effects of rivalry on self-esteem variability are based on stronger decreases in self-esteem from one assessment to the next, particularly after a perceived lack of social inclusion. The revealed differentiated effects of admiration and rivalry explain why the analysis of narcissism as a unitary concept has led to the inconsistent past findings and provide deeper insights into the intrapersonal dynamics of grandiose narcissism governing state self-esteem. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  6. Cross-Layer Resource Allocation for Wireless Visual Sensor Networks and Mobile Ad Hoc Networks

    DTIC Science & Technology

    2014-10-01

    MMD), minimizes the maximum dis- tortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. We employed the Particle...achieve the ideal tradeoff between the transmitted video quality and energy consumption. Each sensor node has a bit rate that can be used for both...Distortion (MMD), minimizes the maximum distortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. For both criteria

  7. Unbiased estimators for spatial distribution functions of classical fluids

    NASA Astrophysics Data System (ADS)

    Adib, Artur B.; Jarzynski, Christopher

    2005-01-01

    We use a statistical-mechanical identity closely related to the familiar virial theorem, to derive unbiased estimators for spatial distribution functions of classical fluids. In particular, we obtain estimators for both the fluid density ρ(r) in the vicinity of a fixed solute and the pair correlation g(r) of a homogeneous classical fluid. We illustrate the utility of our estimators with numerical examples, which reveal advantages over traditional histogram-based methods of computing such distributions.

  8. SMAP Level 4 Surface and Root Zone Soil Moisture

    NASA Technical Reports Server (NTRS)

    Reichle, R.; De Lannoy, G.; Liu, Q.; Ardizzone, J.; Kimball, J.; Koster, R.

    2017-01-01

    The SMAP Level 4 soil moisture (L4_SM) product provides global estimates of surface and root zone soil moisture, along with other land surface variables and their error estimates. These estimates are obtained through assimilation of SMAP brightness temperature observations into the Goddard Earth Observing System (GEOS-5) land surface model. The L4_SM product is provided at 9 km spatial and 3-hourly temporal resolution and with about 2.5 day latency. The soil moisture and temperature estimates in the L4_SM product are validated against in situ observations. The L4_SM product meets the required target uncertainty of 0.04 m(exp. 3)m(exp. -3), measured in terms of unbiased root-mean-square-error, for both surface and root zone soil moisture.

  9. Enhanced Sampling in the Well-Tempered Ensemble

    NASA Astrophysics Data System (ADS)

    Bonomi, M.; Parrinello, M.

    2010-05-01

    We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi , J. Comput. Chem. 30, 1615 (2009)JCCHDD0192-865110.1002/jcc.21305]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.

  10. Enhanced sampling in the well-tempered ensemble.

    PubMed

    Bonomi, M; Parrinello, M

    2010-05-14

    We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi, J. Comput. Chem. 30, 1615 (2009)]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.

  11. Teacher Efficacy of Secondary Special Education Science Teachers

    NASA Astrophysics Data System (ADS)

    Bonton, Celeste

    Students with disabilities are a specific group of the student population that are guaranteed rights that allow them to receive a free and unbiased education in an environment with their non-disabled peers. The importance of this study relates to providing students with disabilities with the opportunity to receive instruction from the most efficient and prepared educators. The purpose of this study is to determine how specific factors influence special education belief systems. In particular, educators who provide science instruction in whole group or small group classrooms in a large metropolitan area in Georgia possess specific beliefs about their ability to provide meaningful instruction. Data was collected through a correlational study completed by educators through an online survey website. The SEBEST quantitative survey instrument was used on a medium sample size (approximately 120 teachers) in a large metropolitan school district. The selected statistical analysis was the Shapiro-Wilk and Mann-Whitney in order to determine if any correlation exists among preservice training and perceived self-efficacy of secondary special education teachers in the content area of science. The results of this study showed that special education teachers in the content area of science have a higher perceived self-efficacy if they have completed an alternative certification program. Other variables tested did not show any statistical significance. Further research can be centered on the analysis of actual teacher efficacy, year end teacher efficacy measurements, teacher stipends, increased recruitment, and special education teachers of multiple content areas.

  12. The edge-preservation multi-classifier relearning framework for the classification of high-resolution remotely sensed imagery

    NASA Astrophysics Data System (ADS)

    Han, Xiaopeng; Huang, Xin; Li, Jiayi; Li, Yansheng; Yang, Michael Ying; Gong, Jianya

    2018-04-01

    In recent years, the availability of high-resolution imagery has enabled more detailed observation of the Earth. However, it is imperative to simultaneously achieve accurate interpretation and preserve the spatial details for the classification of such high-resolution data. To this aim, we propose the edge-preservation multi-classifier relearning framework (EMRF). This multi-classifier framework is made up of support vector machine (SVM), random forest (RF), and sparse multinomial logistic regression via variable splitting and augmented Lagrangian (LORSAL) classifiers, considering their complementary characteristics. To better characterize complex scenes of remote sensing images, relearning based on landscape metrics is proposed, which iteratively quantizes both the landscape composition and spatial configuration by the use of the initial classification results. In addition, a novel tri-training strategy is proposed to solve the over-smoothing effect of relearning by means of automatic selection of training samples with low classification certainties, which always distribute in or near the edge areas. Finally, EMRF flexibly combines the strengths of relearning and tri-training via the classification certainties calculated by the probabilistic output of the respective classifiers. It should be noted that, in order to achieve an unbiased evaluation, we assessed the classification accuracy of the proposed framework using both edge and non-edge test samples. The experimental results obtained with four multispectral high-resolution images confirm the efficacy of the proposed framework, in terms of both edge and non-edge accuracy.

  13. Incomplete offspring sex bias in Australian populations of the butterfly Eurema hecabe

    PubMed Central

    Kemp, D J; Thomson, F E; Edwards, W; Iturbe-Ormaetxe, I

    2017-01-01

    Theory predicts unified sex ratios for most organisms, yet biases may be engendered by selfish genetic elements such as endosymbionts that kill or feminize individuals with male genotypes. Although rare, feminization is established for Wolbachia-infected Eurema butterflies. This paradigm is presently confined to islands in the southern Japanese archipelago, where feminized phenotypes produce viable all-daughter broods. Here, we characterize sex bias for E. hecabe in continental Australia. Starting with 186 wild-caught females, we reared >6000 F1–F3 progeny in pedigree designs that incorporated selective antibiotic treatments. F1 generations expressed a consistent bias across 2 years and populations that was driven by an ~5% incidence of broods comprising ⩾80% daughters. Females from biased lineages continued to overproduce daughters over two generations of outcrossing to wild males. Treatment with antibiotics of differential strength influenced sex ratio only in biased lineages by inducing an equivalent incomplete degree of son overproduction. Brood sex ratios were nevertheless highly variable within lineages and across generations. Intriguingly, the cytogenetic signature of female karyotype was uniformly absent, even among phenotypic females in unbiased lineages. Molecular evidence supported the existence of a single Wolbachia strain at high prevalence, yet this was not clearly linked to brood sex bias. In sum, we establish an inherited, experimentally reversible tendency for incomplete offspring bias. Key features of our findings clearly depart from the Japanese feminization paradigm and highlight the potential for more subtle degrees of sex distortion in arthropods. PMID:27731327

  14. A paleointensity technique for multidomain igneous rocks

    NASA Astrophysics Data System (ADS)

    Wang, Huapei; Kent, Dennis V.

    2013-10-01

    We developed a paleointensity technique to account for concave-up Arai diagrams due to multidomain (MD) contributions to determine unbiased paleointensities for 24 trial samples from site GA-X in Pleistocene lavas from Floreana Island, Galapagos Archipelago. The main magnetization carrier is fine-grained low-titanium magnetite of variable grain size. We used a comprehensive back-zero-forth (BZF) heating technique by adding an additional zero-field heating between the Thellier two opposite in-field heating steps in order to estimate paleointensities in various standard protocols and provide internal self-consistency checks. After the first BZF experiment, we gave each sample a total thermal remanent magnetization (tTRM) by cooling from the Curie point in the presence of a low (15 µT) laboratory-applied field. Then we repeated the BZF protocol, with the laboratory-applied tTRM as a synthetic natural remanent magnetization (NRM), using the same laboratory-applied field and temperature steps to obtain the synthetic Arai signatures, which should only represent the domain-state dependent properties of the samples. We corrected the original Arai diagrams from the first BZF experiment by using the Arai signatures from the repeated BZF experiment, which neutralizes the typical MD concave-up effect. Eleven samples meet the Arai diagram post-selection criteria and provide qualified paleointensity estimates with a mean value for site GA-X of 4.23 ± 1.29 µT, consistent with an excursional geomagnetic field direction reported for this site.

  15. Genomic selection in sugar beet breeding populations

    PubMed Central

    2013-01-01

    Background Genomic selection exploits dense genome-wide marker data to predict breeding values. In this study we used a large sugar beet population of 924 lines representing different germplasm types present in breeding populations: unselected segregating families and diverse lines from more advanced stages of selection. All lines have been intensively phenotyped in multi-location field trials for six agronomically important traits and genotyped with 677 SNP markers. Results We used ridge regression best linear unbiased prediction in combination with fivefold cross-validation and obtained high prediction accuracies for all except one trait. In addition, we investigated whether a calibration developed based on a training population composed of diverse lines is suited to predict the phenotypic performance within families. Our results show that the prediction accuracy is lower than that obtained within the diverse set of lines, but comparable to that obtained by cross-validation within the respective families. Conclusions The results presented in this study suggest that a training population derived from intensively phenotyped and genotyped diverse lines from a breeding program does hold potential to build up robust calibration models for genomic selection. Taken together, our results indicate that genomic selection is a valuable tool and can thus complement the genomics toolbox in sugar beet breeding. PMID:24047500

  16. Metadynamics convergence law in a multidimensional system

    NASA Astrophysics Data System (ADS)

    Crespo, Yanier; Marinelli, Fabrizio; Pietrucci, Fabio; Laio, Alessandro

    2010-05-01

    Metadynamics is a powerful sampling technique that uses a nonequilibrium history-dependent process to reconstruct the free-energy surface as a function of the relevant collective variables s . In Bussi [Phys. Rev. Lett. 96, 090601 (2006)] it is proved that, in a Langevin process, metadynamics provides an unbiased estimate of the free energy F(s) . We here study the convergence properties of this approach in a multidimensional system, with a Hamiltonian depending on several variables. Specifically, we show that in a Monte Carlo metadynamics simulation of an Ising model the time average of the history-dependent potential converge to F(s) with the same law of an umbrella sampling performed in optimal conditions (i.e., with a bias exactly equal to the negative of the free energy). Remarkably, after a short transient, the error becomes approximately independent on the filling speed, showing that even in out-of-equilibrium conditions metadynamics allows recovering an accurate estimate of F(s) . These results have been obtained introducing a functional form of the history-dependent potential that avoids the onset of systematic errors near the boundaries of the free-energy landscape.

  17. Metadynamics convergence law in a multidimensional system.

    PubMed

    Crespo, Yanier; Marinelli, Fabrizio; Pietrucci, Fabio; Laio, Alessandro

    2010-05-01

    Metadynamics is a powerful sampling technique that uses a nonequilibrium history-dependent process to reconstruct the free-energy surface as a function of the relevant collective variables s . In Bussi [Phys. Rev. Lett. 96, 090601 (2006)] it is proved that, in a Langevin process, metadynamics provides an unbiased estimate of the free energy F(s) . We here study the convergence properties of this approach in a multidimensional system, with a Hamiltonian depending on several variables. Specifically, we show that in a Monte Carlo metadynamics simulation of an Ising model the time average of the history-dependent potential converge to F(s) with the same law of an umbrella sampling performed in optimal conditions (i.e., with a bias exactly equal to the negative of the free energy). Remarkably, after a short transient, the error becomes approximately independent on the filling speed, showing that even in out-of-equilibrium conditions metadynamics allows recovering an accurate estimate of F(s) . These results have been obtained introducing a functional form of the history-dependent potential that avoids the onset of systematic errors near the boundaries of the free-energy landscape.

  18. Canonical Measure of Correlation (CMC) and Canonical Measure of Distance (CMD) between sets of data. Part 3. Variable selection in classification.

    PubMed

    Ballabio, Davide; Consonni, Viviana; Mauri, Andrea; Todeschini, Roberto

    2010-01-11

    In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures. In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one. The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks' Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees. A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.

  19. Age, environment, object recognition and morphological diversity of GFAP-immunolabeled astrocytes.

    PubMed

    Diniz, Daniel Guerreiro; de Oliveira, Marcus Augusto; de Lima, Camila Mendes; Fôro, César Augusto Raiol; Sosthenes, Marcia Consentino Kronka; Bento-Torres, João; da Costa Vasconcelos, Pedro Fernando; Anthony, Daniel Clive; Diniz, Cristovam Wanderley Picanço

    2016-10-10

    Few studies have explored the glial response to a standard environment and how the response may be associated with age-related cognitive decline in learning and memory. Here we investigated aging and environmental influences on hippocampal-dependent tasks and on the morphology of an unbiased selected population of astrocytes from the molecular layer of dentate gyrus, which is the main target of perforant pathway. Six and twenty-month-old female, albino Swiss mice were housed, from weaning, in a standard or enriched environment, including running wheels for exercise and tested for object recognition and contextual memories. Young adult and aged subjects, independent of environment, were able to distinguish familiar from novel objects. All experimental groups, except aged mice from standard environment, distinguish stationary from displaced objects. Young adult but not aged mice, independent of environment, were able to distinguish older from recent objects. Only young mice from an enriched environment were able to distinguish novel from familiar contexts. Unbiased selected astrocytes from the molecular layer of the dentate gyrus were reconstructed in three-dimensions and classified using hierarchical cluster analysis of bimodal or multimodal morphological features. We found two morphological phenotypes of astrocytes and we designated type I the astrocytes that exhibited significantly higher values of morphological complexity as compared with type II. Complexity = [Sum of the terminal orders + Number of terminals] × [Total branch length/Number of primary branches]. On average, type I morphological complexity seems to be much more sensitive to age and environmental influences than that of type II. Indeed, aging and environmental impoverishment interact and reduce the morphological complexity of type I astrocytes at a point that they could not be distinguished anymore from type II. We suggest these two types of astrocytes may have different physiological roles and that the detrimental effects of aging on memory in mice from a standard environment may be associated with a reduction of astrocytes morphological diversity.

  20. Variable Selection in the Presence of Missing Data: Imputation-based Methods.

    PubMed

    Zhao, Yize; Long, Qi

    2017-01-01

    Variable selection plays an essential role in regression analysis as it identifies important variables that associated with outcomes and is known to improve predictive accuracy of resulting models. Variable selection methods have been widely investigated for fully observed data. However, in the presence of missing data, methods for variable selection need to be carefully designed to account for missing data mechanisms and statistical techniques used for handling missing data. Since imputation is arguably the most popular method for handling missing data due to its ease of use, statistical methods for variable selection that are combined with imputation are of particular interest. These methods, valid used under the assumptions of missing at random (MAR) and missing completely at random (MCAR), largely fall into three general strategies. The first strategy applies existing variable selection methods to each imputed dataset and then combine variable selection results across all imputed datasets. The second strategy applies existing variable selection methods to stacked imputed datasets. The third variable selection strategy combines resampling techniques such as bootstrap with imputation. Despite recent advances, this area remains under-developed and offers fertile ground for further research.

  1. One-shot estimate of MRMC variance: AUC.

    PubMed

    Gallas, Brandon D

    2006-03-01

    One popular study design for estimating the area under the receiver operating characteristic curve (AUC) is the one in which a set of readers reads a set of cases: a fully crossed design in which every reader reads every case. The variability of the subsequent reader-averaged AUC has two sources: the multiple readers and the multiple cases (MRMC). In this article, we present a nonparametric estimate for the variance of the reader-averaged AUC that is unbiased and does not use resampling tools. The one-shot estimate is based on the MRMC variance derived by the mechanistic approach of Barrett et al. (2005), as well as the nonparametric variance of a single-reader AUC derived in the literature on U statistics. We investigate the bias and variance properties of the one-shot estimate through a set of Monte Carlo simulations with simulated model observers and images. The different simulation configurations vary numbers of readers and cases, amounts of image noise and internal noise, as well as how the readers are constructed. We compare the one-shot estimate to a method that uses the jackknife resampling technique with an analysis of variance model at its foundation (Dorfman et al. 1992). The name one-shot highlights that resampling is not used. The one-shot and jackknife estimators behave similarly, with the one-shot being marginally more efficient when the number of cases is small. We have derived a one-shot estimate of the MRMC variance of AUC that is based on a probabilistic foundation with limited assumptions, is unbiased, and compares favorably to an established estimate.

  2. Is video gaming, or video game addiction, associated with depression, academic achievement, heavy episodic drinking, or conduct problems?

    PubMed

    Brunborg, Geir Scott; Mentzoni, Rune Aune; Frøyland, Lars Roar

    2014-03-01

    While the relationships between video game use and negative consequences are debated, the relationships between video game addiction and negative consequences are fairly well established. However, previous studies suffer from methodological weaknesses that may have caused biased results. There is need for further investigation that benefits from the use of methods that avoid omitted variable bias. Two wave panel data was used from two surveys of 1,928 Norwegian adolescents aged 13 to 17 years. The surveys included measures of video game use, video game addiction, depression, heavy episodic drinking, academic achievement, and conduct problems. The data was analyzed using first-differencing, a regression method that is unbiased by time invariant individual factors. Video game addiction was related to depression, lower academic achievement, and conduct problems, but time spent on video games was not related to any of the studied negative outcomes. The findings were in line with a growing number of studies that have failed to find relationships between time spent on video games and negative outcomes. The current study is also consistent with previous studies in that video game addiction was related to other negative outcomes, but it made the added contribution that the relationships are unbiased by time invariant individual effects. However, future research should aim at establishing the temporal order of the supposed causal effects. Spending time playing video games does not involve negative consequences, but adolescents who experience problems related to video games are likely to also experience problems in other facets of life.

  3. Is video gaming, or video game addiction, associated with depression, academic achievement, heavy episodic drinking, or conduct problems?

    PubMed Central

    Brunborg, Geir Scott; Mentzoni, Rune Aune; Frøyland, Lars Roar

    2014-01-01

    Background and aims: While the relationships between video game use and negative consequences are debated, the relationships between video game addiction and negative consequences are fairly well established. However, previous studies suffer from methodological weaknesses that may have caused biased results. There is need for further investigation that benefits from the use of methods that avoid omitted variable bias. Methods: Two wave panel data was used from two surveys of 1,928 Norwegian adolescents aged 13 to 17 years. The surveys included measures of video game use, video game addiction, depression, heavy episodic drinking, academic achievement, and conduct problems. The data was analyzed using first-differencing, a regression method that is unbiased by time invariant individual factors. Results: Video game addiction was related to depression, lower academic achievement, and conduct problems, but time spent on video games was not related to any of the studied negative outcomes. Discussion: The findings were in line with a growing number of studies that have failed to find relationships between time spent on video games and negative outcomes. The current study is also consistent with previous studies in that video game addiction was related to other negative outcomes, but it made the added contribution that the relationships are unbiased by time invariant individual effects. However, future research should aim at establishing the temporal order of the supposed causal effects. Conclusions: Spending time playing video games does not involve negative consequences, but adolescents who experience problems related to video games are likely to also experience problems in other facets of life. PMID:25215212

  4. Negative Differential Resistance (NDR) frequency conversion with gain

    NASA Technical Reports Server (NTRS)

    Hwu, R. J.; Alm, R. W.; Lee, S. C.

    1992-01-01

    The dependence of the I-V characteristic of the negative differential resistance (NDR) devices on the power level and frequency of the rf input signal has been theoretically analyzed with a modified large- and small-signal nonlinear circuit analysis program. The NDR devices we used in this work include both the tunnel diode (without the antisymmetry in the I-V characteristic) and resonant-tunneling devices (with the antisymmetry in the I-V characteristic). Absolute negative conductance can be found from a zero-biased resonant tunneling device when the applied pump power is within a small range. This study verifies the work of Sollner et al. Variable negative conductances at the fundamental and harmonic frequencies can also be obtained from both the unbiased and biased tunnel diodes. The magnitude of the negative conductances can be adjusted by varying the pump amplitude -- a very useful circuit property. However, the voltage range over which the negative conductance occurs moves towards the more positive side of the voltage axis with increasing frequency. Furthermore, the range of the pumping amplitude to obtain negative conductance varies with the parasitics (resistance and capacitance) of the device. The theoretical observation of the dependence of the I-V characteristic of the NDR devices on the power and frequency of the applied pump signal is supported by the experimental results. In addition, novel functions of a NDR device such as self-oscillating frequency multiplier and mixer with gain have been experimentally demonstrated. The unbiased oscillator have also been successfully realized with a NDR device with an antisymmetrical I-V characteristic. Finally, the applications of these device functions will be discussed.

  5. Spatial distribution, sampling precision and survey design optimisation with non-normal variables: The case of anchovy (Engraulis encrasicolus) recruitment in Spanish Mediterranean waters

    NASA Astrophysics Data System (ADS)

    Tugores, M. Pilar; Iglesias, Magdalena; Oñate, Dolores; Miquel, Joan

    2016-02-01

    In the Mediterranean Sea, the European anchovy (Engraulis encrasicolus) displays a key role in ecological and economical terms. Ensuring stock sustainability requires the provision of crucial information, such as species spatial distribution or unbiased abundance and precision estimates, so that management strategies can be defined (e.g. fishing quotas, temporal closure areas or marine protected areas MPA). Furthermore, the estimation of the precision of global abundance at different sampling intensities can be used for survey design optimisation. Geostatistics provide a priori unbiased estimations of the spatial structure, global abundance and precision for autocorrelated data. However, their application to non-Gaussian data introduces difficulties in the analysis in conjunction with low robustness or unbiasedness. The present study applied intrinsic geostatistics in two dimensions in order to (i) analyse the spatial distribution of anchovy in Spanish Western Mediterranean waters during the species' recruitment season, (ii) produce distribution maps, (iii) estimate global abundance and its precision, (iv) analyse the effect of changing the sampling intensity on the precision of global abundance estimates and, (v) evaluate the effects of several methodological options on the robustness of all the analysed parameters. The results suggested that while the spatial structure was usually non-robust to the tested methodological options when working with the original dataset, it became more robust for the transformed datasets (especially for the log-backtransformed dataset). The global abundance was always highly robust and the global precision was highly or moderately robust to most of the methodological options, except for data transformation.

  6. Bedload and Total Load Sediment Transport Equations for Rough Open-Channel Flow

    NASA Astrophysics Data System (ADS)

    Abrahams, A. D.; Gao, P.

    2001-12-01

    The total sediment load transported by an open-channel flow may be divided into bedload and suspended load. Bedload transport occurs by saltation at low shear stress and by sheetflow at high shear stress. Dimensional analysis is used to identify the dimensionless variables that control the transport rate of noncohesive sediments over a plane bed, and regression analysis is employed to isolate the significant variables and determine the values of the coefficients. In the general bedload transport equation (i.e. for saltation and sheetflow) the dimensionless bedload transport rate is a function of the dimensionless shear stress, the friction factor, and an efficiency coefficient. For sheetflow the last term approaches 1, so that the bedload transport rate becomes a function of just the dimensionless shear stress and the friction factor. The dimensional analysis indicates that the dimensionless total load transport rate is a function of the dimensionless bedload transport rate and the dimensionless settling velocity of the sediment. Predicted values of the transport rates are graphed against the computed values of these variables for 505 flume experiments reported in the literature. These graphs indicate that the equations developed in this study give good unbiased predictions of both the bedload transport rate and total load transport rate over a wide range of conditions.

  7. Volcanic influence on centennial to millennial Holocene Greenland temperature change.

    PubMed

    Kobashi, Takuro; Menviel, Laurie; Jeltsch-Thömmes, Aurich; Vinther, Bo M; Box, Jason E; Muscheler, Raimund; Nakaegawa, Toshiyuki; Pfister, Patrik L; Döring, Michael; Leuenberger, Markus; Wanner, Heinz; Ohmura, Atsumu

    2017-05-03

    Solar variability has been hypothesized to be a major driver of North Atlantic millennial-scale climate variations through the Holocene along with orbitally induced insolation change. However, another important climate driver, volcanic forcing has generally been underestimated prior to the past 2,500 years partly owing to the lack of proper proxy temperature records. Here, we reconstruct seasonally unbiased and physically constrained Greenland Summit temperatures over the Holocene using argon and nitrogen isotopes within trapped air in a Greenland ice core (GISP2). We show that a series of volcanic eruptions through the Holocene played an important role in driving centennial to millennial-scale temperature changes in Greenland. The reconstructed Greenland temperature exhibits significant millennial correlations with K + and Na + ions in the GISP2 ice core (proxies for atmospheric circulation patterns), and δ 18 O of Oman and Chinese Dongge cave stalagmites (proxies for monsoon activity), indicating that the reconstructed temperature contains hemispheric signals. Climate model simulations forced with the volcanic forcing further suggest that a series of large volcanic eruptions induced hemispheric-wide centennial to millennial-scale variability through ocean/sea-ice feedbacks. Therefore, we conclude that volcanic activity played a critical role in driving centennial to millennial-scale Holocene temperature variability in Greenland and likely beyond.

  8. The role of environmental variables in structuring landscape-scale species distributions in seafloor habitats.

    PubMed

    Kraan, Casper; Aarts, Geert; Van der Meer, Jaap; Piersma, Theunis

    2010-06-01

    Ongoing statistical sophistication allows a shift from describing species' spatial distributions toward statistically disentangling the possible roles of environmental variables in shaping species distributions. Based on a landscape-scale benthic survey in the Dutch Wadden Sea, we show the merits of spatially explicit generalized estimating equations (GEE). The intertidal macrozoobenthic species, Macoma balthica, Cerastoderma edule, Marenzelleria viridis, Scoloplos armiger, Corophium volutator, and Urothoe poseidonis served as test cases, with median grain-size and inundation time as typical environmental explanatory variables. GEEs outperformed spatially naive generalized linear models (GLMs), and removed much residual spatial structure, indicating the importance of median grain-size and inundation time in shaping landscape-scale species distributions in the intertidal. GEE regression coefficients were smaller than those attained with GLM, and GEE standard errors were larger. The best fitting GEE for each species was used to predict species' density in relation to median grain-size and inundation time. Although no drastic changes were noted compared to previous work that described habitat suitability for benthic fauna in the Wadden Sea, our predictions provided more detailed and unbiased estimates of the determinants of species-environment relationships. We conclude that spatial GEEs offer the necessary methodological advances to further steps toward linking pattern to process.

  9. A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty

    USGS Publications Warehouse

    Friedel, Michael J.

    2011-01-01

    This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.

  10. Assessment of flood susceptible areas using spatially explicit, probabilistic multi-criteria decision analysis

    NASA Astrophysics Data System (ADS)

    Tang, Zhongqian; Zhang, Hua; Yi, Shanzhen; Xiao, Yangfan

    2018-03-01

    GIS-based multi-criteria decision analysis (MCDA) is increasingly used to support flood risk assessment. However, conventional GIS-MCDA methods fail to adequately represent spatial variability and are accompanied with considerable uncertainty. It is, thus, important to incorporate spatial variability and uncertainty into GIS-based decision analysis procedures. This research develops a spatially explicit, probabilistic GIS-MCDA approach for the delineation of potentially flood susceptible areas. The approach integrates the probabilistic and the local ordered weighted averaging (OWA) methods via Monte Carlo simulation, to take into account the uncertainty related to criteria weights, spatial heterogeneity of preferences and the risk attitude of the analyst. The approach is applied to a pilot study for the Gucheng County, central China, heavily affected by the hazardous 2012 flood. A GIS database of six geomorphological and hydrometeorological factors for the evaluation of susceptibility was created. Moreover, uncertainty and sensitivity analysis were performed to investigate the robustness of the model. The results indicate that the ensemble method improves the robustness of the model outcomes with respect to variation in criteria weights and identifies which criteria weights are most responsible for the variability of model outcomes. Therefore, the proposed approach is an improvement over the conventional deterministic method and can provides a more rational, objective and unbiased tool for flood susceptibility evaluation.

  11. Targeted estimation of nuisance parameters to obtain valid statistical inference.

    PubMed

    van der Laan, Mark J

    2014-01-01

    In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special case, we also demonstrate the required targeting of the propensity score for the inverse probability of treatment weighted estimator using super-learning to fit the propensity score.

  12. Risk-Stratified Imputation in Survival Analysis

    PubMed Central

    Kennedy, Richard E.; Adragni, Kofi P.; Tiwari, Hemant K.; Voeks, Jenifer H.; Brott, Thomas G.; Howard, George

    2013-01-01

    Background Censoring that is dependent on covariates associated with survival can arise in randomized trials due to changes in recruitment and eligibility criteria to minimize withdrawals, potentially leading to biased treatment effect estimates. Imputation approaches have been proposed to address censoring in survival analysis; and while these approaches may provide unbiased estimates of treatment effects, imputation of a large number of outcomes may over- or underestimate the associated variance based on the imputation pool selected. Purpose We propose an improved method, risk-stratified imputation, as an alternative to address withdrawal related to the risk of events in the context of time-to-event analyses. Methods Our algorithm performs imputation from a pool of replacement subjects with similar values of both treatment and covariate(s) of interest, that is, from a risk-stratified sample. This stratification prior to imputation addresses the requirement of time-to-event analysis that censored observations are representative of all other observations in the risk group with similar exposure variables. We compared our risk-stratified imputation to case deletion and bootstrap imputation in a simulated dataset in which the covariate of interest (study withdrawal) was related to treatment. A motivating example from a recent clinical trial is also presented to demonstrate the utility of our method. Results In our simulations, risk-stratified imputation gives estimates of treatment effect comparable to bootstrap and auxiliary variable imputation while avoiding inaccuracies of the latter two in estimating the associated variance. Similar results were obtained in analysis of clinical trial data. Limitations Risk-stratified imputation has little advantage over other imputation methods when covariates of interest are not related to treatment, although its performance is superior when covariates are related to treatment. Risk-stratified imputation is intended for categorical covariates, and may be sensitive to the width of the matching window if continuous covariates are used. Conclusions The use of the risk-stratified imputation should facilitate the analysis of many clinical trials, in which one group has a higher withdrawal rate that is related to treatment. PMID:23818434

  13. Unbiased and robust quantification of synchronization between spikes and local field potential.

    PubMed

    Li, Zhaohui; Cui, Dong; Li, Xiaoli

    2016-08-30

    In neuroscience, relating the spiking activity of individual neurons to the local field potential (LFP) of neural ensembles is an increasingly useful approach for studying rhythmic neuronal synchronization. Many methods have been proposed to measure the strength of the association between spikes and rhythms in the LFP recordings, and most existing measures are dependent upon the total number of spikes. In the present work, we introduce a robust approach for quantifying spike-LFP synchronization which performs reliably for limited samples of data. The measure is termed as spike-triggered correlation matrix synchronization (SCMS), which takes LFP segments centered on each spike as multi-channel signals and calculates the index of spike-LFP synchronization by constructing a correlation matrix. The simulation based on artificial data shows that the SCMS output almost does not change with the sample size. This property is of crucial importance when making comparisons between different experimental conditions. When applied to actual neuronal data recorded from the monkey primary visual cortex, it is found that the spike-LFP synchronization strength shows orientation selectivity to drifting gratings. In comparison to another unbiased method, pairwise phase consistency (PPC), the proposed SCMS behaves better for noisy spike trains by means of numerical simulations. This study demonstrates the basic idea and calculating process of the SCMS method. Considering its unbiasedness and robustness, the measure is of great advantage to characterize the synchronization between spike trains and rhythms present in LFP. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Transposon-mediated generation of BCR-ABL1-expressing transgenic cell lines for unbiased sensitivity testing of tyrosine kinase inhibitors

    PubMed Central

    Berkowitsch, Bettina; Koenig, Margit; Haas, Oskar A.; Hoermann, Gregor; Valent, Peter; Lion, Thomas

    2016-01-01

    Point mutations in the ABL1 kinase domain are an important mechanism of resistance to tyrosine kinase inhibitors (TKI) in BCR-ABL1-positive and, as recently shown, BCR-ABL1-like leukemias. The cell line Ba/F3 lentivirally transduced with mutant BCR-ABL1 constructs is widely used for in vitro sensitivity testing and response prediction to tyrosine kinase inhibitors. The transposon-based Sleeping Beauty system presented offers several advantages over lentiviral transduction including the absence of biosafety issues, faster generation of transgenic cell lines, and greater efficacy in introducing large gene constructs. Nevertheless, both methods can mediate multiple insertions in the genome. Here we show that multiple BCR-ABL1 insertions result in elevated IC50 levels for individual TKIs, thus overestimating the actual resistance of mutant subclones. We have therefore established flow-sorting-based fractionation of BCR-ABL1-transformed Ba/F3 cells facilitating efficient enrichment of cells carrying single-site insertions, as demonstrated by FISH-analysis. Fractions of unselected Ba/F3 cells not only showed a greater number of BCR-ABL1 hybridization signals, but also revealed higher IC50 values for the TKIs tested. The data presented highlight the need to carefully select transfected cells by flow-sorting, and to control the insertion numbers by FISH and real-time PCR to permit unbiased in vitro testing of drug resistance. PMID:27801667

  15. Short-term and long-term memory deficits in handedness learning in mice with absent corpus callosum and reduced hippocampal commissure.

    PubMed

    Ribeiro, Andre S; Eales, Brenda A; Biddle, Fred G

    2013-05-15

    The corpus callosum (CC) and hippocampal commissure (HC) are major interhemispheric connections whose role in brain function and behaviors is fascinating and contentious. Paw preference of laboratory mice is a genetically regulated, adaptive behavior, continuously shaped by training and learning. We studied variation with training in paw-preference in mice of the 9XCA/WahBid ('9XCA') recombinant inbred strain, selected for complete absence of the CC and severely reduced HC. We measured sequences of paw choices in 9XCA mice in two training sessions in unbiased test chambers, separated by one-week. We compared them with sequences of paw choices in model non-learner mice that have random unbiased paw choices and with those of C57BL/6JBid ('C57BL/6J') mice that have normal interhemispheric connections and learn a paw preference. Positive autocorrelation between successive paw choices during each session and change in paw-preference bias between sessions indicate that 9XCA mice have weak, but not null, learning skills. We tested the effect of the forebrain commissural defect on paw-preference learning with the independent BTBR T+ tf/J ('BTBR') mouse strain that has a genetically identical, non-complementing commissural trait. BTBR has weak short-term and long-term memory skills, identical to 9XCA. The results provide strong evidence that CC and HC contribute in memory function and formation of paw-preference biases. Copyright © 2013 Elsevier B.V. All rights reserved.

  16. Towards a sampling strategy for the assessment of forest condition at European level: combining country estimates.

    PubMed

    Travaglini, Davide; Fattorini, Lorenzo; Barbati, Anna; Bottalico, Francesca; Corona, Piermaria; Ferretti, Marco; Chirici, Gherardo

    2013-04-01

    A correct characterization of the status and trend of forest condition is essential to support reporting processes at national and international level. An international forest condition monitoring has been implemented in Europe since 1987 under the auspices of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests). The monitoring is based on harmonized methodologies, with individual countries being responsible for its implementation. Due to inconsistencies and problems in sampling design, however, the ICP Forests network is not able to produce reliable quantitative estimates of forest condition at European and sometimes at country level. This paper proposes (1) a set of requirements for status and change assessment and (2) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and of their changes at both country and European level. Under the assumption that a common definition of forest holds among European countries, monitoring objectives, parameters of concern and accuracy indexes are stated. On the basis of fixed-area plot sampling performed independently in each country, an unbiased and consistent estimator of forest defoliation indexes is obtained at both country and European level, together with conservative estimators of their sampling variance and power in the detection of changes. The strategy adopts a probabilistic sampling scheme based on fixed-area plots selected by means of systematic or stratified schemes. Operative guidelines for its application are provided.

  17. Policy decision-making under scientific uncertainty: radiological risk assessment and the role of expert advisory groups.

    PubMed

    Mossman, Kenneth L

    2009-08-01

    Standard-setting agencies such as the U.S. Nuclear Regulatory Commission and the U.S. Environmental Protection Agency depend on advice from external expert advisory groups on matters of public policy and standard-setting. Authoritative bodies including the National Research Council and the National Council on Radiation Protection and Measurements provide analyses and recommendations that enable the technical and scientific soundness in decision-making. In radiological protection the nature of the scientific evidence is such that risk assessment at radiation doses typically encountered in environmental and occupational settings is highly uncertain, and several policy alternatives are scientifically defensible. The link between science and policy is problematic. The fundamental issue is the failure to properly consider risk assessment, risk communication, and risk management and then consolidate them in a process that leads to sound policy. Authoritative bodies should serve as unbiased brokers of policy choices by providing balanced and objective scientific analyses. As long as the policy-decision environment is characterized by high scientific uncertainty and a lack of values consensus, advisory groups should present unbiased evaluations of all scientifically plausible alternatives and recommend selection criteria that decision makers can use in the policy-setting process. To do otherwise (e.g., by serving as single position advocates) weakens decision-making by eliminating options and narrowing discussions of scientific perspectives. Understanding uncertainties and the limitations on available scientific information and conveying such information to policy makers remain key challenges for the technical and policy communities.

  18. Selection biases in empirical p(z) methods for weak lensing

    DOE PAGES

    Gruen, D.; Brimioulle, F.

    2017-02-23

    To measure the mass of foreground objects with weak gravitational lensing, one needs to estimate the redshift distribution of lensed background sources. This is commonly done in an empirical fashion, i.e. with a reference sample of galaxies of known spectroscopic redshift, matched to the source population. In this paper, we develop a simple decision tree framework that, under the ideal conditions of a large, purely magnitude-limited reference sample, allows an unbiased recovery of the source redshift probability density function p(z), as a function of magnitude and colour. We use this framework to quantify biases in empirically estimated p(z) caused bymore » selection effects present in realistic reference and weak lensing source catalogues, namely (1) complex selection of reference objects by the targeting strategy and success rate of existing spectroscopic surveys and (2) selection of background sources by the success of object detection and shape measurement at low signal to noise. For intermediate-to-high redshift clusters, and for depths and filter combinations appropriate for ongoing lensing surveys, we find that (1) spectroscopic selection can cause biases above the 10 per cent level, which can be reduced to ≈5 per cent by optimal lensing weighting, while (2) selection effects in the shape catalogue bias mass estimates at or below the 2 per cent level. Finally, this illustrates the importance of completeness of the reference catalogues for empirical redshift estimation.« less

  19. Data-driven confounder selection via Markov and Bayesian networks.

    PubMed

    Häggström, Jenny

    2018-06-01

    To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the observed pretreatment covariates, X, sufficient for unconfoundedness, if such subsets exist. Here, estimation of these target subsets is considered when the underlying causal structure is unknown. The proposed method is to model the causal structure by a probabilistic graphical model, for example, a Markov or Bayesian network, estimate this graph from observed data and select the target subsets given the estimated graph. The approach is evaluated by simulation both in a high-dimensional setting where unconfoundedness holds given X and in a setting where unconfoundedness only holds given subsets of X. Several common target subsets are investigated and the selected subsets are compared with respect to accuracy in estimating the average causal effect. The proposed method is implemented with existing software that can easily handle high-dimensional data, in terms of large samples and large number of covariates. The results from the simulation study show that, if unconfoundedness holds given X, this approach is very successful in selecting the target subsets, outperforming alternative approaches based on random forests and LASSO, and that the subset estimating the target subset containing all causes of outcome yields smallest MSE in the average causal effect estimation. © 2017, The International Biometric Society.

  20. Fitting N-mixture models to count data with unmodeled heterogeneity: Bias, diagnostics, and alternative approaches

    USGS Publications Warehouse

    Duarte, Adam; Adams, Michael J.; Peterson, James T.

    2018-01-01

    Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision making. Therefore, we also discuss alternative approaches to yield unbiased estimates of population state variables using similar data types, and we stress that there is no substitute for an effective sample design that is grounded upon well-defined management objectives.

  1. Valence-Dependent Belief Updating: Computational Validation

    PubMed Central

    Kuzmanovic, Bojana; Rigoux, Lionel

    2017-01-01

    People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments. PMID:28706499

  2. Valence-Dependent Belief Updating: Computational Validation.

    PubMed

    Kuzmanovic, Bojana; Rigoux, Lionel

    2017-01-01

    People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments.

  3. Absorption and folding of melittin onto lipid bilayer membranes via unbiased atomic detail microsecond molecular dynamics simulation.

    PubMed

    Chen, Charles H; Wiedman, Gregory; Khan, Ayesha; Ulmschneider, Martin B

    2014-09-01

    Unbiased molecular simulation is a powerful tool to study the atomic details driving functional structural changes or folding pathways of highly fluid systems, which present great challenges experimentally. Here we apply unbiased long-timescale molecular dynamics simulation to study the ab initio folding and partitioning of melittin, a template amphiphilic membrane active peptide. The simulations reveal that the peptide binds strongly to the lipid bilayer in an unstructured configuration. Interfacial folding results in a localized bilayer deformation. Akin to purely hydrophobic transmembrane segments the surface bound native helical conformer is highly resistant against thermal denaturation. Circular dichroism spectroscopy experiments confirm the strong binding and thermostability of the peptide. The study highlights the utility of molecular dynamics simulations for studying transient mechanisms in fluid lipid bilayer systems. This article is part of a Special Issue entitled: Interfacially Active Peptides and Proteins. Guest Editors: William C. Wimley and Kalina Hristova. Copyright © 2014. Published by Elsevier B.V.

  4. Mixed model approaches for diallel analysis based on a bio-model.

    PubMed

    Zhu, J; Weir, B S

    1996-12-01

    A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.

  5. Uncertainty relation based on unbiased parameter estimations

    NASA Astrophysics Data System (ADS)

    Sun, Liang-Liang; Song, Yong-Shun; Qiao, Cong-Feng; Yu, Sixia; Chen, Zeng-Bing

    2017-02-01

    Heisenberg's uncertainty relation has been extensively studied in spirit of its well-known original form, in which the inaccuracy measures used exhibit some controversial properties and don't conform with quantum metrology, where the measurement precision is well defined in terms of estimation theory. In this paper, we treat the joint measurement of incompatible observables as a parameter estimation problem, i.e., estimating the parameters characterizing the statistics of the incompatible observables. Our crucial observation is that, in a sequential measurement scenario, the bias induced by the first unbiased measurement in the subsequent measurement can be eradicated by the information acquired, allowing one to extract unbiased information of the second measurement of an incompatible observable. In terms of Fisher information we propose a kind of information comparison measure and explore various types of trade-offs between the information gains and measurement precisions, which interpret the uncertainty relation as surplus variance trade-off over individual perfect measurements instead of a constraint on extracting complete information of incompatible observables.

  6. Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data.

    PubMed

    Rohrer, Sebastian G; Baumann, Knut

    2009-02-01

    Refined nearest neighbor analysis was recently introduced for the analysis of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a mathematical framework for the nonparametric analysis of mapped point patterns. Here, refined nearest neighbor analysis is used to design benchmark data sets for virtual screening based on PubChem bioactivity data. A workflow is devised that purges data sets of compounds active against pharmaceutically relevant targets from unselective hits. Topological optimization using experimental design strategies monitored by refined nearest neighbor analysis functions is applied to generate corresponding data sets of actives and decoys that are unbiased with regard to analogue bias and artificial enrichment. These data sets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The data sets and a software package implementing the MUV design workflow are freely available at http://www.pharmchem.tu-bs.de/lehre/baumann/MUV.html.

  7. Autocorrelation analysis for the unbiased determination of power-law exponents in single-quantum-dot blinking.

    PubMed

    Houel, Julien; Doan, Quang T; Cajgfinger, Thomas; Ledoux, Gilles; Amans, David; Aubret, Antoine; Dominjon, Agnès; Ferriol, Sylvain; Barbier, Rémi; Nasilowski, Michel; Lhuillier, Emmanuel; Dubertret, Benoît; Dujardin, Christophe; Kulzer, Florian

    2015-01-27

    We present an unbiased and robust analysis method for power-law blinking statistics in the photoluminescence of single nanoemitters, allowing us to extract both the bright- and dark-state power-law exponents from the emitters' intensity autocorrelation functions. As opposed to the widely used threshold method, our technique therefore does not require discriminating the emission levels of bright and dark states in the experimental intensity timetraces. We rely on the simultaneous recording of 450 emission timetraces of single CdSe/CdS core/shell quantum dots at a frame rate of 250 Hz with single photon sensitivity. Under these conditions, our approach can determine ON and OFF power-law exponents with a precision of 3% from a comparison to numerical simulations, even for shot-noise-dominated emission signals with an average intensity below 1 photon per frame and per quantum dot. These capabilities pave the way for the unbiased, threshold-free determination of blinking power-law exponents at the microsecond time scale.

  8. Efficient Variable Selection Method for Exposure Variables on Binary Data

    NASA Astrophysics Data System (ADS)

    Ohno, Manabu; Tarumi, Tomoyuki

    In this paper, we propose a new variable selection method for "robust" exposure variables. We define "robust" as property that the same variable can select among original data and perturbed data. There are few studies of effective for the selection method. The problem that selects exposure variables is almost the same as a problem that extracts correlation rules without robustness. [Brin 97] is suggested that correlation rules are possible to extract efficiently using chi-squared statistic of contingency table having monotone property on binary data. But the chi-squared value does not have monotone property, so it's is easy to judge the method to be not independent with an increase in the dimension though the variable set is completely independent, and the method is not usable in variable selection for robust exposure variables. We assume anti-monotone property for independent variables to select robust independent variables and use the apriori algorithm for it. The apriori algorithm is one of the algorithms which find association rules from the market basket data. The algorithm use anti-monotone property on the support which is defined by association rules. But independent property does not completely have anti-monotone property on the AIC of independent probability model, but the tendency to have anti-monotone property is strong. Therefore, selected variables with anti-monotone property on the AIC have robustness. Our method judges whether a certain variable is exposure variable for the independent variable using previous comparison of the AIC. Our numerical experiments show that our method can select robust exposure variables efficiently and precisely.

  9. Correction of Selection Bias in Survey Data: Is the Statistical Cure Worse Than the Bias?

    PubMed

    Hanley, James A

    2017-03-15

    In previous articles in the American Journal of Epidemiology (Am J Epidemiol. 2013;177(5):431-442) and American Journal of Public Health (Am J Public Health. 2013;103(10):1895-1901), Masters et al. reported age-specific hazard ratios for the contrasts in mortality rates between obesity categories. They corrected the observed hazard ratios for selection bias caused by what they postulated was the nonrepresentativeness of the participants in the National Health Interview Study that increased with age, obesity, and ill health. However, it is possible that their regression approach to remove the alleged bias has not produced, and in general cannot produce, sensible hazard ratio estimates. First, one must consider how many nonparticipants there might have been in each category of obesity and of age at entry and how much higher the mortality rates would have to be in nonparticipants than in participants in these same categories. What plausible set of numerical values would convert the ("biased") decreasing-with-age hazard ratios seen in the data into the ("unbiased") increasing-with-age ratios that they computed? Can these values be encapsulated in (and can sensible values be recovered from) 1 additional internal variable in a regression model? Second, one must examine the age pattern of the hazard ratios that have been adjusted for selection. Without the correction, the hazard ratios are attenuated with increasing age. With it, the hazard ratios at older ages are considerably higher, but those at younger ages are well below 1. Third, one must test whether the regression approach suggested by Masters et al. would correct the nonrepresentativeness that increased with age and ill health that I introduced into real and hypothetical data sets. I found that the approach did not recover the hazard ratio patterns present in the unselected data sets: The corrections overshot the target at older ages and undershot it at lower ages. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.

  10. Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.

    PubMed

    van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F

    2015-09-17

    In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.

  11. Demand for health care in Denmark: results of a national sample survey using contingent valuation.

    PubMed

    Gyldmark, M; Morrison, G C

    2001-10-01

    In this paper we use willingness to pay (WTP) to elicit values for private insurance covering treatment for four different health problems. By way of obtaining these values, we test the viability of the contingent valuation method (CVM) and econometric techniques, respectively, as means of eliciting and analysing values from the general public. WTP responses from a Danish national sample survey, which was designed in accordance with existing guidelines, are analysed in terms of consistency and validity checks. Large numbers of zero responses are common in WTP studies, and are found here; therefore, the Heckman selectivity model and log-transformed OLS are employed. The selectivity model is rejected, but test results indicate that the lognormal model yields efficient and unbiased estimates. The results give confidence in the WTP estimates obtained and, more generally, in CVM as a means of valuing publicly provided goods and in econometrics as a tool for analysing WTP results containing many zero responses.

  12. Deficiencies in the reporting of VD and t1/2 in the FDA approved chemotherapy drug inserts

    PubMed Central

    D’Souza, Malcolm J.; Alabed, Ghada J.

    2011-01-01

    Since its release in 2006, the US Food and Drug Administration (FDA) final improved format for prescription drug labeling has revamped the comprehensiveness of drug inserts, including chemotherapy drugs. The chemotherapy drug “packets”, retrieved via the FDA website and other accredited drug information reporting agencies such as the Physician Drug Reference (PDR), are practically the only available unbiased summary of information. One objective is to impartially evaluate the reporting of useful pharmacokinetic parameters, in particular, Volume of Distribution (VD) and elimination half-life (t1/2), in randomly selected FDA approved chemotherapy drug inserts. The web-accessible portable document format (PDF) files for 30 randomly selected chemotherapy drugs are subjected to detailed search and the two parameters of interest are tabulated. The knowledge of the two parameters is essential in directing patient care as well as for clinical research and since the completeness of the core FDA recommendations has been found deficient, a detailed explanation of the impact of such deficiencies is provided. PMID:21643531

  13. Pulmonary infection by Yersinia pestis rapidly establishes a permissive environment for microbial proliferation.

    PubMed

    Price, Paul A; Jin, Jianping; Goldman, William E

    2012-02-21

    Disease progression of primary pneumonic plague is biphasic, consisting of a preinflammatory and a proinflammatory phase. During the long preinflammatory phase, bacteria replicate to high levels, seemingly uninhibited by normal pulmonary defenses. In a coinfection model of pneumonic plague, it appears that Yersinia pestis quickly creates a localized, dominant anti-inflammatory state that allows for the survival and rapid growth of both itself and normally avirulent organisms. Yersinia pseudotuberculosis, the relatively recent progenitor of Y. pestis, shows no similar trans-complementation effect, which is unprecedented among other respiratory pathogens. We demonstrate that the effectors secreted by the Ysc type III secretion system are necessary but not sufficient to mediate this apparent immunosuppression. Even an unbiased negative selection screen using a vast pool of Y. pestis mutants revealed no selection against any known virulence genes, demonstrating the transformation of the lung from a highly restrictive to a generally permissive environment during the preinflammatory phase of pneumonic plague.

  14. Twelve recommendations for integrating existing systematic reviews into new reviews: EPC guidance.

    PubMed

    Robinson, Karen A; Chou, Roger; Berkman, Nancy D; Newberry, Sydne J; Fu, Rongwei; Hartling, Lisa; Dryden, Donna; Butler, Mary; Foisy, Michelle; Anderson, Johanna; Motu'apuaka, Makalapua; Relevo, Rose; Guise, Jeanne-Marie; Chang, Stephanie

    2016-02-01

    As time and cost constraints in the conduct of systematic reviews increase, the need to consider the use of existing systematic reviews also increases. We developed guidance on the integration of systematic reviews into new reviews. A workgroup of methodologists from Evidence-based Practice Centers developed consensus-based recommendations. Discussions were informed by a literature scan and by interviews with organizations that conduct systematic reviews. Twelve recommendations were developed addressing selecting reviews, assessing risk of bias, qualitative and quantitative synthesis, and summarizing and assessing body of evidence. We provide preliminary guidance for an efficient and unbiased approach to integrating existing systematic reviews with primary studies in a new review. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Finite mixture model: A maximum likelihood estimation approach on time series data

    NASA Astrophysics Data System (ADS)

    Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad

    2014-09-01

    Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.

  16. Generating high precision ionospheric ground-truth measurements

    NASA Technical Reports Server (NTRS)

    Komjathy, Attila (Inventor); Sparks, Lawrence (Inventor); Mannucci, Anthony J. (Inventor)

    2007-01-01

    A method, apparatus and article of manufacture provide ionospheric ground-truth measurements for use in a wide-area augmentation system (WAAS). Ionospheric pseudorange/code and carrier phase data as primary observables is received by a WAAS receiver. A polynomial fit is performed on the phase data that is examined to identify any cycle slips in the phase data. The phase data is then leveled. Satellite and receiver biases are obtained and applied to the leveled phase data to obtain unbiased phase-leveled ionospheric measurements that are used in a WAAS system. In addition, one of several measurements may be selected and data is output that provides information on the quality of the measurements that are used to determine corrective messages as part of the WAAS system.

  17. Use of allele scores as instrumental variables for Mendelian randomization

    PubMed Central

    Burgess, Stephen; Thompson, Simon G

    2013-01-01

    Background An allele score is a single variable summarizing multiple genetic variants associated with a risk factor. It is calculated as the total number of risk factor-increasing alleles for an individual (unweighted score), or the sum of weights for each allele corresponding to estimated genetic effect sizes (weighted score). An allele score can be used in a Mendelian randomization analysis to estimate the causal effect of the risk factor on an outcome. Methods Data were simulated to investigate the use of allele scores in Mendelian randomization where conventional instrumental variable techniques using multiple genetic variants demonstrate ‘weak instrument’ bias. The robustness of estimates using the allele score to misspecification (for example non-linearity, effect modification) and to violations of the instrumental variable assumptions was assessed. Results Causal estimates using a correctly specified allele score were unbiased with appropriate coverage levels. The estimates were generally robust to misspecification of the allele score, but not to instrumental variable violations, even if the majority of variants in the allele score were valid instruments. Using a weighted rather than an unweighted allele score increased power, but the increase was small when genetic variants had similar effect sizes. Naive use of the data under analysis to choose which variants to include in an allele score, or for deriving weights, resulted in substantial biases. Conclusions Allele scores enable valid causal estimates with large numbers of genetic variants. The stringency of criteria for genetic variants in Mendelian randomization should be maintained for all variants in an allele score. PMID:24062299

  18. Sexual selection and allometry: a critical reappraisal of the evidence and ideas.

    PubMed

    Bonduriansky, Russell

    2007-04-01

    One of the most pervasive ideas in the sexual selection literature is the belief that sexually selected traits almost universally exhibit positive static allometries (i.e., within a sample of conspecific adults, larger individuals have disproportionally larger traits). In this review, I show that this idea is contradicted by empirical evidence and theory. Although positive allometry is a typical attribute of some sexual traits in certain groups, the preponderance of positively allometric sexual traits in the empirical literature apparently results from a sampling bias reflecting a fascination with unusually exaggerated (bizarre) traits. I review empirical examples from a broad range of taxa illustrating the diversity of allometric patterns exhibited by signal, weapon, clasping and genital traits, as well as nonsexual traits. This evidence suggests that positive allometry may be the exception rather than the rule in sexual traits, that directional sexual selection does not necessarily lead to the evolution of positive allometry and, conversely, that positive allometry is not necessarily a consequence of sexual selection, and that many sexual traits exhibit sex differences in allometric intercept rather than slope. Such diversity in the allometries of secondary sexual traits is to be expected, given that optimal allometry should reflect resource allocation trade-offs, and patterns of sexual and viability selection on both trait size and body size. An unbiased empirical assessment of the relation between sexual selection and allometry is an essential step towards an understanding of this diversity.

  19. Optoelectrical Properties of a Heterojunction with Amorphous InGaZnO Film on n-Silicon Substrate

    NASA Astrophysics Data System (ADS)

    Jiang, D. L.; Ma, X. Z.; Li, L.; Xu, Z. K.

    2017-10-01

    An a-IGZO/ n-Si heterojunction device has been fabricated at room temperature by depositing amorphous InGaZnO (a-IGZO) film on n-type silicon substrate by plasma-assisted pulsed laser deposition and its optoelectrical properties studied in detail. The heterojunction showed distinct rectifying characteristic with rectification ratio of 1.93 × 103 at ±2 V bias and reverse leakage current density of 1.6 × 10-6 A cm-2 at -2 V bias. More interestingly, the heterojunction not only showed the characteristic of unbiased photoresponse, but could also detect either ultraviolet or ultraviolet-visible light by simply changing the polarity of the bias applied to the heterojunction. The variable photoresponse phenomenon and the charge transport mechanisms in the heterojunction are explained based on the energy band diagram of the heterojunction.

  20. Experiences from the testing of a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  1. Experience gained in testing a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift, and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  2. The Use of Propensity Scores and Observational Data to Estimate Randomized Controlled Trial Generalizability Bias

    PubMed Central

    Pressler, Taylor R.; Kaizar, Eloise E.

    2014-01-01

    While randomized controlled trials (RCT) are considered the “gold standard” for clinical studies, the use of exclusion criteria may impact the external validity of the results. It is unknown whether estimators of effect size are biased by excluding a portion of the target population from enrollment. We propose to use observational data to estimate the bias due to enrollment restrictions, which we term generalizability bias. In this paper we introduce a class of estimators for the generalizability bias and use simulation to study its properties in the presence of non-constant treatment effects. We find the surprising result that our estimators can be unbiased for the true generalizability bias even when all potentially confounding variables are not measured. In addition, our proposed doubly robust estimator performs well even for mis-specified models. PMID:23553373

  3. Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

    NASA Astrophysics Data System (ADS)

    Wang, Lijuan; Yan, Yong; Wang, Xue; Wang, Tao

    2017-03-01

    Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including partial mutual information (PMI), genetic algorithm-artificial neural network (GA-ANN) and tree-based iterative input selection (IIS) are applied in this study. Typical data-driven models incorporating support vector machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction.

  4. Using SMAP Data to Investigate the Role of Soil Moisture Variability on Realtime Flood Forecasting

    NASA Astrophysics Data System (ADS)

    Krajewski, W. F.; Jadidoleslam, N.; Mantilla, R.

    2017-12-01

    The Iowa Flood Center has developed a regional high-resolution flood-forecasting model for the state of Iowa that decomposes the landscape into hillslopes of about 0.1 km2. For the model to benefit, through data assimilation, from SMAP observations of soil moisture (SM) at scales of approximately 100 km2, we are testing a framework to connect SMAP-scale observations to the small-scale SM variability calculated by our rainfall-runoff models. As a step in this direction, we performed data analyses of 15-min point SM observations using a network of about 30 TDR instruments spread throughout the state. We developed a stochastic point-scale SM model that captures 1) SM increases due to rainfall inputs, and 2) SM decay during dry periods. We use a power law model to describe soil moisture decay during dry periods, and a single parameter logistic curve to describe precipitation feedback on soil moisture. We find that the parameters of the models behave as time-independent random variables with stationary distributions. Using data-based simulation, we explore differences in the dynamical range of variability of hillslope and SMAP-scale domains. The simulations allow us to predict the runoff field and streamflow hydrographs for the state of Iowa during the three largest flooding periods (2008, 2014, and 2016). We also use the results to determine the reduction in forecast uncertainty from assimilation of unbiased SMAP-scale soil moisture observations.

  5. Instrumental Variable Methods for Continuous Outcomes That Accommodate Nonignorable Missing Baseline Values.

    PubMed

    Ertefaie, Ashkan; Flory, James H; Hennessy, Sean; Small, Dylan S

    2017-06-15

    Instrumental variable (IV) methods provide unbiased treatment effect estimation in the presence of unmeasured confounders under certain assumptions. To provide valid estimates of treatment effect, treatment effect confounders that are associated with the IV (IV-confounders) must be included in the analysis, and not including observations with missing values may lead to bias. Missing covariate data are particularly problematic when the probability that a value is missing is related to the value itself, which is known as nonignorable missingness. In such cases, imputation-based methods are biased. Using health-care provider preference as an IV method, we propose a 2-step procedure with which to estimate a valid treatment effect in the presence of baseline variables with nonignorable missing values. First, the provider preference IV value is estimated by performing a complete-case analysis using a random-effects model that includes IV-confounders. Second, the treatment effect is estimated using a 2-stage least squares IV approach that excludes IV-confounders with missing values. Simulation results are presented, and the method is applied to an analysis comparing the effects of sulfonylureas versus metformin on body mass index, where the variables baseline body mass index and glycosylated hemoglobin have missing values. Our result supports the association of sulfonylureas with weight gain. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Possibilities and limitations of the kinetic plot method in supercritical fluid chromatography.

    PubMed

    De Pauw, Ruben; Desmet, Gert; Broeckhoven, Ken

    2013-08-30

    Although supercritical fluid chromatography (SFC) is becoming a technique of increasing importance in the field of analytical chromatography, methods to compare the performance of SFC-columns and separations in an unbiased way are not fully developed. The present study uses mathematical models to investigate the possibilities and limitations of the kinetic plot method in SFC as this easily allows to investigate a wide range of operating pressures, retention and mobile phase conditions. The variable column length (L) kinetic plot method was further investigated in this work. Since the pressure history is identical for each measurement, this method gives the true kinetic performance limit in SFC. The deviations of the traditional way of measuring the performance as a function of flow rate (fixed back pressure and column length) and the isopycnic method with respect to this variable column length method were investigated under a wide range of operational conditions. It is found that using the variable L method, extrapolations towards other pressure drops are not valid in SFC (deviation of ∼15% for extrapolation from 50 to 200bar pressure drop). The isopycnic method provides the best prediction but its use is limited when operating closer towards critical point conditions. When an organic modifier is used, the predictions are improved for both methods with respect to the variable L method (e.g. deviations decreases from 20% to 2% when 20mol% of methanol is added). Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Correcting for bias in the selection and validation of informative diagnostic tests.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2015-04-15

    When developing a new diagnostic test for a disease, there are often multiple candidate classifiers to choose from, and it is unclear if any will offer an improvement in performance compared with current technology. A two-stage design can be used to select a promising classifier (if one exists) in stage one for definitive validation in stage two. However, estimating the true properties of the chosen classifier is complicated by the first stage selection rules. In particular, the usual maximum likelihood estimator (MLE) that combines data from both stages will be biased high. Consequently, confidence intervals and p-values flowing from the MLE will also be incorrect. Building on the results of Pepe et al. (SIM 28:762-779), we derive the most efficient conditionally unbiased estimator and exact confidence intervals for a classifier's sensitivity in a two-stage design with arbitrary selection rules; the condition being that the trial proceeds to the validation stage. We apply our estimation strategy to data from a recent family history screening tool validation study by Walter et al. (BJGP 63:393-400) and are able to identify and successfully adjust for bias in the tool's estimated sensitivity to detect those at higher risk of breast cancer. © 2015 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  8. THE AzTEC/SMA INTERFEROMETRIC IMAGING SURVEY OF SUBMILLIMETER-SELECTED HIGH-REDSHIFT GALAXIES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Younger, Joshua D.; Fazio, Giovanni G.; Huang Jiasheng

    We present results from a continuing interferometric survey of high-redshift submillimeter galaxies (SMGs) with the Submillimeter Array, including high-resolution (beam size approx2 arcsec) imaging of eight additional AzTEC 1.1 mm selected sources in the COSMOS field, for which we obtain six reliable (peak signal-to-noise ratio (S/N) >5 or peak S/N >4 with multiwavelength counterparts within the beam) and two moderate significance (peak S/N >4) detections. When combined with previous detections, this yields an unbiased sample of millimeter-selected SMGs with complete interferometric follow up. With this sample in hand, we (1) empirically confirm the radio-submillimeter association, (2) examine the submillimeter morphology-includingmore » the nature of SMGs with multiple radio counterparts and constraints on the physical scale of the far infrared-of the sample, and (3) find additional evidence for a population of extremely luminous, radio-dim SMGs that peaks at higher redshift than previous, radio-selected samples. In particular, the presence of such a population of high-redshift sources has important consequences for models of galaxy formation-which struggle to account for such objects even under liberal assumptions-and dust production models given the limited time since the big bang.« less

  9. Studying the genetic basis of speciation in high gene flow marine invertebrates

    PubMed Central

    2016-01-01

    A growing number of genes responsible for reproductive incompatibilities between species (barrier loci) exhibit the signals of positive selection. However, the possibility that genes experiencing positive selection diverge early in speciation and commonly cause reproductive incompatibilities has not been systematically investigated on a genome-wide scale. Here, I outline a research program for studying the genetic basis of speciation in broadcast spawning marine invertebrates that uses a priori genome-wide information on a large, unbiased sample of genes tested for positive selection. A targeted sequence capture approach is proposed that scores single-nucleotide polymorphisms (SNPs) in widely separated species populations at an early stage of allopatric divergence. The targeted capture of both coding and non-coding sequences enables SNPs to be characterized at known locations across the genome and at genes with known selective or neutral histories. The neutral coding and non-coding SNPs provide robust background distributions for identifying FST-outliers within genes that can, in principle, identify specific mutations experiencing diversifying selection. If natural hybridization occurs between species, the neutral coding and non-coding SNPs can provide a neutral admixture model for genomic clines analyses aimed at finding genes exhibiting strong blocks to introgression. Strongylocentrotid sea urchins are used as a model system to outline the approach but it can be used for any group that has a complete reference genome available. PMID:29491951

  10. Identification of solid state fermentation degree with FT-NIR spectroscopy: Comparison of wavelength variable selection methods of CARS and SCARS.

    PubMed

    Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai

    2015-01-01

    The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Identification of solid state fermentation degree with FT-NIR spectroscopy: Comparison of wavelength variable selection methods of CARS and SCARS

    NASA Astrophysics Data System (ADS)

    Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai

    2015-10-01

    The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree.

  12. DNA-encoded libraries - an efficient small molecule discovery technology for the biomedical sciences.

    PubMed

    Kunig, Verena; Potowski, Marco; Gohla, Anne; Brunschweiger, Andreas

    2018-06-27

    DNA-encoded compound libraries are a highly attractive technology for the discovery of small molecule protein ligands. These compound collections consist of small molecules covalently connected to individual DNA sequences carrying readable information about the compound structure. DNA-tagging allows for efficient synthesis, handling and interrogation of vast numbers of chemically synthesized, drug-like compounds. They are screened on proteins by an efficient, generic assay based on Darwinian principles of selection. To date, selection of DNA-encoded libraries allowed for the identification of numerous bioactive compounds. Some of these compounds uncovered hitherto unknown allosteric binding sites on target proteins; several compounds proved their value as chemical biology probes unraveling complex biology; and the first examples of clinical candidates that trace their ancestry to a DNA-encoded library were reported. Thus, DNA-encoded libraries proved their value for the biomedical sciences as a generic technology for the identification of bioactive drug-like molecules numerous times. However, large scale experiments showed that even the selection of billions of compounds failed to deliver bioactive compounds for the majority of proteins in an unbiased panel of target proteins. This raises the question of compound library design.

  13. Usefulness of the HMRPGV method for simultaneous selection of upland cotton genotypes with greater fiber length and high yield stability.

    PubMed

    Farias, F J C; Carvalho, L P; Silva Filho, J L; Teodoro, P E

    2016-08-19

    The harmonic mean of the relative performance of genotypic predicted value (HMRPGV) method has been used to measure the genotypic stability and adaptability of various crops. However, its use in cotton is still restricted. This study aimed to use mixed models to select cotton genotypes that simultaneously result in longer fiber length, higher fiber yield, and phenotypic stability in both of these traits. Eight trials with 16 cotton genotypes were conducted in the 2008/2009 harvest in Mato Grosso State. The experimental design was randomized complete blocks with four replicates of each of the 16 genotypes. In each trial, we evaluated fiber yield and fiber length. The genetic parameters were estimated using the restricted maximum likelihood/best linear unbiased predictor method. Joint selection considering, simultaneously, fiber length, fiber yield, stability, and adaptability is possible with the HMRPGV method. Our results suggested that genotypes CNPA MT 04 2080 and BRS CEDRO may be grown in environments similar to those tested here and may be predicted to result in greater fiber length, fiber yield, adaptability, and phenotypic stability. These genotypes may constitute a promising population base in breeding programs aimed at increasing these trait values.

  14. Phenology of Scramble Polygyny in a Wild Population of Chrysolemid Beetles: The Opportunity for and the Strength of Sexual Selection

    PubMed Central

    Baena, Martha Lucía; Macías-Ordóñez, Rogelio

    2012-01-01

    Recent debate has highlighted the importance of estimating both the strength of sexual selection on phenotypic traits, and the opportunity for sexual selection. We describe seasonal fluctuations in mating dynamics of Leptinotarsa undecimlineata (Coleoptera: Chrysomelidae). We compared several estimates of the opportunity for, and the strength of, sexual selection and male precopulatory competition over the reproductive season. First, using a null model, we suggest that the ratio between observed values of the opportunity for sexual selections and their expected value under random mating results in unbiased estimates of the actual nonrandom mating behavior of the population. Second, we found that estimates for the whole reproductive season often misrepresent the actual value at any given time period. Third, mating differentials on male size and mobility, frequency of male fighting and three estimates of the opportunity for sexual selection provide contrasting but complementary information. More intense sexual selection associated to male mobility, but not to male size, was observed in periods with high opportunity for sexual selection and high frequency of male fights. Fourth, based on parameters of spatial and temporal aggregation of female receptivity, we describe the mating system of L. undecimlineata as a scramble mating polygyny in which the opportunity for sexual selection varies widely throughout the season, but the strength of sexual selection on male size remains fairly weak, while male mobility inversely covaries with mating success. We suggest that different estimates for the opportunity for, and intensity of, sexual selection should be applied in order to discriminate how different behavioral and demographic factors shape the reproductive dynamic of populations. PMID:22761675

  15. On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

    PubMed

    Pfeiffer, R M; Riedl, R

    2015-08-15

    We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  16. YSO Jets in the Galactic Plane from UWISH2. IV. Jets and Outflows in Cygnus-X

    NASA Astrophysics Data System (ADS)

    Makin, S. V.; Froebrich, D.

    2018-01-01

    We have performed an unbiased search for outflows from young stars in Cygnus-X using 42 deg2 of data from the UKIRT Widefield Infrared Survey for H2 (UWISH2 Survey), to identify shock-excited near-IR H2 emission in the 1–0 S(1) 2.122 μm line. We uncovered 572 outflows, of which 465 are new discoveries, increasing the number of known objects by more than 430%. This large and unbiased sample allows us to statistically determine the typical properties of outflows from young stars. We found 261 bipolar outflows, and 16% of these are parsec scale. The typical bipolar outflow is 0.45 pc in length and has gaps of 0.025–0.1 pc between large knots. The median luminosity in the 1–0 S(1) line is 10‑3 {L}ȯ . The bipolar flows are typically asymmetrical, with the two lobes misaligned by 5°, one lobe 30% shorter than the other, and one lobe twice as bright as the other. Of the remaining outflows, 152 are single-sided and 159 are groups of extended, shock-excited H2 emission without identifiable driving sources. Half of all driving sources have sufficient WISE data to determine their evolutionary status as either protostars (80%) or classical T Tauri stars (20%). One-fifth of the driving sources are variable by more than 0.5 mag in the K-band continuum over several years. Several of the newly identified outflows provide excellent targets for follow-up studies. We particularly encourage the study of the outflows and young stars identified in a bright-rimmed cloud near IRAS 20294+4255, which seems to represent a textbook example of triggered star formation.

  17. Effects of sample size on estimates of population growth rates calculated with matrix models.

    PubMed

    Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M

    2008-08-28

    Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.

  18. A sibling method for identifying vQTLs

    PubMed Central

    Domingue, Ben; Dawes, Christopher; Boardman, Jason; Siegal, Mark

    2018-01-01

    The propensity of a trait to vary within a population may have evolutionary, ecological, or clinical significance. In the present study we deploy sibling models to offer a novel and unbiased way to ascertain loci associated with the extent to which phenotypes vary (variance-controlling quantitative trait loci, or vQTLs). Previous methods for vQTL-mapping either exclude genetically related individuals or treat genetic relatedness among individuals as a complicating factor addressed by adjusting estimates for non-independence in phenotypes. The present method uses genetic relatedness as a tool to obtain unbiased estimates of variance effects rather than as a nuisance. The family-based approach, which utilizes random variation between siblings in minor allele counts at a locus, also allows controls for parental genotype, mean effects, and non-linear (dominance) effects that may spuriously appear to generate variation. Simulations show that the approach performs equally well as two existing methods (squared Z-score and DGLM) in controlling type I error rates when there is no unobserved confounding, and performs significantly better than these methods in the presence of small degrees of confounding. Using height and BMI as empirical applications, we investigate SNPs that alter within-family variation in height and BMI, as well as pathways that appear to be enriched. One significant SNP for BMI variability, in the MAST4 gene, replicated. Pathway analysis revealed one gene set, encoding members of several signaling pathways related to gap junction function, which appears significantly enriched for associations with within-family height variation in both datasets (while not enriched in analysis of mean levels). We recommend approximating laboratory random assignment of genotype using family data and more careful attention to the possible conflation of mean and variance effects. PMID:29617452

  19. Modeling Particle Exposure in US Trucking Terminals

    PubMed Central

    Davis, ME; Smith, TJ; Laden, F; Hart, JE; Ryan, LM; Garshick, E

    2007-01-01

    Multi-tiered sampling approaches are common in environmental and occupational exposure assessment, where exposures for a given individual are often modeled based on simultaneous measurements taken at multiple indoor and outdoor sites. The monitoring data from such studies is hierarchical by design, imposing a complex covariance structure that must be accounted for in order to obtain unbiased estimates of exposure. Statistical methods such as structural equation modeling (SEM) represent a useful alternative to simple linear regression in these cases, providing simultaneous and unbiased predictions of each level of exposure based on a set of covariates specific to the exposure setting. We test the SEM approach using data from a large exposure assessment of diesel and combustion particles in the US trucking industry. The exposure assessment includes data from 36 different trucking terminals across the United States sampled between 2001 and 2005, measuring PM2.5 and its elemental carbon (EC), organic carbon (OC) components, by personal monitoring, and sampling at two indoor work locations and an outdoor “background” location. Using the SEM method, we predict: 1) personal exposures as a function of work related exposure and smoking status; 2) work related exposure as a function of terminal characteristics, indoor ventilation, job location, and background exposure conditions; and 3) background exposure conditions as a function of weather, nearby source pollution, and other regional differences across terminal sites. The primary advantage of SEMs in this setting is the ability to simultaneously predict exposures at each of the sampling locations, while accounting for the complex covariance structure among the measurements and descriptive variables. The statistically significant results and high R2 values observed from the trucking industry application supports the broader use of this approach in exposure assessment modeling. PMID:16856739

  20. Efficacy of calf:cow ratios for estimating calf production of arctic caribou

    USGS Publications Warehouse

    Cameron, R.D.; Griffith, B.; Parrett, L.S.; White, R.G.

    2013-01-01

    Caribou (Rangifer tarandus granti) calf:cow ratios (CCR) computed from composition counts obtained on arctic calving grounds are biased estimators of net calf production (NCP, the product of parturition rate and early calf survival) for sexually-mature females. Sexually-immature 2-year-old females, which are indistinguishable from sexually-mature females without calves, are included in the denominator, thereby biasing the calculated ratio low. This underestimate increases with the proportion of 2-year-old females in the population. We estimated the magnitude of this error with deterministic simulations under three scenarios of calf and yearling annual survival (respectively: low, 60 and 70%; medium, 70 and 80%; high, 80 and 90%) for five levels of unbiased NCP: 20, 40, 60, 80, and 100%. We assumed a survival rate of 90% for both 2-year-old and mature females. For each NCP, we computed numbers of 2-year-old females surviving annually and increased the denominator of CCR accordingly. We then calculated a series of hypothetical “observed” CCRs, which stabilized during the last 6 years of the simulations, and documented the degree to which each 6-year mean CCR differed from the corresponding NCP. For the three calf and yearling survival scenarios, proportional underestimates of NCP by CCR ranged 0.046–0.156, 0.058–0.187, and 0.071–0.216, respectively. Unfortunately, because parturition and survival rates are typically variable (i.e., age distribution is unstable), the magnitude of the error is not predictable without substantial supporting information. We recommend maintaining a sufficient sample of known-age radiocollared females in each herd and implementing a regular relocation schedule during the calving period to obtain unbiased estimates of both parturition rate and NCP.

  1. MODIS observations of cyanobacterial risks in a eutrophic lake: Implications for long-term safety evaluation in drinking-water source.

    PubMed

    Duan, Hongtao; Tao, Min; Loiselle, Steven Arthur; Zhao, Wei; Cao, Zhigang; Ma, Ronghua; Tang, Xiaoxian

    2017-10-01

    The occurrence and related risks from cyanobacterial blooms have increased world-wide over the past 40 years. Information on the abundance and distribution of cyanobacteria is fundamental to support risk assessment and management activities. In the present study, an approach based on Empirical Orthogonal Function (EOF) analysis was used to estimate the concentrations of chlorophyll a (Chla) and the cyanobacterial biomarker pigment phycocyanin (PC) using data from the MODerate resolution Imaging Spectroradiometer (MODIS) in Lake Chaohu (China's fifth largest freshwater lake). The approach was developed and tested using fourteen years (2000-2014) of MODIS images, which showed significant spatial and temporal variability of the PC:Chla ratio, an indicator of cyanobacterial dominance. The results had unbiased RMS uncertainties of <60% for Chla ranging between 10 and 300 μg/L, and unbiased RMS uncertainties of <65% for PC between 10 and 500 μg/L. Further analysis showed the importance of nutrient and climate conditions for this dominance. Low TN:TP ratios (<29:1) and elevated temperatures were found to influence the seasonal shift of phytoplankton community. The resultant MODIS Chla and PC products were then used for cyanobacterial risk mapping with a decision tree classification model. The resulting Water Quality Decision Matrix (WQDM) was designed to assist authorities in the identification of possible intake areas, as well as specific months when higher frequency monitoring and more intense water treatment would be required if the location of the present intake area remained the same. Remote sensing cyanobacterial risk mapping provides a new tool for reservoir and lake management programs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Normalization of High Dimensional Genomics Data Where the Distribution of the Altered Variables Is Skewed

    PubMed Central

    Landfors, Mattias; Philip, Philge; Rydén, Patrik; Stenberg, Per

    2011-01-01

    Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher sensitivity and lower bias than can be attained using standard and invariant normalization methods. PMID:22132175

  3. Quantifying Variability of Avian Colours: Are Signalling Traits More Variable?

    PubMed Central

    Delhey, Kaspar; Peters, Anne

    2008-01-01

    Background Increased variability in sexually selected ornaments, a key assumption of evolutionary theory, is thought to be maintained through condition-dependence. Condition-dependent handicap models of sexual selection predict that (a) sexually selected traits show amplified variability compared to equivalent non-sexually selected traits, and since males are usually the sexually selected sex, that (b) males are more variable than females, and (c) sexually dimorphic traits more variable than monomorphic ones. So far these predictions have only been tested for metric traits. Surprisingly, they have not been examined for bright coloration, one of the most prominent sexual traits. This omission stems from computational difficulties: different types of colours are quantified on different scales precluding the use of coefficients of variation. Methodology/Principal Findings Based on physiological models of avian colour vision we develop an index to quantify the degree of discriminable colour variation as it can be perceived by conspecifics. A comparison of variability in ornamental and non-ornamental colours in six bird species confirmed (a) that those coloured patches that are sexually selected or act as indicators of quality show increased chromatic variability. However, we found no support for (b) that males generally show higher levels of variability than females, or (c) that sexual dichromatism per se is associated with increased variability. Conclusions/Significance We show that it is currently possible to realistically estimate variability of animal colours as perceived by them, something difficult to achieve with other traits. Increased variability of known sexually-selected/quality-indicating colours in the studied species, provides support to the predictions borne from sexual selection theory but the lack of increased overall variability in males or dimorphic colours in general indicates that sexual differences might not always be shaped by similar selective forces. PMID:18301766

  4. Detection of seizures from small samples using nonlinear dynamic system theory.

    PubMed

    Yaylali, I; Koçak, H; Jayakar, P

    1996-07-01

    The electroencephalogram (EEG), like many other biological phenomena, is quite likely governed by nonlinear dynamics. Certain characteristics of the underlying dynamics have recently been quantified by computing the correlation dimensions (D2) of EEG time series data. In this paper, D2 of the unbiased autocovariance function of the scalp EEG data was used to detect electrographic seizure activity. Digital EEG data were acquired at a sampling rate of 200 Hz per channel and organized in continuous frames (duration 2.56 s, 512 data points). To increase the reliability of D2 computations with short duration data, raw EEG data were initially simplified using unbiased autocovariance analysis to highlight the periodic activity that is present during seizures. The D2 computation was then performed from the unbiased autocovariance function of each channel using the Grassberger-Procaccia method with Theiler's box-assisted correlation algorithm. Even with short duration data, this preprocessing proved to be computationally robust and displayed no significant sensitivity to implementation details such as the choices of embedding dimension and box size. The system successfully identified various types of seizures in clinical studies.

  5. Biased and unbiased perceptual decision-making on vocal emotions.

    PubMed

    Dricu, Mihai; Ceravolo, Leonardo; Grandjean, Didier; Frühholz, Sascha

    2017-11-24

    Perceptual decision-making on emotions involves gathering sensory information about the affective state of another person and forming a decision on the likelihood of a particular state. These perceptual decisions can be of varying complexity as determined by different contexts. We used functional magnetic resonance imaging and a region of interest approach to investigate the brain activation and functional connectivity behind two forms of perceptual decision-making. More complex unbiased decisions on affective voices recruited an extended bilateral network consisting of the posterior inferior frontal cortex, the orbitofrontal cortex, the amygdala, and voice-sensitive areas in the auditory cortex. Less complex biased decisions on affective voices distinctly recruited the right mid inferior frontal cortex, pointing to a functional distinction in this region following decisional requirements. Furthermore, task-induced neural connectivity revealed stronger connections between these frontal, auditory, and limbic regions during unbiased relative to biased decision-making on affective voices. Together, the data shows that different types of perceptual decision-making on auditory emotions have distinct patterns of activations and functional coupling that follow the decisional strategies and cognitive mechanisms involved during these perceptual decisions.

  6. Minimum mean squared error (MSE) adjustment and the optimal Tykhonov-Phillips regularization parameter via reproducing best invariant quadratic uniformly unbiased estimates (repro-BIQUUE)

    NASA Astrophysics Data System (ADS)

    Schaffrin, Burkhard

    2008-02-01

    In a linear Gauss-Markov model, the parameter estimates from BLUUE (Best Linear Uniformly Unbiased Estimate) are not robust against possible outliers in the observations. Moreover, by giving up the unbiasedness constraint, the mean squared error (MSE) risk may be further reduced, in particular when the problem is ill-posed. In this paper, the α-weighted S-homBLE (Best homogeneously Linear Estimate) is derived via formulas originally used for variance component estimation on the basis of the repro-BIQUUE (reproducing Best Invariant Quadratic Uniformly Unbiased Estimate) principle in a model with stochastic prior information. In the present model, however, such prior information is not included, which allows the comparison of the stochastic approach (α-weighted S-homBLE) with the well-established algebraic approach of Tykhonov-Phillips regularization, also known as R-HAPS (Hybrid APproximation Solution), whenever the inverse of the “substitute matrix” S exists and is chosen as the R matrix that defines the relative impact of the regularizing term on the final result.

  7. Comprehensive Mechanistic Analysis of Hits from High-Throughput and Docking Screens against β-Lactamase

    PubMed Central

    Babaoglu, Kerim; Simeonov, Anton; Irwin, John J.; Nelson, Michael E.; Feng, Brian; Thomas, Craig J.; Cancian, Laura; Costi, M. Paola; Maltby, David A.; Jadhav, Ajit; Inglese, James; Austin, Christopher P.; Shoichet, Brian K.

    2009-01-01

    High-throughput screening (HTS) is widely used in drug discovery. Especially for screens of unbiased libraries, false positives can dominate “hit lists”; their origins are much debated. Here we determine the mechanism of every active hit from a screen of 70,563 unbiased molecules against β-lactamase using quantitative HTS (qHTS). Of the 1274 initial inhibitors, 95% were detergent-sensitive and were classified as aggregators. Among the 70 remaining were 25 potent, covalent-acting β-lactams. Mass spectra, counter-screens, and crystallography identified 12 as promiscuous covalent inhibitors. The remaining 33 were either aggregators or irreproducible. No specific reversible inhibitors were found. We turned to molecular docking to prioritize molecules from the same library for testing at higher concentrations. Of 16 tested, 2 were modest inhibitors. Subsequent X-ray structures corresponded to the docking prediction. Analog synthesis improved affinity to 8 µM. These results suggest that it may be the physical behavior of organic molecules, not their reactivity, that accounts for most screening artifacts. Structure-based methods may prioritize weak-but-novel chemotypes in unbiased library screens. PMID:18333608

  8. Construction of mutually unbiased bases with cyclic symmetry for qubit systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seyfarth, Ulrich; Ranade, Kedar S.

    2011-10-15

    For the complete estimation of arbitrary unknown quantum states by measurements, the use of mutually unbiased bases has been well established in theory and experiment for the past 20 years. However, most constructions of these bases make heavy use of abstract algebra and the mathematical theory of finite rings and fields, and no simple and generally accessible construction is available. This is particularly true in the case of a system composed of several qubits, which is arguably the most important case in quantum information science and quantum computation. In this paper, we close this gap by providing a simple andmore » straightforward method for the construction of mutually unbiased bases in the case of a qubit register. We show that our construction is also accessible to experiments, since only Hadamard and controlled-phase gates are needed, which are available in most practical realizations of a quantum computer. Moreover, our scheme possesses the optimal scaling possible, i.e., the number of gates scales only linearly in the number of qubits.« less

  9. Wetland plant species improve performance when inoculated with arbuscular mycorrhizal fungi: a meta-analysis of experimental pot studies.

    PubMed

    Ramírez-Viga, Thai Khan; Aguilar, Ramiro; Castillo-Argüero, Silvia; Chiappa-Carrara, Xavier; Guadarrama, Patricia; Ramos-Zapata, José

    2018-06-04

    The presence of arbuscular mycorrhizal fungi (AMF) in wetlands is widespread. Wetlands are transition ecosystems between aquatic and terrestrial systems, where shallow water stands or moves over the land surface. The presence of AMF in wetlands suggests that they are ecologically significant; however, their function is not yet clearly understood. With the aim of determining the overall magnitude and direction of AMF effect on wetland plants associated with them in pot assays, we conducted a meta-analysis of data extracted from 48 published studies. The AMF effect on their wetland hosts was estimated through different plant attributes reported in the studies including nutrient acquisition, photosynthetic activity, biomass production, and saline stress reduction. As the common metric, we calculated the standardized unbiased mean difference (Hedges' d) of wetland plant performance attributes in AMF-inoculated plants versus non-AMF-inoculated plants. Also, we examined a series of moderator variables regarding symbiont identity and experimental procedures that could influence the magnitude and direction of an AMF effect. Response patterns indicate that wetland plants significantly benefit from their association with AMF, even under flooded conditions. The beneficial AMF effect differed in magnitude depending on the plant attribute selected to estimate it in the published studies. The nature of these benefits depends on the identity of the host plant, phosphorus addition, and water availability in the soil where both symbionts develop. Our meta-analysis synthetizes the relationship of AMF with wetland plants in pot assays and suggests that AMF may be of comparable importance to wetland plants as to terrestrial plants.

  10. Analyses of flood-flow frequency for selected gaging stations in South Dakota

    USGS Publications Warehouse

    Benson, R.D.; Hoffman, E.B.; Wipf, V.J.

    1985-01-01

    Analyses of flood flow frequency were made for 111 continuous-record gaging stations in South Dakota with 10 or more years of record. The analyses were developed using the log-Pearson Type III procedure recommended by the U.S. Water Resources Council. The procedure characterizes flood occurrence at a single site as a sequence of annual peak flows. The magnitudes of the annual peak flows are assumed to be independent random variables following a log-Pearson Type III probability distribution, which defines the probability that any single annual peak flow will exceed a specified discharge. By considering only annual peak flows, the flood-frequency analysis becomes the estimation of the log-Pearson annual-probability curve using the record of annual peak flows at the site. The recorded data are divided into two classes: systematic and historic. The systematic record includes all annual peak flows determined in the process of conducting a systematic gaging program at a site. In this program, the annual peak flow is determined for each and every year of the program. The systematic record is intended to constitute an unbiased and representative sample of the population of all possible annual peak flows at the site. In contrast to the systematic record, the historic record consists of annual peak flows that would not have been determined except for evidence indicating their unusual magnitude. Flood information acquired from historical sources almost invariably refers to floods of noteworthy, and hence extraordinary, size. Although historic records form a biased and unrepresentative sample, they can be used to supplement the systematic record. (Author 's abstract)

  11. Using known populations of pronghorn to evaluate sampling plans and estimators

    USGS Publications Warehouse

    Kraft, K.M.; Johnson, D.H.; Samuelson, J.M.; Allen, S.H.

    1995-01-01

    Although sampling plans and estimators of abundance have good theoretical properties, their performance in real situations is rarely assessed because true population sizes are unknown. We evaluated widely used sampling plans and estimators of population size on 3 known clustered distributions of pronghorn (Antilocapra americana). Our criteria were accuracy of the estimate, coverage of 95% confidence intervals, and cost. Sampling plans were combinations of sampling intensities (16, 33, and 50%), sample selection (simple random sampling without replacement, systematic sampling, and probability proportional to size sampling with replacement), and stratification. We paired sampling plans with suitable estimators (simple, ratio, and probability proportional to size). We used area of the sampling unit as the auxiliary variable for the ratio and probability proportional to size estimators. All estimators were nearly unbiased, but precision was generally low (overall mean coefficient of variation [CV] = 29). Coverage of 95% confidence intervals was only 89% because of the highly skewed distribution of the pronghorn counts and small sample sizes, especially with stratification. Stratification combined with accurate estimates of optimal stratum sample sizes increased precision, reducing the mean CV from 33 without stratification to 25 with stratification; costs increased 23%. Precise results (mean CV = 13) but poor confidence interval coverage (83%) were obtained with simple and ratio estimators when the allocation scheme included all sampling units in the stratum containing most pronghorn. Although areas of the sampling units varied, ratio estimators and probability proportional to size sampling did not increase precision, possibly because of the clumped distribution of pronghorn. Managers should be cautious in using sampling plans and estimators to estimate abundance of aggregated populations.

  12. Intrinsic scatter of caustic masses and hydrostatic bias: An observational study

    NASA Astrophysics Data System (ADS)

    Andreon, S.; Trinchieri, G.; Moretti, A.; Wang, J.

    2017-10-01

    All estimates of cluster mass have some intrinsic scatter and perhaps some bias with true mass even in the absence of measurement errors for example caused by cluster triaxiality and large scale structure. Knowledge of the bias and scatter values is fundamental for both cluster cosmology and astrophysics. In this paper we show that the intrinsic scatter of a mass proxy can be constrained by measurements of the gas fraction because masses with higher values of intrinsic scatter with true mass produce more scattered gas fractions. Moreover, the relative bias of two mass estimates can be constrained by comparing the mean gas fraction at the same (nominal) cluster mass. Our observational study addresses the scatter between caustic (I.e., dynamically estimated) and true masses, and the relative bias of caustic and hydrostatic masses. For these purposes, we used the X-ray Unbiased Cluster Sample, a cluster sample selected independently from the intracluster medium content with reliable masses: 34 galaxy clusters in the nearby (0.050 < z < 0.135) Universe, mostly with 14 < log M500/M⊙ ≲ 14.5, and with caustic masses. We found a 35% scatter between caustic and true masses. Furthermore, we found that the relative bias between caustic and hydrostatic masses is small, 0.06 ± 0.05 dex, improving upon past measurements. The small scatter found confirms our previous measurements of a highly variable amount of feedback from cluster to cluster, which is the cause of the observed large variety of core-excised X-ray luminosities and gas masses.

  13. Data-driven modeling of surface temperature anomaly and solar activity trends

    USGS Publications Warehouse

    Friedel, Michael J.

    2012-01-01

    A novel two-step modeling scheme is used to reconstruct and analyze surface temperature and solar activity data at global, hemispheric, and regional scales. First, the self-organizing map (SOM) technique is used to extend annual modern climate data from the century to millennial scale. The SOM component planes are used to identify and quantify strength of nonlinear relations among modern surface temperature anomalies (<150 years), tropical and extratropical teleconnections, and Palmer Drought Severity Indices (0–2000 years). Cross-validation of global sea and land surface temperature anomalies verifies that the SOM is an unbiased estimator with less uncertainty than the magnitude of anomalies. Second, the quantile modeling of SOM reconstructions reveal trends and periods in surface temperature anomaly and solar activity whose timing agrees with published studies. Temporal features in surface temperature anomalies, such as the Medieval Warm Period, Little Ice Age, and Modern Warming Period, appear at all spatial scales but whose magnitudes increase when moving from ocean to land, from global to regional scales, and from southern to northern regions. Some caveats that apply when interpreting these data are the high-frequency filtering of climate signals based on quantile model selection and increased uncertainty when paleoclimatic data are limited. Even so, all models find the rate and magnitude of Modern Warming Period anomalies to be greater than those during the Medieval Warm Period. Lastly, quantile trends among reconstructed equatorial Pacific temperature profiles support the recent assertion of two primary El Niño Southern Oscillation types. These results demonstrate the efficacy of this alternative modeling approach for reconstructing and interpreting scale-dependent climate variables.

  14. Performance comparison of two efficient genomic selection methods (gsbay & MixP) applied in aquacultural organisms

    NASA Astrophysics Data System (ADS)

    Su, Hailin; Li, Hengde; Wang, Shi; Wang, Yangfan; Bao, Zhenmin

    2017-02-01

    Genomic selection is more and more popular in animal and plant breeding industries all around the world, as it can be applied early in life without impacting selection candidates. The objective of this study was to bring the advantages of genomic selection to scallop breeding. Two different genomic selection tools MixP and gsbay were applied on genomic evaluation of simulated data and Zhikong scallop ( Chlamys farreri) field data. The data were compared with genomic best linear unbiased prediction (GBLUP) method which has been applied widely. Our results showed that both MixP and gsbay could accurately estimate single-nucleotide polymorphism (SNP) marker effects, and thereby could be applied for the analysis of genomic estimated breeding values (GEBV). In simulated data from different scenarios, the accuracy of GEBV acquired was ranged from 0.20 to 0.78 by MixP; it was ranged from 0.21 to 0.67 by gsbay; and it was ranged from 0.21 to 0.61 by GBLUP. Estimations made by MixP and gsbay were expected to be more reliable than those estimated by GBLUP. Predictions made by gsbay were more robust, while with MixP the computation is much faster, especially in dealing with large-scale data. These results suggested that both algorithms implemented by MixP and gsbay are feasible to carry out genomic selection in scallop breeding, and more genotype data will be necessary to produce genomic estimated breeding values with a higher accuracy for the industry.

  15. A genome-wide scan for signatures of selection in Azeri and Khuzestani buffalo breeds.

    PubMed

    Mokhber, Mahdi; Moradi-Shahrbabak, Mohammad; Sadeghi, Mostafa; Moradi-Shahrbabak, Hossein; Stella, Alessandra; Nicolzzi, Ezequiel; Rahmaninia, Javad; Williams, John L

    2018-06-11

    Identification of genomic regions that have been targets of selection may shed light on the genetic history of livestock populations and help to identify variation controlling commercially important phenotypes. The Azeri and Kuzestani buffalos are the most common indigenous Iranian breeds which have been subjected to divergent selection and are well adapted to completely different regions. Examining the genetic structure of these populations may identify genomic regions associated with adaptation to the different environments and production goals. A set of 385 water buffalo samples from Azeri (N = 262) and Khuzestani (N = 123) breeds were genotyped using the Axiom® Buffalo Genotyping 90 K Array. The unbiased fixation index method (F ST ) was used to detect signatures of selection. In total, 13 regions with outlier F ST values (0.1%) were identified. Annotation of these regions using the UMD3.1 Bos taurus Genome Assembly was performed to find putative candidate genes and QTLs within the selected regions. Putative candidate genes identified include FBXO9, NDFIP1, ACTR3, ARHGAP26, SERPINF2, BOLA-DRB3, BOLA-DQB, CLN8, and MYOM2. Candidate genes identified in regions potentially under selection were associated with physiological pathways including milk production, cytoskeleton organization, growth, metabolic function, apoptosis and domestication-related changes include immune and nervous system development. The QTL identified are involved in economically important traits in buffalo related to milk composition, udder structure, somatic cell count, meat quality, and carcass and body weight.

  16. Genome-based prediction of test cross performance in two subsequent breeding cycles.

    PubMed

    Hofheinz, Nina; Borchardt, Dietrich; Weissleder, Knuth; Frisch, Matthias

    2012-12-01

    Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.

  17. Meta-analysis of magnitudes, differences and variation in evolutionary parameters.

    PubMed

    Morrissey, M B

    2016-10-01

    Meta-analysis is increasingly used to synthesize major patterns in the large literatures within ecology and evolution. Meta-analytic methods that do not account for the process of observing data, which we may refer to as 'informal meta-analyses', may have undesirable properties. In some cases, informal meta-analyses may produce results that are unbiased, but do not necessarily make the best possible use of available data. In other cases, unbiased statistical noise in individual reports in the literature can potentially be converted into severe systematic biases in informal meta-analyses. I first present a general description of how failure to account for noise in individual inferences should be expected to lead to biases in some kinds of meta-analysis. In particular, informal meta-analyses of quantities that reflect the dispersion of parameters in nature, for example, the mean absolute value of a quantity, are likely to be generally highly misleading. I then re-analyse three previously published informal meta-analyses, where key inferences were of aspects of the dispersion of values in nature, for example, the mean absolute value of selection gradients. Major biological conclusions in each original informal meta-analysis closely match those that could arise as artefacts due to statistical noise. I present alternative mixed-model-based analyses that are specifically tailored to each situation, but where all analyses may be implemented with widely available open-source software. In each example meta-re-analysis, major conclusions change substantially. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  18. The case test-negative design for studies of the effectiveness of influenza vaccine in inpatient settings.

    PubMed

    Foppa, Ivo M; Ferdinands, Jill M; Chaves, Sandra S; Haber, Michael J; Reynolds, Sue B; Flannery, Brendan; Fry, Alicia M

    2016-12-01

    The test-negative design (TND) to evaluate influenza vaccine effectiveness is based on patients seeking care for acute respiratory infection, with those who test positive for influenza as cases and the test-negatives serving as controls. This design has not been validated for the inpatient setting where selection bias might be different from an outpatient setting. We derived mathematical expressions for vaccine effectiveness (VE) against laboratory-confirmed influenza hospitalizations and used numerical simulations to verify theoretical results exploring expected biases under various scenarios. We explored meaningful interpretations of VE estimates from inpatient TND studies. VE estimates from inpatient TND studies capture the vaccine-mediated protection of the source population against laboratory-confirmed influenza hospitalizations. If vaccination does not modify disease severity, these estimates are equivalent to VE against influenza virus infection. If chronic cardiopulmonary individuals are enrolled because of non-infectious exacerbation, biased VE estimates (too high) will result. If chronic cardiopulmonary disease status is adjusted for accurately, the VE estimates will be unbiased. If chronic cardiopulmonary illness cannot be adequately be characterized, excluding these individuals may provide unbiased VE estimates. The inpatient TND offers logistic advantages and can provide valid estimates of influenza VE. If highly vaccinated patients with respiratory exacerbation of chronic cardiopulmonary conditions are eligible for study inclusion, biased VE estimates will result unless this group is well characterized and the analysis can adequately adjust for it. Otherwise, such groups of subjects should be excluded from the analysis. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.

  19. Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study

    PubMed Central

    Shah, Anoop D.; Bartlett, Jonathan W.; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-01-01

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The “true” imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001–2010) with complete data on all covariates. Variables were artificially made “missing at random,” and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data. PMID:24589914

  20. Spectroscopic analysis of Cepheid variables with 2D radiation-hydrodynamic simulations

    NASA Astrophysics Data System (ADS)

    Vasilyev, Valeriy

    2018-06-01

    The analysis of chemical enrichment history of dwarf galaxies allows to derive constraints on their formation and evolution. In this context, Cepheids play a very important role, as these periodically variable stars provide a means to obtain accurate distances. Besides, chemical composition of Cepheids can provide a strong constraint on the chemical evolution of the system. Standard spectroscopic analysis of Cepheids is based on using one-dimensional (1D) hydrostatic model atmospheres, with convection parametrised using the mixing-length theory. However, this quasi-static approach has theoretically not been validated. In my talk, I will discuss the validity of the quasi-static approximation in spectroscopy of short-periodic Cepheids. I will show the results obtained using a 2D time-dependent envelope model of a pulsating star computed with the radiation-hydrodynamics code CO5BOLD. I will then describe the impact of new models on the spectroscopic diagnostic of the effective temperature, surface gravity, microturbulent velocity, and metallicity. One of the interesting findings of my work is that 1D model atmospheres provide unbiased estimates of stellar parameters and abundances of Cepheid variables for certain phases of their pulsations. Convective inhomogeneities, however, also introduce biases. I will then discuss how these results can be used in a wider parameter space of pulsating stars and present an outlook for the future studies.

  1. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

    PubMed

    Shah, Anoop D; Bartlett, Jonathan W; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-03-15

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The "true" imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001-2010) with complete data on all covariates. Variables were artificially made "missing at random," and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data.

  2. Resolving the Conflict Between Associative Overdominance and Background Selection

    PubMed Central

    Zhao, Lei; Charlesworth, Brian

    2016-01-01

    In small populations, genetic linkage between a polymorphic neutral locus and loci subject to selection, either against partially recessive mutations or in favor of heterozygotes, may result in an apparent selective advantage to heterozygotes at the neutral locus (associative overdominance) and a retardation of the rate of loss of variability by genetic drift at this locus. In large populations, selection against deleterious mutations has previously been shown to reduce variability at linked neutral loci (background selection). We describe analytical, numerical, and simulation studies that shed light on the conditions under which retardation vs. acceleration of loss of variability occurs at a neutral locus linked to a locus under selection. We consider a finite, randomly mating population initiated from an infinite population in equilibrium at a locus under selection. With mutation and selection, retardation occurs only when S, the product of twice the effective population size and the selection coefficient, is of order 1. With S >> 1, background selection always causes an acceleration of loss of variability. Apparent heterozygote advantage at the neutral locus is, however, always observed when mutations are partially recessive, even if there is an accelerated rate of loss of variability. With heterozygote advantage at the selected locus, loss of variability is nearly always retarded. The results shed light on experiments on the loss of variability at marker loci in laboratory populations and on the results of computer simulations of the effects of multiple selected loci on neutral variability. PMID:27182952

  3. Variable Cultural Acquisition Costs Constrain Cumulative Cultural Evolution

    PubMed Central

    Mesoudi, Alex

    2011-01-01

    One of the hallmarks of the human species is our capacity for cumulative culture, in which beneficial knowledge and technology is accumulated over successive generations. Yet previous analyses of cumulative cultural change have failed to consider the possibility that as cultural complexity accumulates, it becomes increasingly costly for each new generation to acquire from the previous generation. In principle this may result in an upper limit on the cultural complexity that can be accumulated, at which point accumulated knowledge is so costly and time-consuming to acquire that further innovation is not possible. In this paper I first review existing empirical analyses of the history of science and technology that support the possibility that cultural acquisition costs may constrain cumulative cultural evolution. I then present macroscopic and individual-based models of cumulative cultural evolution that explore the consequences of this assumption of variable cultural acquisition costs, showing that making acquisition costs vary with cultural complexity causes the latter to reach an upper limit above which no further innovation can occur. These models further explore the consequences of different cultural transmission rules (directly biased, indirectly biased and unbiased transmission), population size, and cultural innovations that themselves reduce innovation or acquisition costs. PMID:21479170

  4. The Combined Effects of Measurement Error and Omitting Confounders in the Single-Mediator Model

    PubMed Central

    Fritz, Matthew S.; Kenny, David A.; MacKinnon, David P.

    2016-01-01

    Mediation analysis requires a number of strong assumptions be met in order to make valid causal inferences. Failing to account for violations of these assumptions, such as not modeling measurement error or omitting a common cause of the effects in the model, can bias the parameter estimates of the mediated effect. When the independent variable is perfectly reliable, for example when participants are randomly assigned to levels of treatment, measurement error in the mediator tends to underestimate the mediated effect, while the omission of a confounding variable of the mediator to outcome relation tends to overestimate the mediated effect. Violations of these two assumptions often co-occur, however, in which case the mediated effect could be overestimated, underestimated, or even, in very rare circumstances, unbiased. In order to explore the combined effect of measurement error and omitted confounders in the same model, the impact of each violation on the single-mediator model is first examined individually. Then the combined effect of having measurement error and omitted confounders in the same model is discussed. Throughout, an empirical example is provided to illustrate the effect of violating these assumptions on the mediated effect. PMID:27739903

  5. Kriging analysis of mean annual precipitation, Powder River Basin, Montana and Wyoming

    USGS Publications Warehouse

    Karlinger, M.R.; Skrivan, James A.

    1981-01-01

    Kriging is a statistical estimation technique for regionalized variables which exhibit an autocorrelation structure. Such structure can be described by a semi-variogram of the observed data. The kriging estimate at any point is a weighted average of the data, where the weights are determined using the semi-variogram and an assumed drift, or lack of drift, in the data. Block, or areal, estimates can also be calculated. The kriging algorithm, based on unbiased and minimum-variance estimates, involves a linear system of equations to calculate the weights. Kriging variances can then be used to give confidence intervals of the resulting estimates. Mean annual precipitation in the Powder River basin, Montana and Wyoming, is an important variable when considering restoration of coal-strip-mining lands of the region. Two kriging analyses involving data at 60 stations were made--one assuming no drift in precipitation, and one a partial quadratic drift simulating orographic effects. Contour maps of estimates of mean annual precipitation were similar for both analyses, as were the corresponding contours of kriging variances. Block estimates of mean annual precipitation were made for two subbasins. Runoff estimates were 1-2 percent of the kriged block estimates. (USGS)

  6. Multilevel structural equation models for assessing moderation within and across levels of analysis.

    PubMed

    Preacher, Kristopher J; Zhang, Zhen; Zyphur, Michael J

    2016-06-01

    Social scientists are increasingly interested in multilevel hypotheses, data, and statistical models as well as moderation or interactions among predictors. The result is a focus on hypotheses and tests of multilevel moderation within and across levels of analysis. Unfortunately, existing approaches to multilevel moderation have a variety of shortcomings, including conflated effects across levels of analysis and bias due to using observed cluster averages instead of latent variables (i.e., "random intercepts") to represent higher-level constructs. To overcome these problems and elucidate the nature of multilevel moderation effects, we introduce a multilevel structural equation modeling (MSEM) logic that clarifies the nature of the problems with existing practices and remedies them with latent variable interactions. This remedy uses random coefficients and/or latent moderated structural equations (LMS) for unbiased tests of multilevel moderation. We describe our approach and provide an example using the publicly available High School and Beyond data with Mplus syntax in Appendix. Our MSEM method eliminates problems of conflated multilevel effects and reduces bias in parameter estimates while offering a coherent framework for conceptualizing and testing multilevel moderation effects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  7. The geostatistical approach for structural and stratigraphic framework analysis of offshore NW Bonaparte Basin, Australia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wahid, Ali, E-mail: ali.wahid@live.com; Salim, Ahmed Mohamed Ahmed, E-mail: mohamed.salim@petronas.com.my; Yusoff, Wan Ismail Wan, E-mail: wanismail-wanyusoff@petronas.com.my

    2016-02-01

    Geostatistics or statistical approach is based on the studies of temporal and spatial trend, which depend upon spatial relationships to model known information of variable(s) at unsampled locations. The statistical technique known as kriging was used for petrophycial and facies analysis, which help to assume spatial relationship to model the geological continuity between the known data and the unknown to produce a single best guess of the unknown. Kriging is also known as optimal interpolation technique, which facilitate to generate best linear unbiased estimation of each horizon. The idea is to construct a numerical model of the lithofacies and rockmore » properties that honor available data and further integrate with interpreting seismic sections, techtonostratigraphy chart with sea level curve (short term) and regional tectonics of the study area to find the structural and stratigraphic growth history of the NW Bonaparte Basin. By using kriging technique the models were built which help to estimate different parameters like horizons, facies, and porosities in the study area. The variograms were used to determine for identification of spatial relationship between data which help to find the depositional history of the North West (NW) Bonaparte Basin.« less

  8. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less

  9. Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap

    NASA Astrophysics Data System (ADS)

    Spiwok, Vojtěch; Králová, Blanka

    2011-12-01

    Atomic motions in molecules are not linear. This infers that nonlinear dimensionality reduction methods can outperform linear ones in analysis of collective atomic motions. In addition, nonlinear collective motions can be used as potentially efficient guides for biased simulation techniques. Here we present a simulation with a bias potential acting in the directions of collective motions determined by a nonlinear dimensionality reduction method. Ad hoc generated conformations of trans,trans-1,2,4-trifluorocyclooctane were analyzed by Isomap method to map these 72-dimensional coordinates to three dimensions, as described by Brown and co-workers [J. Chem. Phys. 129, 064118 (2008)]. Metadynamics employing the three-dimensional embeddings as collective variables was applied to explore all relevant conformations of the studied system and to calculate its conformational free energy surface. The method sampled all relevant conformations (boat, boat-chair, and crown) and corresponding transition structures inaccessible by an unbiased simulation. This scheme allows to use essentially any parameter of the system as a collective variable in biased simulations. Moreover, the scheme we used for mapping out-of-sample conformations from the 72D to 3D space can be used as a general purpose mapping for dimensionality reduction, beyond the context of molecular modeling.

  10. Efficiency optimization in a correlation ratchet with asymmetric unbiased fluctuations

    NASA Astrophysics Data System (ADS)

    Ai, Bao-Quan; Wang, Xian-Ju; Liu, Guo-Tao; Wen, De-Hua; Xie, Hui-Zhang; Chen, Wei; Liu, Liang-Gang

    2003-12-01

    The efficiency of a Brownian particle moving in a periodic potential in the presence of asymmetric unbiased fluctuations is investigated. We found that even on the quasistatic limit there is a regime where the efficiency can be a peaked function of temperature, which proves that thermal fluctuations facilitate the efficiency of energy transformation, contradicting the earlier findings [H. Kamegawa et al., Phys. Rev. Lett. 80, 5251 (1998)]. It is also found that the mutual interplay between temporal asymmetry and spatial asymmetry may induce optimized efficiency at finite temperatures. The ratchet is not most efficient when it gives maximum current.

  11. AD620SQ/883B Total Ionizing Dose Radiation Lot Acceptance Report for RESTORE-LEO

    NASA Technical Reports Server (NTRS)

    Burton, Noah; Campola, Michael

    2017-01-01

    A Radiation Lot Acceptance Test was performed on the AD620SQ/883B, Lot 1708D, in accordance with MIL-STD-883, Method 1019, Condition D. Using a Co-60 source 4 biased parts and 4 unbiased parts were irradiated at 10 mrad/s (0.036 krad/hr) in intervals of approximately 1 krad from 3-10 krads, and ones of 5 krads from 10-25 krads, where it was annealed while unbiased at 25 degrees Celsius, for 2 days, and then, subsequently, annealed while biased at 25 degrees celsius, for another 7 days.

  12. Quasi interpolation with Voronoi splines.

    PubMed

    Mirzargar, Mahsa; Entezari, Alireza

    2011-12-01

    We present a quasi interpolation framework that attains the optimal approximation-order of Voronoi splines for reconstruction of volumetric data sampled on general lattices. The quasi interpolation framework of Voronoi splines provides an unbiased reconstruction method across various lattices. Therefore this framework allows us to analyze and contrast the sampling-theoretic performance of general lattices, using signal reconstruction, in an unbiased manner. Our quasi interpolation methodology is implemented as an efficient FIR filter that can be applied online or as a preprocessing step. We present visual and numerical experiments that demonstrate the improved accuracy of reconstruction across lattices, using the quasi interpolation framework. © 2011 IEEE

  13. Precision medicine in the age of big data: The present and future role of large-scale unbiased sequencing in drug discovery and development.

    PubMed

    Vicini, P; Fields, O; Lai, E; Litwack, E D; Martin, A-M; Morgan, T M; Pacanowski, M A; Papaluca, M; Perez, O D; Ringel, M S; Robson, M; Sakul, H; Vockley, J; Zaks, T; Dolsten, M; Søgaard, M

    2016-02-01

    High throughput molecular and functional profiling of patients is a key driver of precision medicine. DNA and RNA characterization has been enabled at unprecedented cost and scale through rapid, disruptive progress in sequencing technology, but challenges persist in data management and interpretation. We analyze the state-of-the-art of large-scale unbiased sequencing in drug discovery and development, including technology, application, ethical, regulatory, policy and commercial considerations, and discuss issues of LUS implementation in clinical and regulatory practice. © 2015 American Society for Clinical Pharmacology and Therapeutics.

  14. Searching for new young stars in the Northern hemisphere: the Pisces moving group

    NASA Astrophysics Data System (ADS)

    Binks, A. S.; Jeffries, R. D.; Ward, J. L.

    2018-01-01

    Using the kinematically unbiased technique described in Binks, Jeffries & Maxted (2015), we present optical spectra for a further 122 rapidly rotating (rotation periods <6 d), X-ray active FGK stars, selected from the SuperWASP survey. We identify 17 new examples of young, probably single stars with ages of <200 Myr and provide additional evidence for a new Northern hemisphere kinematic association: the Pisces moving group (MG). The group consists of 14 lithium-rich G- and K-type stars that have a dispersion of only ∼3 km s-1 in each Galactic space velocity coordinate. The group members are approximately coeval in the colour-magnitude diagram, with an age of 30-50 Myr, and have similar, though not identical, kinematics to the Octans-Near MG.

  15. Internships, employment opportunities, and research grants

    USGS Publications Warehouse

    ,

    2015-01-01

    As an unbiased, multidisciplinary science organization, the U.S. Geological Survey (USGS) is dedicated to the timely, relevant, and impartial study of the health of our ecosystems and environment, our natural resources, the impacts of climate and land-use change, and the natural hazards that threaten us. Opportunities for undergraduate and graduate students and faculty to participate in USGS science are available in the selected programs described below. Please note: U.S. citizenship is required for all government positions.This publication has been superseded by USGS General Information Product 165 Grant Opportunities for Academic Research and Training and USGS General Information Product 166 Student and Recent Graduate Employment Opportunities.This publication is proceeded by USGS General Information Product 80 Internships, Employment Opportunities, and Research Grants published in 2008.

  16. Spectral ratio method for measuring emissivity

    USGS Publications Warehouse

    Watson, K.

    1992-01-01

    The spectral ratio method is based on the concept that although the spectral radiances are very sensitive to small changes in temperature the ratios are not. Only an approximate estimate of temperature is required thus, for example, we can determine the emissivity ratio to an accuracy of 1% with a temperature estimate that is only accurate to 12.5 K. Selecting the maximum value of the channel brightness temperatures is an unbiased estimate. Laboratory and field spectral data are easily converted into spectral ratio plots. The ratio method is limited by system signal:noise and spectral band-width. The images can appear quite noisy because ratios enhance high frequencies and may require spatial filtering. Atmospheric effects tend to rescale the ratios and require using an atmospheric model or a calibration site. ?? 1992.

  17. A rat model system to study complex disease risks, fitness, aging, and longevity.

    PubMed

    Koch, Lauren Gerard; Britton, Steven L; Wisløff, Ulrik

    2012-02-01

    The association between low exercise capacity and all-cause morbidity and mortality is statistically strong yet mechanistically unresolved. By connecting clinical observation with a theoretical base, we developed a working hypothesis that variation in capacity for oxygen metabolism is the central mechanistic determinant between disease and health (aerobic hypothesis). As an unbiased test, we show that two-way artificial selective breeding of rats for low and high intrinsic endurance exercise capacity also produces rats that differ for numerous disease risks, including the metabolic syndrome, cardiovascular complications, premature aging, and reduced longevity. This contrasting animal model system may prove to be translationally superior relative to more widely used simplistic models for understanding geriatric biology and medicine. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Nitrogen loads from selected rivers in the Long Island Sound Basin, 2005–13, Connecticut and Massachusetts

    USGS Publications Warehouse

    Mullaney, John R.

    2016-03-29

    Total nitrogen loads at 14 water-quality monitoring stations were calculated by using discrete measurements of total nitrogen and continuous streamflow data for the period 2005–13 (water years 2006–13). Total nitrogen loads were calculated by using the LOADEST computer program.Overall, for water years 2006–13, streamflow in Connecticut was generally above normal. Total nitrogen yields ranged from 1,160 to 23,330 pounds per square mile per year. Total nitrogen loads from the French River at North Grosvenordale and the Still River at Brookfield Center, Connecticut, declined noticeably during the study period. An analysis of the bias in estimated loads indicated unbiased results at all but one station, indicating generally good fit for the LOADEST models.

  19. A thermally driven differential mutation approach for the structural optimization of large atomic systems

    NASA Astrophysics Data System (ADS)

    Biswas, Katja

    2017-09-01

    A computational method is presented which is capable to obtain low lying energy structures of topological amorphous systems. The method merges a differential mutation genetic algorithm with simulated annealing. This is done by incorporating a thermal selection criterion, which makes it possible to reliably obtain low lying minima with just a small population size and is suitable for multimodal structural optimization. The method is tested on the structural optimization of amorphous graphene from unbiased atomic starting configurations. With just a population size of six systems, energetically very low structures are obtained. While each of the structures represents a distinctly different arrangement of the atoms, their properties, such as energy, distribution of rings, radial distribution function, coordination number, and distribution of bond angles, are very similar.

  20. Tailored Codes for Small Quantum Memories

    NASA Astrophysics Data System (ADS)

    Robertson, Alan; Granade, Christopher; Bartlett, Stephen D.; Flammia, Steven T.

    2017-12-01

    We demonstrate that small quantum memories, realized via quantum error correction in multiqubit devices, can benefit substantially by choosing a quantum code that is tailored to the relevant error model of the system. For a biased noise model, with independent bit and phase flips occurring at different rates, we show that a single code greatly outperforms the well-studied Steane code across the full range of parameters of the noise model, including for unbiased noise. In fact, this tailored code performs almost optimally when compared with 10 000 randomly selected stabilizer codes of comparable experimental complexity. Tailored codes can even outperform the Steane code with realistic experimental noise, and without any increase in the experimental complexity, as we demonstrate by comparison in the observed error model in a recent seven-qubit trapped ion experiment.

  1. Well-tempered metadynamics as a tool for characterizing multi-component, crystalline molecular machines.

    PubMed

    Ilott, Andrew J; Palucha, Sebastian; Hodgkinson, Paul; Wilson, Mark R

    2013-10-10

    The well-tempered, smoothly converging form of the metadynamics algorithm has been implemented in classical molecular dynamics simulations and used to obtain an estimate of the free energy surface explored by the molecular rotations in the plastic crystal, octafluoronaphthalene. The biased simulations explore the full energy surface extremely efficiently, more than 4 orders of magnitude faster than unbiased molecular dynamics runs. The metadynamics collective variables used have also been expanded to include the simultaneous orientations of three neighboring octafluoronaphthalene molecules. Analysis of the resultant three-dimensional free energy surface, which is sampled to a very high degree despite its significant complexity, demonstrates that there are strong correlations between the molecular orientations. Although this correlated motion is of limited applicability in terms of exploiting dynamical motion in octafluoronaphthalene, the approach used is extremely well suited to the investigation of the function of crystalline molecular machines.

  2. Implications of genome-wide association studies in cancer therapeutics.

    PubMed

    Patel, Jai N; McLeod, Howard L; Innocenti, Federico

    2013-09-01

    Genome wide association studies (GWAS) provide an agnostic approach to identifying potential genetic variants associated with disease susceptibility, prognosis of survival and/or predictive of drug response. Although these techniques are costly and interpretation of study results is challenging, they do allow for a more unbiased interrogation of the entire genome, resulting in the discovery of novel genes and understanding of novel biological associations. This review will focus on the implications of GWAS in cancer therapy, in particular germ-line mutations, including findings from major GWAS which have identified predictive genetic loci for clinical outcome and/or toxicity. Lessons and challenges in cancer GWAS are also discussed, including the need for functional analysis and replication, as well as future perspectives for biological and clinical utility. Given the large heterogeneity in response to cancer therapeutics, novel methods of identifying mechanisms and biology of variable drug response and ultimately treatment individualization will be indispensable. © 2013 The British Pharmacological Society.

  3. Counter unmanned aerial system testing and evaluation methodology

    NASA Astrophysics Data System (ADS)

    Kouhestani, C.; Woo, B.; Birch, G.

    2017-05-01

    Unmanned aerial systems (UAS) are increasing in flight times, ease of use, and payload sizes. Detection, classification, tracking, and neutralization of UAS is a necessary capability for infrastructure and facility protection. We discuss test and evaluation methodology developed at Sandia National Laboratories to establish a consistent, defendable, and unbiased means for evaluating counter unmanned aerial system (CUAS) technologies. The test approach described identifies test strategies, performance metrics, UAS types tested, key variables, and the necessary data analysis to accurately quantify the capabilities of CUAS technologies. The tests conducted, as defined by this approach, will allow for the determination of quantifiable limitations, strengths, and weaknesses in terms of detection, tracking, classification, and neutralization. Communicating the results of this testing in such a manner informs decisions by government sponsors and stakeholders that can be used to guide future investments and inform procurement, deployment, and advancement of such systems into their specific venues.

  4. Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells.

    PubMed

    Semrau, Stefan; Goldmann, Johanna E; Soumillon, Magali; Mikkelsen, Tarjei S; Jaenisch, Rudolf; van Oudenaarden, Alexander

    2017-10-23

    Gene expression heterogeneity in the pluripotent state of mouse embryonic stem cells (mESCs) has been increasingly well-characterized. In contrast, exit from pluripotency and lineage commitment have not been studied systematically at the single-cell level. Here we measure the gene expression dynamics of retinoic acid driven mESC differentiation from pluripotency to lineage commitment, using an unbiased single-cell transcriptomics approach. We find that the exit from pluripotency marks the start of a lineage transition as well as a transient phase of increased susceptibility to lineage specifying signals. Our study reveals several transcriptional signatures of this phase, including a sharp increase of gene expression variability and sequential expression of two classes of transcriptional regulators. In summary, we provide a comprehensive analysis of the exit from pluripotency and lineage commitment at the single cell level, a potential stepping stone to improved lineage manipulation through timing of differentiation cues.

  5. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  6. Nonlinear vs. linear biasing in Trp-cage folding simulations

    NASA Astrophysics Data System (ADS)

    Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka

    2015-03-01

    Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.

  7. Nonlinear vs. linear biasing in Trp-cage folding simulations.

    PubMed

    Spiwok, Vojtěch; Oborský, Pavel; Pazúriková, Jana; Křenek, Aleš; Králová, Blanka

    2015-03-21

    Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.

  8. Long and short term variability of seven blazars in six near-infrared/optical bands

    NASA Astrophysics Data System (ADS)

    Sandrinelli, A.; Covino, S.; Treves, A.

    2014-02-01

    Context. We present the light curves of six BL Lac objects, PKS 0537-441, PKS 0735+17, OJ 287, PKS 2005-489, PKS 2155-304, and W Comae, and of the flat spectrum radio quasar PKS 1510-089, as a part of a photometric monitoring program in the near-infrared/optical bands started in 2004. All sources are Fermi blazars. Aims: Our purpose is to investigate flux and spectral variability on short and long time scales. Systematic monitoring, independent of the activity of the source, guarantees large sample size statistics, and allows an unbiased view of different activity states on weekly or daily time scales for the whole timeframe and on nightly time scales for some epochs. Methods: Data were obtained with the REM telescope located at the ESO premises of La Silla (Chile). Light curves were gathered in the optical/near-infrared VRIJHK bands from April 2005 to June 2012. Results: Variability ≳3 mag is observed in PKS 0537-441, PKS 1510-089 and PKS 2155-304, the largest ranges spanned in the near-infrared. The color intensity plots show rather different morphologies. The spectral energy distributions in general are well fitted by a power law, with some deviations that are more apparent in low states. Some variability episodes during a night interval are well documented for PKS 0537-441 and PKS 2155-304. For the latter source the variability time scale implies a large relativistic beaming factor. Full Table 3 is only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/562/A79

  9. Estimators for longitudinal latent exposure models: examining measurement model assumptions.

    PubMed

    Sánchez, Brisa N; Kim, Sehee; Sammel, Mary D

    2017-06-15

    Latent variable (LV) models are increasingly being used in environmental epidemiology as a way to summarize multiple environmental exposures and thus minimize statistical concerns that arise in multiple regression. LV models may be especially useful when multivariate exposures are collected repeatedly over time. LV models can accommodate a variety of assumptions but, at the same time, present the user with many choices for model specification particularly in the case of exposure data collected repeatedly over time. For instance, the user could assume conditional independence of observed exposure biomarkers given the latent exposure and, in the case of longitudinal latent exposure variables, time invariance of the measurement model. Choosing which assumptions to relax is not always straightforward. We were motivated by a study of prenatal lead exposure and mental development, where assumptions of the measurement model for the time-changing longitudinal exposure have appreciable impact on (maximum-likelihood) inferences about the health effects of lead exposure. Although we were not particularly interested in characterizing the change of the LV itself, imposing a longitudinal LV structure on the repeated multivariate exposure measures could result in high efficiency gains for the exposure-disease association. We examine the biases of maximum likelihood estimators when assumptions about the measurement model for the longitudinal latent exposure variable are violated. We adapt existing instrumental variable estimators to the case of longitudinal exposures and propose them as an alternative to estimate the health effects of a time-changing latent predictor. We show that instrumental variable estimators remain unbiased for a wide range of data generating models and have advantages in terms of mean squared error. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Multilocus patterns of polymorphism and selection across the X chromosome of Caenorhabditis remanei.

    PubMed

    Cutter, Asher D

    2008-03-01

    Natural selection and neutral processes such as demography, mutation, and gene conversion all contribute to patterns of polymorphism within genomes. Identifying the relative importance of these varied components in evolution provides the principal challenge for population genetics. To address this issue in the nematode Caenorhabditis remanei, I sampled nucleotide polymorphism at 40 loci across the X chromosome. The site-frequency spectrum for these loci provides no evidence for population size change, and one locus presents a candidate for linkage to a target of balancing selection. Selection for codon usage bias leads to the non-neutrality of synonymous sites, and despite its weak magnitude of effect (N(e)s approximately 0.1), is responsible for profound patterns of diversity and divergence in the C. remanei genome. Although gene conversion is evident for many loci, biased gene conversion is not identified as a significant evolutionary process in this sample. No consistent association is observed between synonymous-site diversity and linkage-disequilibrium-based estimators of the population recombination parameter, despite theoretical predictions about background selection or widespread genetic hitchhiking, but genetic map-based estimates of recombination are needed to rigorously test for a diversity-recombination relationship. Coalescent simulations also illustrate how a spurious correlation between diversity and linkage-disequilibrium-based estimators of recombination can occur, due in part to the presence of unbiased gene conversion. These results illustrate the influence that subtle natural selection can exert on polymorphism and divergence, in the form of codon usage bias, and demonstrate the potential of C. remanei for detecting natural selection from genomic scans of polymorphism.

  11. Estimating Unbiased Land Cover Change Areas In The Colombian Amazon Using Landsat Time Series And Statistical Inference Methods

    NASA Astrophysics Data System (ADS)

    Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.

    2017-12-01

    Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.

  12. Constructing statistically unbiased cortical surface templates using feature-space covariance

    NASA Astrophysics Data System (ADS)

    Parvathaneni, Prasanna; Lyu, Ilwoo; Huo, Yuankai; Blaber, Justin; Hainline, Allison E.; Kang, Hakmook; Woodward, Neil D.; Landman, Bennett A.

    2018-03-01

    The choice of surface template plays an important role in cross-sectional subject analyses involving cortical brain surfaces because there is a tendency toward registration bias given variations in inter-individual and inter-group sulcal and gyral patterns. In order to account for the bias and spatial smoothing, we propose a feature-based unbiased average template surface. In contrast to prior approaches, we factor in the sample population covariance and assign weights based on feature information to minimize the influence of covariance in the sampled population. The mean surface is computed by applying the weights obtained from an inverse covariance matrix, which guarantees that multiple representations from similar groups (e.g., involving imaging, demographic, diagnosis information) are down-weighted to yield an unbiased mean in feature space. Results are validated by applying this approach in two different applications. For evaluation, the proposed unbiased weighted surface mean is compared with un-weighted means both qualitatively and quantitatively (mean squared error and absolute relative distance of both the means with baseline). In first application, we validated the stability of the proposed optimal mean on a scan-rescan reproducibility dataset by incrementally adding duplicate subjects. In the second application, we used clinical research data to evaluate the difference between the weighted and unweighted mean when different number of subjects were included in control versus schizophrenia groups. In both cases, the proposed method achieved greater stability that indicated reduced impacts of sampling bias. The weighted mean is built based on covariance information in feature space as opposed to spatial location, thus making this a generic approach to be applicable to any feature of interest.

  13. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations.

    PubMed

    Liu, Dajiang J; Leal, Suzanne M

    2012-10-05

    Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  14. Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information

    NASA Technical Reports Server (NTRS)

    Howell, L. W., Jr.

    2003-01-01

    A simple power law model consisting of a single spectral index, sigma(sub 2), is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index sigma(sub 2) greater than sigma(sub 1) above E(sub k). The maximum likelihood (ML) procedure was developed for estimating the single parameter sigma(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (Pl) consistency (asymptotically unbiased), (P2) efficiency (asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only be ascertained by calculating the CRB for an assumed energy spectrum- detector response function combination, which can be quite formidable in practice. However, the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are stained in practice are investigated.

  15. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  16. Variable Selection through Correlation Sifting

    NASA Astrophysics Data System (ADS)

    Huang, Jim C.; Jojic, Nebojsa

    Many applications of computational biology require a variable selection procedure to sift through a large number of input variables and select some smaller number that influence a target variable of interest. For example, in virology, only some small number of viral protein fragments influence the nature of the immune response during viral infection. Due to the large number of variables to be considered, a brute-force search for the subset of variables is in general intractable. To approximate this, methods based on ℓ1-regularized linear regression have been proposed and have been found to be particularly successful. It is well understood however that such methods fail to choose the correct subset of variables if these are highly correlated with other "decoy" variables. We present a method for sifting through sets of highly correlated variables which leads to higher accuracy in selecting the correct variables. The main innovation is a filtering step that reduces correlations among variables to be selected, making the ℓ1-regularization effective for datasets on which many methods for variable selection fail. The filtering step changes both the values of the predictor variables and output values by projections onto components obtained through a computationally-inexpensive principal components analysis. In this paper we demonstrate the usefulness of our method on synthetic datasets and on novel applications in virology. These include HIV viral load analysis based on patients' HIV sequences and immune types, as well as the analysis of seasonal variation in influenza death rates based on the regions of the influenza genome that undergo diversifying selection in the previous season.

  17. A review of covariate selection for non-experimental comparative effectiveness research.

    PubMed

    Sauer, Brian C; Brookhart, M Alan; Roy, Jason; VanderWeele, Tyler

    2013-11-01

    This paper addresses strategies for selecting variables for adjustment in non-experimental comparative effectiveness research and uses causal graphs to illustrate the causal network that relates treatment to outcome. Variables in the causal network take on multiple structural forms. Adjustment for a common cause pathway between treatment and outcome can remove confounding, whereas adjustment for other structural types may increase bias. For this reason, variable selection would ideally be based on an understanding of the causal network; however, the true causal network is rarely known. Therefore, we describe more practical variable selection approaches based on background knowledge when the causal structure is only partially known. These approaches include adjustment for all observed pretreatment variables thought to have some connection to the outcome, all known risk factors for the outcome, and all direct causes of the treatment or the outcome. Empirical approaches, such as forward and backward selection and automatic high-dimensional proxy adjustment, are also discussed. As there is a continuum between knowing and not knowing the causal, structural relations of variables, we recommend addressing variable selection in a practical way that involves a combination of background knowledge and empirical selection and that uses high-dimensional approaches. This empirical approach can be used to select from a set of a priori variables based on the researcher's knowledge to be included in the final analysis or to identify additional variables for consideration. This more limited use of empirically derived variables may reduce confounding while simultaneously reducing the risk of including variables that may increase bias. Copyright © 2013 John Wiley & Sons, Ltd.

  18. A Review of Covariate Selection for Nonexperimental Comparative Effectiveness Research

    PubMed Central

    Sauer, Brian C.; Brookhart, Alan; Roy, Jason; Vanderweele, Tyler

    2014-01-01

    This paper addresses strategies for selecting variables for adjustment in non-experimental comparative effectiveness research (CER), and uses causal graphs to illustrate the causal network that relates treatment to outcome. Variables in the causal network take on multiple structural forms. Adjustment for on a common cause pathway between treatment and outcome can remove confounding, while adjustment for other structural types may increase bias. For this reason variable selection would ideally be based on an understanding of the causal network; however, the true causal network is rarely know. Therefore, we describe more practical variable selection approaches based on background knowledge when the causal structure is only partially known. These approaches include adjustment for all observed pretreatment variables thought to have some connection to the outcome, all known risk factors for the outcome, and all direct causes of the treatment or the outcome. Empirical approaches, such as forward and backward selection and automatic high-dimensional proxy adjustment, are also discussed. As there is a continuum between knowing and not knowing the causal, structural relations of variables, we recommend addressing variable selection in a practical way that involves a combination of background knowledge and empirical selection and that uses the high-dimensional approaches. This empirical approach can be used to select from a set of a priori variables based on the researcher’s knowledge to be included in the final analysis or to identify additional variables for consideration. This more limited use of empirically-derived variables may reduce confounding while simultaneously reducing the risk of including variables that may increase bias. PMID:24006330

  19. Act on Numbers: Numerical Magnitude Influences Selection and Kinematics of Finger Movement

    PubMed Central

    Rugani, Rosa; Betti, Sonia; Ceccarini, Francesco; Sartori, Luisa

    2017-01-01

    In the past decade hand kinematics has been reliably adopted for investigating cognitive processes and disentangling debated topics. One of the most controversial issues in numerical cognition literature regards the origin – cultural vs. genetically driven – of the mental number line (MNL), oriented from left (small numbers) to right (large numbers). To date, the majority of studies have investigated this effect by means of response times, whereas studies considering more culturally unbiased measures such as kinematic parameters are rare. Here, we present a new paradigm that combines a “free response” task with the kinematic analysis of movement. Participants were seated in front of two little soccer goals placed on a table, one on the left and one on the right side. They were presented with left- or right-directed arrows and they were instructed to kick a small ball with their right index toward the goal indicated by the arrow. In a few test trials participants were presented also with a small (2) or a large (8) number, and they were allowed to choose the kicking direction. Participants performed more left responses with the small number and more right responses with the large number. The whole kicking movement was segmented in two temporal phases in order to make a hand kinematics’ fine-grained analysis. The Kick Preparation and Kick Finalization phases were selected on the basis of peak trajectory deviation from the virtual midline between the two goals. Results show an effect of both small and large numbers on action execution timing. Participants were faster to finalize the action when responding to small numbers toward the left and to large number toward the right. Here, we provide the first experimental demonstration which highlights how numerical processing affects action execution in a new and not-overlearned context. The employment of this innovative and unbiased paradigm will permit to disentangle the role of nature and culture in shaping the direction of MNL and the role of finger in the acquisition of numerical skills. Last but not least, similar paradigms will allow to determine how cognition can influence action execution. PMID:28912743

  20. Optimal Asteroid Mass Determination from Planetary Range Observations: A Study of a Simplified Test Model

    NASA Technical Reports Server (NTRS)

    Kuchynka, P.; Laskar, J.; Fienga, A.

    2011-01-01

    Mars ranging observations are available over the past 10 years with an accuracy of a few meters. Such precise measurements of the Earth-Mars distance provide valuable constraints on the masses of the asteroids perturbing both planets. Today more than 30 asteroid masses have thus been estimated from planetary ranging data (see [1] and [2]). Obtaining unbiased mass estimations is nevertheless difficult. Various systematic errors can be introduced by imperfect reduction of spacecraft tracking observations to planetary ranging data. The large number of asteroids and the limited a priori knowledge of their masses is also an obstacle for parameter selection. Fitting in a model a mass of a negligible perturber, or on the contrary omitting a significant perturber, will induce important bias in determined asteroid masses. In this communication, we investigate a simplified version of the mass determination problem. Instead of planetary ranging observations from spacecraft or radar data, we consider synthetic ranging observations generated with the INPOP [2] ephemeris for a test model containing 25000 asteroids. We then suggest a method for optimal parameter selection and estimation in this simplified framework.

  1. Age-related changes in glial cells of dopamine midbrain subregions in rhesus monkeys.

    PubMed

    Kanaan, Nicholas M; Kordower, Jeffrey H; Collier, Timothy J

    2010-06-01

    Aging remains the strongest risk factor for developing Parkinson's disease (PD), and there is selective vulnerability in midbrain dopamine (DA) neuron degeneration in PD. By tracking normal aging-related changes with an emphasis on regional specificity, factors involved in selective vulnerability and resistance to degeneration can be studied. Towards this end, we sought to determine whether age-related changes in microglia and astrocytes in rhesus monkeys are region-specific, suggestive of involvement in regional differences in vulnerability to degeneration that may be relevant to PD pathogenesis. Gliosis in midbrain DA subregions was measured by estimating glia number using unbiased stereology, assessing fluorescence intensity for proteins upregulated during activation, and rating morphology. With normal aging, microglia exhibited increased staining intensity and a shift to more activated morphologies preferentially in the vulnerable substantia nigra-ventral tier (vtSN). Astrocytes did not exhibit age-related changes consistent with an involvement in regional vulnerability in any measure. Our results suggest advancing age is associated with chronic mild inflammation in the vtSN, which may render these DA neurons more vulnerable to degeneration. Copyright 2008 Elsevier Inc. All rights reserved.

  2. Young children seek out biased information about social groups.

    PubMed

    Over, Harriet; Eggleston, Adam; Bell, Jenny; Dunham, Yarrow

    2018-05-01

    Understanding the origins of prejudice necessitates exploring the ways in which children participate in the construction of biased representations of social groups. We investigate whether young children actively seek out information that supports and extends their initial intergroup biases. In Studies 1 and 2, we show that children choose to hear a story that contains positive information about their own group and negative information about another group rather than a story that contains negative information about their own group and positive information about the other group. In a third study, we show that children choose to present biased information to others, thus demonstrating that the effects of information selection can start to propagate through social networks. In Studies 4 and 5, we further investigate the nature of children's selective information seeking and show that children prefer ingroup-favouring information to other types of biased information and even to balanced, unbiased information. Together, this work shows that children are not merely passively recipients of social information; they play an active role in the creation and transmission of intergroup attitudes. © 2017 John Wiley & Sons Ltd.

  3. RNF166 Determines Recruitment of Adaptor Proteins during Antibacterial Autophagy.

    PubMed

    Heath, Robert J; Goel, Gautam; Baxt, Leigh A; Rush, Jason S; Mohanan, Vishnu; Paulus, Geraldine L C; Jani, Vijay; Lassen, Kara G; Xavier, Ramnik J

    2016-11-22

    Xenophagy is a form of selective autophagy that involves the targeting and elimination of intracellular pathogens through several recognition, recruitment, and ubiquitination events. E3 ubiquitin ligases control substrate selectivity in the ubiquitination cascade; however, systematic approaches to map the role of E3 ligases in antibacterial autophagy have been lacking. We screened more than 600 putative human E3 ligases, identifying E3 ligases that are required for adaptor protein recruitment and LC3-bacteria colocalization, critical steps in antibacterial autophagy. An unbiased informatics approach pinpointed RNF166 as a key gene that interacts with the autophagy network and controls the recruitment of ubiquitin as well as the autophagy adaptors p62 and NDP52 to bacteria. Mechanistic studies demonstrated that RNF166 catalyzes K29- and K33-linked polyubiquitination of p62 at residues K91 and K189. Thus, our study expands the catalog of E3 ligases that mediate antibacterial autophagy and identifies a critical role for RNF166 in this process. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Cosmic Star Formation - Seen from the Milky Way with AtLAST Short Contributed Talk

    NASA Astrophysics Data System (ADS)

    Kauffmann, Jens

    2018-01-01

    Herschel and Spitzer provided first truly unbiased overviews of star formation environments in the Milky Way. Today, high–powered instruments like ALMA additionally resolve the immediate birth environments of individual stars in a few selected regions throughout the Galaxy. This progress in the Milky Way is important, because the same facilities also allow us to explore how galaxies evolved over time. Was star formation more efficient in the dense molecular clouds found in starburst galaxies? Why do galaxies often follow star formation relations like those from Kennicutt & Schmidt and Gao & Solomon? A cloud-scale understanding of the star formation processes, that can only be developed in the Milky Way, is necessary to make progress. Unfortunately, ALMA can resolve the detailed substructure only in SELECTED galactic molecular clouds, given mapping with ALMA is very slow. Here I show how surveys of dust continuum and line emission provided by a large and fast single–dish telescope can overcome these critical limitations, e.g. by breaking degeneracies in current theoretical models. My discussion draws on a white papers previously developed for similar telescopes.

  5. Independent technical review and analysis of hydraulic modeling and hydrology under low-flow conditions of the Des Plaines River near Riverside, Illinois

    USGS Publications Warehouse

    Over, Thomas M.; Straub, Timothy D.; Hortness, Jon E.; Murphy, Elizabeth A.

    2012-01-01

    The U.S. Geological Survey (USGS) has operated a streamgage and published daily flows for the Des Plaines River at Riverside since Oct. 1, 1943. A HEC-RAS model has been developed to estimate the effect of the removal of Hofmann Dam near the gage on low-flow elevations in the reach approximately 3 miles upstream from the dam. The Village of Riverside, the Illinois Department of Natural Resources-Office of Water Resources (IDNR-OWR), and the U. S. Army Corps of Engineers-Chicago District (USACE-Chicago) are interested in verifying the performance of the HEC-RAS model for specific low-flow conditions, and obtaining an estimate of selected daily flow quantiles and other low-flow statistics for a selected period of record that best represents current hydrologic conditions. Because the USGS publishes streamflow records for the Des Plaines River system and provides unbiased analyses of flows and stream hydraulic characteristics, the USGS served as an Independent Technical Reviewer (ITR) for this study.

  6. Clear: Composition of Likelihoods for Evolve and Resequence Experiments.

    PubMed

    Iranmehr, Arya; Akbari, Ali; Schlötterer, Christian; Bafna, Vineet

    2017-06-01

    The advent of next generation sequencing technologies has made whole-genome and whole-population sampling possible, even for eukaryotes with large genomes. With this development, experimental evolution studies can be designed to observe molecular evolution "in action" via evolve-and-resequence (E&R) experiments. Among other applications, E&R studies can be used to locate the genes and variants responsible for genetic adaptation. Most existing literature on time-series data analysis often assumes large population size, accurate allele frequency estimates, or wide time spans. These assumptions do not hold in many E&R studies. In this article, we propose a method-composition of likelihoods for evolve-and-resequence experiments (Clear)-to identify signatures of selection in small population E&R experiments. Clear takes whole-genome sequences of pools of individuals as input, and properly addresses heterogeneous ascertainment bias resulting from uneven coverage. Clear also provides unbiased estimates of model parameters, including population size, selection strength, and dominance, while being computationally efficient. Extensive simulations show that Clear achieves higher power in detecting and localizing selection over a wide range of parameters, and is robust to variation of coverage. We applied the Clear statistic to multiple E&R experiments, including data from a study of adaptation of Drosophila melanogaster to alternating temperatures and a study of outcrossing yeast populations, and identified multiple regions under selection with genome-wide significance. Copyright © 2017 by the Genetics Society of America.

  7. Application of a multipurpose unequal probability stream survey in the Mid-Atlantic Coastal Plain

    USGS Publications Warehouse

    Ator, S.W.; Olsen, A.R.; Pitchford, A.M.; Denver, J.M.

    2003-01-01

    A stratified, spatially balanced sample with unequal probability selection was used to design a multipurpose survey of headwater streams in the Mid-Atlantic Coastal Plain. Objectives for the survey include unbiased estimates of regional stream conditions, and adequate coverage of unusual but significant environmental settings to support empirical modeling of the factors affecting those conditions. The design and field application of the survey are discussed in light of these multiple objectives. A probability (random) sample of 175 first-order nontidal streams was selected for synoptic sampling of water chemistry and benthic and riparian ecology during late winter and spring 2000. Twenty-five streams were selected within each of seven hydrogeologic subregions (strata) that were delineated on the basis of physiography and surficial geology. In each subregion, unequal inclusion probabilities were used to provide an approximately even distribution of streams along a gradient of forested to developed (agricultural or urban) land in the contributing watershed. Alternate streams were also selected. Alternates were included in groups of five in each subregion when field reconnaissance demonstrated that primary streams were inaccessible or otherwise unusable. Despite the rejection and replacement of a considerable number of primary streams during reconnaissance (up to 40 percent in one subregion), the desired land use distribution was maintained within each hydrogeologic subregion without sacrificing the probabilistic design.

  8. Spectroscopic classification of supernova SN 2018Z by NUTS (NOT Un-biased Transient Survey)

    NASA Astrophysics Data System (ADS)

    Kuncarayakti, H.; Mattila, S.; Kotak, R.; Harmanen, J.; Reynolds, T.; Pastorello, A.; Benetti, S.; Stritzinger, M.; Onori, F.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.

    2018-01-01

    The NOT Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of supernova SN 2018Z in host galaxy SDSS J231809.76+212553.5 The observations were performed with the 2.56 m Nordic Optical Telescope equipped with ALFOSC (range 350-950 nm; resolution 1.6 nm) on 2018-01-09.9 UT. Survey Name | IAU Name | Discovery (UT) | Discovery mag | Observation (UT) | Redshift | Type | Phase | Notes PS18ao | SN 2018Z | 2018-01-01.2 | 19.96 | 2018-01-09.9 | 0.102 | Ia | post-maximum? | (1) (1) Redshift was derived from the SN and host absorption features.

  9. On the mathematical foundations of mutually unbiased bases

    NASA Astrophysics Data System (ADS)

    Thas, Koen

    2018-02-01

    In order to describe a setting to handle Zauner's conjecture on mutually unbiased bases (MUBs) (stating that in C^d, a set of MUBs of the theoretical maximal size d + 1 exists only if d is a prime power), we pose some fundamental questions which naturally arise. Some of these questions have important consequences for the construction theory of (new) sets of maximal MUBs. Partial answers will be provided in particular cases; more specifically, we will analyze MUBs with associated operator groups that have nilpotence class 2, and consider MUBs of height 1. We will also confirm Zauner's conjecture for MUBs with associated finite nilpotent operator groups.

  10. Analysis of conditional genetic effects and variance components in developmental genetics.

    PubMed

    Zhu, J

    1995-12-01

    A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.

  11. Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics

    PubMed Central

    Zhu, J.

    1995-01-01

    A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500

  12. Large deviations in the presence of cooperativity and slow dynamics

    NASA Astrophysics Data System (ADS)

    Whitelam, Stephen

    2018-06-01

    We study simple models of intermittency, involving switching between two states, within the dynamical large-deviation formalism. Singularities appear in the formalism when switching is cooperative or when its basic time scale diverges. In the first case the unbiased trajectory distribution undergoes a symmetry breaking, leading to a change in shape of the large-deviation rate function for a particular dynamical observable. In the second case the symmetry of the unbiased trajectory distribution remains unbroken. Comparison of these models suggests that singularities of the dynamical large-deviation formalism can signal the dynamical equivalent of an equilibrium phase transition but do not necessarily do so.

  13. Unbiased classification of spatial strategies in the Barnes maze.

    PubMed

    Illouz, Tomer; Madar, Ravit; Clague, Charlotte; Griffioen, Kathleen J; Louzoun, Yoram; Okun, Eitan

    2016-11-01

    Spatial learning is one of the most widely studied cognitive domains in neuroscience. The Morris water maze and the Barnes maze are the most commonly used techniques to assess spatial learning and memory in rodents. Despite the fact that these tasks are well-validated paradigms for testing spatial learning abilities, manual categorization of performance into behavioral strategies is subject to individual interpretation, and thus to bias. We have previously described an unbiased machine-learning algorithm to classify spatial strategies in the Morris water maze. Here, we offer a support vector machine-based, automated, Barnes-maze unbiased strategy (BUNS) classification algorithm, as well as a cognitive score scale that can be used for memory acquisition, reversal training and probe trials. The BUNS algorithm can greatly benefit Barnes maze users as it provides a standardized method of strategy classification and cognitive scoring scale, which cannot be derived from typical Barnes maze data analysis. Freely available on the web at http://okunlab.wix.com/okunlab as a MATLAB application. eitan.okun@biu.ac.ilSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Detection of sea otters in boat-based surveys of Prince William Sound, Alaska

    USGS Publications Warehouse

    Udevitz, Mark S.; Bodkin, James L.; Costa, Daniel P.

    1995-01-01

    Boat-based surveys have been commonly used to monitor sea otter populations, but there has been little quantitative work to evaluate detection biases that may affect these surveys. We used ground-based observers to investigate sea otter detection probabilities in a boat-based survey of Prince William Sound, Alaska. We estimated that 30% of the otters present on surveyed transects were not detected by boat crews. Approximately half (53%) of the undetected otters were missed because the otters left the transects, apparently in response to the approaching boat. Unbiased estimates of detection probabilities will be required for obtaining unbiased population estimates from boat-based surveys of sea otters. Therefore, boat-based surveys should include methods to estimate sea otter detection probabilities under the conditions specific to each survey. Unbiased estimation of detection probabilities with ground-based observers requires either that the ground crews detect all of the otters in observed subunits, or that there are no errors in determining which crews saw each detected otter. Ground-based observer methods may be appropriate in areas where nearly all of the sea otter habitat is potentially visible from ground-based vantage points.

  15. Unbiased Strain-Typing of Arbovirus Directly from Mosquitoes Using Nanopore Sequencing: A Field-forward Biosurveillance Protocol.

    PubMed

    Russell, Joseph A; Campos, Brittany; Stone, Jennifer; Blosser, Erik M; Burkett-Cadena, Nathan; Jacobs, Jonathan L

    2018-04-03

    The future of infectious disease surveillance and outbreak response is trending towards smaller hand-held solutions for point-of-need pathogen detection. Here, samples of Culex cedecei mosquitoes collected in Southern Florida, USA were tested for Venezuelan Equine Encephalitis Virus (VEEV), a previously-weaponized arthropod-borne RNA-virus capable of causing acute and fatal encephalitis in animal and human hosts. A single 20-mosquito pool tested positive for VEEV by quantitative reverse transcription polymerase chain reaction (RT-qPCR) on the Biomeme two3. The virus-positive sample was subjected to unbiased metatranscriptome sequencing on the Oxford Nanopore MinION and shown to contain Everglades Virus (EVEV), an alphavirus in the VEEV serocomplex. Our results demonstrate, for the first time, the use of unbiased sequence-based detection and subtyping of a high-consequence biothreat pathogen directly from an environmental sample using field-forward protocols. The development and validation of methods designed for field-based diagnostic metagenomics and pathogen discovery, such as those suitable for use in mobile "pocket laboratories", will address a growing demand for public health teams to carry out their mission where it is most urgent: at the point-of-need.

  16. Power Generation from a Radiative Thermal Source Using a Large-Area Infrared Rectenna

    NASA Astrophysics Data System (ADS)

    Shank, Joshua; Kadlec, Emil A.; Jarecki, Robert L.; Starbuck, Andrew; Howell, Stephen; Peters, David W.; Davids, Paul S.

    2018-05-01

    Electrical power generation from a moderate-temperature thermal source by means of direct conversion of infrared radiation is important and highly desirable for energy harvesting from waste heat and micropower applications. Here, we demonstrate direct rectified power generation from an unbiased large-area nanoantenna-coupled tunnel diode rectifier called a rectenna. Using a vacuum radiometric measurement technique with irradiation from a temperature-stabilized thermal source, a generated power density of 8 nW /cm2 is observed at a source temperature of 450 °C for the unbiased rectenna across an optimized load resistance. The optimized load resistance for the peak power generation for each temperature coincides with the tunnel diode resistance at zero bias and corresponds to the impedance matching condition for a rectifying antenna. Current-voltage measurements of a thermally illuminated large-area rectenna show current zero crossing shifts into the second quadrant indicating rectification. Photon-assisted tunneling in the unbiased rectenna is modeled as the mechanism for the large short-circuit photocurrents observed where the photon energy serves as an effective bias across the tunnel junction. The measured current and voltage across the load resistor as a function of the thermal source temperature represents direct current electrical power generation.

  17. Prioritizing causal disease genes using unbiased genomic features.

    PubMed

    Deo, Rahul C; Musso, Gabriel; Tasan, Murat; Tang, Paul; Poon, Annie; Yuan, Christiana; Felix, Janine F; Vasan, Ramachandran S; Beroukhim, Rameen; De Marco, Teresa; Kwok, Pui-Yan; MacRae, Calum A; Roth, Frederick P

    2014-12-03

    Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.

  18. An Automatic Method for Generating an Unbiased Intensity Normalizing Factor in Positron Emission Tomography Image Analysis After Stroke.

    PubMed

    Nie, Binbin; Liang, Shengxiang; Jiang, Xiaofeng; Duan, Shaofeng; Huang, Qi; Zhang, Tianhao; Li, Panlong; Liu, Hua; Shan, Baoci

    2018-06-07

    Positron emission tomography (PET) imaging of functional metabolism has been widely used to investigate functional recovery and to evaluate therapeutic efficacy after stroke. The voxel intensity of a PET image is the most important indicator of cellular activity, but is affected by other factors such as the basal metabolic ratio of each subject. In order to locate dysfunctional regions accurately, intensity normalization by a scale factor is a prerequisite in the data analysis, for which the global mean value is most widely used. However, this is unsuitable for stroke studies. Alternatively, a specified scale factor calculated from a reference region is also used, comprising neither hyper- nor hypo-metabolic voxels. But there is no such recognized reference region for stroke studies. Therefore, we proposed a totally data-driven automatic method for unbiased scale factor generation. This factor was generated iteratively until the residual deviation of two adjacent scale factors was reduced by < 5%. Moreover, both simulated and real stroke data were used for evaluation, and these suggested that our proposed unbiased scale factor has better sensitivity and accuracy for stroke studies.

  19. Nanoslit cavity plasmonic modes and built-in fields enhance the CW THz radiation in an unbiased antennaless photomixers array.

    PubMed

    Mohammad-Zamani, Mohammad Javad; Neshat, Mohammad; Moravvej-Farshi, Mohammad Kazem

    2016-01-15

    A new generation unbiased antennaless CW terahertz (THz) photomixer emitters array made of asymmetric metal-semiconductor-metal (MSM) gratings with a subwavelength pitch, operating in the optical near-field regime, is proposed. We take advantage of size effects in near-field optics and electrostatics to demonstrate the possibility of enhancing the THz power by 4 orders of magnitude, compared to a similar unbiased antennaless array of the same size that operates in the far-field regime. We show that, with the appropriate choice of grating parameters in such THz sources, the first plasmonic resonant cavity mode in the nanoslit between two adjacent MSMs can enhance the optical near-field absorption and, hence, the generation of photocarriers under the slit in the active medium. These photocarriers, on the other hand, are accelerated by the large built-in electric field sustained under the nanoslits by two dissimilar Schottky barriers to create the desired large THz power that is mainly radiated downward. The proposed structure can be tuned in a broadband frequency range of 0.1-3 THz, with output power increasing with frequency.

  20. Probability Theory Plus Noise: Descriptive Estimation and Inferential Judgment.

    PubMed

    Costello, Fintan; Watts, Paul

    2018-01-01

    We describe a computational model of two central aspects of people's probabilistic reasoning: descriptive probability estimation and inferential probability judgment. This model assumes that people's reasoning follows standard frequentist probability theory, but it is subject to random noise. This random noise has a regressive effect in descriptive probability estimation, moving probability estimates away from normative probabilities and toward the center of the probability scale. This random noise has an anti-regressive effect in inferential judgement, however. These regressive and anti-regressive effects explain various reliable and systematic biases seen in people's descriptive probability estimation and inferential probability judgment. This model predicts that these contrary effects will tend to cancel out in tasks that involve both descriptive estimation and inferential judgement, leading to unbiased responses in those tasks. We test this model by applying it to one such task, described by Gallistel et al. ). Participants' median responses in this task were unbiased, agreeing with normative probability theory over the full range of responses. Our model captures the pattern of unbiased responses in this task, while simultaneously explaining systematic biases away from normatively correct probabilities seen in other tasks. Copyright © 2018 Cognitive Science Society, Inc.

  1. Comparison of estimators of standard deviation for hydrologic time series

    USGS Publications Warehouse

    Tasker, Gary D.; Gilroy, Edward J.

    1982-01-01

    Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.

  2. Unbiased multi-fidelity estimate of failure probability of a free plane jet

    NASA Astrophysics Data System (ADS)

    Marques, Alexandre; Kramer, Boris; Willcox, Karen; Peherstorfer, Benjamin

    2017-11-01

    Estimating failure probability related to fluid flows is a challenge because it requires a large number of evaluations of expensive models. We address this challenge by leveraging multiple low fidelity models of the flow dynamics to create an optimal unbiased estimator. In particular, we investigate the effects of uncertain inlet conditions in the width of a free plane jet. We classify a condition as failure when the corresponding jet width is below a small threshold, such that failure is a rare event (failure probability is smaller than 0.001). We estimate failure probability by combining the frameworks of multi-fidelity importance sampling and optimal fusion of estimators. Multi-fidelity importance sampling uses a low fidelity model to explore the parameter space and create a biasing distribution. An unbiased estimate is then computed with a relatively small number of evaluations of the high fidelity model. In the presence of multiple low fidelity models, this framework offers multiple competing estimators. Optimal fusion combines all competing estimators into a single estimator with minimal variance. We show that this combined framework can significantly reduce the cost of estimating failure probabilities, and thus can have a large impact in fluid flow applications. This work was funded by DARPA.

  3. Spatial Variability in Black Carbon Mixing State Observed During The Multi-City NASA DISCOVER-AQ Field Campaign

    NASA Astrophysics Data System (ADS)

    Moore, R.; Ziemba, L. D.; Beyersdorf, A. J.; Chen, G.; Corr, C.; Hudgins, C.; Martin, R.; Shook, M.; Thornhill, K. L., II; Winstead, E.; Anderson, B. E.

    2014-12-01

    Light absorbing carbonaceous aerosols are known to be an important climatic driver with a global radiative forcing of about half (IPCC, 2013) to two-thirds (Bond et al., 2013) that of the dominant greenhouse gas, carbon dioxide. While the mass absorption coefficient of pure black carbon (BC) is fairly well known, observational evidence suggests that BC rapidly mixes with other aerosol chemical components within hours of emission (Moffet and Prather, 2009; Moteki et al., 2007). These other components may include predominantly scattering organic, sulfate, and nitrate species, as well as light-absorbing, so-called "brown carbon" (BrC). It has been suggested that the presence of these BC-mixed components may induce mixing-state-dependent lensing effects that could potentially double the BC direct radiative forcing (Jacobson, 2001). The key to better understanding how BC-rich aerosols are distributed in the atmosphere is to examine an unbiased set of measurements covering broad spatial and temporal coverage; however, many past airborne field campaigns have specifically targeted source plumes or other scientifically-relevant emissions sources. The recent NASA DISCOVER-AQ campaign is unique in that approximately the same flight pattern was performed over a month-long period in each of four different U.S. metropolitan areas, ensuring an unbiased, or at least less biased, data set with both wide horizontal and vertical (surface to 5 km altitude) coverage. We present a statistical analysis of BC-rich particle mixing state measured during DISCOVER-AQ by a DMT Single Particle Soot Photometer (SP2). The SP2 measures the BC mass distribution via laser incandescence, and the non-BC coating thickness is inferred from the light scattering signal of particles greater than 200 nm in diameter (Gao et al., 2007; Moteki and Kondo, 2008). The SP2-derived size distributions are compared to optical scattering size distributions measured by an UHSAS in order determine 1) the externally mixed fraction of particles containing BC across the optically-active region of the size distribution (200-1000 nm) and 2) the internally mixed volume fraction of BC relative to the total particle volume assuming spherical particles. Vertical profiles of these variables are discussed in the context of remotely sensing aerosol mixing state.

  4. Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant.

    PubMed

    Vanderhaeghe, F; Smolders, A J P; Roelofs, J G M; Hoffmann, M

    2012-03-01

    Selecting an appropriate variable subset in linear multivariate methods is an important methodological issue for ecologists. Interest often exists in obtaining general predictive capacity or in finding causal inferences from predictor variables. Because of a lack of solid knowledge on a studied phenomenon, scientists explore predictor variables in order to find the most meaningful (i.e. discriminating) ones. As an example, we modelled the response of the amphibious softwater plant Eleocharis multicaulis using canonical discriminant function analysis. We asked how variables can be selected through comparison of several methods: univariate Pearson chi-square screening, principal components analysis (PCA) and step-wise analysis, as well as combinations of some methods. We expected PCA to perform best. The selected methods were evaluated through fit and stability of the resulting discriminant functions and through correlations between these functions and the predictor variables. The chi-square subset, at P < 0.05, followed by a step-wise sub-selection, gave the best results. In contrast to expectations, PCA performed poorly, as so did step-wise analysis. The different chi-square subset methods all yielded ecologically meaningful variables, while probable noise variables were also selected by PCA and step-wise analysis. We advise against the simple use of PCA or step-wise discriminant analysis to obtain an ecologically meaningful variable subset; the former because it does not take into account the response variable, the latter because noise variables are likely to be selected. We suggest that univariate screening techniques are a worthwhile alternative for variable selection in ecology. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.

  5. Genomic selection for fruit quality traits in apple (Malus×domestica Borkh.).

    PubMed

    Kumar, Satish; Chagné, David; Bink, Marco C A M; Volz, Richard K; Whitworth, Claire; Carlisle, Charmaine

    2012-01-01

    The genome sequence of apple (Malus×domestica Borkh.) was published more than a year ago, which helped develop an 8K SNP chip to assist in implementing genomic selection (GS). In apple breeding programmes, GS can be used to obtain genomic breeding values (GEBV) for choosing next-generation parents or selections for further testing as potential commercial cultivars at a very early stage. Thus GS has the potential to accelerate breeding efficiency significantly because of decreased generation interval or increased selection intensity. We evaluated the accuracy of GS in a population of 1120 seedlings generated from a factorial mating design of four females and two male parents. All seedlings were genotyped using an Illumina Infinium chip comprising 8,000 single nucleotide polymorphisms (SNPs), and were phenotyped for various fruit quality traits. Random-regression best liner unbiased prediction (RR-BLUP) and the Bayesian LASSO method were used to obtain GEBV, and compared using a cross-validation approach for their accuracy to predict unobserved BLUP-BV. Accuracies were very similar for both methods, varying from 0.70 to 0.90 for various fruit quality traits. The selection response per unit time using GS compared with the traditional BLUP-based selection were very high (>100%) especially for low-heritability traits. Genome-wide average estimated linkage disequilibrium (LD) between adjacent SNPs was 0.32, with a relatively slow decay of LD in the long range (r(2) = 0.33 and 0.19 at 100 kb and 1,000 kb respectively), contributing to the higher accuracy of GS. Distribution of estimated SNP effects revealed involvement of large effect genes with likely pleiotropic effects. These results demonstrated that genomic selection is a credible alternative to conventional selection for fruit quality traits.

  6. Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

    ERIC Educational Resources Information Center

    Brusco, Michael J.; Singh, Renu; Steinley, Douglas

    2009-01-01

    The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…

  7. Model selection bias and Freedman's paradox

    USGS Publications Warehouse

    Lukacs, P.M.; Burnham, K.P.; Anderson, D.R.

    2010-01-01

    In situations where limited knowledge of a system exists and the ratio of data points to variables is small, variable selection methods can often be misleading. Freedman (Am Stat 37:152-155, 1983) demonstrated how common it is to select completely unrelated variables as highly "significant" when the number of data points is similar in magnitude to the number of variables. A new type of model averaging estimator based on model selection with Akaike's AIC is used with linear regression to investigate the problems of likely inclusion of spurious effects and model selection bias, the bias introduced while using the data to select a single seemingly "best" model from a (often large) set of models employing many predictor variables. The new model averaging estimator helps reduce these problems and provides confidence interval coverage at the nominal level while traditional stepwise selection has poor inferential properties. ?? The Institute of Statistical Mathematics, Tokyo 2009.

  8. Variables selection methods in near-infrared spectroscopy.

    PubMed

    Xiaobo, Zou; Jiewen, Zhao; Povey, Malcolm J W; Holmes, Mel; Hanpin, Mao

    2010-05-14

    Near-infrared (NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields, such as the petrochemical, pharmaceutical, environmental, clinical, agricultural, food and biomedical sectors during the past 15 years. A NIR spectrum of a sample is typically measured by modern scanning instruments at hundreds of equally spaced wavelengths. The large number of spectral variables in most data sets encountered in NIR spectral chemometrics often renders the prediction of a dependent variable unreliable. Recently, considerable effort has been directed towards developing and evaluating different procedures that objectively identify variables which contribute useful information and/or eliminate variables containing mostly noise. This review focuses on the variable selection methods in NIR spectroscopy. Selection methods include some classical approaches, such as manual approach (knowledge based selection), "Univariate" and "Sequential" selection methods; sophisticated methods such as successive projections algorithm (SPA) and uninformative variable elimination (UVE), elaborate search-based strategies such as simulated annealing (SA), artificial neural networks (ANN) and genetic algorithms (GAs) and interval base algorithms such as interval partial least squares (iPLS), windows PLS and iterative PLS. Wavelength selection with B-spline, Kalman filtering, Fisher's weights and Bayesian are also mentioned. Finally, the websites of some variable selection software and toolboxes for non-commercial use are given. Copyright 2010 Elsevier B.V. All rights reserved.

  9. An examination of effect estimation in factorial and standardly-tailored designs

    PubMed Central

    Allore, Heather G; Murphy, Terrence E

    2012-01-01

    Background Many clinical trials are designed to test an intervention arm against a control arm wherein all subjects are equally eligible for all interventional components. Factorial designs have extended this to test multiple intervention components and their interactions. A newer design referred to as a ‘standardly-tailored’ design, is a multicomponent interventional trial that applies individual interventional components to modify risk factors identified a priori and tests whether health outcomes differ between treatment arms. Standardly-tailored designs do not require that all subjects be eligible for every interventional component. Although standardly-tailored designs yield an estimate for the net effect of the multicomponent intervention, it has not yet been shown if they permit separate, unbiased estimation of individual component effects. The ability to estimate the most potent interventional components has direct bearing on conducting second stage translational research. Purpose We present statistical issues related to the estimation of individual component effects in trials of geriatric conditions using factorial and standardly-tailored designs. The medical community is interested in second stage translational research involving the transfer of results from a randomized clinical trial to a community setting. Before such research is undertaken, main effects and synergistic and or antagonistic interactions between them should be identified. Knowledge of the relative strength and direction of the effects of the individual components and their interactions facilitates the successful transfer of clinically significant findings and may potentially reduce the number of interventional components needed. Therefore the current inability of the standardly-tailored design to provide unbiased estimates of individual interventional components is a serious limitation in their applicability to second stage translational research. Methods We discuss estimation of individual component effects from the family of factorial designs and this limitation for standardly-tailored designs. We use the phrase ‘factorial designs’ to describe full-factorial designs and their derivatives including the fractional factorial, partial factorial, incomplete factorial and modified reciprocal designs. We suggest two potential directions for designing multicomponent interventions to facilitate unbiased estimates of individual interventional components. Results Full factorial designs and their variants are the most common multicomponent trial design described in the literature and differ meaningfully from standardly-tailored designs. Factorial and standardly-tailored designs result in similar estimates of net effect with different levels of precision. Unbiased estimation of individual component effects from a standardly-tailored design will require new methodology. Limitations Although clinically relevant in geriatrics, previous applications of standardly-tailored designs have not provided unbiased estimates of the effects of individual interventional components. Discussion Future directions to estimate individual component effects from standardly-tailored designs include applying D-optimal designs and creating independent linear combinations of risk factors analogous to factor analysis. Conclusion Methods are needed to extract unbiased estimates of the effects of individual interventional components from standardly-tailored designs. PMID:18375650

  10. Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.).

    PubMed

    Rincent, R; Laloë, D; Nicolas, S; Altmann, T; Brunel, D; Revilla, P; Rodríguez, V M; Moreno-Gonzalez, J; Melchinger, A; Bauer, E; Schoen, C-C; Meyer, N; Giauffret, C; Bauland, C; Jamin, P; Laborde, J; Monod, H; Flament, P; Charcosset, A; Moreau, L

    2012-10-01

    Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix-best linear unbiased predictions model (RA-BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.

  11. Genome wide selection in Citrus breeding.

    PubMed

    Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A

    2016-10-17

    Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq TM (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.

  12. Genomic Selection in Multi-environment Crop Trials.

    PubMed

    Oakey, Helena; Cullis, Brian; Thompson, Robin; Comadran, Jordi; Halpin, Claire; Waugh, Robbie

    2016-05-03

    Genomic selection in crop breeding introduces modeling challenges not found in animal studies. These include the need to accommodate replicate plants for each line, consider spatial variation in field trials, address line by environment interactions, and capture nonadditive effects. Here, we propose a flexible single-stage genomic selection approach that resolves these issues. Our linear mixed model incorporates spatial variation through environment-specific terms, and also randomization-based design terms. It considers marker, and marker by environment interactions using ridge regression best linear unbiased prediction to extend genomic selection to multiple environments. Since the approach uses the raw data from line replicates, the line genetic variation is partitioned into marker and nonmarker residual genetic variation (i.e., additive and nonadditive effects). This results in a more precise estimate of marker genetic effects. Using barley height data from trials, in 2 different years, of up to 477 cultivars, we demonstrate that our new genomic selection model improves predictions compared to current models. Analyzing single trials revealed improvements in predictive ability of up to 5.7%. For the multiple environment trial (MET) model, combining both year trials improved predictive ability up to 11.4% compared to a single environment analysis. Benefits were significant even when fewer markers were used. Compared to a single-year standard model run with 3490 markers, our partitioned MET model achieved the same predictive ability using between 500 and 1000 markers depending on the trial. Our approach can be used to increase accuracy and confidence in the selection of the best lines for breeding and/or, to reduce costs by using fewer markers. Copyright © 2016 Oakey et al.

  13. Regional Regression Equations to Estimate Flow-Duration Statistics at Ungaged Stream Sites in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2010-01-01

    Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.

  14. A GPU-Based Implementation of the Firefly Algorithm for Variable Selection in Multivariate Calibration Problems

    PubMed Central

    de Paula, Lauro C. M.; Soares, Anderson S.; de Lima, Telma W.; Delbem, Alexandre C. B.; Coelho, Clarimar J.; Filho, Arlindo R. G.

    2014-01-01

    Several variable selection algorithms in multivariate calibration can be accelerated using Graphics Processing Units (GPU). Among these algorithms, the Firefly Algorithm (FA) is a recent proposed metaheuristic that may be used for variable selection. This paper presents a GPU-based FA (FA-MLR) with multiobjective formulation for variable selection in multivariate calibration problems and compares it with some traditional sequential algorithms in the literature. The advantage of the proposed implementation is demonstrated in an example involving a relatively large number of variables. The results showed that the FA-MLR, in comparison with the traditional algorithms is a more suitable choice and a relevant contribution for the variable selection problem. Additionally, the results also demonstrated that the FA-MLR performed in a GPU can be five times faster than its sequential implementation. PMID:25493625

  15. A GPU-Based Implementation of the Firefly Algorithm for Variable Selection in Multivariate Calibration Problems.

    PubMed

    de Paula, Lauro C M; Soares, Anderson S; de Lima, Telma W; Delbem, Alexandre C B; Coelho, Clarimar J; Filho, Arlindo R G

    2014-01-01

    Several variable selection algorithms in multivariate calibration can be accelerated using Graphics Processing Units (GPU). Among these algorithms, the Firefly Algorithm (FA) is a recent proposed metaheuristic that may be used for variable selection. This paper presents a GPU-based FA (FA-MLR) with multiobjective formulation for variable selection in multivariate calibration problems and compares it with some traditional sequential algorithms in the literature. The advantage of the proposed implementation is demonstrated in an example involving a relatively large number of variables. The results showed that the FA-MLR, in comparison with the traditional algorithms is a more suitable choice and a relevant contribution for the variable selection problem. Additionally, the results also demonstrated that the FA-MLR performed in a GPU can be five times faster than its sequential implementation.

  16. Identification of Small Molecule Activators of Cryptochrome

    PubMed Central

    Hirota, Tsuyoshi; Lee, Jae Wook; St. John, Peter C.; Sawa, Mariko; Iwaisako, Keiko; Noguchi, Takako; Pongsawakul, Pagkapol Y.; Sonntag, Tim; Welsh, David K.; Brenner, David A.; Doyle, Francis J.; Schultz, Peter G.; Kay, Steve A.

    2013-01-01

    Impairment of the circadian clock has been associated with numerous disorders, including metabolic disease. Although small molecules that modulate clock function might offer therapeutic approaches to such diseases, only a few compound have been identified that selectively target core clock proteins. From an unbiased cell-based circadian screen, we identified KL001, a small molecule that specifically interacts with cryptochrome (CRY). KL001 prevented ubiquitin-dependent degradation of CRY, resulting in lengthening of the circadian period. In combination with mathematical modeling, KL001 revealed that CRY1 and CRY2 share a similar functional role in the period regulation. Furthermore, KL001- mediated CRY stabilization inhibited glucagon-induced gluconeogenesis in primary hepatocytes. KL001 thus provides a tool to study the regulation of CRY-dependent physiology and aid development of clock-based therapeutics of diabetes. PMID:22798407

  17. The 6dFGS Fundamental Plane

    NASA Astrophysics Data System (ADS)

    Springob, Chris M.; Colless, M.; Jones, D. H.; Magoulas, C.; Mould, J. R.; Campbell, L.; Lah, P.; Lucey, J.; Merson, A.; Proctor, R.

    2010-01-01

    The 6dF Galaxy Survey (6dFGS) is an all southern sky galaxy survey, including 125,000 redshifts and more than 10,000 peculiar velocities, making it the largest peculiar velocity sample to date. In combination with 2MASS surface brightnesses and effective radii, 6dFGS yields the near-infrared Fundamental Plane (FP) for a large and uniform sample. We have fit the FP relation for the galaxies in the peculiar velocity sample using a maximum likelihood method which allows us to precisely account for selection effects and observational errors. We investigate the effects of varying stellar populations and environments on the FP. Finally, we discuss the implications of these results both for our understanding of the origin of the FP for early-type galaxies and bulges and for deriving unbiased distances and peculiar velocities in the local universe.

  18. AGN Clustering in the BAT Sample

    NASA Astrophysics Data System (ADS)

    Powell, Meredith; Cappelluti, Nico; Urry, Meg; Koss, Michael; BASS Team

    2018-01-01

    We characterize the environments of local growing supermassive black holes by measuring the clustering of AGN in the Swift-BAT Spectroscopic Survey (BASS). With 548 AGN in the redshift range 0.01

  19. Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows

    NASA Astrophysics Data System (ADS)

    Ivanov, Mark V.; Levitsky, Lev I.; Gorshkov, Mikhail V.

    2016-09-01

    A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms.

  20. Seeing Through the Clouds: AGN Geometry with the Swift BAT Sample

    NASA Astrophysics Data System (ADS)

    Glikman, Eilat; Urry, M.; Schawinski, K.; Koss, M. J.; Winter, L. M.; Elitzur, M.; Wilkin, W. H.

    2011-01-01

    We investigate the intrinsic structure of the clouds surrounding AGN which give rise to their X-ray and optical emission properties. Using a complete sample of Swift BAT AGN selected in hard X-rays (14-195 keV), which is unbiased with respect to obscuration and extinction, we compute the reddening in the broad line region along the line of sight to the nucleus of each source using Balmer decrement from the ratio of the broad components of H-alpha/H-beta. We compare reddening from dust in the broad line clouds to the hydrogen column density (NH) obtained from their X-ray spectra. The distribution of the gas-to-dust ratios over many lines of sight allow us to test models of AGN structure and probe the immediate environment of the accreting supermassive black holes.

Top