On a stronger-than-best property for best prediction
NASA Astrophysics Data System (ADS)
Teunissen, P. J. G.
2008-03-01
The minimum mean squared error (MMSE) criterion is a popular criterion for devising best predictors. In case of linear predictors, it has the advantage that no further distributional assumptions need to be made, other then about the first- and second-order moments. In the spatial and Earth sciences, it is the best linear unbiased predictor (BLUP) that is used most often. Despite the fact that in this case only the first- and second-order moments need to be known, one often still makes statements about the complete distribution, in particular when statistical testing is involved. For such cases, one can do better than the BLUP, as shown in Teunissen (J Geod. doi: 10.1007/s00190-007-0140-6, 2006), and thus devise predictors that have a smaller MMSE than the BLUP. Hence, these predictors are to be preferred over the BLUP, if one really values the MMSE-criterion. In the present contribution, we will show, however, that the BLUP has another optimality property than the MMSE-property, provided that the distribution is Gaussian. It will be shown that in the Gaussian case, the prediction error of the BLUP has the highest possible probability of all linear unbiased predictors of being bounded in the weighted squared norm sense. This is a stronger property than the often advertised MMSE-property of the BLUP.
Vittorazzi, C; Amaral Junior, A T; Guimarães, A G; Viana, A P; Silva, F H L; Pena, G F; Daher, R F; Gerhardt, I F S; Oliveira, G H F; Pereira, M G
2017-09-27
Selection indices commonly utilize economic weights, which become arbitrary genetic gains. In popcorn, this is even more evident due to the negative correlation between the main characteristics of economic importance - grain yield and popping expansion. As an option in the use of classical biometrics as a selection index, the optimal procedure restricted maximum likelihood/best linear unbiased predictor (REML/BLUP) allows the simultaneous estimation of genetic parameters and the prediction of genotypic values. Based on the mixed model methodology, the objective of this study was to investigate the comparative efficiency of eight selection indices estimated by REML/BLUP for the effective selection of superior popcorn families in the eighth intrapopulation recurrent selection cycle. We also investigated the efficiency of the inclusion of the variable "expanded popcorn volume per hectare" in the most advantageous selection of superior progenies. In total, 200 full-sib families were evaluated in two different areas in the North and Northwest regions of the State of Rio de Janeiro, Brazil. The REML/BLUP procedure resulted in higher estimated gains than those obtained with classical biometric selection index methodologies and should be incorporated into the selection of progenies. The following indices resulted in higher gains in the characteristics of greatest economic importance: the classical selection index/values attributed by trial, via REML/BLUP, and the greatest genotypic values/expanded popcorn volume per hectare, via REML. The expanded popcorn volume per hectare characteristic enabled satisfactory gains in grain yield and popping expansion; this characteristic should be considered super-trait in popcorn breeding programs.
Piepho, H P
1994-11-01
Multilocation trials are often used to analyse the adaptability of genotypes in different environments and to find for each environment the genotype that is best adapted; i.e. that is highest yielding in that environment. For this purpose, it is of interest to obtain a reliable estimate of the mean yield of a cultivar in a given environment. This article compares two different statistical estimation procedures for this task: the Additive Main Effects and Multiplicative Interaction (AMMI) analysis and Best Linear Unbiased Prediction (BLUP). A modification of a cross validation procedure commonly used with AMMI is suggested for trials that are laid out as a randomized complete block design. The use of these procedure is exemplified using five faba bean datasets from German registration trails. BLUP was found to outperform AMMI in four of five faba bean datasets.
T.Z. Ye; K.J.S. Jayawickrama; G.R. Johnson
2004-01-01
BLUP (Best linear unbiased prediction) method has been widely used in forest tree improvement programs. Since one of the properties of BLUP is that related individuals contribute to the predictions of each other, it seems logical that integrating data from all generations and from all populations would improve both the precision and accuracy in predicting genetic...
Camarinha-Silva, Amelia; Maushammer, Maria; Wellmann, Robin; Vital, Marius; Preuss, Siegfried; Bennewitz, Jörn
2017-07-01
The aim of the present study was to analyze the interplay between gastrointestinal tract (GIT) microbiota, host genetics, and complex traits in pigs using extended quantitative-genetic methods. The study design consisted of 207 pigs that were housed and slaughtered under standardized conditions, and phenotyped for daily gain, feed intake, and feed conversion rate. The pigs were genotyped with a standard 60 K SNP chip. The GIT microbiota composition was analyzed by 16S rRNA gene amplicon sequencing technology. Eight from 49 investigated bacteria genera showed a significant narrow sense host heritability, ranging from 0.32 to 0.57. Microbial mixed linear models were applied to estimate the microbiota variance for each complex trait. The fraction of phenotypic variance explained by the microbial variance was 0.28, 0.21, and 0.16 for daily gain, feed conversion, and feed intake, respectively. The SNP data and the microbiota composition were used to predict the complex traits using genomic best linear unbiased prediction (G-BLUP) and microbial best linear unbiased prediction (M-BLUP) methods, respectively. The prediction accuracies of G-BLUP were 0.35, 0.23, and 0.20 for daily gain, feed conversion, and feed intake, respectively. The corresponding prediction accuracies of M-BLUP were 0.41, 0.33, and 0.33. Thus, in addition to SNP data, microbiota abundances are an informative source of complex trait predictions. Since the pig is a well-suited animal for modeling the human digestive tract, M-BLUP, in addition to G-BLUP, might be beneficial for predicting human predispositions to some diseases, and, consequently, for preventative and personalized medicine. Copyright © 2017 by the Genetics Society of America.
Correa, Katharina; Bangera, Rama; Figueroa, René; Lhorente, Jean P; Yáñez, José M
2017-01-31
Sea lice infestations caused by Caligus rogercresseyi are a main concern to the salmon farming industry due to associated economic losses. Resistance to this parasite was shown to have low to moderate genetic variation and its genetic architecture was suggested to be polygenic. The aim of this study was to compare accuracies of breeding value predictions obtained with pedigree-based best linear unbiased prediction (P-BLUP) methodology against different genomic prediction approaches: genomic BLUP (G-BLUP), Bayesian Lasso, and Bayes C. To achieve this, 2404 individuals from 118 families were measured for C. rogercresseyi count after a challenge and genotyped using 37 K single nucleotide polymorphisms. Accuracies were assessed using fivefold cross-validation and SNP densities of 0.5, 1, 5, 10, 25 and 37 K. Accuracy of genomic predictions increased with increasing SNP density and was higher than pedigree-based BLUP predictions by up to 22%. Both Bayesian and G-BLUP methods can predict breeding values with higher accuracies than pedigree-based BLUP, however, G-BLUP may be the preferred method because of reduced computation time and ease of implementation. A relatively low marker density (i.e. 10 K) is sufficient for maximal increase in accuracy when using G-BLUP or Bayesian methods for genomic prediction of C. rogercresseyi resistance in Atlantic salmon.
MultiBLUP: improved SNP-based prediction for complex traits.
Speed, Doug; Balding, David J
2014-09-01
BLUP (best linear unbiased prediction) is widely used to predict complex traits in plant and animal breeding, and increasingly in human genetics. The BLUP mathematical model, which consists of a single random effect term, was adequate when kinships were measured from pedigrees. However, when genome-wide SNPs are used to measure kinships, the BLUP model implicitly assumes that all SNPs have the same effect-size distribution, which is a severe and unnecessary limitation. We propose MultiBLUP, which extends the BLUP model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. The SNP classes can be specified in advance, for example, based on SNP functional annotations, and we also provide an adaptive procedure for determining a suitable partition of SNPs. We apply MultiBLUP to genome-wide association data from the Wellcome Trust Case Control Consortium (seven diseases), and from much larger studies of celiac disease and inflammatory bowel disease, finding that it consistently provides better prediction than alternative methods. Moreover, MultiBLUP is computationally very efficient; for the largest data set, which includes 12,678 individuals and 1.5 M SNPs, the total analysis can be run on a single desktop PC in less than a day and can be parallelized to run even faster. Tools to perform MultiBLUP are freely available in our software LDAK. © 2014 Speed and Balding; Published by Cold Spring Harbor Laboratory Press.
Bernardo, R
1996-11-01
Best linear unbiased prediction (BLUP) has been found to be useful in maize (Zea mays L.) breeding. The advantage of including both testcross additive and dominance effects (Intralocus Model) in BLUP, rather than only testcross additive effects (Additive Model), has not been clearly demonstrated. The objective of this study was to compare the usefulness of Intralocus and Additive Models for BLUP of maize single-cross performance. Multilocation data from 1990 to 1995 were obtained from the hybrid testing program of Limagrain Genetics. Grain yield, moisture, stalk lodging, and root lodging of untested single crosses were predicted from (1) the performance of tested single crosses and (2) known genetic relationships among the parental inbreds. Correlations between predicted and observed performance were obtained with a delete-one cross-validation procedure. For the Intralocus Model, the correlations ranged from 0.50 to 0.66 for yield, 0.88 to 0.94 for moisture, 0.47 to 0.69 for stalk lodging, and 0.31 to 0.45 for root lodging. The BLUP procedure was consistently more effective with the Intralocus Model than with the Additive Model. When the Additive Model was used instead of the Intralocus Model, the reductions in the correlation were largest for root lodging (0.06-0.35), smallest for moisture (0.00-0.02), and intermediate for yield (0.02-0.06) and stalk lodging (0.02-0.08). The ratio of dominance variance (v D) to total genetic variance (v G) was highest for root lodging (0.47) and lowest for moisture (0.10). The Additive Model may be used if prior information indicates that VD for a given trait has little contribution to VG. Otherwise, the continued use of the Intralocus Model for BLUP of single-cross performance is recommended.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-06-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-01-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889
Accuracy of Genomic Prediction for Foliar Terpene Traits in Eucalyptus polybractea.
Kainer, David; Stone, Eric A; Padovan, Amanda; Foley, William J; Külheim, Carsten
2018-06-11
Unlike agricultural crops, most forest species have not had millennia of improvement through phenotypic selection, but can contribute energy and material resources and possibly help alleviate climate change. Yield gains similar to those achieved in agricultural crops over millennia could be made in forestry species with the use of genomic methods in a much shorter time frame. Here we compare various methods of genomic prediction for eight traits related to foliar terpene yield in Eucalyptus polybractea , a tree grown predominantly for the production of Eucalyptus oil. The genomic markers used in this study are derived from shallow whole genome sequencing of a population of 480 trees. We compare the traditional pedigree-based additive best linear unbiased predictors (ABLUP), genomic BLUP (GBLUP), BayesB genomic prediction model, and a form of GBLUP based on weighting markers according to their influence on traits (BLUP|GA). Predictive ability is assessed under varying marker densities of 10,000, 100,000 and 500,000 SNPs. Our results show that BayesB and BLUP|GA perform best across the eight traits. Predictive ability was higher for individual terpene traits, such as foliar α-pinene and 1,8-cineole concentration (0.59 and 0.73, respectively), than aggregate traits such as total foliar oil concentration (0.38). This is likely a function of the trait architecture and markers used. BLUP|GA was the best model for the two biomass related traits, height and 1 year change in height (0.25 and 0.19, respectively). Predictive ability increased with marker density for most traits, but with diminishing returns. The results of this study are a solid foundation for yield improvement of essential oil producing eucalypts. New markets such as biopolymers and terpene-derived biofuels could benefit from rapid yield increases in undomesticated oil-producing species. Copyright © 2018, G3: Genes, Genomes, Genetics.
Clark, Samuel A; Hickey, John M; Daetwyler, Hans D; van der Werf, Julius H J
2012-02-09
The theory of genomic selection is based on the prediction of the effects of genetic markers in linkage disequilibrium with quantitative trait loci. However, genomic selection also relies on relationships between individuals to accurately predict genetic value. This study aimed to examine the importance of information on relatives versus that of unrelated or more distantly related individuals on the estimation of genomic breeding values. Simulated and real data were used to examine the effects of various degrees of relationship on the accuracy of genomic selection. Genomic Best Linear Unbiased Prediction (gBLUP) was compared to two pedigree based BLUP methods, one with a shallow one generation pedigree and the other with a deep ten generation pedigree. The accuracy of estimated breeding values for different groups of selection candidates that had varying degrees of relationships to a reference data set of 1750 animals was investigated. The gBLUP method predicted breeding values more accurately than BLUP. The most accurate breeding values were estimated using gBLUP for closely related animals. Similarly, the pedigree based BLUP methods were also accurate for closely related animals, however when the pedigree based BLUP methods were used to predict unrelated animals, the accuracy was close to zero. In contrast, gBLUP breeding values, for animals that had no pedigree relationship with animals in the reference data set, allowed substantial accuracy. An animal's relationship to the reference data set is an important factor for the accuracy of genomic predictions. Animals that share a close relationship to the reference data set had the highest accuracy from genomic predictions. However a baseline accuracy that is driven by the reference data set size and the overall population effective population size enables gBLUP to estimate a breeding value for unrelated animals within a population (breed), using information previously ignored by pedigree based BLUP methods.
Genomewide predictions from maize single-cross data.
Massman, Jon M; Gordillo, Andres; Lorenzana, Robenzon E; Bernardo, Rex
2013-01-01
Maize (Zea mays L.) breeders evaluate many single-cross hybrids each year in multiple environments. Our objective was to determine the usefulness of genomewide predictions, based on marker effects from maize single-cross data, for identifying the best untested single crosses and the best inbreds within a biparental cross. We considered 479 experimental maize single crosses between 59 Iowa Stiff Stalk Synthetic (BSSS) inbreds and 44 non-BSSS inbreds. The single crosses were evaluated in multilocation experiments from 2001 to 2009 and the BSSS and non-BSSS inbreds had genotypic data for 669 single nucleotide polymorphism (SNP) markers. Single-cross performance was predicted by a previous best linear unbiased prediction (BLUP) approach that utilized marker-based relatedness and information on relatives, and from genomewide marker effects calculated by ridge-regression BLUP (RR-BLUP). With BLUP, the mean prediction accuracy (r(MG)) of single-cross performance was 0.87 for grain yield, 0.90 for grain moisture, 0.69 for stalk lodging, and 0.84 for root lodging. The BLUP and RR-BLUP models did not lead to r(MG) values that differed significantly. We then used the RR-BLUP model, developed from single-cross data, to predict the performance of testcrosses within 14 biparental populations. The r(MG) values within each testcross population were generally low and were often negative. These results were obtained despite the above-average level of linkage disequilibrium, i.e., r(2) between adjacent markers of 0.35 in the BSSS inbreds and 0.26 in the non-BSSS inbreds. Overall, our results suggested that genomewide marker effects estimated from maize single crosses are not advantageous (cofmpared with BLUP) for predicting single-cross performance and have erratic usefulness for predicting testcross performance within a biparental cross.
Manzanilla-Pech, C I V; Veerkamp, R F; de Haas, Y; Calus, M P L; Ten Napel, J
2017-11-01
Given the interest of including dry matter intake (DMI) in the breeding goal, accurate estimated breeding values (EBV) for DMI are needed, preferably for separate lactations. Due to the limited amount of records available on DMI, 2 main approaches have been suggested to compute those EBV: (1) the inclusion of predictor traits, such as fat- and protein-corrected milk (FPCM) and live weight (LW), and (2) the addition of genomic information of animals using what is called genomic prediction. Recently, several methodologies to estimate EBV utilizing genomic information (EBV) have become available. In this study, a new method known as single-step ridge-regression BLUP (SSRR-BLUP) is suggested. The SSRR-BLUP method does not have an imposed limit on the number of genotyped animals, as the commonly used methods do. The objective of this study was to estimate genetic parameters using a relatively large data set with DMI records, as well as compare the accuracies of the EBV for DMI. These accuracies were obtained using 4 different methods: BLUP (using pedigree for all animals with phenotypes), genomic BLUP (GBLUP; only for genotyped animals), single-step GBLUP (SS-GBLUP), and SSRR-BLUP (for genotyped and nongenotyped animals). Records from different lactations, with or without predictor traits (FPCM and LW), were used in the model. Accuracies of EBV for DMI (defined as the correlation between the EBV and pre-adjusted DMI phenotypes divided by the average accuracy of those phenotypes) ranged between 0.21 and 0.38 across methods and scenarios. Accuracies of EBV for DMI using BLUP were the lowest accuracies obtained across methods. Meanwhile, accuracies of EBV for DMI were similar in SS-GBLUP and SSRR-BLUP, and lower for the GBLUP method. Hence, SSRR-BLUP could be used when the number of genotyped animals is large, avoiding the construction of the inverse genomic relationship matrix. Adding information on DMI from different lactations in the reference population gave higher accuracies in comparison when only lactation 1 was included. Finally, no benefit was obtained by adding information on predictor traits to the reference population when DMI was already included. However, in the absence of DMI records, having records on FPCM and LW from different lactations is a good way to obtain EBV with a relatively good accuracy. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Olivoto, T; Nardino, M; Carvalho, I R; Follmann, D N; Ferrari, M; Szareski, V J; de Pelegrin, A J; de Souza, V Q
2017-03-22
Methodologies using restricted maximum likelihood/best linear unbiased prediction (REML/BLUP) in combination with sequential path analysis in maize are still limited in the literature. Therefore, the aims of this study were: i) to use REML/BLUP-based procedures in order to estimate variance components, genetic parameters, and genotypic values of simple maize hybrids, and ii) to fit stepwise regressions considering genotypic values to form a path diagram with multi-order predictors and minimum multicollinearity that explains the relationships of cause and effect among grain yield-related traits. Fifteen commercial simple maize hybrids were evaluated in multi-environment trials in a randomized complete block design with four replications. The environmental variance (78.80%) and genotype-vs-environment variance (20.83%) accounted for more than 99% of the phenotypic variance of grain yield, which difficult the direct selection of breeders for this trait. The sequential path analysis model allowed the selection of traits with high explanatory power and minimum multicollinearity, resulting in models with elevated fit (R 2 > 0.9 and ε < 0.3). The number of kernels per ear (NKE) and thousand-kernel weight (TKW) are the traits with the largest direct effects on grain yield (r = 0.66 and 0.73, respectively). The high accuracy of selection (0.86 and 0.89) associated with the high heritability of the average (0.732 and 0.794) for NKE and TKW, respectively, indicated good reliability and prospects of success in the indirect selection of hybrids with high-yield potential through these traits. The negative direct effect of NKE on TKW (r = -0.856), however, must be considered. The joint use of mixed models and sequential path analysis is effective in the evaluation of maize-breeding trials.
USDA-ARS?s Scientific Manuscript database
Transformations to multiple trait mixed model equations (MME) which are intended to improve computational efficiency in best linear unbiased prediction (BLUP) and restricted maximum likelihood (REML) are described. It is shown that traits that are expected or estimated to have zero residual variance...
Simultaneous fitting of genomic-BLUP and Bayes-C components in a genomic prediction model.
Iheshiulor, Oscar O M; Woolliams, John A; Svendsen, Morten; Solberg, Trygve; Meuwissen, Theo H E
2017-08-24
The rapid adoption of genomic selection is due to two key factors: availability of both high-throughput dense genotyping and statistical methods to estimate and predict breeding values. The development of such methods is still ongoing and, so far, there is no consensus on the best approach. Currently, the linear and non-linear methods for genomic prediction (GP) are treated as distinct approaches. The aim of this study was to evaluate the implementation of an iterative method (called GBC) that incorporates aspects of both linear [genomic-best linear unbiased prediction (G-BLUP)] and non-linear (Bayes-C) methods for GP. The iterative nature of GBC makes it less computationally demanding similar to other non-Markov chain Monte Carlo (MCMC) approaches. However, as a Bayesian method, GBC differs from both MCMC- and non-MCMC-based methods by combining some aspects of G-BLUP and Bayes-C methods for GP. Its relative performance was compared to those of G-BLUP and Bayes-C. We used an imputed 50 K single-nucleotide polymorphism (SNP) dataset based on the Illumina Bovine50K BeadChip, which included 48,249 SNPs and 3244 records. Daughter yield deviations for somatic cell count, fat yield, milk yield, and protein yield were used as response variables. GBC was frequently (marginally) superior to G-BLUP and Bayes-C in terms of prediction accuracy and was significantly better than G-BLUP only for fat yield. On average across the four traits, GBC yielded a 0.009 and 0.006 increase in prediction accuracy over G-BLUP and Bayes-C, respectively. Computationally, GBC was very much faster than Bayes-C and similar to G-BLUP. Our results show that incorporating some aspects of G-BLUP and Bayes-C in a single model can improve accuracy of GP over the commonly used method: G-BLUP. Generally, GBC did not statistically perform better than G-BLUP and Bayes-C, probably due to the close relationships between reference and validation individuals. Nevertheless, it is a flexible tool, in the sense, that it simultaneously incorporates some aspects of linear and non-linear models for GP, thereby exploiting family relationships while also accounting for linkage disequilibrium between SNPs and genes with large effects. The application of GBC in GP merits further exploration.
GenoMatrix: A Software Package for Pedigree-Based and Genomic Prediction Analyses on Complex Traits.
Nazarian, Alireza; Gezan, Salvador Alejandro
2016-07-01
Genomic and pedigree-based best linear unbiased prediction methodologies (G-BLUP and P-BLUP) have proven themselves efficient for partitioning the phenotypic variance of complex traits into its components, estimating the individuals' genetic merits, and predicting unobserved (or yet-to-be observed) phenotypes in many species and fields of study. The GenoMatrix software, presented here, is a user-friendly package to facilitate the process of using genome-wide marker data and parentage information for G-BLUP and P-BLUP analyses on complex traits. It provides users with a collection of applications which help them on a set of tasks from performing quality control on data to constructing and manipulating the genomic and pedigree-based relationship matrices and obtaining their inverses. Such matrices will be then used in downstream analyses by other statistical packages. The package also enables users to obtain predicted values for unobserved individuals based on the genetic values of observed related individuals. GenoMatrix is available to the research community as a Windows 64bit executable and can be downloaded free of charge at: http://compbio.ufl.edu/software/genomatrix/. © The American Genetic Association. 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
2009-01-01
Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835
Unraveling additive from nonadditive effects using genomic relationship matrices.
Muñoz, Patricio R; Resende, Marcio F R; Gezan, Salvador A; Resende, Marcos Deon Vilela; de Los Campos, Gustavo; Kirst, Matias; Huber, Dudley; Peter, Gary F
2014-12-01
The application of quantitative genetics in plant and animal breeding has largely focused on additive models, which may also capture dominance and epistatic effects. Partitioning genetic variance into its additive and nonadditive components using pedigree-based models (P-genomic best linear unbiased predictor) (P-BLUP) is difficult with most commonly available family structures. However, the availability of dense panels of molecular markers makes possible the use of additive- and dominance-realized genomic relationships for the estimation of variance components and the prediction of genetic values (G-BLUP). We evaluated height data from a multifamily population of the tree species Pinus taeda with a systematic series of models accounting for additive, dominance, and first-order epistatic interactions (additive by additive, dominance by dominance, and additive by dominance), using either pedigree- or marker-based information. We show that, compared with the pedigree, use of realized genomic relationships in marker-based models yields a substantially more precise separation of additive and nonadditive components of genetic variance. We conclude that the marker-based relationship matrices in a model including additive and nonadditive effects performed better, improving breeding value prediction. Moreover, our results suggest that, for tree height in this population, the additive and nonadditive components of genetic variance are similar in magnitude. This novel result improves our current understanding of the genetic control and architecture of a quantitative trait and should be considered when developing breeding strategies. Copyright © 2014 by the Genetics Society of America.
Metabolomic prediction of yield in hybrid rice.
Xu, Shizhong; Xu, Yang; Gong, Liang; Zhang, Qifa
2016-10-01
Rice (Oryza sativa) provides a staple food source for more than 50% of the world's population. An increase in yield can significantly contribute to global food security. Hybrid breeding can potentially help to meet this goal because hybrid rice often shows a considerable increase in yield when compared with pure-bred cultivars. We recently developed a marker-guided prediction method for hybrid yield and showed a substantial increase in yield through genomic hybrid breeding. We now have transcriptomic and metabolomic data as potential resources for prediction. Using six prediction methods, including least absolute shrinkage and selection operator (LASSO), best linear unbiased prediction (BLUP), stochastic search variable selection, partial least squares, and support vector machines using the radial basis function and polynomial kernel function, we found that the predictability of hybrid yield can be further increased using these omic data. LASSO and BLUP are the most efficient methods for yield prediction. For high heritability traits, genomic data remain the most efficient predictors. When metabolomic data are used, the predictability of hybrid yield is almost doubled compared with genomic prediction. Of the 21 945 potential hybrids derived from 210 recombinant inbred lines, selection of the top 10 hybrids predicted from metabolites would lead to a ~30% increase in yield. We hypothesize that each metabolite represents a biologically built-in genetic network for yield; thus, using metabolites for prediction is equivalent to using information integrated from these hidden genetic networks for yield prediction. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Allele frequency changes due to hitch-hiking in genomic selection programs
2014-01-01
Background Genomic selection makes it possible to reduce pedigree-based inbreeding over best linear unbiased prediction (BLUP) by increasing emphasis on own rather than family information. However, pedigree inbreeding might not accurately reflect loss of genetic variation and the true level of inbreeding due to changes in allele frequencies and hitch-hiking. This study aimed at understanding the impact of using long-term genomic selection on changes in allele frequencies, genetic variation and level of inbreeding. Methods Selection was performed in simulated scenarios with a population of 400 animals for 25 consecutive generations. Six genetic models were considered with different heritabilities and numbers of QTL (quantitative trait loci) affecting the trait. Four selection criteria were used, including selection on own phenotype and on estimated breeding values (EBV) derived using phenotype-BLUP, genomic BLUP and Bayesian Lasso. Changes in allele frequencies at QTL, markers and linked neutral loci were investigated for the different selection criteria and different scenarios, along with the loss of favourable alleles and the rate of inbreeding measured by pedigree and runs of homozygosity. Results For each selection criterion, hitch-hiking in the vicinity of the QTL appeared more extensive when accuracy of selection was higher and the number of QTL was lower. When inbreeding was measured by pedigree information, selection on genomic BLUP EBV resulted in lower levels of inbreeding than selection on phenotype BLUP EBV, but this did not always apply when inbreeding was measured by runs of homozygosity. Compared to genomic BLUP, selection on EBV from Bayesian Lasso led to less genetic drift, reduced loss of favourable alleles and more effectively controlled the rate of both pedigree and genomic inbreeding in all simulated scenarios. In addition, selection on EBV from Bayesian Lasso showed a higher selection differential for mendelian sampling terms than selection on genomic BLUP EBV. Conclusions Neutral variation can be shaped to a great extent by the hitch-hiking effects associated with selection, rather than just by genetic drift. When implementing long-term genomic selection, strategies for genomic control of inbreeding are essential, due to a considerable hitch-hiking effect, regardless of the method that is used for prediction of EBV. PMID:24495634
Path analysis of the genetic integration of traits in the sand cricket: a novel use of BLUPs.
Roff, D A; Fairbairn, D J
2011-09-01
This study combines path analysis with quantitative genetics to analyse a key life history trade-off in the cricket, Gryllus firmus. We develop a path model connecting five traits associated with the trade-off between flight capability and reproduction and test this model using phenotypic data and estimates of breeding values (best linear unbiased predictors) from a half-sibling experiment. Strong support by both types of data validates our causal model and indicates concordance between the phenotypic and genetic expression of the trade-off. Comparisons of the trade-off between sexes and wing morphs reveal that these discrete phenotypes are not genetically independent and that the evolutionary trajectories of the two wing morphs are more tightly constrained to covary than those of the two sexes. Our results illustrate the benefits of combining a quantitative genetic analysis, which examines statistical correlations between traits, with a path model that focuses upon the causal components of variation. © 2011 The Authors. Journal of Evolutionary Biology © 2011 European Society For Evolutionary Biology.
NASA Astrophysics Data System (ADS)
Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.
2017-05-01
Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.
Schrag, Tobias A; Westhues, Matthias; Schipprack, Wolfgang; Seifert, Felix; Thiemann, Alexander; Scholten, Stefan; Melchinger, Albrecht E
2018-04-01
The ability to predict the agronomic performance of single-crosses with high precision is essential for selecting superior candidates for hybrid breeding. With recent technological advances, thousands of new parent lines, and, consequently, millions of new hybrid combinations are possible in each breeding cycle, yet only a few hundred can be produced and phenotyped in multi-environment yield trials. Well established prediction approaches such as best linear unbiased prediction (BLUP) using pedigree data and whole-genome prediction using genomic data are limited in capturing epistasis and interactions occurring within and among downstream biological strata such as transcriptome and metabolome. Because mRNA and small RNA (sRNA) sequences are involved in transcriptional, translational and post-translational processes, we expect them to provide information influencing several biological strata. However, using sRNA data of parent lines to predict hybrid performance has not yet been addressed. Here, we gathered genomic, transcriptomic (mRNA and sRNA) and metabolomic data of parent lines to evaluate the ability of the data to predict the performance of untested hybrids for important agronomic traits in grain maize. We found a considerable interaction for predictive ability between predictor and trait, with mRNA data being a superior predictor for grain yield and genomic data for grain dry matter content, while sRNA performed relatively poorly for both traits. Combining mRNA and genomic data as predictors resulted in high predictive abilities across both traits and combining other predictors improved prediction over that of the individual predictors alone. We conclude that downstream "omics" can complement genomics for hybrid prediction, and, thereby, contribute to more efficient selection of hybrid candidates. Copyright © 2018 by the Genetics Society of America.
Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus.
Lourenco, D A L; Tsuruta, S; Fragomeni, B O; Masuda, Y; Aguilar, I; Legarra, A; Bertrand, J K; Amen, T S; Wang, L; Moser, D W; Misztal, I
2015-06-01
Predictive ability of genomic EBV when using single-step genomic BLUP (ssGBLUP) in Angus cattle was investigated. Over 6 million records were available on birth weight (BiW) and weaning weight (WW), almost 3.4 million on postweaning gain (PWG), and over 1.3 million on calving ease (CE). Genomic information was available on, at most, 51,883 animals, which included high and low EBV accuracy animals. Traditional EBV was computed by BLUP and genomic EBV by ssGBLUP and indirect prediction based on SNP effects was derived from ssGBLUP; SNP effects were calculated based on the following reference populations: ref_2k (contains top bulls and top cows that had an EBV accuracy for BiW ≥0.85), ref_8k (contains all parents that were genotyped), and ref_33k (contains all genotyped animals born up to 2012). Indirect prediction was obtained as direct genomic value (DGV) or as an index of DGV and parent average (PA). Additionally, runs with ssGBLUP used the inverse of the genomic relationship matrix calculated by an algorithm for proven and young animals (APY) that uses recursions on a small subset of reference animals. An extra reference subset included 3,872 genotyped parents of genotyped animals (ref_4k). Cross-validation was used to assess predictive ability on a validation population of 18,721 animals born in 2013. Computations for growth traits used multiple-trait linear model and, for CE, a bivariate CE-BiW threshold-linear model. With BLUP, predictivities were 0.29, 0.34, 0.23, and 0.12 for BiW, WW, PWG, and CE, respectively. With ssGBLUP and ref_2k, predictivities were 0.34, 0.35, 0.27, and 0.13 for BiW, WW, PWG, and CE, respectively, and with ssGBLUP and ref_33k, predictivities were 0.39, 0.38, 0.29, and 0.13 for BiW, WW, PWG, and CE, respectively. Low predictivity for CE was due to low incidence rate of difficult calving. Indirect predictions with ref_33k were as accurate as with full ssGBLUP. Using the APY and recursions on ref_4k gave 88% gains of full ssGBLUP and using the APY and recursions on ref_8k gave 97% gains of full ssGBLUP. Genomic evaluation in beef cattle with ssGBLUP is feasible while keeping the models (maternal, multiple trait, and threshold) already used in regular BLUP. Gains in predictivity are dependent on the composition of the reference population. Indirect predictions via SNP effects derived from ssGBLUP allow for accurate genomic predictions on young animals, with no advantage of including PA in the index if the reference population is large. With the APY conditioning on about 10,000 reference animals, ssGBLUP is potentially applicable to a large number of genotyped animals without compromising predictive ability.
Masuda, Y; Misztal, I; Tsuruta, S; Legarra, A; Aguilar, I; Lourenco, D A L; Fragomeni, B O; Lawlor, T J
2016-03-01
The objectives of this study were to develop and evaluate an efficient implementation in the computation of the inverse of genomic relationship matrix with the recursion algorithm, called the algorithm for proven and young (APY), in single-step genomic BLUP. We validated genomic predictions for young bulls with more than 500,000 genotyped animals in final score for US Holsteins. Phenotypic data included 11,626,576 final scores on 7,093,380 US Holstein cows, and genotypes were available for 569,404 animals. Daughter deviations for young bulls with no classified daughters in 2009, but at least 30 classified daughters in 2014 were computed using all the phenotypic data. Genomic predictions for the same bulls were calculated with single-step genomic BLUP using phenotypes up to 2009. We calculated the inverse of the genomic relationship matrix GAPY(-1) based on a direct inversion of genomic relationship matrix on a small subset of genotyped animals (core animals) and extended that information to noncore animals by recursion. We tested several sets of core animals including 9,406 bulls with at least 1 classified daughter, 9,406 bulls and 1,052 classified dams of bulls, 9,406 bulls and 7,422 classified cows, and random samples of 5,000 to 30,000 animals. Validation reliability was assessed by the coefficient of determination from regression of daughter deviation on genomic predictions for the predicted young bulls. The reliabilities were 0.39 with 5,000 randomly chosen core animals, 0.45 with the 9,406 bulls, and 7,422 cows as core animals, and 0.44 with the remaining sets. With phenotypes truncated in 2009 and the preconditioned conjugate gradient to solve mixed model equations, the number of rounds to convergence for core animals defined by bulls was 1,343; defined by bulls and cows, 2,066; and defined by 10,000 random animals, at most 1,629. With complete phenotype data, the number of rounds decreased to 858, 1,299, and at most 1,092, respectively. Setting up GAPY(-1) for 569,404 genotyped animals with 10,000 core animals took 1.3h and 57 GB of memory. The validation reliability with APY reaches a plateau when the number of core animals is at least 10,000. Predictions with APY have little differences in reliability among definitions of core animals. Single-step genomic BLUP with APY is applicable to millions of genotyped animals. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Fragomeni, B O; Lourenco, D A L; Tsuruta, S; Bradford, H L; Gray, K A; Huang, Y; Misztal, I
2016-12-01
The purposes of this study were to analyze the impact of seasonal losses due to heat stress in pigs from different breeds raised in different environments and to evaluate the accuracy improvement from adding genomic information to genetic evaluations. Data were available for 2 different swine populations: purebred Duroc animals raised in Texas and North Carolina and commercial crosses of Duroc and F females (Landrace × Large White) raised in Missouri and North Carolina; pedigrees provided links for animals from different states. Pedigree information was available for 553,442 animals, of which 8,232 pure breeds were genotyped. Traits were BW at 170 d for purebred animals and HCW for crossbred animals. Analyses were done with an animal model as either single- or 2-trait models using phenotypes measured in different states as separate traits. Additionally, reaction norm models were fitted for 1 or 2 traits using heat load index as a covariable. Heat load was calculated as temperature-humidity index greater than 70 and was averaged over 30 d prior to data collection. Variance components were estimated with average information REML, and EBV and genomic EBV (GEBV) with BLUP or single-step genomic BLUP (ssGBLUP). Validation was assessed for 146 genotyped sires with progeny in the last generation. Accuracy was calculated as a correlation between EBV and GEBV using reduced data (all animals, except the last generation) and using complete data. Heritability estimates for purebred animals were similar across states (varying from 0.23 to 0.26), and reaction norm models did not show evidence of a heat stress effect. Genetic correlations between states for heat loads were always strong (>0.91). For crossbred animals, no differences in heritability were found in single- or 2-trait analysis (from 0.17 to 0.18), and genetic correlations between states were moderate (0.43). In the reaction norm for crossbreeds, heritabilities ranged from 0.15 to 0.30 and genetic correlations between heat loads were as weak as 0.36, with heat load ranging from 0 to 12. Accuracies with ssGBLUP were, on average, 25% greater than with BLUP. Accuracies were greater in 2-trait reaction norm models and at extreme heat load values. Impacts of seasonality are evident only for crossbred animals. Genomic information can help producers mitigate heat stress in swine by identifying superior sires that are more resistant to heat stress.
Mazo Lopera, Mauricio A; Coombes, Brandon J; de Andrade, Mariza
2017-09-27
Gene-environment (GE) interaction has important implications in the etiology of complex diseases that are caused by a combination of genetic factors and environment variables. Several authors have developed GE analysis in the context of independent subjects or longitudinal data using a gene-set. In this paper, we propose to analyze GE interaction for discrete and continuous phenotypes in family studies by incorporating the relatedness among the relatives for each family into a generalized linear mixed model (GLMM) and by using a gene-based variance component test. In addition, we deal with collinearity problems arising from linkage disequilibrium among single nucleotide polymorphisms (SNPs) by considering their coefficients as random effects under the null model estimation. We show that the best linear unbiased predictor (BLUP) of such random effects in the GLMM is equivalent to the ridge regression estimator. This equivalence provides a simple method to estimate the ridge penalty parameter in comparison to other computationally-demanding estimation approaches based on cross-validation schemes. We evaluated the proposed test using simulation studies and applied it to real data from the Baependi Heart Study consisting of 76 families. Using our approach, we identified an interaction between BMI and the Peroxisome Proliferator Activated Receptor Gamma ( PPARG ) gene associated with diabetes.
Oliveira, E J; Santana, F A; Oliveira, L A; Santos, V S
2014-08-28
The aim of this study was to estimate the genetic parameters and predict the genotypic values of root quality traits in cassava (Manihot esculenta Crantz) using restricted maximum likelihood (REML) and best linear unbiased prediction (BLUP). A total of 471 cassava accessions were evaluated over two years of cultivation. The evaluated traits included amylose content (AML), root dry matter (DMC), cyanogenic compounds (CyC), and starch yield (StYi). Estimates of the individual broad-sense heritability of AML were low (hg(2) = 0.07 ± 0.02), medium for StYi and DMC, and high for CyC. The heritability of AML was substantially improved based on mean of accessions (hm(2) = 0.28), indicating that some strategies such as increasing the number of repetitions can be used to increase the selective efficiency. In general, the observed genotypic values were very close to the predicted average of the improved population, most likely due to the high accuracy (>0.90), especially for DMC, CyC, and StYi. Gains via selection of the 30 best genotypes for each trait were 4.8 and 3.2% for an increase and decrease for AML, respectively, an increase of 10.75 and 74.62% for DMC for StYi, respectively, and a decrease of 89.60% for CyC in relation to the overall mean of the genotypic values. Genotypic correlations between the quality traits of the cassava roots collected were generally favorable, although they were low in magnitude. The REML/BLUP method was adequate for estimating genetic parameters and predicting the genotypic values, making it useful for cassava breeding.
Genomic selection for fruit quality traits in apple (Malus×domestica Borkh.).
Kumar, Satish; Chagné, David; Bink, Marco C A M; Volz, Richard K; Whitworth, Claire; Carlisle, Charmaine
2012-01-01
The genome sequence of apple (Malus×domestica Borkh.) was published more than a year ago, which helped develop an 8K SNP chip to assist in implementing genomic selection (GS). In apple breeding programmes, GS can be used to obtain genomic breeding values (GEBV) for choosing next-generation parents or selections for further testing as potential commercial cultivars at a very early stage. Thus GS has the potential to accelerate breeding efficiency significantly because of decreased generation interval or increased selection intensity. We evaluated the accuracy of GS in a population of 1120 seedlings generated from a factorial mating design of four females and two male parents. All seedlings were genotyped using an Illumina Infinium chip comprising 8,000 single nucleotide polymorphisms (SNPs), and were phenotyped for various fruit quality traits. Random-regression best liner unbiased prediction (RR-BLUP) and the Bayesian LASSO method were used to obtain GEBV, and compared using a cross-validation approach for their accuracy to predict unobserved BLUP-BV. Accuracies were very similar for both methods, varying from 0.70 to 0.90 for various fruit quality traits. The selection response per unit time using GS compared with the traditional BLUP-based selection were very high (>100%) especially for low-heritability traits. Genome-wide average estimated linkage disequilibrium (LD) between adjacent SNPs was 0.32, with a relatively slow decay of LD in the long range (r(2) = 0.33 and 0.19 at 100 kb and 1,000 kb respectively), contributing to the higher accuracy of GS. Distribution of estimated SNP effects revealed involvement of large effect genes with likely pleiotropic effects. These results demonstrated that genomic selection is a credible alternative to conventional selection for fruit quality traits.
Vallejo, Roger L; Leeds, Timothy D; Gao, Guangtu; Parsons, James E; Martin, Kyle E; Evenhuis, Jason P; Fragomeni, Breno O; Wiens, Gregory D; Palti, Yniv
2017-02-01
Previously, we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative that enables exploitation of within-family genetic variation. We compared three GS models [single-step genomic best linear unbiased prediction (ssGBLUP), weighted ssGBLUP (wssGBLUP), and BayesB] to predict genomic-enabled breeding values (GEBV) for BCWD resistance in a commercial rainbow trout population, and compared the accuracy of GEBV to traditional estimates of breeding values (EBV) from a pedigree-based BLUP (P-BLUP) model. We also assessed the impact of sampling design on the accuracy of GEBV predictions. For these comparisons, we used BCWD survival phenotypes recorded on 7893 fish from 102 families, of which 1473 fish from 50 families had genotypes [57 K single nucleotide polymorphism (SNP) array]. Naïve siblings of the training fish (n = 930 testing fish) were genotyped to predict their GEBV and mated to produce 138 progeny testing families. In the following generation, 9968 progeny were phenotyped to empirically assess the accuracy of GEBV predictions made on their non-phenotyped parents. The accuracy of GEBV from all tested GS models were substantially higher than the P-BLUP model EBV. The highest increase in accuracy relative to the P-BLUP model was achieved with BayesB (97.2 to 108.8%), followed by wssGBLUP at iteration 2 (94.4 to 97.1%) and 3 (88.9 to 91.2%) and ssGBLUP (83.3 to 85.3%). Reducing the training sample size to n = ~1000 had no negative impact on the accuracy (0.67 to 0.72), but with n = ~500 the accuracy dropped to 0.53 to 0.61 if the training and testing fish were full-sibs, and even substantially lower, to 0.22 to 0.25, when they were not full-sibs. Using progeny performance data, we showed that the accuracy of genomic predictions is substantially higher than estimates obtained from the traditional pedigree-based BLUP model for BCWD resistance. Overall, we found that using a much smaller training sample size compared to similar studies in livestock, GS can substantially improve the selection accuracy and genetic gains for this trait in a commercial rainbow trout breeding population.
The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation.
Luan, Tu; Woolliams, John A; Lien, Sigbjørn; Kent, Matthew; Svendsen, Morten; Meuwissen, Theo H E
2009-11-01
Genomic Selection (GS) is a newly developed tool for the estimation of breeding values for quantitative traits through the use of dense markers covering the whole genome. For a successful application of GS, accuracy of the prediction of genomewide breeding value (GW-EBV) is a key issue to consider. Here we investigated the accuracy and possible bias of GW-EBV prediction, using real bovine SNP genotyping (18,991 SNPs) and phenotypic data of 500 Norwegian Red bulls. The study was performed on milk yield, fat yield, protein yield, first lactation mastitis traits, and calving ease. Three methods, best linear unbiased prediction (G-BLUP), Bayesian statistics (BayesB), and a mixture model approach (MIXTURE), were used to estimate marker effects, and their accuracy and bias were estimated by using cross-validation. The accuracies of the GW-EBV prediction were found to vary widely between 0.12 and 0.62. G-BLUP gave overall the highest accuracy. We observed a strong relationship between the accuracy of the prediction and the heritability of the trait. GW-EBV prediction for production traits with high heritability achieved higher accuracy and also lower bias than health traits with low heritability. To achieve a similar accuracy for the health traits probably more records will be needed.
Application of Response Surface Methods To Determine Conditions for Optimal Genomic Prediction
Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.
2017-01-01
An epistatic genetic architecture can have a significant impact on prediction accuracies of genomic prediction (GP) methods. Machine learning methods predict traits comprised of epistatic genetic architectures more accurately than statistical methods based on additive mixed linear models. The differences between these types of GP methods suggest a diagnostic for revealing genetic architectures underlying traits of interest. In addition to genetic architecture, the performance of GP methods may be influenced by the sample size of the training population, the number of QTL, and the proportion of phenotypic variability due to genotypic variability (heritability). Possible values for these factors and the number of combinations of the factor levels that influence the performance of GP methods can be large. Thus, efficient methods for identifying combinations of factor levels that produce most accurate GPs is needed. Herein, we employ response surface methods (RSMs) to find the experimental conditions that produce the most accurate GPs. We illustrate RSM with an example of simulated doubled haploid populations and identify the combination of factors that maximize the difference between prediction accuracies of best linear unbiased prediction (BLUP) and support vector machine (SVM) GP methods. The greatest impact on the response is due to the genetic architecture of the population, heritability of the trait, and the sample size. When epistasis is responsible for all of the genotypic variance and heritability is equal to one and the sample size of the training population is large, the advantage of using the SVM method vs. the BLUP method is greatest. However, except for values close to the maximum, most of the response surface shows little difference between the methods. We also determined that the conditions resulting in the greatest prediction accuracy for BLUP occurred when genetic architecture consists solely of additive effects, and heritability is equal to one. PMID:28720710
Morgante, Fabio; Huang, Wen; Maltecca, Christian; Mackay, Trudy F C
2018-06-01
Predicting complex phenotypes from genomic data is a fundamental aim of animal and plant breeding, where we wish to predict genetic merits of selection candidates; and of human genetics, where we wish to predict disease risk. While genomic prediction models work well with populations of related individuals and high linkage disequilibrium (LD) (e.g., livestock), comparable models perform poorly for populations of unrelated individuals and low LD (e.g., humans). We hypothesized that low prediction accuracies in the latter situation may occur when the genetics architecture of the trait departs from the infinitesimal and additive architecture assumed by most prediction models. We used simulated data for 10,000 lines based on sequence data from a population of unrelated, inbred Drosophila melanogaster lines to evaluate this hypothesis. We show that, even in very simplified scenarios meant as a stress test of the commonly used Genomic Best Linear Unbiased Predictor (G-BLUP) method, using all common variants yields low prediction accuracy regardless of the trait genetic architecture. However, prediction accuracy increases when predictions are informed by the genetic architecture inferred from mapping the top variants affecting main effects and interactions in the training data, provided there is sufficient power for mapping. When the true genetic architecture is largely or partially due to epistatic interactions, the additive model may not perform well, while models that account explicitly for interactions generally increase prediction accuracy. Our results indicate that accounting for genetic architecture can improve prediction accuracy for quantitative traits.
Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Chen, Charles; Porth, Ilga; El-Kassaby, Yousry A
2015-05-09
Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.
Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen
2017-12-27
Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Zhang, Zhe; Erbe, Malena; He, Jinlong; Ober, Ulrike; Gao, Ning; Zhang, Hao; Simianer, Henner; Li, Jiaqi
2015-02-09
Obtaining accurate predictions of unobserved genetic or phenotypic values for complex traits in animal, plant, and human populations is possible through whole-genome prediction (WGP), a combined analysis of genotypic and phenotypic data. Because the underlying genetic architecture of the trait of interest is an important factor affecting model selection, we propose a new strategy, termed BLUP|GA (BLUP-given genetic architecture), which can use genetic architecture information within the dataset at hand rather than from public sources. This is achieved by using a trait-specific covariance matrix ( T: ), which is a weighted sum of a genetic architecture part ( S: matrix) and the realized relationship matrix ( G: ). The algorithm of BLUP|GA (BLUP-given genetic architecture) is provided and illustrated with real and simulated datasets. Predictive ability of BLUP|GA was validated with three model traits in a dairy cattle dataset and 11 traits in three public datasets with a variety of genetic architectures and compared with GBLUP and other approaches. Results show that BLUP|GA outperformed GBLUP in 20 of 21 scenarios in the dairy cattle dataset and outperformed GBLUP, BayesA, and BayesB in 12 of 13 traits in the analyzed public datasets. Further analyses showed that the difference of accuracies for BLUP|GA and GBLUP significantly correlate with the distance between the T: and G: matrices. The new strategy applied in BLUP|GA is a favorable and flexible alternative to the standard GBLUP model, allowing to account for the genetic architecture of the quantitative trait under consideration when necessary. This feature is mainly due to the increased similarity between the trait-specific relationship matrix ( T: matrix) and the genetic relationship matrix at unobserved causal loci. Applying BLUP|GA in WGP would ease the burden of model selection. Copyright © 2015 Zhang et al.
Genome wide selection in Citrus breeding.
Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A
2016-10-17
Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq TM (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.
Nicodemus, Kristin K; Malley, James D; Strobl, Carolin; Ziegler, Andreas
2010-02-27
Random forests (RF) have been increasingly used in applications such as genome-wide association and microarray studies where predictor correlation is frequently observed. Recent works on permutation-based variable importance measures (VIMs) used in RF have come to apparently contradictory conclusions. We present an extended simulation study to synthesize results. In the case when both predictor correlation was present and predictors were associated with the outcome (HA), the unconditional RF VIM attributed a higher share of importance to correlated predictors, while under the null hypothesis that no predictors are associated with the outcome (H0) the unconditional RF VIM was unbiased. Conditional VIMs showed a decrease in VIM values for correlated predictors versus the unconditional VIMs under HA and was unbiased under H0. Scaled VIMs were clearly biased under HA and H0. Unconditional unscaled VIMs are a computationally tractable choice for large datasets and are unbiased under the null hypothesis. Whether the observed increased VIMs for correlated predictors may be considered a "bias" - because they do not directly reflect the coefficients in the generating model - or if it is a beneficial attribute of these VIMs is dependent on the application. For example, in genetic association studies, where correlation between markers may help to localize the functionally relevant variant, the increased importance of correlated predictors may be an advantage. On the other hand, we show examples where this increased importance may result in spurious signals.
Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E
2017-07-01
High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.
Pintus, M A; Gaspa, G; Nicolazzi, E L; Vicario, D; Rossoni, A; Ajmone-Marsan, P; Nardone, A; Dimauro, C; Macciotta, N P P
2012-06-01
The large number of markers available compared with phenotypes represents one of the main issues in genomic selection. In this work, principal component analysis was used to reduce the number of predictors for calculating genomic breeding values (GEBV). Bulls of 2 cattle breeds farmed in Italy (634 Brown and 469 Simmental) were genotyped with the 54K Illumina beadchip (Illumina Inc., San Diego, CA). After data editing, 37,254 and 40,179 single nucleotide polymorphisms (SNP) were retained for Brown and Simmental, respectively. Principal component analysis carried out on the SNP genotype matrix extracted 2,257 and 3,596 new variables in the 2 breeds, respectively. Bulls were sorted by birth year to create reference and prediction populations. The effect of principal components on deregressed proofs in reference animals was estimated with a BLUP model. Results were compared with those obtained by using SNP genotypes as predictors with either the BLUP or Bayes_A method. Traits considered were milk, fat, and protein yields, fat and protein percentages, and somatic cell score. The GEBV were obtained for prediction population by blending direct genomic prediction and pedigree indexes. No substantial differences were observed in squared correlations between GEBV and EBV in prediction animals between the 3 methods in the 2 breeds. The principal component analysis method allowed for a reduction of about 90% in the number of independent variables when predicting direct genomic values, with a substantial decrease in calculation time and without loss of accuracy. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Mendes, M P; Ramalho, M A P; Abreu, A F B
2012-04-10
The objective of this study was to compare the BLUP selection method with different selection strategies in F(2:4) and assess the efficiency of this method on the early choice of the best common bean (Phaseolus vulgaris) lines. Fifty-one F(2:4) progenies were produced from a cross between the CVIII8511 x RP-26 lines. A randomized block design was used with 20 replications and one-plant field plots. Character data on plant architecture and grain yield were obtained and then the sum of the standardized variables was estimated for simultaneous selection of both traits. Analysis was carried out by mixed models (BLUP) and the least squares method to compare different selection strategies, like mass selection, stratified mass selection and between and within progeny selection. The progenies selected by BLUP were assessed in advanced generations, always selecting the greatest and smallest sum of the standardized variables. Analyses by the least squares method and BLUP procedure ranked the progenies in the same way. The coincidence of the individuals identified by BLUP and between and within progeny selection was high and of the greatest magnitude when BLUP was compared with mass selection. Although BLUP is the best estimator of genotypic value, its efficiency in the response to long term selection is not different from any of the other methods, because it is also unable to predict the future effect of the progenies x environments interaction. It was inferred that selection success will always depend on the most accurate possible progeny assessment and using alternatives to reduce the progenies x environments interaction effect.
Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner
2017-01-16
With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights, regardless of the SNP set used. Our results show that little or no benefit was gained when using all imputed WGS data to perform genomic prediction compared to using HD array data regardless of the weighting factors tested. However, using only genic SNPs from WGS data had a positive effect on prediction ability.
Cabezas, José Antonio; González-Martínez, Santiago C; Collada, Carmen; Guevara, María Angeles; Boury, Christophe; de María, Nuria; Eveno, Emmanuelle; Aranda, Ismael; Garnier-Géré, Pauline H; Brach, Jean; Alía, Ricardo; Plomion, Christophe; Cervera, María Teresa
2015-09-01
We have carried out a candidate-gene-based association genetic study in Pinus pinaster Aiton and evaluated the predictive performance for genetic merit gain of the most significantly associated genes and single nucleotide polymorphisms (SNPs). We used a second generation 384-SNP array enriched with candidate genes for growth and wood properties to genotype mother trees collected in 20 natural populations covering most of the European distribution of the species. Phenotypic data for total height, polycyclism, root-collar diameter and biomass were obtained from a replicated provenance-progeny trial located in two sites with contrasting environments (Atlantic vs Mediterranean climate). General linear models identified strong associations between growth traits (total height and polycyclism) and four SNPs from the korrigan candidate gene, after multiple testing corrections using false discovery rate. The combined genomic breeding value predictions assessed for the four associated korrigan SNPs by ridge regression-best linear unbiased prediction (RR-BLUP) and cross-validation accounted for up to 8 and 15% of the phenotypic variance for height and polycyclic growth, respectively, and did not improve adding SNPs from other growth-related candidate genes. For root-collar diameter and total biomass, they accounted for 1.6 and 1.1% of the phenotypic variance, respectively, but increased to 15 and 4.1% when other SNPs from lp3.1, lp3.3 and cad were included in RR-BLUP models. These results point towards a desirable integration of candidate-gene studies as a means to pre-select relevant markers, and aid genomic selection in maritime pine breeding programs. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Vitezica, Zulma G; Varona, Luis; Elsen, Jean-Michel; Misztal, Ignacy; Herring, William; Legarra, Andrès
2016-01-29
Most developments in quantitative genetics theory focus on the study of intra-breed/line concepts. With the availability of massive genomic information, it becomes necessary to revisit the theory for crossbred populations. We propose methods to construct genomic covariances with additive and non-additive (dominance) inheritance in the case of pure lines and crossbred populations. We describe substitution effects and dominant deviations across two pure parental populations and the crossbred population. Gene effects are assumed to be independent of the origin of alleles and allelic frequencies can differ between parental populations. Based on these assumptions, the theoretical variance components (additive and dominant) are obtained as a function of marker effects and allelic frequencies. The additive genetic variance in the crossbred population includes the biological additive and dominant effects of a gene and a covariance term. Dominance variance in the crossbred population is proportional to the product of the heterozygosity coefficients of both parental populations. A genomic BLUP (best linear unbiased prediction) equivalent model is presented. We illustrate this approach by using pig data (two pure lines and their cross, including 8265 phenotyped and genotyped sows). For the total number of piglets born, the dominance variance in the crossbred population represented about 13 % of the total genetic variance. Dominance variation is only marginally important for litter size in the crossbred population. We present a coherent marker-based model that includes purebred and crossbred data and additive and dominant actions. Using this model, it is possible to estimate breeding values, dominant deviations and variance components in a dataset that comprises data on purebred and crossbred individuals. These methods can be exploited to plan assortative mating in pig, maize or other species, in order to generate superior crossbred individuals in terms of performance.
Guo, Guang; Liu, Hexuan; Wang, Ling; Shen, Haipeng; Hu, Wen
2015-10-01
In this analysis, guided by an evolutionary framework, we investigate how the human genome as a whole interacts with historical period, age, and physical activity to influence body mass index (BMI). The genomic influence is estimated by (1) heritability or the proportion of variance in BMI explained by genome-wide genotype data, and (2) the random effects or the best linear unbiased predictors (BLUPs) of genome-wide association studies (GWAS) data on BMI. Data were used from the Framingham Heart Study (FHS) in the United States. The study was initiated in 1948, and the obesity data were collected repeatedly over the subsequent decades. The analyses draw analysis samples from a pool of >8,000 individuals in the FHS. The hypothesis testing based on Pitman test, permutation Pitman test, F test, and permutation F test produces three sets of significant findings. First, the genomic influence on BMI is substantially larger after the mid-1980s than in the few decades before the mid-1980s within each age group of 21-40, 41-50, 51-60, and >60. Second, the genomic influence on BMI weakens as one ages across the life course, or the genomic influence on BMI tends to be more important during reproductive ages than after reproductive ages within each of the two historical periods. Third, within the age group of 21-50 and not in the age group of >50, the genomic influence on BMI among physically active individuals is substantially smaller than the influence on those who are not physically active. In summary, this study provides evidence that the influence of human genome as a whole on obesity depends on historical period, age, and level of physical activity.
Jacquin, Laval; Cao, Tuong-Vi; Ahmadi, Nourollah
2016-01-01
One objective of this study was to provide readers with a clear and unified understanding of parametric statistical and kernel methods, used for genomic prediction, and to compare some of these in the context of rice breeding for quantitative traits. Furthermore, another objective was to provide a simple and user-friendly R package, named KRMM, which allows users to perform RKHS regression with several kernels. After introducing the concept of regularized empirical risk minimization, the connections between well-known parametric and kernel methods such as Ridge regression [i.e., genomic best linear unbiased predictor (GBLUP)] and reproducing kernel Hilbert space (RKHS) regression were reviewed. Ridge regression was then reformulated so as to show and emphasize the advantage of the kernel "trick" concept, exploited by kernel methods in the context of epistatic genetic architectures, over parametric frameworks used by conventional methods. Some parametric and kernel methods; least absolute shrinkage and selection operator (LASSO), GBLUP, support vector machine regression (SVR) and RKHS regression were thereupon compared for their genomic predictive ability in the context of rice breeding using three real data sets. Among the compared methods, RKHS regression and SVR were often the most accurate methods for prediction followed by GBLUP and LASSO. An R function which allows users to perform RR-BLUP of marker effects, GBLUP and RKHS regression, with a Gaussian, Laplacian, polynomial or ANOVA kernel, in a reasonable computation time has been developed. Moreover, a modified version of this function, which allows users to tune kernels for RKHS regression, has also been developed and parallelized for HPC Linux clusters. The corresponding KRMM package and all scripts have been made publicly available.
Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value.
Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-Hyeon
2017-03-01
Holsteins are known as the world's highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.
Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value
Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-hyeon
2017-01-01
Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins. PMID:26954162
Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan
2014-02-10
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2013-09-04
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Kärkkäinen, Hanni P.; Sillanpää, Mikko J.
2013-01-01
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
Estimation of genomic breeding values for milk yield in UK dairy goats.
Mucha, S; Mrode, R; MacLaren-Lee, I; Coffey, M; Conington, J
2015-11-01
The objective of this study was to estimate genomic breeding values for milk yield in crossbred dairy goats. The research was based on data provided by 2 commercial goat farms in the UK comprising 590,409 milk yield records on 14,453 dairy goats kidding between 1987 and 2013. The population was created by crossing 3 breeds: Alpine, Saanen, and Toggenburg. In each generation the best performing animals were selected for breeding, and as a result, a synthetic breed was created. The pedigree file contained 30,139 individuals, of which 2,799 were founders. The data set contained test-day records of milk yield, lactation number, farm, age at kidding, and year and season of kidding. Data on milk composition was unavailable. In total 1,960 animals were genotyped with the Illumina 50K caprine chip. Two methods for estimation of genomic breeding value were compared-BLUP at the single nucleotide polymorphism level (BLUP-SNP) and single-step BLUP. The highest accuracy of 0.61 was obtained with single-step BLUP, and the lowest (0.36) with BLUP-SNP. Linkage disequilibrium (r(2), the squared correlation of the alleles at 2 loci) at 50 kb (distance between 2 SNP) was 0.18. This is the first attempt to implement genomic selection in UK dairy goats. Results indicate that the single-step method provides the highest accuracy for populations with a small number of genotyped individuals, where the number of genotyped males is low and females are predominant in the reference population. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Chauvel, Louis; Leist, Anja K
2015-11-14
Health inequalities reflect multidimensional inequality (income, education, and other indicators of socioeconomic position) and vary across countries and welfare regimes. To which extent there is intergenerational transmission of health via parental socioeconomic status has rarely been investigated in comparative perspective. The study sought to explore if different measures of stratification produce the same health gradient and to which extent health gradients of income and of social origins vary with level of living and income inequality. A total of 299,770 observations were available from 18 countries assessed in EU-SILC 2005 and 2011 data, which contain information on social origins. Income inequality (Gini) and level of living were calculated from EU-SILC. Logit rank transformation provided normalized inequalities and distributions of income and social origins up to the extremes of the distribution and was used to investigate net comparable health gradients in detail. Multilevel random-slope models were run to post-estimate best linear unbiased predictors (BLUPs) and related standard deviations of residual intercepts (median health) and slopes (income-health gradients) per country and survey year. Health gradients varied across different measures of stratification, with origins and income producing significant slopes after controls. Income inequality was associated with worse average health, but income inequality and steepness of the health gradient were only marginally associated. Linear health gradients suggest gains in health per rank of income and of origins even at the very extremes of the distribution. Intergenerational transmission of status gains in importance in countries with higher income inequality. Countries differ in the association of income inequality and income-related health gradient, and low income inequality may mask health problems of vulnerable individuals with low status. Not only income inequality, but other country characteristics such as familial orientation play a considerable role in explaining steepness of the health gradient.
USDA-ARS?s Scientific Manuscript database
The objective of this study was to compare genetic trends from a single-step genomic BLUP (ssGBLUP) and the traditional BLUP models for milk production traits in US Holstein. Phenotypes were 305-day milk, fat, and protein yield from 21,527,040 cows recorded between January, 1990 and August, 2015. Th...
Guo, X; Christensen, O F; Ostersen, T; Wang, Y; Lund, M S; Su, G
2015-02-01
A single-step method allows genetic evaluation using information of phenotypes, pedigree, and markers from genotyped and nongenotyped individuals simultaneously. This paper compared genomic predictions obtained from a single-step BLUP (SSBLUP) method, a genomic BLUP (GBLUP) method, a selection index blending (SELIND) method, and a traditional pedigree-based method (BLUP) for total number of piglets born (TNB), litter size at d 5 after birth (LS5), and mortality rate before d 5 (Mort; including stillbirth) in Danish Landrace and Yorkshire pigs. Data sets of 778,095 litters from 309,362 Landrace sows and 472,001 litters from 190,760 Yorkshire sows were used for the analysis. There were 332,795 Landrace and 207,255 Yorkshire animals in the pedigree data, among which 3,445 Landrace pigs (1,366 boars and 2,079 sows) and 3,372 Yorkshire pigs (1,241 boars and 2,131 sows) were genotyped with the Illumina PorcineSNP60 BeadChip. The results showed that the 3 methods with marker information (SSBLUP, GBLUP, and SELIND) produced more accurate predictions for genotyped animals than the pedigree-based method. For genotyped animals, the average of reliabilities for all traits in both breeds using traditional BLUP was 0.091, which increased to 0.171 w+hen using GBLUP and to 0.179 when using SELIND and further increased to 0.209 when using SSBLUP. Furthermore, the average reliability of EBV for nongenotyped animals was increased from 0.091 for traditional BLUP to 0.105 for the SSBLUP. The results indicate that the SSBLUP is a good approach to practical genomic prediction of litter size and piglet mortality in Danish Landrace and Yorkshire populations.
A geostatistical approach to data harmonization - Application to radioactivity exposure data
NASA Astrophysics Data System (ADS)
Baume, O.; Skøien, J. O.; Heuvelink, G. B. M.; Pebesma, E. J.; Melles, S. J.
2011-06-01
Environmental issues such as air, groundwater pollution and climate change are frequently studied at spatial scales that cross boundaries between political and administrative regions. It is common for different administrations to employ different data collection methods. If these differences are not taken into account in spatial interpolation procedures then biases may appear and cause unrealistic results. The resulting maps may show misleading patterns and lead to wrong interpretations. Also, errors will propagate when these maps are used as input to environmental process models. In this paper we present and apply a geostatistical model that generalizes the universal kriging model such that it can handle heterogeneous data sources. The associated best linear unbiased estimation and prediction (BLUE and BLUP) equations are presented and it is shown that these lead to harmonized maps from which estimated biases are removed. The methodology is illustrated with an example of country bias removal in a radioactivity exposure assessment for four European countries. The application also addresses multicollinearity problems in data harmonization, which arise when both artificial bias factors and natural drifts are present and cannot easily be distinguished. Solutions for handling multicollinearity are suggested and directions for further investigations proposed.
Strategies for implementing genomic selection for feed efficiency in dairy cattle breeding schemes.
Wallén, S E; Lillehammer, M; Meuwissen, T H E
2017-08-01
Alternative genomic selection and traditional BLUP breeding schemes were compared for the genetic improvement of feed efficiency in simulated Norwegian Red dairy cattle populations. The change in genetic gain over time and achievable selection accuracy were studied for milk yield and residual feed intake, as a measure of feed efficiency. When including feed efficiency in genomic BLUP schemes, it was possible to achieve high selection accuracies for genomic selection, and all genomic BLUP schemes gave better genetic gain for feed efficiency than BLUP using a pedigree relationship matrix. However, introducing a second trait in the breeding goal caused a reduction in the genetic gain for milk yield. When using contracted test herds with genotyped and feed efficiency recorded cows as a reference population, adding an additional 4,000 new heifers per year to the reference population gave accuracies that were comparable to a male reference population that used progeny testing with 250 daughters per sire. When the test herd consisted of 500 or 1,000 cows, lower genetic gain was found than using progeny test records to update the reference population. It was concluded that to improve difficult to record traits, the use of contracted test herds that had additional recording (e.g., measurements required to calculate feed efficiency) is a viable option, possibly through international collaborations. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Genomic prediction of reproduction traits for Merino sheep.
Bolormaa, S; Brown, D J; Swan, A A; van der Werf, J H J; Hayes, B J; Daetwyler, H D
2017-06-01
Economically important reproduction traits in sheep, such as number of lambs weaned and litter size, are expressed only in females and later in life after most selection decisions are made, which makes them ideal candidates for genomic selection. Accurate genomic predictions would lead to greater genetic gain for these traits by enabling accurate selection of young rams with high genetic merit. The aim of this study was to design and evaluate the accuracy of a genomic prediction method for female reproduction in sheep using daughter trait deviations (DTD) for sires and ewe phenotypes (when individual ewes were genotyped) for three reproduction traits: number of lambs born (NLB), litter size (LSIZE) and number of lambs weaned. Genomic best linear unbiased prediction (GBLUP), BayesR and pedigree BLUP analyses of the three reproduction traits measured on 5340 sheep (4503 ewes and 837 sires) with real and imputed genotypes for 510 174 SNPs were performed. The prediction of breeding values using both sire and ewe trait records was validated in Merino sheep. Prediction accuracy was evaluated by across sire family and random cross-validations. Accuracies of genomic estimated breeding values (GEBVs) were assessed as the mean Pearson correlation adjusted by the accuracy of the input phenotypes. The addition of sire DTD into the prediction analysis resulted in higher accuracies compared with using only ewe records in genomic predictions or pedigree BLUP. Using GBLUP, the average accuracy based on the combined records (ewes and sire DTD) was 0.43 across traits, but the accuracies varied by trait and type of cross-validations. The accuracies of GEBVs from random cross-validations (range 0.17-0.61) were higher than were those from sire family cross-validations (range 0.00-0.51). The GEBV accuracies of 0.41-0.54 for NLB and LSIZE based on the combined records were amongst the highest in the study. Although BayesR was not significantly different from GBLUP in prediction accuracy, it identified several candidate genes which are known to be associated with NLB and LSIZE. The approach provides a way to make use of all data available in genomic prediction for traits that have limited recording. © 2017 Stichting International Foundation for Animal Genetics.
Shahinfar, Saleh; Mehrabani-Yeganeh, Hassan; Lucas, Caro; Kalhor, Ahmad; Kazemian, Majid; Weigel, Kent A.
2012-01-01
Developing machine learning and soft computing techniques has provided many opportunities for researchers to establish new analytical methods in different areas of science. The objective of this study is to investigate the potential of two types of intelligent learning methods, artificial neural networks and neuro-fuzzy systems, in order to estimate breeding values (EBV) of Iranian dairy cattle. Initially, the breeding values of lactating Holstein cows for milk and fat yield were estimated using conventional best linear unbiased prediction (BLUP) with an animal model. Once that was established, a multilayer perceptron was used to build ANN to predict breeding values from the performance data of selection candidates. Subsequently, fuzzy logic was used to form an NFS, a hybrid intelligent system that was implemented via a local linear model tree algorithm. For milk yield the correlations between EBV and EBV predicted by the ANN and NFS were 0.92 and 0.93, respectively. Corresponding correlations for fat yield were 0.93 and 0.93, respectively. Correlations between multitrait predictions of EBVs for milk and fat yield when predicted simultaneously by ANN were 0.93 and 0.93, respectively, whereas corresponding correlations with reference EBV for multitrait NFS were 0.94 and 0.95, respectively, for milk and fat production. PMID:22991575
Genetic Dissection of End-Use Quality Traits in Adapted Soft White Winter Wheat
Jernigan, Kendra L.; Godoy, Jayfred V.; Huang, Meng; Zhou, Yao; Morris, Craig F.; Garland-Campbell, Kimberly A.; Zhang, Zhiwu; Carter, Arron H.
2018-01-01
Soft white wheat is used in domestic and foreign markets for various end products requiring specific quality profiles. Phenotyping for end-use quality traits can be costly, time-consuming and destructive in nature, so it is advantageous to use molecular markers to select experimental lines with superior traits. An association mapping panel of 469 soft white winter wheat cultivars and advanced generation breeding lines was developed from regional breeding programs in the U.S. Pacific Northwest. This panel was genotyped on a wheat-specific 90 K iSelect single nucleotide polymorphism (SNP) chip. A total of 15,229 high quality SNPs were selected and combined with best linear unbiased predictions (BLUPs) from historical phenotypic data of the genotypes in the panel. Genome-wide association mapping was conducted using the Fixed and random model Circulating Probability Unification (FarmCPU). A total of 105 significant marker-trait associations were detected across 19 chromosomes. Potentially new loci for total flour yield, lactic acid solvent retention capacity, flour sodium dodecyl sulfate sedimentation and flour swelling volume were also detected. Better understanding of the genetic factors impacting end-use quality enable breeders to more effectively discard poor quality germplasm and increase frequencies of favorable end-use quality alleles in their breeding populations. PMID:29593752
Diallel analysis for sex-linked and maternal effects.
Zhu, J; Weir, B S
1996-01-01
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.
Brinker, T; Raymond, B; Bijma, P; Vereijken, A; Ellen, E D
2017-02-01
Mortality of laying hens due to cannibalism is a major problem in the egg-laying industry. Survival depends on two genetic effects: the direct genetic effect of the individual itself (DGE) and the indirect genetic effects of its group mates (IGE). For hens housed in sire-family groups, DGE and IGE cannot be estimated using pedigree information, but the combined effect of DGE and IGE is estimated in the total breeding value (TBV). Genomic information provides information on actual genetic relationships between individuals and might be a tool to improve TBV accuracy. We investigated whether genomic information of the sire increased TBV accuracy compared with pedigree information, and we estimated genetic parameters for survival time. A sire model with pedigree information (BLUP) and a sire model with genomic information (ssGBLUP) were used. We used survival time records of 7290 crossbred offspring with intact beaks from four crosses. Cross-validation was used to compare the models. Using ssGBLUP did not improve TBV accuracy compared with BLUP which is probably due to the limited number of sires available per cross (~50). Genetic parameter estimates were similar for BLUP and ssGBLUP. For both BLUP and ssGBLUP, total heritable variance (T 2 ), expressed as a proportion of phenotypic variance, ranged from 0.03 ± 0.04 to 0.25 ± 0.09. Further research is needed on breeding value estimation for socially affected traits measured on individuals kept in single-family groups. © 2016 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH.
Technical note: Equivalent genomic models with a residual polygenic effect.
Liu, Z; Goddard, M E; Hayes, B J; Reinhardt, F; Reents, R
2016-03-01
Routine genomic evaluations in animal breeding are usually based on either a BLUP with genomic relationship matrix (GBLUP) or single nucleotide polymorphism (SNP) BLUP model. For a multi-step genomic evaluation, these 2 alternative genomic models were proven to give equivalent predictions for genomic reference animals. The model equivalence was verified also for young genotyped animals without phenotypes. Due to incomplete linkage disequilibrium of SNP markers to genes or causal mutations responsible for genetic inheritance of quantitative traits, SNP markers cannot explain all the genetic variance. A residual polygenic effect is normally fitted in the genomic model to account for the incomplete linkage disequilibrium. In this study, we start by showing the proof that the multi-step GBLUP and SNP BLUP models are equivalent for the reference animals, when they have a residual polygenic effect included. Second, the equivalence of both multi-step genomic models with a residual polygenic effect was also verified for young genotyped animals without phenotypes. Additionally, we derived formulas to convert genomic estimated breeding values of the GBLUP model to its components, direct genomic values and residual polygenic effect. Third, we made a proof that the equivalence of these 2 genomic models with a residual polygenic effect holds also for single-step genomic evaluation. Both the single-step GBLUP and SNP BLUP models lead to equal prediction for genotyped animals with phenotypes (e.g., reference animals), as well as for (young) genotyped animals without phenotypes. Finally, these 2 single-step genomic models with a residual polygenic effect were proven to be equivalent for estimation of SNP effects, too. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Brown, Steven D.; Tramayne, Selena; Hoxha, Denada; Telander, Kyle; Fan, Xiaoyan; Lent, Robert W.
2008-01-01
This study tested Social Cognitive Career Theory's (SCCT) academic performance model using a two-stage approach that combined meta-analytic and structural equation modeling methodologies. Unbiased correlations obtained from a previously published meta-analysis [Robbins, S. B., Lauver, K., Le, H., Davis, D., & Langley, R. (2004). Do psychosocial…
Athletic Departments' Operating Expenses as a Predictor of Their Directors' Cup Standing
ERIC Educational Resources Information Center
Magner, Amber
2014-01-01
The NACDA Directors' Cup is a competition utilizing an unbiased scoring system that encourages a broad based athletic department as the standard for defining intercollegiate athletic success. Therefore, for NCAA DI athletic administrators the Directors' Cup should be the standard for defining intercollegiate athletic success. The purpose of this…
USDA-ARS?s Scientific Manuscript database
Single-step Genomic Best Linear Unbiased Predictor (ssGBLUP) has become increasingly popular for whole-genome prediction (WGP) modeling as it utilizes any available pedigree and phenotypes on both genotyped and non-genotyped individuals. The WGP accuracy of ssGBLUP has been demonstrated to be greate...
Genetic Basis for Variation in Wheat Grain Yield in Response to Varying Nitrogen Application.
Mahjourimajd, Saba; Taylor, Julian; Sznajder, Beata; Timmins, Andy; Shahinnia, Fahimeh; Rengel, Zed; Khabaz-Saberi, Hossein; Kuchel, Haydn; Okamoto, Mamoru; Langridge, Peter
2016-01-01
Nitrogen (N) is a major nutrient needed to attain optimal grain yield (GY) in all environments. Nitrogen fertilisers represent a significant production cost, in both monetary and environmental terms. Developing genotypes capable of taking up N early during development while limiting biomass production after establishment and showing high N-use efficiency (NUE) would be economically beneficial. Genetic variation in NUE has been shown previously. Here we describe the genetic characterisation of NUE and identify genetic loci underlying N response under different N fertiliser regimes in a bread wheat population of doubled-haploid lines derived from a cross between two Australian genotypes (RAC875 × Kukri) bred for a similar production environment. NUE field trials were carried out at four sites in South Australia and two in Western Australia across three seasons. There was genotype-by-environment-by-treatment interaction across the sites and also good transgressive segregation for yield under different N supply in the population. We detected some significant Quantitative Trait Loci (QTL) associated with NUE and N response at different rates of N application across the sites and years. It was also possible to identify lines showing positive N response based on the rankings of their Best Linear Unbiased Predictions (BLUPs) within a trial. Dissecting the complexity of the N effect on yield through QTL analysis is a key step towards elucidating the molecular and physiological basis of NUE in wheat.
Moreira-Ascarrunz, Sergio Daniel; Larsson, Hans; Prieto-Linde, Maria Luisa; Johansson, Eva
2016-01-01
The aim of the present investigation was to investigate the nutritional yield, nutrient density, stability, and adaptability of organically produced wheat for sustainable and nutritional high value food production. This study evaluated the nutritional yield of four minerals (Fe, Zn, Cu, and Mg) in 19 wheat genotypes, selected as being locally adapted under organic agriculture conditions. The new metric of nutritional yield was calculated for each genotype and they were evaluated for stability using the Additive Main effects and Multiplicative Interaction (AMMI) stability analysis and for genotypic value, stability, and adaptability using the Best Linear Unbiased Prediction (BLUP procedure). The results indicated that there were genotypes suitable for production under organic agriculture conditions with satisfactory yields (>4000 kg·ha−1). Furthermore, these genotypes showed high nutritional yield and nutrient density for the four minerals studied. Additionally, since these genotypes were stable and adaptable over three environmentally different years, they were designated “balanced genotypes” for the four minerals and for the aforementioned characteristics. Selection and breeding of such “balanced genotypes” may offer an alternative to producing nutritious food under low-input agriculture conditions. Furthermore, the type of evaluation presented here may also be of interest for implementation in research conducted in developing countries, following the objectives of producing enough nutrients for a growing population. PMID:28231184
De Cremer, David
2004-03-01
The present research examined the combined effect of accuracy of procedures and leader's bias on fairness judgments and the experience of positive emotions. The results of two studies showed that the strongest positive effects on both types of reactions were found when procedures were accurate and the leader was unbiased. In addition, accuracy of procedures only revealed an impact when the leader was perceived as unbiased rather than biased. Moreover, this interactive effect was found to be mediated, at least partly, by perceptions of trustworthiness. These findings show that more research is needed on examining different types of procedural fairness, both as single and combined predictors of people's reactions.
Cow genotyping strategies for genomic selection in a small dairy cattle population.
Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A
2017-01-01
This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Incorporation of causative quantitative trait nucleotides in single-step GBLUP.
Fragomeni, Breno O; Lourenco, Daniela A L; Masuda, Yutaka; Legarra, Andres; Misztal, Ignacy
2017-07-26
Much effort is put into identifying causative quantitative trait nucleotides (QTN) in animal breeding, empowered by the availability of dense single nucleotide polymorphism (SNP) information. Genomic selection using traditional SNP information is easily implemented for any number of genotyped individuals using single-step genomic best linear unbiased predictor (ssGBLUP) with the algorithm for proven and young (APY). Our aim was to investigate whether ssGBLUP is useful for genomic prediction when some or all QTN are known. Simulations included 180,000 animals across 11 generations. Phenotypes were available for all animals in generations 6 to 10. Genotypes for 60,000 SNPs across 10 chromosomes were available for 29,000 individuals. The genetic variance was fully accounted for by 100 or 1000 biallelic QTN. Raw genomic relationship matrices (GRM) were computed from (a) unweighted SNPs, (b) unweighted SNPs and causative QTN, (c) SNPs and causative QTN weighted with results obtained with genome-wide association studies, (d) unweighted SNPs and causative QTN with simulated weights, (e) only unweighted causative QTN, (f-h) as in (b-d) but using only the top 10% causative QTN, and (i) using only causative QTN with simulated weight. Predictions were computed by pedigree-based BLUP (PBLUP) and ssGBLUP. Raw GRM were blended with 1 or 5% of the numerator relationship matrix, or 1% of the identity matrix. Inverses of GRM were obtained directly or with APY. Accuracy of breeding values for 5000 genotyped animals in the last generation with PBLUP was 0.32, and for ssGBLUP it increased to 0.49 with an unweighted GRM, 0.53 after adding unweighted QTN, 0.63 when QTN weights were estimated, and 0.89 when QTN weights were based on true effects known from the simulation. When the GRM was constructed from causative QTN only, accuracy was 0.95 and 0.99 with blending at 5 and 1%, respectively. Accuracies simulating 1000 QTN were generally lower, with a similar trend. Accuracies using the APY inverse were equal or higher than those with a regular inverse. Single-step GBLUP can account for causative QTN via a weighted GRM. Accuracy gains are maximum when variances of causative QTN are known and blending is at 1%.
F. Mauro; Vicente Monleon; H. Temesgen
2015-01-01
Small area estimation (SAE) techniques have been successfully applied in forest inventories to provide reliable estimates for domains where the sample size is small (i.e. small areas). Previous studies have explored the use of either Area Level or Unit Level Empirical Best Linear Unbiased Predictors (EBLUPs) in a univariate framework, modeling each variable of interest...
Genetic Basis for Variation in Wheat Grain Yield in Response to Varying Nitrogen Application
Mahjourimajd, Saba; Taylor, Julian; Sznajder, Beata; Timmins, Andy; Shahinnia, Fahimeh; Rengel, Zed; Khabaz-Saberi, Hossein; Kuchel, Haydn; Okamoto, Mamoru
2016-01-01
Nitrogen (N) is a major nutrient needed to attain optimal grain yield (GY) in all environments. Nitrogen fertilisers represent a significant production cost, in both monetary and environmental terms. Developing genotypes capable of taking up N early during development while limiting biomass production after establishment and showing high N-use efficiency (NUE) would be economically beneficial. Genetic variation in NUE has been shown previously. Here we describe the genetic characterisation of NUE and identify genetic loci underlying N response under different N fertiliser regimes in a bread wheat population of doubled-haploid lines derived from a cross between two Australian genotypes (RAC875 × Kukri) bred for a similar production environment. NUE field trials were carried out at four sites in South Australia and two in Western Australia across three seasons. There was genotype-by-environment-by-treatment interaction across the sites and also good transgressive segregation for yield under different N supply in the population. We detected some significant Quantitative Trait Loci (QTL) associated with NUE and N response at different rates of N application across the sites and years. It was also possible to identify lines showing positive N response based on the rankings of their Best Linear Unbiased Predictions (BLUPs) within a trial. Dissecting the complexity of the N effect on yield through QTL analysis is a key step towards elucidating the molecular and physiological basis of NUE in wheat. PMID:27459317
Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.
2014-01-01
Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE. PMID:24727289
Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.
Jiménez-Montero, J A; González-Recio, O; Alenda, R
2013-01-01
The aim of this study was to evaluate methods for genomic evaluation of the Spanish Holstein population as an initial step toward the implementation of routine genomic evaluations. This study provides a description of the population structure of progeny tested bulls in Spain at the genomic level and compares different genomic evaluation methods with regard to accuracy and bias. Two bayesian linear regression models, Bayes-A and Bayesian-LASSO (B-LASSO), as well as a machine learning algorithm, Random-Boosting (R-Boost), and BLUP using a realized genomic relationship matrix (G-BLUP), were compared. Five traits that are currently under selection in the Spanish Holstein population were used: milk yield, fat yield, protein yield, fat percentage, and udder depth. In total, genotypes from 1859 progeny tested bulls were used. The training sets were composed of bulls born before 2005; including 1601 bulls for production and 1574 bulls for type, whereas the testing sets contained 258 and 235 bulls born in 2005 or later for production and type, respectively. Deregressed proofs (DRP) from January 2009 Interbull (Uppsala, Sweden) evaluation were used as the dependent variables for bulls in the training sets, whereas DRP from the December 2011 DRPs Interbull evaluation were used to compare genomic predictions with progeny test results for bulls in the testing set. Genomic predictions were more accurate than traditional pedigree indices for predicting future progeny test results of young bulls. The gain in accuracy, due to inclusion of genomic data varied by trait and ranged from 0.04 to 0.42 Pearson correlation units. Results averaged across traits showed that B-LASSO had the highest accuracy with an advantage of 0.01, 0.03 and 0.03 points in Pearson correlation compared with R-Boost, Bayes-A, and G-BLUP, respectively. The B-LASSO predictions also showed the least bias (0.02, 0.03 and 0.10 SD units less than Bayes-A, R-Boost and G-BLUP, respectively) as measured by mean difference between genomic predictions and progeny test results. The R-Boosting algorithm provided genomic predictions with regression coefficients closer to unity, which is an alternative measure of bias, for 4 out of 5 traits and also resulted in mean squared errors estimates that were 2%, 10%, and 12% smaller than B-LASSO, Bayes-A, and G-BLUP, respectively. The observed prediction accuracy obtained with these methods was within the range of values expected for a population of similar size, suggesting that the prediction method and reference population described herein are appropriate for implementation of routine genome-assisted evaluations in Spanish dairy cattle. R-Boost is a competitive marker regression methodology in terms of predictive ability that can accommodate large data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
On the degrees of freedom of reduced-rank estimators in multivariate regression
Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.
2015-01-01
Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155
Rincent, R; Laloë, D; Nicolas, S; Altmann, T; Brunel, D; Revilla, P; Rodríguez, V M; Moreno-Gonzalez, J; Melchinger, A; Bauer, E; Schoen, C-C; Meyer, N; Giauffret, C; Bauland, C; Jamin, P; Laborde, J; Monod, H; Flament, P; Charcosset, A; Moreau, L
2012-10-01
Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix-best linear unbiased predictions model (RA-BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.
Rodrigues, E V; Daher, R F; Dos Santos, A; Vivas, M; Machado, J C; Gravina, G do A; de Souza, Y P; Vidal, A K; Rocha, A Dos S; Freitas, R S
2017-05-18
Brazil has great potential to produce bioenergy since it is located in a tropical region that receives high incidence of solar energy and presents favorable climatic conditions for such purpose. However, the use of bioenergy in the country is below its productivity potential. The aim of the current study was to select full-sib progenies and families of elephant grass (Pennisetum purpureum S.) to optimize phenotypes relevant to bioenergy production through mixed models (REML/BLUP). The circulating diallel-based crossing of ten elephant grass genotypes was performed. An experimental design using the randomized block methodology, with three repetitions, was set to assess both the hybrids and the parents. Each plot comprised 14-m rows, 1.40 m spacing between rows, and 1.40 m spacing between plants. The number of tillers, plant height, culm diameter, fresh biomass production, dry biomass rate, and the dry biomass production were assessed. Genetic-statistical analyses were performed through mixed models (REML/BLUP). The genetic variance in the assessed families was explained through additive genetic effects and dominance genetic effects; the dominance variance was prevalent. Families such as Capim Cana D'África x Guaçu/I.Z.2, Cameroon x Cuba-115, CPAC x Cuba-115, Cameroon x Guaçu/I.Z.2, and IAC-Campinas x CPAC showed the highest dry biomass production. The family derived from the crossing between Cana D'África and Guaçu/I.Z.2 showed the largest number of potential individuals for traits such as plant height, culm diameter, fresh biomass production, dry biomass production, and dry biomass rate. The individual 5 in the family Cana D'África x Guaçu/I.Z.2, planted in blocks 1 and 2, showed the highest dry biomass production.
Gourdine, J L; Sørensen, A C; Rydhmer, L
2012-01-01
Selection progress must be carefully balanced against the conservation of genetic variation in small populations of local breeds. Well-defined breeding programs with specified selection traits are rare in local pig breeds. Given the small population size, the focus is often on the management of genetic diversity. However, in local breeds, optimum contribution selection can be applied to control the rate of inbreeding and to avoid reduced performance in traits with high market value. The aim of this study was to assess the extent to which a breeding program aiming for improved product quality in a small local breed would be feasible. We used stochastic simulations to compare 25 scenarios. The scenarios differed in size of population, selection intensity of boars, type of selection (random selection, truncation selection based on BLUP breeding values, or optimum contribution selection based on BLUP breeding values), and heritability of the selection trait. It was assumed that the local breed is used in an extensive system for a high-meat-quality market. The simulations showed that in the smallest population (300 female reproducers), inbreeding increased by 0.8% when selection was performed at random. With optimum contribution selection, genetic progress can be achieved that is almost as great as that with truncation selection based on BLUP breeding values (0.2 to 0.5 vs. 0.3 to 0.5 genetic SD, P < 0.05), but at a considerably decreased rate of inbreeding (0.7 to 1.2 vs. 2.3 to 5.7%, P < 0.01). This confirmation of the potential utilization of OCS even in small populations is important in the context of sustainable management and the use of animal genetic resources.
Gao, Yong-Ming; Wan, Ping
2002-06-01
Screening markers efficiently is the foundation of mapping QTLs by composite interval mapping. Main and interaction markers distinguished, besides using background control for genetic variation, could also be used to construct intervals of two-way searching for mapping QTLs with epistasis, which can save a lot of calculation time. Therefore, the efficiency of marker screening would affect power and precision of QTL mapping. A doubled haploid population with 200 individuals and 5 chromosomes was constructed, with 50 markers evenly distributed at 10 cM space. Among a total of 6 QTLs, one was placed on chromosome I, two linked on chromosome II, and the other three linked on chromosome IV. QTL setting included additive effects and epistatic effects of additive x additive, the corresponding QTL interaction effects were set if data were collected under multiple environments. The heritability was assumed to be 0.5 if no special declaration. The power of marker screening by stepwise regression, forward regression, and three methods for random effect prediction, e.g. best linear unbiased prediction (BLUP), linear unbiased prediction (LUP) and adjusted unbiased prediction (AUP), was studied and compared through 100 Monte Carlo simulations. The results indicated that the marker screening power by stepwise regression at 0.1, 0.05 and 0.01 significant level changed from 2% to 68%, the power changed from 2% to 72% by forward regression. The larger the QTL effects, the higher the marker screening power. While the power of marker screening by three random effect prediction was very low, the maximum was only 13%. That suggested that regression methods were much better than those by using the approaches of random effect prediction to identify efficient markers flanking QTLs, and forward selection method was more simple and efficient. The results of simulation study on heritability showed that heightening of both general heritability and interaction heritability of genotype x environments could enhance marker screening power, the former had a greater influence on QTLs with larger main and/or epistatic effects, while the later on QTLs with small main and/or epistatic effects. The simulation of 100 times was also conducted to study the influence of different marker number and density on marker screening power. It is indicated that the marker screening power would decrease if there were too many markers, especially with high density in a mapping population, which suggested that a mapping population with definite individuals could only hold limited markers. According to the simulation study, the reasonable number of markers should not be more than individuals. The simulation study of marker screening under multiple environments showed high total power of marker screening. In order to relieve the problem that marker screening power restricted the efficiency of QTL mapping, markers identified in multiple environments could be used to construct two search intervals.
Methods to approximate reliabilities in single-step genomic evaluation
USDA-ARS?s Scientific Manuscript database
Reliability of predictions from single-step genomic BLUP (ssGBLUP) can be calculated by inversion, but that is not feasible for large data sets. Two methods of approximating reliability were developed based on decomposition of a function of reliability into contributions from records, pedigrees, and...
USDA-ARS?s Scientific Manuscript database
Colletotrichum gloeosporioides f. sp. salsolae (Penz.) Penz. & Sacc. in Penz. (CGS) is a facultative parasitic fungus being evaluated as a classical biological control agent of Russian thistle or tumbleweed (Salsola tragus L.). In initial host range determination tests, Henderson’s mixed model equat...
Variation and BLUPs in a novel source of orchardgrass germplasm with increased winter hardiness
USDA-ARS?s Scientific Manuscript database
The production potential of orchardgrass (Dactylis glomerata L.) is limited by winter injury at high latitudes and elevations. Evaluation of orchardgrass families at two Utah (US) locations identified significant genetic variation for two measures of tolerance to winter injury, but not for flowering...
Performance of genomic prediction within and across generations in maritime pine.
Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent
2016-08-11
Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
How redesigning AD clinical trials might increase study partners’ willingness to participate
Karlawish, Jason; Cary, Mark S.; Rubright, Jonathan; TenHave, Tom
2008-01-01
Background: Timely recruiting and retaining participants into Alzheimer disease (AD) clinical trials is a challenge. We used conjoint analysis to identify how alterations in attributes of clinical trial design improve willingness to participate: risk, home visits, car service, or increased chance of receiving intervention. Method: A total of 108 study partners of patients with very mild to severe stage AD rated willingness to allow their relative to participate in eight clinical trials that varied combinations of the four attributes. Results: The highest utility was for home visits (0.89) which essentially compensated for the disutility of high risk (−0.85). The combination of home visits and car service was redundant, with almost no increase in utility over home visits alone. Seventeen percent were willing to participate in a trial with no amenities; the addition of home visits increased predicted willingness to participate to 27%; low risk, home visits, and higher chance of active treatment increased predicted willingness to 60%. The value of reducing the hassles of travel correlated well with measures of AD severity (activities of daily living r = 0.41, p < 0.001; basic activities of daily living r = 0.38, p < 0.001; Neuropsychiatric Inventory severity p = 0.24, p = 0.01; Neuropsychiatric Inventory distress r = 0.23, p < 0.02). No association was found between degree of study partner burden and willingness to tolerate risk of an intervention. Conclusion: Clinical trials that reduce travel inconvenience may offset the disincentive of study features such as the risk of intervention and may also increase willingness to participate. Redesigning trials may also help recruit patients with more severe Alzheimer disease. Shorter recruitment periods and increased retention rates may offset costs of these changes. GLOSSARY AD = Alzheimer disease; BLUP = best linear unbiased prediction; RAQ = Research Attitude Questionnaire. PMID:19047560
Ma, Langlang; Liu, Min; Yan, Yuanyuan; Qing, Chunyan; Zhang, Xiaoling; Zhang, Yanling; Long, Yun; Wang, Lei; Pan, Lang; Zou, Chaoying; Li, Zhaoling; Wang, Yanli; Peng, Huanwei; Pan, Guangtang; Jiang, Zhou; Shen, Yaou
2018-01-01
The regenerative capacity of the embryonic callus, a complex quantitative trait, is one of the main limiting factors for maize transformation. This trait was decomposed into five traits, namely, green callus rate (GCR), callus differentiating rate (CDR), callus plantlet number (CPN), callus rooting rate (CRR), and callus browning rate (CBR). To dissect the genetic foundation of maize transformation, in this study multi-locus genome-wide association studies (GWAS) for the five traits were performed in a population of 144 inbred lines genotyped with 43,427 SNPs. Using the phenotypic values in three environments and best linear unbiased prediction (BLUP) values, as a result, a total of 127, 56, 160, and 130 significant quantitative trait nucleotides (QTNs) were identified by mrMLM, FASTmrEMMA, ISIS EM-BLASSO, and pLARmEB, respectively. Of these QTNs, 63 QTNs were commonly detected, including 15 across multiple environments and 58 across multiple methods. Allele distribution analysis showed that the proportion of superior alleles for 36 QTNs was <50% in 31 elite inbred lines. Meanwhile, these superior alleles had obviously additive effect on the regenerative capacity. This indicates that the regenerative capacity-related traits can be improved by proper integration of the superior alleles using marker-assisted selection. Moreover, a total of 40 candidate genes were found based on these common QTNs. Some annotated genes were previously reported to relate with auxin transport, cell fate, seed germination, or embryo development, especially, GRMZM2G108933 (WOX2) was found to promote maize transgenic embryonic callus regeneration. These identified candidate genes will contribute to a further understanding of the genetic foundation of maize embryonic callus regeneration. PMID:29755499
Variation in cassava germplasm for tolerance to post-harvest physiological deterioration.
Venturini, M T; Santos, L R; Vildoso, C I A; Santos, V S; Oliveira, E J
2016-05-06
Tolerant varieties can effectively control post-harvest physiological deterioration (PPD) of cassava, although knowledge on the genetic variability and inheritance of this trait is needed. The objective of this study was to estimate genetic parameters and identify sources of tolerance to PPD and their stability in cassava accessions. Roots from 418 cassava accessions, grown in four independent experiments, were evaluated for PPD tolerance 0, 2, 5, and 10 days post-harvest. Data were transformed into area under the PPD-progress curve (AUP-PPD) to quantify tolerance. Genetic parameters, stability (Si), adaptability (Ai), and the joint analysis of stability and adaptability (Zi) were obtained via residual maximum likelihood (REML) and best linear unbiased prediction (BLUP) methods. Variance in the genotype (G) x environment (E) interaction and genotypic variance were important for PPD tolerance. Individual broad-sense heritability (hg(2)= 0.38 ± 0.04) and average heritability in accessions (hmg(2)= 0.52) showed high genetic control of PPD tolerance. Genotypic correlation of AUP-PPD in different experiments was of medium magnitude (ȓgA = 0.42), indicating significant G x E interaction. The predicted genotypic values o f G x E free of interaction (û + ĝi) showed high variation. Of the 30 accessions with high Zi, 19 were common to û + ĝi, Si, and Ai parameters. The genetic gain with selection of these 19 cassava accessions was -55.94, -466.86, -397.72, and -444.03% for û + ĝi, Si, Ai, and Zi, respectively, compared with the overall mean for each parameter. These results demonstrate the variability and potential of cassava germplasm to introduce PPD tolerance in commercial varieties.
Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I
2017-01-01
This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Genomic predictability of single-step GBLUP for production traits in US Holstein
USDA-ARS?s Scientific Manuscript database
The objective of this study was to validate genomic predictability of single-step genomic BLUP for 305-day protein yield for US Holsteins. The genomic relationship matrix was created with the Algorithm of Proven and Young (APY) with 18,359 core animals. The full data set consisted of phenotypes coll...
Li, X; Lund, M S; Zhang, Q; Costa, C N; Ducrocq, V; Su, G
2016-06-01
The present study investigated the improvement of prediction reliabilities for 3 production traits in Brazilian Holsteins that had no genotype information by adding information from Nordic and French Holstein bulls that had genotypes. The estimated across-country genetic correlations (ranging from 0.604 to 0.726) indicated that an important genotype by environment interaction exists between Brazilian and Nordic (or Nordic and French) populations. Prediction reliabilities for Brazilian genotyped bulls were greatly increased by including data of Nordic and French bulls, and a 2-trait single-step genomic BLUP performed much better than the corresponding pedigree-based BLUP. However, only a minor improvement in prediction reliabilities was observed in nongenotyped Brazilian cows. The results indicate that although there is a large genotype by environment interaction, inclusion of a foreign reference population can improve accuracy of genetic evaluation for the Brazilian Holstein population. However, a Brazilian reference population is necessary to obtain a more accurate genomic evaluation. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Spindel, J E; Begum, H; Akdemir, D; Collard, B; Redoña, E; Jannink, J-L; McCouch, S
2016-01-01
To address the multiple challenges to food security posed by global climate change, population growth and rising incomes, plant breeders are developing new crop varieties that can enhance both agricultural productivity and environmental sustainability. Current breeding practices, however, are unable to keep pace with demand. Genomic selection (GS) is a new technique that helps accelerate the rate of genetic gain in breeding by using whole-genome data to predict the breeding value of offspring. Here, we describe a new GS model that combines RR-BLUP with markers fit as fixed effects selected from the results of a genome-wide-association study (GWAS) on the RR-BLUP training data. We term this model GS + de novo GWAS. In a breeding population of tropical rice, GS + de novo GWAS outperformed six other models for a variety of traits and in multiple environments. On the basis of these results, we propose an extended, two-part breeding design that can be used to efficiently integrate novel variation into elite breeding populations, thus expanding genetic diversity and enhancing the potential for sustainable productivity gains. PMID:26860200
A Concept-Wide Association Study of Clinical Notes to Discover New Predictors of Kidney Failure.
Singh, Karandeep; Betensky, Rebecca A; Wright, Adam; Curhan, Gary C; Bates, David W; Waikar, Sushrut S
2016-12-07
Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. We used natural language processing tools to extract concepts from the preceding year's clinical notes among patients newly referred to a tertiary care center's outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a 5% threshold for false discovery rate (q value <0.05). We included all patients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95% confidence interval, 2.80 to 10.70; q<0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q<0.001). Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record. Copyright © 2016 by the American Society of Nephrology.
USDA-ARS?s Scientific Manuscript database
Host range tests were conducted with Colletotrichum gloeosporioides f. sp. salsolae (CGS) in quarantine to determine whether the fungus is safe to release in N. America for biological control of tumbleweed (Salsola tragus L., Chenopodiaceae). Ninety-two accessions were analyzed from 19 families and...
Accuracy and training population design for genomic selection in elite north american oats
USDA-ARS?s Scientific Manuscript database
Genomic selection (GS) is a method to estimate the breeding values of individuals by using markers throughout the genome. We evaluated the accuracies of GS using data from five traits on 446 oat lines genotyped with 1005 Diversity Array Technology (DArT) markers and two GS methods (RR-BLUP and Bayes...
Selection of core animals in the Algorithm for Proven and Young using a simulation model.
Bradford, H L; Pocrnić, I; Fragomeni, B O; Lourenco, D A L; Misztal, I
2017-12-01
The Algorithm for Proven and Young (APY) enables the implementation of single-step genomic BLUP (ssGBLUP) in large, genotyped populations by separating genotyped animals into core and non-core subsets and creating a computationally efficient inverse for the genomic relationship matrix (G). As APY became the choice for large-scale genomic evaluations in BLUP-based methods, a common question is how to choose the animals in the core subset. We compared several core definitions to answer this question. Simulations comprised a moderately heritable trait for 95,010 animals and 50,000 genotypes for animals across five generations. Genotypes consisted of 25,500 SNP distributed across 15 chromosomes. Genotyping errors and missing pedigree were also mimicked. Core animals were defined based on individual generations, equal representation across generations, and at random. For a sufficiently large core size, core definitions had the same accuracies and biases, even if the core animals had imperfect genotypes. When genotyped animals had unknown parents, accuracy and bias were significantly better (p ≤ .05) for random and across generation core definitions. © 2017 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH.
Kwong, Qi Bin; Ong, Ai Ling; Teh, Chee Keng; Chew, Fook Tim; Tammi, Martti; Mayes, Sean; Kulaveerasingam, Harikrishna; Yeoh, Suat Hui; Harikrishna, Jennifer Ann; Appleton, David Ross
2017-06-06
Genomic selection (GS) uses genome-wide markers to select individuals with the desired overall combination of breeding traits. A total of 1,218 individuals from a commercial population of Ulu Remis x AVROS (UR x AVROS) were genotyped using the OP200K array. The traits of interest included: shell-to-fruit ratio (S/F, %), mesocarp-to-fruit ratio (M/F, %), kernel-to-fruit ratio (K/F, %), fruit per bunch (F/B, %), oil per bunch (O/B, %) and oil per palm (O/P, kg/palm/year). Genomic heritabilities of these traits were estimated to be in the range of 0.40 to 0.80. GS methods assessed were RR-BLUP, Bayes A (BA), Cπ (BC), Lasso (BL) and Ridge Regression (BRR). All methods resulted in almost equal prediction accuracy. The accuracy achieved ranged from 0.40 to 0.70, correlating with the heritability of traits. By selecting the most important markers, RR-BLUP B has the potential to outperform other methods. The marker density for certain traits can be further reduced based on the linkage disequilibrium (LD). Together with in silico breeding, GS is now being used in oil palm breeding programs to hasten parental palm selection.
A Concept–Wide Association Study of Clinical Notes to Discover New Predictors of Kidney Failure
Betensky, Rebecca A.; Wright, Adam; Curhan, Gary C.; Bates, David W.; Waikar, Sushrut S.
2016-01-01
Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, & measurements We used natural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a 5% threshold for false discovery rate (q value <0.05). We included all patients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high–dose ascorbic acid (adjusted hazard ratio, 5.48; 95% confidence interval, 2.80 to 10.70; q<0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q<0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record. PMID:27927892
Multiscale measurement error models for aggregated small area health data.
Aregay, Mehreteab; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Carroll, Rachel; Watjou, Kevin
2016-08-01
Spatial data are often aggregated from a finer (smaller) to a coarser (larger) geographical level. The process of data aggregation induces a scaling effect which smoothes the variation in the data. To address the scaling problem, multiscale models that link the convolution models at different scale levels via the shared random effect have been proposed. One of the main goals in aggregated health data is to investigate the relationship between predictors and an outcome at different geographical levels. In this paper, we extend multiscale models to examine whether a predictor effect at a finer level hold true at a coarser level. To adjust for predictor uncertainty due to aggregation, we applied measurement error models in the framework of multiscale approach. To assess the benefit of using multiscale measurement error models, we compare the performance of multiscale models with and without measurement error in both real and simulated data. We found that ignoring the measurement error in multiscale models underestimates the regression coefficient, while it overestimates the variance of the spatially structured random effect. On the other hand, accounting for the measurement error in multiscale models provides a better model fit and unbiased parameter estimates. © The Author(s) 2016.
USDA-ARS?s Scientific Manuscript database
The objective of this study was to provide initial results in an application of single-step genomic BLUP with a genomic relationship matrix (G^-1APY) calculated using the Algorithm of Proven and Young (APY) to 305-day protein yield for US Holsteins. Two G^-1APY were tested; one was from 139,057 geno...
Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S
2015-06-01
This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome sequence data alongside the 54k SNP set. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Modeling longitudinal data, I: principles of multivariate analysis.
Ravani, Pietro; Barrett, Brendan; Parfrey, Patrick
2009-01-01
Statistical models are used to study the relationship between exposure and disease while accounting for the potential role of other factors' impact on outcomes. This adjustment is useful to obtain unbiased estimates of true effects or to predict future outcomes. Statistical models include a systematic component and an error component. The systematic component explains the variability of the response variable as a function of the predictors and is summarized in the effect estimates (model coefficients). The error element of the model represents the variability in the data unexplained by the model and is used to build measures of precision around the point estimates (confidence intervals).
Genomic Prediction of Testcross Performance in Canola (Brassica napus)
Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.
2016-01-01
Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924
Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter
2017-08-10
A better understanding of the genetic architecture underlying complex traits (e.g., the distribution of causal variants and their effects) may aid in the genomic prediction. Here, we hypothesized that the genomic variants of complex traits might be enriched in a subset of genomic regions defined by genes grouped on the basis of "Gene Ontology" (GO), and that incorporating this independent biological information into genomic prediction models might improve their predictive ability. Four complex traits (i.e., milk, fat and protein yields, and mastitis) together with imputed sequence variants in Holstein (HOL) and Jersey (JER) cattle were analysed. We first carried out a post-GWAS analysis in a HOL training population to assess the degree of enrichment of the association signals in the gene regions defined by each GO term. We then extended the genomic best linear unbiased prediction model (GBLUP) to a genomic feature BLUP (GFBLUP) model, including an additional genomic effect quantifying the joint effect of a group of variants located in a genomic feature. The GBLUP model using a single random effect assumes that all genomic variants contribute to the genomic relationship equally, whereas GFBLUP attributes different weights to the individual genomic relationships in the prediction equation based on the estimated genomic parameters. Our results demonstrate that the immune-relevant GO terms were more associated with mastitis than milk production, and several biologically meaningful GO terms improved the prediction accuracy with GFBLUP for the four traits, as compared with GBLUP. The improvement of the genomic prediction between breeds (the average increase across the four traits was 0.161) was more apparent than that it was within the HOL (the average increase across the four traits was 0.020). Our genomic feature modelling approaches provide a framework to simultaneously explore the genetic architecture and genomic prediction of complex traits by taking advantage of independent biological knowledge.
Mutually unbiased product bases for multiple qudits
DOE Office of Scientific and Technical Information (OSTI.GOV)
McNulty, Daniel; Pammer, Bogdan; Weigert, Stefan
We investigate the interplay between mutual unbiasedness and product bases for multiple qudits of possibly different dimensions. A product state of such a system is shown to be mutually unbiased to a product basis only if each of its factors is mutually unbiased to all the states which occur in the corresponding factors of the product basis. This result implies both a tight limit on the number of mutually unbiased product bases which the system can support and a complete classification of mutually unbiased product bases for multiple qubits or qutrits. In addition, only maximally entangled states can be mutuallymore » unbiased to a maximal set of mutually unbiased product bases.« less
Psychosocial predictors of energy underreporting in a large doubly labeled water study.
Tooze, Janet A; Subar, Amy F; Thompson, Frances E; Troiano, Richard; Schatzkin, Arthur; Kipnis, Victor
2004-05-01
Underreporting of energy intake is associated with self-reported diet measures and appears to be selective according to personal characteristics. Doubly labeled water is an unbiased reference biomarker for energy intake that may be used to assess underreporting. Our objective was to determine which factors are associated with underreporting of energy intake on food-frequency questionnaires (FFQs) and 24-h dietary recalls (24HRs). The study participants were 484 men and women aged 40-69 y who resided in Montgomery County, MD. Using the doubly labeled water method to measure total energy expenditure, we considered numerous psychosocial, lifestyle, and sociodemographic factors in multiple logistic regression models for prediction of the probability of underreporting on the FFQ and 24HR. In the FFQ models, fear of negative evaluation, weight-loss history, and percentage of energy from fat were the best predictors of underreporting in women (R(2) = 0.09); body mass index, comparison of activity level with that of others of the same sex and age, and eating frequency were the best predictors in men (R(2) = 0.10). In the 24HR models, social desirability, fear of negative evaluation, body mass index, percentage of energy from fat, usual activity, and variability in number of meals per day were the best predictors of underreporting in women (R(2) = 0.22); social desirability, dietary restraint, body mass index, eating frequency, dieting history, and education were the best predictors in men (R(2) = 0.25). Although the final models were significantly related to underreporting on both the FFQ and the 24HR, the amount of variation explained by these models was relatively low, especially for the FFQ.
Cuyabano, B C D; Su, G; Rosa, G J M; Lund, M S; Gianola, D
2015-10-01
This study compared the accuracy of genome-enabled prediction models using individual single nucleotide polymorphisms (SNP) or haplotype blocks as covariates when using either a single breed or a combined population of Nordic Red cattle. The main objective was to compare predictions of breeding values of complex traits using a combined training population with haplotype blocks, with predictions using a single breed as training population and individual SNP as predictors. To compare the prediction reliabilities, bootstrap samples were taken from the test data set. With the bootstrapped samples of prediction reliabilities, we built and graphed confidence ellipses to allow comparisons. Finally, measures of statistical distances were used to calculate the gain in predictive ability. Our analyses are innovative in the context of assessment of predictive models, allowing a better understanding of prediction reliabilities and providing a statistical basis to effectively calibrate whether one prediction scenario is indeed more accurate than another. An ANOVA indicated that use of haplotype blocks produced significant gains mainly when Bayesian mixture models were used but not when Bayesian BLUP was fitted to the data. Furthermore, when haplotype blocks were used to train prediction models in a combined Nordic Red cattle population, we obtained up to a statistically significant 5.5% average gain in prediction accuracy, over predictions using individual SNP and training the model with a single breed. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Gao, Lei; Xi, Qian Qian; Wu, Jun; Han, Yu; Dai, Wei; Su, Yuan Yuan; Zhang, Xin
2015-09-01
To investigate the association between autism and prenatal environmental risk factors. A case-control study was conducted among 193 children with autism from the special educational schools and 733 typical development controls matched by age and gender by using questionnaire in Tianjin from 2007 to 2012. Statistical analysis included quick unbiased efficient statistical tree (QUEST) and logistic regression in SPSS 20.0. There were four predictors by QUEST and the logistic regression analysis, maternal air conditioner use during pregnancy (OR=0.316, 95% CI: 0.215-0.463) was the single first-level node (χ²=50.994, P=0.000); newborn complications (OR=4.277, 95% CI: 2.314-7.908) and paternal consumption of freshwater fish (OR=0.383, 95% CI: 0.256-0.573) were second-layer predictors (χ²=45.248, P=0.000; χ²=24.212, P=0.000); and maternal depression (OR=4.822, 95% CI: 3.047-7.631) was the single third-level predictor (χ²=23.835, P=0.000). The prediction accuracy of the tree was 89.2%. The air conditioner use during pregnancy and paternal freshwater fish diet might be beneficial for the prevention of autism, while newborn complications and maternal depression might be the risk factors. Copyright © 2015 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Mutually unbiased projectors and duality between lines and bases in finite quantum systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shalaby, M.; Vourdas, A., E-mail: a.vourdas@bradford.ac.uk
2013-10-15
Quantum systems with variables in the ring Z(d) are considered, and the concepts of weak mutually unbiased bases and mutually unbiased projectors are discussed. The lines through the origin in the Z(d)×Z(d) phase space, are classified into maximal lines (sets of d points), and sublines (sets of d{sub i} points where d{sub i}|d). The sublines are intersections of maximal lines. It is shown that there exists a duality between the properties of lines (resp., sublines), and the properties of weak mutually unbiased bases (resp., mutually unbiased projectors). -- Highlights: •Lines in discrete phase space. •Bases in finite quantum systems. •Dualitymore » between bases and lines. •Weak mutually unbiased bases.« less
Wakasaki, Rumie; Eiwaz, Mahaba; McClellan, Nicholas; Matsushita, Katsuyuki; Golgotiu, Kirsti; Hutchens, Michael P
2018-06-14
A technical challenge in translational models of kidney injury is determination of the extent of cell death. Histologic sections are commonly analyzed by area morphometry or unbiased stereology, but stereology requires specialized equipment. Therefore, a challenge to rigorous quantification would be addressed by an unbiased stereology tool with reduced equipment dependence. We hypothesized that it would be feasible to build a novel software component which would facilitate unbiased stereologic quantification on scanned slides, and that unbiased stereology would demonstrate greater precision and decreased bias compared with 2D morphometry. We developed a macro for the widely used image analysis program, Image J, and performed cardiac arrest with cardiopulmonary resuscitation (CA/CPR, a model of acute cardiorenal syndrome) in mice. Fluorojade-B stained kidney sections were analyzed using three methods to quantify cell death: gold standard stereology using a controlled stage and commercially-available software, unbiased stereology using the novel ImageJ macro, and quantitative 2D morphometry also using the novel macro. There was strong agreement between both methods of unbiased stereology (bias -0.004±0.006 with 95% limits of agreement -0.015 to 0.007). 2D morphometry demonstrated poor agreement and significant bias compared to either method of unbiased stereology. Unbiased stereology is facilitated by a novel macro for ImageJ and results agree with those obtained using gold-standard methods. Automated 2D morphometry overestimated tubular epithelial cell death and correlated modestly with values obtained from unbiased stereology. These results support widespread use of unbiased stereology for analysis of histologic outcomes of injury models.
Entropic uncertainty relations and locking: Tight bounds for mutually unbiased bases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ballester, Manuel A.; Wehner, Stephanie
We prove tight entropic uncertainty relations for a large number of mutually unbiased measurements. In particular, we show that a bound derived from the result by Maassen and Uffink [Phys. Rev. Lett. 60, 1103 (1988)] for two such measurements can in fact be tight for up to {radical}(d) measurements in mutually unbiased bases. We then show that using more mutually unbiased bases does not always lead to a better locking effect. We prove that the optimal bound for the accessible information using up to {radical}(d) specific mutually unbiased bases is log d/2, which is the same as can be achievedmore » by using only two bases. Our result indicates that merely using mutually unbiased bases is not sufficient to achieve a strong locking effect and we need to look for additional properties.« less
NASA Astrophysics Data System (ADS)
Maack, Joachim; Lingenfelder, Marcus; Weinacker, Holger; Koch, Barbara
2016-07-01
Remote sensing-based timber volume estimation is key for modelling the regional potential, accessibility and price of lignocellulosic raw material for an emerging bioeconomy. We used a unique wall-to-wall airborne LiDAR dataset and Landsat 7 satellite images in combination with terrestrial inventory data derived from the National Forest Inventory (NFI), and applied generalized additive models (GAM) to estimate spatially explicit timber distribution and volume in forested areas. Since the NFI data showed an underlying structure regarding size and ownership, we additionally constructed a socio-economic predictor to enhance the accuracy of the analysis. Furthermore, we balanced the training dataset with a bootstrap method to achieve unbiased regression weights for interpolating timber volume. Finally, we compared and discussed the model performance of the original approach (r2 = 0.56, NRMSE = 9.65%), the approach with balanced training data (r2 = 0.69, NRMSE = 12.43%) and the final approach with balanced training data and the additional socio-economic predictor (r2 = 0.72, NRMSE = 12.17%). The results demonstrate the usefulness of remote sensing techniques for mapping timber volume for a future lignocellulose-based bioeconomy.
Rodgers, S; Vandeleur, C L; Strippoli, M-P F; Castelao, E; Tesic, A; Glaus, J; Lasserre, A M; Müller, M; Rössler, W; Ajdacic-Gross, V; Preisig, M
2017-09-01
Given the broad range of biopsychosocial difficulties resulting from major depressive disorder (MDD), reliable evidence for predictors of improved mental health is essential, particularly from unbiased prospective community samples. Consequently, a broad spectrum of potential clinical and non-clinical predictors of improved mental health, defined as an absence of current major depressive episode (MDE) at follow-up, were examined over a 5-year period in an adult community sample. The longitudinal population-based PsyCoLaus study from the city of Lausanne, Switzerland, was used. Subjects having a lifetime MDD with a current MDE at baseline assessment were selected, resulting in a subsample of 210 subjects. Logistic regressions were applied to the data. Coping styles were the most important predictive factors in the present study. More specifically, low emotion-oriented coping and informal help-seeking behaviour at baseline were associated with the absence of an MDD diagnosis at follow-up. Surprisingly, neither formal help-seeking behaviour, nor psychopharmacological treatment, nor childhood adversities, nor depression subtypes turned out to be relevant predictors in the current study. The paramount role of coping styles as predictors of improvement in depression found in the present study might be a valuable target for resource-oriented therapeutic models. On the one hand, the positive impact of low emotion-oriented coping highlights the utility of clinical interventions interrupting excessive mental ruminations during MDE. On the other hand, the importance of informal social networks raises questions regarding how to enlarge the personal network of affected subjects and on how to best support informal caregivers.
NASA Astrophysics Data System (ADS)
Chardon, Jérémy; Hingray, Benoit; Favre, Anne-Catherine
2016-04-01
Scenarios of surface weather required for the impact studies have to be unbiased and adapted to the space and time scales of the considered hydro-systems. Hence, surface weather scenarios obtained from global climate models and/or numerical weather prediction models are not really appropriated. Outputs of these models have to be post-processed, which is often carried out thanks to Statistical Downscaling Methods (SDMs). Among those SDMs, approaches based on regression are often applied. For a given station, a regression link can be established between a set of large scale atmospheric predictors and the surface weather variable. These links are then used for the prediction of the latter. However, physical processes generating surface weather vary in time. This is well known for precipitation for instance. The most relevant predictors and the regression link are also likely to vary in time. A better prediction skill is thus classically obtained with a seasonal stratification of the data. Another strategy is to identify the most relevant predictor set and establish the regression link from dates that are similar - or analog - to the target date. In practice, these dates can be selected thanks to an analog model. In this study, we explore the possibility of improving the local performance of an analog model - where the analogy is applied to the geopotential heights 1000 and 500 hPa - using additional local scale predictors for the probabilistic prediction of the Safran precipitation over France. For each prediction day, the prediction is obtained from two GLM regression models - for both the occurrence and the quantity of precipitation - for which predictors and parameters are estimated from the analog dates. Firstly, the resulting combined model noticeably allows increasing the prediction performance by adapting the downscaling link for each prediction day. Secondly, the selected predictors for a given prediction depend on the large scale situation and on the considered region. Finally, even with such an adaptive predictor identification, the downscaling link appears to be robust: for a same prediction day, predictors selected for different locations of a given region are similar and the regression parameters are consistent within the region of interest.
Application of single-step genomic evaluation for crossbred performance in pig.
Xiang, T; Nielsen, B; Su, G; Legarra, A; Christensen, O F
2016-03-01
Crossbreding is predominant and intensively used in commercial meat production systems, especially in poultry and swine. Genomic evaluation has been successfully applied for breeding within purebreds but also offers opportunities of selecting purebreds for crossbred performance by combining information from purebreds with information from crossbreds. However, it generally requires that all relevant animals are genotyped, which is costly and presently does not seem to be feasible in practice. Recently, a novel single-step BLUP method for genomic evaluation of both purebred and crossbred performance has been developed that can incorporate marker genotypes into a traditional animal model. This new method has not been validated in real data sets. In this study, we applied this single-step method to analyze data for the maternal trait of total number of piglets born in Danish Landrace, Yorkshire, and two-way crossbred pigs in different scenarios. The genetic correlation between purebred and crossbred performances was investigated first, and then the impact of (crossbred) genomic information on prediction reliability for crossbred performance was explored. The results confirm the existence of a moderate genetic correlation, and it was seen that the standard errors on the estimates were reduced when including genomic information. Models with marker information, especially crossbred genomic information, improved model-based reliabilities for crossbred performance of purebred boars and also improved the predictive ability for crossbred animals and, to some extent, reduced the bias of prediction. We conclude that the new single-step BLUP method is a good tool in the genetic evaluation for crossbred performance in purebred animals.
A two step Bayesian approach for genomic prediction of breeding values.
Shariati, Mohammad M; Sørensen, Peter; Janss, Luc
2012-05-21
In genomic models that assign an individual variance to each marker, the contribution of one marker to the posterior distribution of the marker variance is only one degree of freedom (df), which introduces many variance parameters with only little information per variance parameter. A better alternative could be to form clusters of markers with similar effects where markers in a cluster have a common variance. Therefore, the influence of each marker group of size p on the posterior distribution of the marker variances will be p df. The simulated data from the 15th QTL-MAS workshop were analyzed such that SNP markers were ranked based on their effects and markers with similar estimated effects were grouped together. In step 1, all markers with minor allele frequency more than 0.01 were included in a SNP-BLUP prediction model. In step 2, markers were ranked based on their estimated variance on the trait in step 1 and each 150 markers were assigned to one group with a common variance. In further analyses, subsets of 1500 and 450 markers with largest effects in step 2 were kept in the prediction model. Grouping markers outperformed SNP-BLUP model in terms of accuracy of predicted breeding values. However, the accuracies of predicted breeding values were lower than Bayesian methods with marker specific variances. Grouping markers is less flexible than allowing each marker to have a specific marker variance but, by grouping, the power to estimate marker variances increases. A prior knowledge of the genetic architecture of the trait is necessary for clustering markers and appropriate prior parameterization.
A statistical test of unbiased evolution of body size in birds.
Bokma, Folmer
2002-12-01
Of the approximately 9500 bird species, the vast majority is small-bodied. That is a general feature of evolutionary lineages, also observed for instance in mammals and plants. The avian interspecific body size distribution is right-skewed even on a logarithmic scale. That has previously been interpreted as evidence that body size evolution has been biased. However, a procedure to test for unbiased evolution from the shape of body size distributions was lacking. In the present paper unbiased body size evolution is defined precisely, and a statistical test is developed based on Monte Carlo simulation of unbiased evolution. Application of the test to birds suggests that it is highly unlikely that avian body size evolution has been unbiased as defined. Several possible explanations for this result are discussed. A plausible explanation is that the general model of unbiased evolution assumes that population size and generation time do not affect the evolutionary variability of body size; that is, that micro- and macroevolution are decoupled, which theory suggests is not likely to be the case.
Mutually unbiased bases and semi-definite programming
NASA Astrophysics Data System (ADS)
Brierley, Stephen; Weigert, Stefan
2010-11-01
A complex Hilbert space of dimension six supports at least three but not more than seven mutually unbiased bases. Two computer-aided analytical methods to tighten these bounds are reviewed, based on a discretization of parameter space and on Gröbner bases. A third algorithmic approach is presented: the non-existence of more than three mutually unbiased bases in composite dimensions can be decided by a global optimization method known as semidefinite programming. The method is used to confirm that the spectral matrix cannot be part of a complete set of seven mutually unbiased bases in dimension six.
Link, W.A.; Armitage, Peter; Colton, Theodore
1998-01-01
Unbiasedness is probably the best known criterion for evaluating the performance of estimators. This note describes unbiasedness, demonstrating various failings of the criterion. It is shown that unbiased estimators might not exist, or might not be unique; an example of a unique but clearly unacceptable unbiased estimator is given. It is shown that unbiased estimators are not translation invariant. Various alternative criteria are described, and are illustrated through examples.
FAST TRACK COMMUNICATION: Affine constellations without mutually unbiased counterparts
NASA Astrophysics Data System (ADS)
Weigert, Stefan; Durt, Thomas
2010-10-01
It has been conjectured that a complete set of mutually unbiased bases in a space of dimension d exists if and only if there is an affine plane of order d. We introduce affine constellations and compare their existence properties with those of mutually unbiased constellations. The observed discrepancies make a deeper relation between the two existence problems unlikely.
Response to Selection in Finite Locus Models with Nonadditive Effects.
Esfandyari, Hadi; Henryon, Mark; Berg, Peer; Thomasen, Jørn Rind; Bijma, Piter; Sørensen, Anders Christian
2017-05-01
Under the finite-locus model in the absence of mutation, the additive genetic variation is expected to decrease when directional selection is acting on a population, according to quantitative-genetic theory. However, some theoretical studies of selection suggest that the level of additive variance can be sustained or even increased when nonadditive genetic effects are present. We tested the hypothesis that finite-locus models with both additive and nonadditive genetic effects maintain more additive genetic variance (VA) and realize larger medium- to long-term genetic gains than models with only additive effects when the trait under selection is subject to truncation selection. Four genetic models that included additive, dominance, and additive-by-additive epistatic effects were simulated. The simulated genome for individuals consisted of 25 chromosomes, each with a length of 1 M. One hundred bi-allelic QTL, 4 on each chromosome, were considered. In each generation, 100 sires and 100 dams were mated, producing 5 progeny per mating. The population was selected for a single trait (h2 = 0.1) for 100 discrete generations with selection on phenotype or BLUP-EBV. VA decreased with directional truncation selection even in presence of nonadditive genetic effects. Nonadditive effects influenced long-term response to selection and among genetic models additive gene action had highest response to selection. In addition, in all genetic models, BLUP-EBV resulted in a greater fixation of favorable and unfavorable alleles and higher response than phenotypic selection. In conclusion, for the schemes we simulated, the presence of nonadditive genetic effects had little effect in changes of additive variance and VA decreased by directional selection. © The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ridge, Lasso and Bayesian additive-dominance genomic models.
Azevedo, Camila Ferreira; de Resende, Marcos Deon Vilela; E Silva, Fabyano Fonseca; Viana, José Marcelo Soriano; Valente, Magno Sávio Ferreira; Resende, Márcio Fernando Ribeiro; Muñoz, Patricio
2015-08-25
A complete approach for genome-wide selection (GWS) involves reliable statistical genetics models and methods. Reports on this topic are common for additive genetic models but not for additive-dominance models. The objective of this paper was (i) to compare the performance of 10 additive-dominance predictive models (including current models and proposed modifications), fitted using Bayesian, Lasso and Ridge regression approaches; and (ii) to decompose genomic heritability and accuracy in terms of three quantitative genetic information sources, namely, linkage disequilibrium (LD), co-segregation (CS) and pedigree relationships or family structure (PR). The simulation study considered two broad sense heritability levels (0.30 and 0.50, associated with narrow sense heritabilities of 0.20 and 0.35, respectively) and two genetic architectures for traits (the first consisting of small gene effects and the second consisting of a mixed inheritance model with five major genes). G-REML/G-BLUP and a modified Bayesian/Lasso (called BayesA*B* or t-BLASSO) method performed best in the prediction of genomic breeding as well as the total genotypic values of individuals in all four scenarios (two heritabilities x two genetic architectures). The BayesA*B*-type method showed a better ability to recover the dominance variance/additive variance ratio. Decomposition of genomic heritability and accuracy revealed the following descending importance order of information: LD, CS and PR not captured by markers, the last two being very close. Amongst the 10 models/methods evaluated, the G-BLUP, BAYESA*B* (-2,8) and BAYESA*B* (4,6) methods presented the best results and were found to be adequate for accurately predicting genomic breeding and total genotypic values as well as for estimating additive and dominance in additive-dominance genomic models.
Genomic prediction in a nuclear population of layers using single-step models.
Yan, Yiyuan; Wu, Guiqin; Liu, Aiqiao; Sun, Congjiao; Han, Wenpeng; Li, Guangqi; Yang, Ning
2018-02-01
Single-step genomic prediction method has been proposed to improve the accuracy of genomic prediction by incorporating information of both genotyped and ungenotyped animals. The objective of this study is to compare the prediction performance of single-step model with a 2-step models and the pedigree-based models in a nuclear population of layers. A total of 1,344 chickens across 4 generations were genotyped by a 600 K SNP chip. Four traits were analyzed, i.e., body weight at 28 wk (BW28), egg weight at 28 wk (EW28), laying rate at 38 wk (LR38), and Haugh unit at 36 wk (HU36). In predicting offsprings, individuals from generation 1 to 3 were used as training data and females from generation 4 were used as validation set. The accuracies of predicted breeding values by pedigree BLUP (PBLUP), genomic BLUP (GBLUP), SSGBLUP and single-step blending (SSBlending) were compared for both genotyped and ungenotyped individuals. For genotyped females, GBLUP performed no better than PBLUP because of the small size of training data, while the 2 single-step models predicted more accurately than the PBLUP model. The average predictive ability of SSGBLUP and SSBlending were 16.0% and 10.8% higher than the PBLUP model across traits, respectively. Furthermore, the predictive abilities for ungenotyped individuals were also enhanced. The average improvements of prediction abilities were 5.9% and 1.5% for SSGBLUP and SSBlending model, respectively. It was concluded that single-step models, especially the SSGBLUP model, can yield more accurate prediction of genetic merits and are preferable for practical implementation of genomic selection in layers. © 2017 Poultry Science Association Inc.
Optimal reconstruction of the states in qutrit systems
NASA Astrophysics Data System (ADS)
Yan, Fei; Yang, Ming; Cao, Zhuo-Liang
2010-10-01
Based on mutually unbiased measurements, an optimal tomographic scheme for the multiqutrit states is presented explicitly. Because the reconstruction process of states based on mutually unbiased states is free of information waste, we refer to our scheme as the optimal scheme. By optimal we mean that the number of the required conditional operations reaches the minimum in this tomographic scheme for the states of qutrit systems. Special attention will be paid to how those different mutually unbiased measurements are realized; that is, how to decompose each transformation that connects each mutually unbiased basis with the standard computational basis. It is found that all those transformations can be decomposed into several basic implementable single- and two-qutrit unitary operations. For the three-qutrit system, there exist five different mutually unbiased-bases structures with different entanglement properties, so we introduce the concept of physical complexity to minimize the number of nonlocal operations needed over the five different structures. This scheme is helpful for experimental scientists to realize the most economical reconstruction of quantum states in qutrit systems.
Aspects of mutually unbiased bases in odd-prime-power dimensions
NASA Astrophysics Data System (ADS)
Chaturvedi, S.
2002-04-01
We rephrase the Wootters-Fields construction [W. K. Wootters and B. C. Fields, Ann. Phys. 191, 363 (1989)] of a full set of mutually unbiased bases in a complex vector space of dimensions N=pr, where p is an odd prime, in terms of the character vectors of the cyclic group G of order p. This form may be useful in explicitly writing down mutually unbiased bases for N=pr.
Comparative modeling and benchmarking data sets for human histone deacetylases and sirtuin families.
Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2015-02-23
Histone deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases, and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective histone deacetylase inhibitors (HDACIs). To facilitate the process, we constructed maximal unbiased benchmarking data sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs cover all four classes including Class III (Sirtuins family) and 14 HDAC isoforms, composed of 631 inhibitors and 24609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of "artificial enrichment" and "analogue bias". We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets and demonstrate that our MUBD-HDACs are unique in that they can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the "2D bias" and "LBVS favorable" effect within the benchmarking sets. In summary, MUBD-HDACs are the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that are available so far. MUBD-HDACs are freely available at http://www.xswlab.org/ .
Fangmann, A; Sharifi, R A; Heinkel, J; Danowski, K; Schrade, H; Erbe, M; Simianer, H
2017-04-01
Currently used multi-step methods to incorporate genomic information in the prediction of breeding values (BV) implicitly involve many assumptions which, if violated, may result in loss of information, inaccuracies and bias. To overcome this, single-step genomic best linear unbiased prediction (ssGBLUP) was proposed combining pedigree, phenotype and genotype of all individuals for genetic evaluation. Our objective was to implement ssGBLUP for genomic predictions in pigs and to compare the accuracy of ssGBLUP with that of multi-step methods with empirical data of moderately sized pig breeding populations. Different predictions were performed: conventional parent average (PA), direct genomic value (DGV) calculated with genomic BLUP (GBLUP), a GEBV obtained by blending the DGV with PA, and ssGBLUP. Data comprised individuals from a German Landrace (LR) and Large White (LW) population. The trait 'number of piglets born alive' (NBA) was available for 182,054 litters of 41,090 LR sows and 15,750 litters from 4534 LW sows. The pedigree contained 174,021 animals, of which 147,461 (26,560) animals were LR (LW) animals. In total, 526 LR and 455 LW animals were genotyped with the Illumina PorcineSNP60 BeadChip. After quality control and imputation, 495 LR (424 LW) animals with 44,368 (43,678) SNP on 18 autosomes remained for the analysis. Predictive abilities, i.e., correlations between de-regressed proofs and genomic BV, were calculated with a five-fold cross validation and with a forward prediction for young genotyped validation animals born after 2011. Generally, predictive abilities for LR were rather small (0.08 for GBLUP, 0.19 for GEBV and 0.18 for ssGBLUP). For LW, ssGBLUP had the greatest predictive ability (0.45). For both breeds, assessment of reliabilities for young genotyped animals indicated that genomic prediction outperforms PA with ssGBLUP providing greater reliabilities (0.40 for LR and 0.32 for LW) than GEBV (0.35 for LR and 0.29 for LW). Grouping of animals according to information sources revealed that genomic prediction had the highest potential benefit for genotyped animals without their own phenotype. Although, ssGBLUP did not generally outperform GBLUP or GEBV, the results suggest that ssGBLUP can be a useful and conceptually convincing approach for practical genomic prediction of NBA in moderately sized LR and LW populations.
Bangera, Rama; Correa, Katharina; Lhorente, Jean P; Figueroa, René; Yáñez, José M
2017-01-31
Salmon Rickettsial Syndrome (SRS) caused by Piscirickettsia salmonis is a major disease affecting the Chilean salmon industry. Genomic selection (GS) is a method wherein genome-wide markers and phenotype information of full-sibs are used to predict genomic EBV (GEBV) of selection candidates and is expected to have increased accuracy and response to selection over traditional pedigree based Best Linear Unbiased Prediction (PBLUP). Widely used GS methods such as genomic BLUP (GBLUP), SNPBLUP, Bayes C and Bayesian Lasso may perform differently with respect to accuracy of GEBV prediction. Our aim was to compare the accuracy, in terms of reliability of genome-enabled prediction, from different GS methods with PBLUP for resistance to SRS in an Atlantic salmon breeding program. Number of days to death (DAYS), binary survival status (STATUS) phenotypes, and 50 K SNP array genotypes were obtained from 2601 smolts challenged with P. salmonis. The reliability of different GS methods at different SNP densities with and without pedigree were compared to PBLUP using a five-fold cross validation scheme. Heritability estimated from GS methods was significantly higher than PBLUP. Pearson's correlation between predicted GEBV from PBLUP and GS models ranged from 0.79 to 0.91 and 0.79-0.95 for DAYS and STATUS, respectively. The relative increase in reliability from different GS methods for DAYS and STATUS with 50 K SNP ranged from 8 to 25% and 27-30%, respectively. All GS methods outperformed PBLUP at all marker densities. DAYS and STATUS showed superior reliability over PBLUP even at the lowest marker density of 3 K and 500 SNP, respectively. 20 K SNP showed close to maximal reliability for both traits with little improvement using higher densities. These results indicate that genomic predictions can accelerate genetic progress for SRS resistance in Atlantic salmon and implementation of this approach will contribute to the control of SRS in Chile. We recommend GBLUP for routine GS evaluation because this method is computationally faster and the results are very similar with other GS methods. The use of lower density SNP or the combination of low density SNP and an imputation strategy may help to reduce genotyping costs without compromising gain in reliability.
Stanley, Thomas R.; Aldridge, Cameron L.; Joanne Saher,; Theresa Childers,
2015-01-01
The Gunnison Sage-Grouse (Centrocercus minimus) is a species of conservation concern and is a candidate for listing under the U.S. Endangered Species Act because of substantial declines in populations from historic levels. It is thought that loss, fragmentation, and deterioration of sagebrush (Artemisia spp.) habitat have contributed to the decline and isolation of this species into seven geographically distinct subpopulations. Nest survival is known to be a primary driver of demography of Greater Sage-Grouse (C. urophasianus), but no unbiased estimates of daily nest survival rates (hereafter nest survival) exist for Gunnison Sage-Grouse or published studies identifying factors that influence nest survival. We estimated nest survival of Gunnison Sage-Grouse for the western portion of Colorado's Gunnison Basin subpopulation, and assessed the effects and relative importance of local- and landscape-scale habitat characteristics on nest survival. Our top performing model was one that allowed variation in nest survival among areas, suggesting a larger landscape-area effect. Overall nest success during a 38-day nesting period (egg-laying plus incubation) was 50% (daily survival rate; SE = 0.982 [0.003]), which is higher than previous estimates for Gunnison Sage-Grouse and generally higher than published for the closely related Greater Sage-Grouse. We did not find strong evidence that local-scale habitat variables were better predictors of nest survival than landscape-scale predictors, nor did we find strong evidence that any of the habitat variables we measured were good predictors of nest survival. Nest success of Gunnison Sage-Grouse in the western portion of the Gunnison Basin was higher than previously believed.
An AUC-based permutation variable importance measure for random forests
2013-01-01
Background The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs). However the classification performance of RF is known to be suboptimal in case of strongly unbalanced data, i.e. data where response class sizes differ considerably. Suggestions were made to obtain better classification performance based either on sampling procedures or on cost sensitivity analyses. However to our knowledge the performance of the VIMs has not yet been examined in the case of unbalanced response classes. In this paper we explore the performance of the permutation VIM for unbalanced data settings and introduce an alternative permutation VIM based on the area under the curve (AUC) that is expected to be more robust towards class imbalance. Results We investigated the performance of the standard permutation VIM and of our novel AUC-based permutation VIM for different class imbalance levels using simulated data and real data. The results suggest that the new AUC-based permutation VIM outperforms the standard permutation VIM for unbalanced data settings while both permutation VIMs have equal performance for balanced data settings. Conclusions The standard permutation VIM loses its ability to discriminate between associated predictors and predictors not associated with the response for increasing class imbalance. It is outperformed by our new AUC-based permutation VIM for unbalanced data settings, while the performance of both VIMs is very similar in the case of balanced classes. The new AUC-based VIM is implemented in the R package party for the unbiased RF variant based on conditional inference trees. The codes implementing our study are available from the companion website: http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/070_drittmittel/janitza/index.html. PMID:23560875
An AUC-based permutation variable importance measure for random forests.
Janitza, Silke; Strobl, Carolin; Boulesteix, Anne-Laure
2013-04-05
The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs). However the classification performance of RF is known to be suboptimal in case of strongly unbalanced data, i.e. data where response class sizes differ considerably. Suggestions were made to obtain better classification performance based either on sampling procedures or on cost sensitivity analyses. However to our knowledge the performance of the VIMs has not yet been examined in the case of unbalanced response classes. In this paper we explore the performance of the permutation VIM for unbalanced data settings and introduce an alternative permutation VIM based on the area under the curve (AUC) that is expected to be more robust towards class imbalance. We investigated the performance of the standard permutation VIM and of our novel AUC-based permutation VIM for different class imbalance levels using simulated data and real data. The results suggest that the new AUC-based permutation VIM outperforms the standard permutation VIM for unbalanced data settings while both permutation VIMs have equal performance for balanced data settings. The standard permutation VIM loses its ability to discriminate between associated predictors and predictors not associated with the response for increasing class imbalance. It is outperformed by our new AUC-based permutation VIM for unbalanced data settings, while the performance of both VIMs is very similar in the case of balanced classes. The new AUC-based VIM is implemented in the R package party for the unbiased RF variant based on conditional inference trees. The codes implementing our study are available from the companion website: http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/070_drittmittel/janitza/index.html.
Hierarchical thinking in network biology: the unbiased modularization of biochemical networks.
Papin, Jason A; Reed, Jennifer L; Palsson, Bernhard O
2004-12-01
As reconstructed biochemical reaction networks continue to grow in size and scope, there is a growing need to describe the functional modules within them. Such modules facilitate the study of biological processes by deconstructing complex biological networks into conceptually simple entities. The definition of network modules is often based on intuitive reasoning. As an alternative, methods are being developed for defining biochemical network modules in an unbiased fashion. These unbiased network modules are mathematically derived from the structure of the whole network under consideration.
Gutreuter, S.; Boogaard, M.A.
2007-01-01
Predictors of the percentile lethal/effective concentration/dose are commonly used measures of efficacy and toxicity. Typically such quantal-response predictors (e.g., the exposure required to kill 50% of some population) are estimated from simple bioassays wherein organisms are exposed to a gradient of several concentrations of a single agent. The toxicity of an agent may be influenced by auxiliary covariates, however, and more complicated experimental designs may introduce multiple variance components. Prediction methods lag examples of those cases. A conventional two-stage approach consists of multiple bivariate predictions of, say, medial lethal concentration followed by regression of those predictions on the auxiliary covariates. We propose a more effective and parsimonious class of generalized nonlinear mixed-effects models for prediction of lethal/effective dose/concentration from auxiliary covariates. We demonstrate examples using data from a study regarding the effects of pH and additions of variable quantities 2???,5???-dichloro-4???- nitrosalicylanilide (niclosamide) on the toxicity of 3-trifluoromethyl-4- nitrophenol to larval sea lamprey (Petromyzon marinus). The new models yielded unbiased predictions and root-mean-squared errors (RMSEs) of prediction for the exposure required to kill 50 and 99.9% of some population that were 29 to 82% smaller, respectively, than those from the conventional two-stage procedure. The model class is flexible and easily implemented using commonly available software. ?? 2007 SETAC.
NASA Astrophysics Data System (ADS)
Frolov, Sergey; Garau, Bartolame; Bellingham, James
2014-08-01
Regular grid ("lawnmower") survey is a classical strategy for synoptic sampling of the ocean. Is it possible to achieve a more effective use of available resources if one takes into account a priori knowledge about variability in magnitudes of uncertainty and decorrelation scales? In this article, we develop and compare the performance of several path-planning algorithms: optimized "lawnmower," a graph-search algorithm (A*), and a fully nonlinear genetic algorithm. We use the machinery of the best linear unbiased estimator (BLUE) to quantify the ability of a vehicle fleet to synoptically map distribution of phytoplankton off the central California coast. We used satellite and in situ data to specify covariance information required by the BLUE estimator. Computational experiments showed that two types of sampling strategies are possible: a suboptimal space-filling design (produced by the "lawnmower" and the A* algorithms) and an optimal uncertainty-aware design (produced by the genetic algorithm). Unlike the space-filling designs that attempted to cover the entire survey area, the optimal design focused on revisiting areas of high uncertainty. Results of the multivehicle experiments showed that fleet performance predictors, such as cumulative speed or the weight of the fleet, predicted the performance of a homogeneous fleet well; however, these were poor predictors for comparing the performance of different platforms.
Weintraub, Marc J; Hall, Daniel L; Carbonella, Julia Y; Weisman de Mamani, Amy; Hooley, Jill M
2017-06-01
There is growing concern that much published research may have questionable validity due to phenomena such as publication bias and p-hacking. Within the psychiatric literature, the construct of expressed emotion (EE) is widely assumed to be a reliable predictor of relapse across a range of mental illnesses. EE is an index of the family climate, measuring how critical, hostile, and overinvolved a family member is toward a mentally ill patient. No study to date has examined the evidential value of this body of research as a whole. That is to say, although many studies have shown a link between EE and symptom relapse, the integrity of the literature from which this claim is derived has not been tested. In an effort to confirm the integrity of the literature of EE predicting psychiatric relapse in patients with schizophrenia, we conducted a p-curve analysis on all known studies examining EE (using the Camberwell Family Interview) to predict psychiatric relapse over a 9- to 12-month follow-up period. Results suggest that the body of literature on EE is unbiased and has integrity, as there was a significant right skew of p-values, a nonsignificant left skew of p-values, and a nonsignificant test of flatness. We conclude that EE is a robust and valuable predictor of symptom relapse in schizophrenia. © 2016 Family Process Institute.
WEINTRAUB, MARC J.; HALL, DANIEL L.; CARBONELLA, JULIA Y.; DE MAMANI, AMY WEISMAN; HOOLEY, JILL M.
2018-01-01
There is growing concern that much published research may have questionable validity due to phenomena such as publication bias and p-hacking. Within the psychiatric literature, the construct of expressed emotion (EE) is widely assumed to be a reliable predictor of relapse across a range of mental illnesses. EE is an index of the family climate, measuring how critical, hostile, and overinvolved a family member is toward a mentally ill patient. No study to date has examined the evidential value of this body of research as a whole. That is to say, although many studies have shown a link between EE and symptom relapse, the integrity of the literature from which this claim is derived has not been tested. In an effort to confirm the integrity of the literature of EE predicting psychiatric relapse in patients with schizophrenia, we conducted a p-curve analysis on all known studies examining EE (using the Camberwell Family Interview) to predict psychiatric relapse over a 9- to 12-month follow-up period. Results suggest that the body of literature on EE is unbiased and has integrity, as there was a significant right skew of p-values, a nonsignificant left skew of p-values, and a nonsignificant test of flatness. We conclude that EE is a robust and valuable predictor of symptom relapse in schizophrenia. PMID:26875506
Comparative Modeling and Benchmarking Data Sets for Human Histone Deacetylases and Sirtuin Families
Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2015-01-01
Histone Deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective Histone Deacetylases Inhibitors (HDACIs). To facilitate the process, we constructed the Maximal Unbiased Benchmarking Data Sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs covers all 4 Classes including Class III (Sirtuins family) and 14 HDACs isoforms, composed of 631 inhibitors and 24,609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of “artificial enrichment” and “analogue bias”. We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets, and demonstrate that our MUBD-HDACs is unique in that it can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the “2D bias” and “LBVS favorable” effect within the benchmarking sets. In summary, MUBD-HDACs is the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that is available so far. MUBD-HDACs is freely available at http://www.xswlab.org/. PMID:25633490
Harris, Alexandre M.; DeGiorgio, Michael
2016-01-01
Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781
Spectroscopic observation of SN2017gkk by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Onori, F.; Benetti, S.; Cappellaro, E.; Losada, Illa R.; Gafton, E.; NUTS Collaboration
2017-09-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of supernova SN2017gkk (=MASTER OT J091344.71762842.5) in host galaxy NGC 2748.
Spectroscopic observation of ASASSN-17he by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Kostrzewa-Rutkowska, Z.; Benetti, S.; Dong, S.; Stritzinger, M.; Stanek, K.; Brimacombe, J.; Sagues, A.; Galindo, P.; Losada, I. Rivero
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17he. The candidate was discovered by by the All-Sky Automated Survey for Supernovae.
Convergence of Free Energy Profile of Coumarin in Lipid Bilayer
2012-01-01
Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from “pulling” coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives. PMID:22545027
Convergence of Free Energy Profile of Coumarin in Lipid Bilayer.
Paloncýová, Markéta; Berka, Karel; Otyepka, Michal
2012-04-10
Atomistic molecular dynamics (MD) simulations of druglike molecules embedded in lipid bilayers are of considerable interest as models for drug penetration and positioning in biological membranes. Here we analyze partitioning of coumarin in dioleoylphosphatidylcholine (DOPC) bilayer, based on both multiple, unbiased 3 μs MD simulations (total length) and free energy profiles along the bilayer normal calculated by biased MD simulations (∼7 μs in total). The convergences in time of free energy profiles calculated by both umbrella sampling and z-constraint techniques are thoroughly analyzed. Two sets of starting structures are also considered, one from unbiased MD simulation and the other from "pulling" coumarin along the bilayer normal. The structures obtained by pulling simulation contain water defects on the lipid bilayer surface, while those acquired from unbiased simulation have no membrane defects. The free energy profiles converge more rapidly when starting frames from unbiased simulations are used. In addition, z-constraint simulation leads to more rapid convergence than umbrella sampling, due to quicker relaxation of membrane defects. Furthermore, we show that the choice of RESP, PRODRG, or Mulliken charges considerably affects the resulting free energy profile of our model drug along the bilayer normal. We recommend using z-constraint biased MD simulations based on starting geometries acquired from unbiased MD simulations for efficient calculation of convergent free energy profiles of druglike molecules along bilayer normals. The calculation of free energy profile should start with an unbiased simulation, though the polar molecules might need a slow pulling afterward. Results obtained with the recommended simulation protocol agree well with available experimental data for two coumarin derivatives.
Spectroscopic classification of Gaia18adv by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Gall, C.; Benetti, S.; Wyrzykowski, L.; Stritzinger, M.; Holmbo, S.; Dong, S.; Siltala, Lauri; NUTS Collaboration
2018-01-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of Gaia18adv (SN2018hh) near the host galaxy SDSS J121341.37+282640.0.
NASA Astrophysics Data System (ADS)
Cannizzaro, G.; Kuncarayakti, H.; Fraser, M.; Hamanowicz, A.; Jonker, P.; Kankare, E.; Kostrzewa-Rutkowska, Z.; Onori, F.; Wevers, T.; Wyrzykowski, L.; Galbany, L.
2018-03-01
The NOT Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of supernovae SN 2018aei and SN 2018aej, discovered by PanSTARSS Survey for Transients (ATel #11408).
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen
Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; ...
2018-03-12
Here, we propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator–coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Reconfigurable generation and measurement of mutually unbiased bases for time-bin qudits
NASA Astrophysics Data System (ADS)
Lukens, Joseph M.; Islam, Nurul T.; Lim, Charles Ci Wen; Gauthier, Daniel J.
2018-03-01
We propose a method for implementing mutually unbiased generation and measurement of time-bin qudits using a cascade of electro-optic phase modulator-coded fiber Bragg grating pairs. Our approach requires only a single spatial mode and can switch rapidly between basis choices. We obtain explicit solutions for dimensions d = 2, 3, and 4 that realize all d + 1 possible mutually unbiased bases and analyze the performance of our approach in quantum key distribution. Given its practicality and compatibility with current technology, our approach provides a promising springboard for scalable processing of high-dimensional time-bin states.
Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data.
Caulfield, Thomas R; Devkota, Batsal; Rollins, Geoffrey C
2011-01-01
We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16.
Lorenzi, Olga D.; Andújar-Pérez, Doris A.; Torres-Velásquez, Brenda C.; Hunsperger, Elizabeth A.; Munoz-Jordan, Jorge Luis; Perez-Padilla, Janice; Rivera, Aidsa; Gonzalez-Zeno, Gladys E.; Sharp, Tyler M.; Galloway, Renee L.; Glass Elrod, Mindy; Mathis, Demetrius L.; Oberste, M. Steven; Nix, W. Allan; Henderson, Elizabeth; McQuiston, Jennifer; Singleton, Joseph; Kato, Cecilia; García Gubern, Carlos; Santiago-Rivera, William; Cruz-Correa, Jesús; Muns-Sosa, Robert; Ortiz-Rivera, Juan D.; Jiménez, Gerson; Galarza, Ivonne E.; Horiuchi, Kalanthe; Margolis, Harold S.; Alvarado, Luisa I.
2017-01-01
Identifying etiologies of acute febrile illnesses (AFI) is challenging due to non-specific presentation and limited availability of diagnostics. Prospective AFI studies provide a methodology to describe the syndrome by age and etiology, findings that can be used to develop case definitions and multiplexed diagnostics to optimize management. We conducted a 3-year prospective AFI study in Puerto Rico. Patients with fever ≤7 days were offered enrollment, and clinical data and specimens were collected at enrollment and upon discharge or follow-up. Blood and oro-nasopharyngeal specimens were tested by RT-PCR and immunodiagnostic methods for infection with dengue viruses (DENV) 1–4, chikungunya virus (CHIKV), influenza A and B viruses (FLU A/B), 12 other respiratory viruses (ORV), enterovirus, Leptospira spp., and Burkholderia pseudomallei. Clinical presentation and laboratory findings of participants infected with DENV were compared to those infected with CHIKV, FLU A/B, and ORV. Clinical predictors of laboratory-positive dengue compared to all other AFI etiologies were determined by age and day post-illness onset (DPO) at presentation. Of 8,996 participants enrolled from May 7, 2012 through May 6, 2015, more than half (54.8%, 4,930) had a pathogen detected. Pathogens most frequently detected were CHIKV (1,635, 18.2%), FLU A/B (1,074, 11.9%), DENV 1–4 (970, 10.8%), and ORV (904, 10.3%). Participants with DENV infection presented later and a higher proportion were hospitalized than those with other diagnoses (46.7% versus 27.3% with ORV, 18.8% with FLU A/B, and 11.2% with CHIKV). Predictors of dengue in participants presenting <3 DPO included leukopenia, thrombocytopenia, headache, eye pain, nausea, and dizziness, while negative predictors were irritability and rhinorrhea. Predictors of dengue in participants presenting 3–5 DPO were leukopenia, thrombocytopenia, facial/neck erythema, nausea, eye pain, signs of poor circulation, and diarrhea; presence of rhinorrhea, cough, and red conjunctiva predicted non-dengue AFI. By enrolling febrile patients at clinical presentation, we identified unbiased predictors of laboratory-positive dengue as compared to other common causes of AFI. These findings can be used to assist in early identification of dengue patients, as well as direct anticipatory guidance and timely initiation of correct clinical management. PMID:28902845
Tomashek, Kay M; Lorenzi, Olga D; Andújar-Pérez, Doris A; Torres-Velásquez, Brenda C; Hunsperger, Elizabeth A; Munoz-Jordan, Jorge Luis; Perez-Padilla, Janice; Rivera, Aidsa; Gonzalez-Zeno, Gladys E; Sharp, Tyler M; Galloway, Renee L; Glass Elrod, Mindy; Mathis, Demetrius L; Oberste, M Steven; Nix, W Allan; Henderson, Elizabeth; McQuiston, Jennifer; Singleton, Joseph; Kato, Cecilia; García Gubern, Carlos; Santiago-Rivera, William; Cruz-Correa, Jesús; Muns-Sosa, Robert; Ortiz-Rivera, Juan D; Jiménez, Gerson; Galarza, Ivonne E; Horiuchi, Kalanthe; Margolis, Harold S; Alvarado, Luisa I
2017-09-01
Identifying etiologies of acute febrile illnesses (AFI) is challenging due to non-specific presentation and limited availability of diagnostics. Prospective AFI studies provide a methodology to describe the syndrome by age and etiology, findings that can be used to develop case definitions and multiplexed diagnostics to optimize management. We conducted a 3-year prospective AFI study in Puerto Rico. Patients with fever ≤7 days were offered enrollment, and clinical data and specimens were collected at enrollment and upon discharge or follow-up. Blood and oro-nasopharyngeal specimens were tested by RT-PCR and immunodiagnostic methods for infection with dengue viruses (DENV) 1-4, chikungunya virus (CHIKV), influenza A and B viruses (FLU A/B), 12 other respiratory viruses (ORV), enterovirus, Leptospira spp., and Burkholderia pseudomallei. Clinical presentation and laboratory findings of participants infected with DENV were compared to those infected with CHIKV, FLU A/B, and ORV. Clinical predictors of laboratory-positive dengue compared to all other AFI etiologies were determined by age and day post-illness onset (DPO) at presentation. Of 8,996 participants enrolled from May 7, 2012 through May 6, 2015, more than half (54.8%, 4,930) had a pathogen detected. Pathogens most frequently detected were CHIKV (1,635, 18.2%), FLU A/B (1,074, 11.9%), DENV 1-4 (970, 10.8%), and ORV (904, 10.3%). Participants with DENV infection presented later and a higher proportion were hospitalized than those with other diagnoses (46.7% versus 27.3% with ORV, 18.8% with FLU A/B, and 11.2% with CHIKV). Predictors of dengue in participants presenting <3 DPO included leukopenia, thrombocytopenia, headache, eye pain, nausea, and dizziness, while negative predictors were irritability and rhinorrhea. Predictors of dengue in participants presenting 3-5 DPO were leukopenia, thrombocytopenia, facial/neck erythema, nausea, eye pain, signs of poor circulation, and diarrhea; presence of rhinorrhea, cough, and red conjunctiva predicted non-dengue AFI. By enrolling febrile patients at clinical presentation, we identified unbiased predictors of laboratory-positive dengue as compared to other common causes of AFI. These findings can be used to assist in early identification of dengue patients, as well as direct anticipatory guidance and timely initiation of correct clinical management.
Fragomeni, B O; Lourenco, D A L; Tsuruta, S; Masuda, Y; Aguilar, I; Misztal, I
2015-10-01
The purpose of this study was to examine accuracy of genomic selection via single-step genomic BLUP (ssGBLUP) when the direct inverse of the genomic relationship matrix (G) is replaced by an approximation of G(-1) based on recursions for young genotyped animals conditioned on a subset of proven animals, termed algorithm for proven and young animals (APY). With the efficient implementation, this algorithm has a cubic cost with proven animals and linear with young animals. Ten duplicate data sets mimicking a dairy cattle population were simulated. In a first scenario, genomic information for 20k genotyped bulls, divided in 7k proven and 13k young bulls, was generated for each replicate. In a second scenario, 5k genotyped cows with phenotypes were included in the analysis as young animals. Accuracies (average for the 10 replicates) in regular EBV were 0.72 and 0.34 for proven and young animals, respectively. When genomic information was included, they increased to 0.75 and 0.50. No differences between genomic EBV (GEBV) obtained with the regular G(-1) and the approximated G(-1) via the recursive method were observed. In the second scenario, accuracies in GEBV (0.76, 0.51 and 0.59 for proven bulls, young males and young females, respectively) were also higher than those in EBV (0.72, 0.35 and 0.49). Again, no differences between GEBV with regular G(-1) and with recursions were observed. With the recursive algorithm, the number of iterations to achieve convergence was reduced from 227 to 206 in the first scenario and from 232 to 209 in the second scenario. Cows can be treated as young animals in APY without reducing the accuracy. The proposed algorithm can be implemented to reduce computing costs and to overcome current limitations on the number of genotyped animals in the ssGBLUP method. © 2015 Blackwell Verlag GmbH.
Atagi, Y; Onogi, A; Kinukawa, M; Ogino, A; Kurogi, K; Uchiyama, K; Yasumori, T; Adachi, K; Togashi, K; Iwata, H
2017-05-01
The semen production traits of bulls from 2 major cattle breeds in Japan, Holstein and Japanese Black, were analyzed comprehensively using genome-wide markers. Weaker genetic correlations were observed between the 2 age groups (1 to 3 yr old and 4 to 6 yr old) regarding semen volume and sperm motility compared with those observed for sperm number and motility after freeze-thawing. The preselection of collected semen for freezing had a limited effect. Given the increasing importance of bull proofs at a young age because of genomic selection and the results from preliminary studies, we used a multiple-trait model that included motility after freeze-thawing with records collected at young ages. Based on variations in contemporary group effects, accounting for both seasonal and management factors, Holstein bulls may be more sensitive than Japanese Black bulls to seasonal environmental variations; however, the seasonal variations of contemporary group effects were smaller than those of overall contemporary group effects. The improvement of motilities, recorded immediately after collection and freeze-thawing, was observed in recent years; thus, good management and better freeze-thawing protocol may alleviate seasonal phenotypic differences. The detrimental effects of inbreeding were observed in all traits of both breeds; accordingly, the selection of candidate bulls with high inbreeding coefficients should be avoided per general recommendations. Semen production traits have never been considered for bull selection. However, negative genetic trends were observed. The magnitudes of the estimated h were comparable to those of other economically important traits. A single-step genomic BLUP will provide more accurate predictions of breeding values compared with BLUP; thus, marker genotype information is useful for estimating the genetic merits of bulls for semen production traits. The selection of these traits would improve sperm viability, a component related to breeding success, and alleviate negative genetic trends.
Genotyping by sequencing for genomic prediction in a soybean breeding population.
Jarquín, Diego; Kocak, Kyle; Posadas, Luis; Hyma, Katie; Jedlicka, Joseph; Graef, George; Lorenz, Aaron
2014-08-29
Advances in genotyping technology, such as genotyping by sequencing (GBS), are making genomic prediction more attractive to reduce breeding cycle times and costs associated with phenotyping. Genomic prediction and selection has been studied in several crop species, but no reports exist in soybean. The objectives of this study were (i) evaluate prospects for genomic selection using GBS in a typical soybean breeding program and (ii) evaluate the effect of GBS marker selection and imputation on genomic prediction accuracy. To achieve these objectives, a set of soybean lines sampled from the University of Nebraska Soybean Breeding Program were genotyped using GBS and evaluated for yield and other agronomic traits at multiple Nebraska locations. Genotyping by sequencing scored 16,502 single nucleotide polymorphisms (SNPs) with minor-allele frequency (MAF) > 0.05 and percentage of missing values ≤ 5% on 301 elite soybean breeding lines. When SNPs with up to 80% missing values were included, 52,349 SNPs were scored. Prediction accuracy for grain yield, assessed using cross validation, was estimated to be 0.64, indicating good potential for using genomic selection for grain yield in soybean. Filtering SNPs based on missing data percentage had little to no effect on prediction accuracy, especially when random forest imputation was used to impute missing values. The highest accuracies were observed when random forest imputation was used on all SNPs, but differences were not significant. A standard additive G-BLUP model was robust; modeling additive-by-additive epistasis did not provide any improvement in prediction accuracy. The effect of training population size on accuracy began to plateau around 100, but accuracy steadily climbed until the largest possible size was used in this analysis. Including only SNPs with MAF > 0.30 provided higher accuracies when training populations were smaller. Using GBS for genomic prediction in soybean holds good potential to expedite genetic gain. Our results suggest that standard additive G-BLUP models can be used on unfiltered, imputed GBS data without loss in accuracy.
Belay, T K; Dagnachew, B S; Kowalski, Z M; Ådnøy, T
2017-08-01
Fourier transform mid-infrared (FT-MIR) spectra of milk are commonly used for phenotyping of traits of interest through links developed between the traits and milk FT-MIR spectra. Predicted traits are then used in genetic analysis for ultimate phenotypic prediction using a single-trait mixed model that account for cows' circumstances at a given test day. Here, this approach is referred to as indirect prediction (IP). Alternatively, FT-MIR spectral variable can be kept multivariate in the form of factor scores in REML and BLUP analyses. These BLUP predictions, including phenotype (predicted factor scores), were converted to single-trait through calibration outputs; this method is referred to as direct prediction (DP). The main aim of this study was to verify whether mixed modeling of milk spectra in the form of factors scores (DP) gives better prediction of blood β-hydroxybutyrate (BHB) than the univariate approach (IP). Models to predict blood BHB from milk spectra were also developed. Two data sets that contained milk FT-MIR spectra and other information on Polish dairy cattle were used in this study. Data set 1 (n = 826) also contained BHB measured in blood samples, whereas data set 2 (n = 158,028) did not contain measured blood values. Part of data set 1 was used to calibrate a prediction model (n = 496) and the remaining part of data set 1 (n = 330) was used to validate the calibration models, as well as to evaluate the DP and IP approaches. Dimensions of FT-MIR spectra in data set 2 were reduced either into 5 or 10 factor scores (DP) or into a single trait (IP) with calibration outputs. The REML estimates for these factor scores were found using WOMBAT. The BLUP values and predicted BHB for observations in the validation set were computed using the REML estimates. Blood BHB predicted from milk FT-MIR spectra by both approaches were regressed on reference blood BHB that had not been used in the model development. Coefficients of determination in cross-validation for untransformed blood BHB were from 0.21 to 0.32, whereas that for the log-transformed BHB were from 0.31 to 0.38. The corresponding estimates in validation were from 0.29 to 0.37 and 0.21 to 0.43, respectively, for untransformed and logarithmic BHB. Contrary to expectation, slightly better predictions of BHB were found when univariate variance structure was used (IP) than when multivariate covariance structures were used (DP). Conclusive remarks on the importance of keeping spectral data in multivariate form for prediction of phenotypes may be found in data sets where the trait of interest has strong relationships with spectral variables. The Authors. Published by the Federation of Animal Science Societies and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Spectroscopic observation of SN 2017jzp and SN 2018bf by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Kuncarayakti, H.; Mattila, S.; Kotak, R.; Harmanen, J.; Reynolds, T.; Wyrzykowski, L.; Stritzinger, M.; Onori, F.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.
2018-01-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of SNe 2017jzp and 2018bf in host galaxies KUG 1326+679 and SDSS J225746.53+253833.5, respectively.
NASA Astrophysics Data System (ADS)
Harmanen, J.; Mattila, S.; Kuncarayakti, H.; Reynolds, T.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.; Dong, S.; Pastorello, A.; Pursimo, T.; NUTS Collaboration
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17nb in MCG+06-17-007 and CSS170922:172546+342249 in an unknown host galaxy.
Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression
Garrett, Neil; Sharot, Tali; Faulkner, Paul; Korn, Christoph W.; Roiser, Jonathan P.; Dolan, Raymond J.
2014-01-01
Recent evidence suggests that a state of good mental health is associated with biased processing of information that supports a positively skewed view of the future. Depression, on the other hand, is associated with unbiased processing of such information. Here, we use brain imaging in conjunction with a belief update task administered to clinically depressed patients and healthy controls to characterize brain activity that supports unbiased belief updating in clinically depressed individuals. Our results reveal that unbiased belief updating in depression is mediated by strong neural coding of estimation errors in response to both good news (in left inferior frontal gyrus and bilateral superior frontal gyrus) and bad news (in right inferior parietal lobule and right inferior frontal gyrus) regarding the future. In contrast, intact mental health was linked to a relatively attenuated neural coding of bad news about the future. These findings identify a neural substrate mediating the breakdown of biased updating in major depression disorder, which may be essential for mental health. PMID:25221492
Allowable SEM noise for unbiased LER measurement
NASA Astrophysics Data System (ADS)
Papavieros, George; Constantoudis, Vassilios; Gogolides, Evangelos
2018-03-01
Recently, a novel method for the calculation of unbiased Line Edge Roughness based on Power Spectral Density analysis has been proposed. In this paper first an alternative method is discussed and investigated, utilizing the Height-Height Correlation Function (HHCF) of edges. The HHCF-based method enables the unbiased determination of the whole triplet of LER parameters including besides rms the correlation length and roughness exponent. The key of both methods is the sensitivity of PSD and HHCF on noise at high frequencies and short distance respectively. Secondly, we elaborate a testbed of synthesized SEM images with controlled LER and noise to justify the effectiveness of the proposed unbiased methods. Our main objective is to find out the boundaries of the method in respect to noise levels and roughness characteristics, for which the method remains reliable, i.e the maximum amount of noise allowed, for which the output results cope with the controllable known inputs. At the same time, we will also set the extremes of roughness parameters for which the methods hold their accuracy.
Mastitis of periparturient Holstein cattle: a phenotypic and genetic study.
Detilleux, J C; Kehrli, M E; Freeman, A E; Fox, L K; Kelley, D H
1995-10-01
Environmental and genetic factors affecting somatic cell scores, clinical mastitis, and IMI by minor and major pathogens were studied on 137 periparturient Holstein cows selected for milk production. Environmental effects were obtained by generalized least squares and logistic regression. Genetic parameters were from BLUP and threshold animal models. Lactation number affected the number of quarters with clinical mastitis and the number of quarters infected with minor pathogens. The DIM affected somatic cell score and number of quarters infected with major pathogens. Heritabilities for all mastitis indicators averaged 10%, but differences occurred among the indicators. Correlations between breeding values of the number of quarters infected with minor pathogens and the number infected with major pathogens were antagonistic and statistically significant.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Combescure, Monique
2009-03-15
In our previous paper [Combescure, M., 'Circulant matrices, Gauss sums and the mutually unbiased bases. I. The prime number case', Cubo A Mathematical Journal (unpublished)] we have shown that the theory of circulant matrices allows to recover the result that there exists p+1 mutually unbiased bases in dimension p, p being an arbitrary prime number. Two orthonormal bases B, B{sup '} of C{sup d} are said mutually unbiased if for all b(set-membership sign)B, for all b{sup '}(set-membership sign)B{sup '} one has that |b{center_dot}b{sup '}|=1/{radical}(d) (b{center_dot}b{sup '} Hermitian scalar product in C{sup d}). In this paper we show that the theorymore » of block-circulant matrices with circulant blocks allows to show very simply the known result that if d=p{sup n} (p a prime number and n any integer) there exists d+1 mutually unbiased bases in C{sup d}. Our result relies heavily on an idea of Klimov et al. [''Geometrical approach to the discrete Wigner function,'' J. Phys. A 39, 14471 (2006)]. As a subproduct we recover properties of quadratic Weil sums for p{>=}3, which generalizes the fact that in the prime case the quadratic Gauss sum properties follow from our results.« less
Examinations of tRNA Range of Motion Using Simulations of Cryo-EM Microscopy and X-Ray Data
Caulfield, Thomas R.; Devkota, Batsal; Rollins, Geoffrey C.
2011-01-01
We examined tRNA flexibility using a combination of steered and unbiased molecular dynamics simulations. Using Maxwell's demon algorithm, molecular dynamics was used to steer X-ray structure data toward that from an alternative state obtained from cryogenic-electron microscopy density maps. Thus, we were able to fit X-ray structures of tRNA onto cryogenic-electron microscopy density maps for hybrid states of tRNA. Additionally, we employed both Maxwell's demon molecular dynamics simulations and unbiased simulation methods to identify possible ribosome-tRNA contact areas where the ribosome may discriminate tRNAs during translation. Herein, we collected >500 ns of simulation data to assess the global range of motion for tRNAs. Biased simulations can be used to steer between known conformational stop points, while unbiased simulations allow for a general testing of conformational space previously unexplored. The unbiased molecular dynamics data describes the global conformational changes of tRNA on a sub-microsecond time scale for comparison with steered data. Additionally, the unbiased molecular dynamics data was used to identify putative contacts between tRNA and the ribosome during the accommodation step of translation. We found that the primary contact regions were H71 and H92 of the 50S subunit and ribosomal proteins L14 and L16. PMID:21716650
Unbiased Estimates of Variance Components with Bootstrap Procedures
ERIC Educational Resources Information Center
Brennan, Robert L.
2007-01-01
This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…
Definition and Measurement of Selection Bias: From Constant Ratio to Constant Difference
ERIC Educational Resources Information Center
Cahan, Sorel; Gamliel, Eyal
2006-01-01
Despite its intuitive appeal and popularity, Thorndike's constant ratio (CR) model for unbiased selection is inherently inconsistent in "n"-free selection. Satisfaction of the condition for unbiased selection, when formulated in terms of success/acceptance probabilities, usually precludes satisfaction by the converse probabilities of…
Spectroscopic observation of Gaia17dht and Gaia17diu by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Fraser, M.; Dyrbye, S.; Cappella, E.
2017-12-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of Gaia17dht/SN2017izz and Gaia17diu/SN2017jdb (in host galaxies SDSS J145121.24+283521.6 and LEDA 2753585 respectively).
Statistics as Unbiased Estimators: Exploring the Teaching of Standard Deviation
ERIC Educational Resources Information Center
Wasserman, Nicholas H.; Casey, Stephanie; Champion, Joe; Huey, Maryann
2017-01-01
This manuscript presents findings from a study about the knowledge for and planned teaching of standard deviation. We investigate how understanding variance as an unbiased (inferential) estimator--not just a descriptive statistic for the variation (spread) in data--is related to teachers' instruction regarding standard deviation, particularly…
Unbiased symmetric metrics provide a useful measure to quickly compare two datasets, with similar interpretations for both under and overestimations. Two examples include the normalized mean bias factor and normalized mean absolute error factor. However, the original formulations...
Mutually unbiased bases in six dimensions: The four most distant bases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raynal, Philippe; Lue Xin; Englert, Berthold-Georg
2011-06-15
We consider the average distance between four bases in six dimensions. The distance between two orthonormal bases vanishes when the bases are the same, and the distance reaches its maximal value of unity when the bases are unbiased. We perform a numerical search for the maximum average distance and find it to be strictly smaller than unity. This is strong evidence that no four mutually unbiased bases exist in six dimensions. We also provide a two-parameter family of three bases which, together with the canonical basis, reach the numerically found maximum of the average distance, and we conduct a detailedmore » study of the structure of the extremal set of bases.« less
Extreme Mean and Its Applications
NASA Technical Reports Server (NTRS)
Swaroop, R.; Brownlow, J. D.
1979-01-01
Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.
Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design
ERIC Educational Resources Information Center
Smith, William C.
2014-01-01
The ability of regression discontinuity (RD) designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs) make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education…
Five instruments for measuring tree height: an evaluation
Michael S. Williams; William A. Bechtold; V.J. LaBau
1994-01-01
Five instruments were tested for reliability in measuring tree heights under realistic conditions. Four linear models were used to determine if tree height can be measured unbiasedly over all tree sizes and if any of the instruments were more efficient in estimating tree height. The laser height finder was the only instrument to produce unbiased estimates of the true...
NASA Astrophysics Data System (ADS)
Dong, Subo; Bose, Subhash; Stritzinger, M.; Holmbo, S.; Fraser, M.; Fedorets, G.
2017-10-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ATLAS17lcs (SN 2017guv) and ASASSN-17mq (AT 2017gvo) in host galaxies 2MASX J19132225-1648031 and CGCG 225-050, respectively.
NASA Astrophysics Data System (ADS)
Pastorello, Andrea; Benetti, Stefano; Cappellaro, Enrico; Terreran, Giacomo; Tomasella, Lina; Fedorets, Grigori; NUTS Collaboration
2017-07-01
The Nordic Optical Telescope (NOT) Unbiased Transient Survey (NUTS; ATel #8992) reports the spectroscopic classification of ASASSN-17io in the galaxy CGCG 316-010, along with the re classification of ATLAS17hpt (SN 2017faf), which was previously classified as a SLSN-I (ATel #10549).
NASA Technical Reports Server (NTRS)
Tilton, J. C.; Swain, P. H. (Principal Investigator); Vardeman, S. B.
1981-01-01
A key input to a statistical classification algorithm, which exploits the tendency of certain ground cover classes to occur more frequently in some spatial context than in others, is a statistical characterization of the context: the context distribution. An unbiased estimator of the context distribution is discussed which, besides having the advantage of statistical unbiasedness, has the additional advantage over other estimation techniques of being amenable to an adaptive implementation in which the context distribution estimate varies according to local contextual information. Results from applying the unbiased estimator to the contextual classification of three real LANDSAT data sets are presented and contrasted with results from non-contextual classifications and from contextual classifications utilizing other context distribution estimation techniques.
Estimation of the simple correlation coefficient.
Shieh, Gwowen
2010-11-01
This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.
Wells, Brian J; Chagin, Kevin M; Li, Liang; Hu, Bo; Yu, Changhong; Kattan, Michael W
2015-03-01
With the integration of electronic health records (EHRs), health data has become easily accessible and abounded. The EHR has the potential to provide important healthcare information to researchers by creating study cohorts. However, accessing this information comes with three major issues: 1) Predictor variables often change over time, 2) Patients have various lengths of follow up within the EHR, and 3) the size of the EHR data can be computationally challenging. Landmark analyses provide a perfect complement to EHR data and help to alleviate these three issues. We present two examples that utilize patient birthdays as landmark times for creating dynamic datasets for predicting clinical outcomes. The use of landmark times help to solve these three issues by incorporating information that changes over time, by creating unbiased reference points that are not related to a patient's exposure within the EHR, and reducing the size of a dataset compared to true time-varying analysis. These techniques are shown using two example cohort studies from the Cleveland Clinic that utilized 4.5 million and 17,787 unique patients, respectively.
Functional mixed effects spectral analysis
KRAFTY, ROBERT T.; HALL, MARTICA; GUO, WENSHENG
2011-01-01
SUMMARY In many experiments, time series data can be collected from multiple units and multiple time series segments can be collected from the same unit. This article introduces a mixed effects Cramér spectral representation which can be used to model the effects of design covariates on the second-order power spectrum while accounting for potential correlations among the time series segments collected from the same unit. The transfer function is composed of a deterministic component to account for the population-average effects and a random component to account for the unit-specific deviations. The resulting log-spectrum has a functional mixed effects representation where both the fixed effects and random effects are functions in the frequency domain. It is shown that, when the replicate-specific spectra are smooth, the log-periodograms converge to a functional mixed effects model. A data-driven iterative estimation procedure is offered for the periodic smoothing spline estimation of the fixed effects, penalized estimation of the functional covariance of the random effects, and unit-specific random effects prediction via the best linear unbiased predictor. PMID:26855437
Software engineering the mixed model for genome-wide association studies on large samples.
Zhang, Zhiwu; Buckler, Edward S; Casstevens, Terry M; Bradbury, Peter J
2009-11-01
Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample size and number of markers used for GWAS is increasing dramatically, resulting in greater statistical power to detect those associations. The use of mixed models with increasingly large data sets depends on the availability of software for analyzing those models. While multiple software packages implement the mixed model method, no single package provides the best combination of fast computation, ability to handle large samples, flexible modeling and ease of use. Key elements of association analysis with mixed models are reviewed, including modeling phenotype-genotype associations using mixed models, population stratification, kinship and its estimation, variance component estimation, use of best linear unbiased predictors or residuals in place of raw phenotype, improving efficiency and software-user interaction. The available software packages are evaluated, and suggestions made for future software development.
Use of generalized linear models and digital data in a forest inventory of Northern Utah
Moisen, Gretchen G.; Edwards, Thomas C.
1999-01-01
Forest inventories, like those conducted by the Forest Service's Forest Inventory and Analysis Program (FIA) in the Rocky Mountain Region, are under increased pressure to produce better information at reduced costs. Here we describe our efforts in Utah to merge satellite-based information with forest inventory data for the purposes of reducing the costs of estimates of forest population totals and providing spatial depiction of forest resources. We illustrate how generalized linear models can be used to construct approximately unbiased and efficient estimates of population totals while providing a mechanism for prediction in space for mapping of forest structure. We model forest type and timber volume of five tree species groups as functions of a variety of predictor variables in the northern Utah mountains. Predictor variables include elevation, aspect, slope, geographic coordinates, as well as vegetation cover types based on satellite data from both the Advanced Very High Resolution Radiometer (AVHRR) and Thematic Mapper (TM) platforms. We examine the relative precision of estimates of area by forest type and mean cubic-foot volumes under six different models, including the traditional double sampling for stratification strategy. Only very small gains in precision were realized through the use of expensive photointerpreted or TM-based data for stratification, while models based on topography and spatial coordinates alone were competitive. We also compare the predictive capability of the models through various map accuracy measures. The models including the TM-based vegetation performed best overall, while topography and spatial coordinates alone provided substantial information at very low cost.
Key seabird areas in southern New England identified using a community occupancy model
O'Connell, Allan F.; Flanders, Nicholas P.; Gardner, Beth; Winiarski, Kristopher J.; Paton, Peter W. C.; Allison, Taber
2015-01-01
Seabirds are of conservation concern, and as new potential risks to seabirds are arising, the need to provide unbiased estimates of species’ distributions is growing. We applied community occupancy models to detection/non-detection data collected from repeated aerial strip-transect surveys conducted in 2 large study plots off southern New England, USA; one off the coast of Rhode Island and the other in Nantucket Sound. A total of 17 seabird species were observed at least once in each study plot. We found that detection varied by survey date and effort for most species and the average detection probability across species was less than 0.4. We estimated the influence of water depth, sea surface temperature, and sea surface chl a concentration on species-specific occupancy. Diving species showed large differences between the 2 study plots in their predicted winter distributions, which were largely explained by water depth acting as a stronger predictor of occupancy in Rhode Island than in Nantucket Sound. Conversely, similarities between the 2 study plots in predicted winter distributions of surface-feeding species were explained by sea surface temperature or chlorophyll a concentration acting as predictors of these species’ occupancy in both study plots. We predicted the number of species at each site using the observed data in order to detect ‘hot-spots’ of seabird diversity and use in the 2 study plots. These results provide new information on detection of species, areas of use, and relationships with environmental variables that will be valuable for biologists and planners interested in seabird conservation in the region.
Big data and computational biology strategy for personalized prognosis.
Ow, Ghim Siong; Tang, Zhiqun; Kuznetsov, Vladimir A
2016-06-28
The era of big data and precision medicine has led to accumulation of massive datasets of gene expression data and clinical information of patients. For a new patient, we propose that identification of a highly similar reference patient from an existing patient database via similarity matching of both clinical and expression data could be useful for predicting the prognostic risk or therapeutic efficacy.Here, we propose a novel methodology to predict disease/treatment outcome via analysis of the similarity between any pair of patients who are each characterized by a certain set of pre-defined biological variables (biomarkers or clinical features) represented initially as a prognostic binary variable vector (PBVV) and subsequently transformed to a prognostic signature vector (PSV). Our analyses revealed that Euclidean distance rather correlation distance measure was effective in defining an unbiased similarity measure calculated between two PSVs.We implemented our methods to high-grade serous ovarian cancer (HGSC) based on a 36-mRNA predictor that was previously shown to stratify patients into 3 distinct prognostic subgroups. We studied and revealed that patient's age, when converted into binary variable, was positively correlated with the overall risk of succumbing to the disease. When applied to an independent testing dataset, the inclusion of age into the molecular predictor provided more robust personalized prognosis of overall survival correlated with the therapeutic response of HGSC and provided benefit for treatment targeting of the tumors in HGSC patients.Finally, our method can be generalized and implemented in many other diseases to accurately predict personalized patients' outcomes.
Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.
Xia, Jie; Reid, Terry-Elinor; Wu, Song; Zhang, Liangren; Wang, Xiang Simon
2018-05-29
Chemokine receptors (CRs) have long been druggable targets for the treatment of inflammatory diseases and HIV-1 infection. As a powerful technique, virtual screening (VS) has been widely applied to identifying small molecule leads for modern drug targets including CRs. For rational selection of a wide variety of VS approaches, ligand enrichment assessment based on a benchmarking data set has become an indispensable practice. However, the lack of versatile benchmarking sets for the whole CRs family that are able to unbiasedly evaluate every single approach including both structure- and ligand-based VS somewhat hinders modern drug discovery efforts. To address this issue, we constructed Maximal Unbiased Benchmarking Data sets for human Chemokine Receptors (MUBD-hCRs) using our recently developed tools of MUBD-DecoyMaker. The MUBD-hCRs encompasses 13 subtypes out of 20 chemokine receptors, composed of 404 ligands and 15756 decoys so far and is readily expandable in the future. It had been thoroughly validated that MUBD-hCRs ligands are chemically diverse while its decoys are maximal unbiased in terms of "artificial enrichment", "analogue bias". In addition, we studied the performance of MUBD-hCRs, in particular CXCR4 and CCR5 data sets, in ligand enrichment assessments of both structure- and ligand-based VS approaches in comparison with other benchmarking data sets available in the public domain and demonstrated that MUBD-hCRs is very capable of designating the optimal VS approach. MUBD-hCRs is a unique and maximal unbiased benchmarking set that covers major CRs subtypes so far.
Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris
Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey
2005-01-01
Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...
Retransformation bias in a stem profile model
Raymond L. Czaplewski; David Bruce
1990-01-01
An unbiased profile model, fit to diameter divided by diameter at breast height, overestimated volume of 5.3-m log sections by 0.5 to 3.5%. Another unbiased profile model, fit to squared diameter divided by squared diameter at breast height, underestimated bole diameters by 0.2 to 2.1%. These biases are caused by retransformation of the predicted dependent variable;...
Unbiased nonorthogonal bases for tomographic reconstruction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sainz, Isabel; Klimov, Andrei B.; Roa, Luis
2010-05-15
We have developed a general method for constructing a set of nonorthogonal bases with equal separations between all different basis states in prime dimensions. The results are that the corresponding biorthogonal counterparts are pairwise unbiased with the components of the original bases. Using these bases, we derive an explicit expression for the optimal tomography in nonorthogonal bases. A special two-dimensional case is analyzed separately.
Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine
2008-01-01
Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...
The dependability of medical students' performance ratings as documented on in-training evaluations.
van Barneveld, Christina
2005-03-01
To demonstrate an approach to obtain an unbiased estimate of the dependability of students' performance ratings during training, when the data-collection design includes nesting of student in rater, unbalanced nest sizes, and dependent observations. In 2003, two variance components analyses of in-training evaluation (ITE) report data were conducted using urGENOVA software. In the first analysis, the dependability for the nested and unbalanced data-collection design was calculated. In the second analysis, an approach using multiple generalizability studies was used to obtain an unbiased estimate of the student variance component, resulting in an unbiased estimate of dependability. Results suggested that there is bias in estimates of the dependability of students' performance on ITEs that are attributable to the data-collection design. When the bias was corrected, the results indicated that the dependability of ratings of student performance was almost zero. The combination of the multiple generalizability studies method and the use of specialized software provides an unbiased estimate of the dependability of ratings of student performance on ITE scores for data-collection designs that include nesting of student in rater, unbalanced nest sizes, and dependent observations.
Apparatus bias and place conditioning with ethanol in mice.
Cunningham, Christopher L; Ferree, Nikole K; Howard, MacKenzie A
2003-12-01
Although the distinction between "biased" and "unbiased" is generally recognized as an important methodological issue in place conditioning, previous studies have not adequately addressed the distinction between a biased/unbiased apparatus and a biased/unbiased stimulus assignment procedure. Moreover, a review of the recent literature indicates that many reports (70% of 76 papers published in 2001) fail to provide adequate information about apparatus bias. This issue is important because the mechanisms underlying a drug's effect in the place-conditioning procedure may differ depending on whether the apparatus is biased or unbiased. The present studies were designed to assess the impact of apparatus bias and stimulus assignment procedure on ethanol-induced place conditioning in mice (DBA/2 J). A secondary goal was to compare various dependent variables commonly used to index conditioned place preference. Apparatus bias was manipulated by varying the combination of tactile (floor) cues available during preference tests. Experiment 1 used an unbiased apparatus in which the stimulus alternatives were equally preferred during a pre-test as indicated by the group average. Experiment 2 used a biased apparatus in which one of the stimuli was strongly preferred by most mice (mean % time on cue = 67%) during the pre-test. In both studies, the stimulus paired with drug (CS+) was assigned randomly (i.e., an "unbiased" stimulus assignment procedure). Experimental mice received four pairings of CS+ with ethanol (2 g/kg, i.p.) and four pairings of the alternative stimulus (CS-) with saline; control mice received saline on both types of trial. Each experiment concluded with a 60-min choice test. With the unbiased apparatus (experiment 1), significant place conditioning was obtained regardless of whether drug was paired with the subject's initially preferred or non-preferred stimulus. However, with the biased apparatus (experiment 2), place conditioning was apparent only when ethanol was paired with the initially non-preferred cue, and not when it was paired with the initially preferred cue. These conclusions held regardless of which dependent variable was used to index place conditioning, but only if the counterbalancing factor was included in statistical analyses. These studies indicate that apparatus bias plays a major role in determining whether biased assignment of an ethanol-paired stimulus affects ability to demonstrate conditioned place preference. Ethanol's ability to produce conditioned place preference in an unbiased apparatus, regardless of the direction of the initial cue bias, supports previous studies that interpret such findings as evidence of a primary rewarding drug effect. Moreover, these studies suggest that the asymmetrical outcome observed in the biased apparatus is most likely due to a measurement problem (e.g., ceiling effect) rather than to an interaction between the drug's effect and an unconditioned motivational response (e.g., "anxiety") to the initially non-preferred stimulus. More generally, these findings illustrate the importance of providing clear information on apparatus bias in all place-conditioning studies.
NASA Astrophysics Data System (ADS)
Mutai, C. C.; Ward, M. N.; Colman, A. W.
1998-07-01
It is shown that the July-September sea-surface temperature (SST) pattern contains moderately strong relationships with the October-December (OND) seasonal rainfall total averaged across East Africa 15°S-5°N, 30°-41.25°E. The relations can be described by using three rotated global SST empirical orthogonal functions (EOFs), mainly measuring aspects of SST patterns in the tropical Pacific (related to El Niño/Southern Oscillation), tropical Indian and, to a lesser extent, tropical Atlantic. Confidence in the relationships is raised because the three EOFs correlate significantly with OND near-surface divergence over the tropical Pacific, Indian and Atlantic Oceans (extending into Northern mid-latitudes), as well as with the rainfall in East Africa and also with rainfall across southern and western tropical Africa.For the East African region, multiple linear regression (MLR) and linear discriminant analysis prediction models are tested. The predictors are pre-rainfall season values of the three rotated SST EOFs. The predictors use information through September. Validating MLR hindcasts using a 1945-1966 (1967-1988) training period and a 1967-1988 (1945-1966) testing period between 30 to 60% of the area-averaged rainfall variance is explained. To achieve unbiased estimates of the expected skill of a forecast system, it is safest to keep model training and testing periods completely separate. The above strategy achieves this in the most important step of ensuring that the models fit the SST predictors to the rainfall predictand using years independent of the testing period. However, the EOFs were calculated over 1901-1980, so for hindcasts prior to 1981, the EOFs describe the SST variability a little better than could be achieved in real-time, which could inflate skill estimates. Tests in the years 1981-1994, independent of the 1901-1980 eigenvector analysis period, do produce similar levels of skill, but a few more forecast years are needed to confirm this result. It is shown that the mean verification at each individual location within East Africa is somewhat lower, which is important to consider for some applications. The need to monitor the prediction relationships and update the models is emphasised. Furthermore, these forecasts only become available as the OND season is underway, though some evidence is found for one of the EOF predictors having skill as early as June.
2010-01-01
Introduction Various multigene predictors of breast cancer clinical outcome have been commercialized, but proved to be prognostic only for hormone receptor (HR) subsets overexpressing estrogen or progesterone receptors. Hormone receptor negative (HRneg) breast cancers, particularly those lacking HER2/ErbB2 overexpression and known as triple-negative (Tneg) cases, are heterogeneous and generally aggressive breast cancer subsets in need of prognostic subclassification, since most early stage HRneg and Tneg breast cancer patients are cured with conservative treatment yet invariably receive aggressive adjuvant chemotherapy. Methods An unbiased search for genes predictive of distant metastatic relapse was undertaken using a training cohort of 199 node-negative, adjuvant treatment naïve HRneg (including 154 Tneg) breast cancer cases curated from three public microarray datasets. Prognostic gene candidates were subsequently validated using a different cohort of 75 node-negative, adjuvant naïve HRneg cases curated from three additional datasets. The HRneg/Tneg gene signature was prognostically compared with eight other previously reported gene signatures, and evaluated for cancer network associations by two commercial pathway analysis programs. Results A novel set of 14 prognostic gene candidates was identified as outcome predictors: CXCL13, CLIC5, RGS4, RPS28, RFX7, EXOC7, HAPLN1, ZNF3, SSX3, HRBL, PRRG3, ABO, PRTN3, MATN1. A composite HRneg/Tneg gene signature index proved more accurate than any individual candidate gene or other reported multigene predictors in identifying cases likely to remain free of metastatic relapse. Significant positive correlations between the HRneg/Tneg index and three independent immune-related signatures (STAT1, IFN, and IR) were observed, as were consistent negative associations between the three immune-related signatures and five other proliferation module-containing signatures (MS-14, ONCO-RS, GGI, CSR/wound and NKI-70). Network analysis identified 8 genes within the HRneg/Tneg signature as being functionally linked to immune/inflammatory chemokine regulation. Conclusions A multigene HRneg/Tneg signature linked to immune/inflammatory cytokine regulation was identified from pooled expression microarray data and shown to be superior to other reported gene signatures in predicting the metastatic outcome of early stage and conservatively managed HRneg and Tneg breast cancer. Further validation of this prognostic signature may lead to new therapeutic insights and spare many newly diagnosed breast cancer patients the need for aggressive adjuvant chemotherapy. PMID:20946665
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paterek, Tomasz; Dakic, Borivoje; Brukner, Caslav
In this Reply to the preceding Comment by Hall and Rao [Phys. Rev. A 83, 036101 (2011)], we motivate terminology of our original paper and point out that further research is needed in order to (dis)prove the claimed link between every orthogonal Latin square of order being a power of a prime and a mutually unbiased basis.
Generation and evaluation of an ultra-high-field atlas with applications in DBS planning
NASA Astrophysics Data System (ADS)
Wang, Brian T.; Poirier, Stefan; Guo, Ting; Parrent, Andrew G.; Peters, Terry M.; Khan, Ali R.
2016-03-01
Purpose Deep brain stimulation (DBS) is a common treatment for Parkinson's disease (PD) and involves the use of brain atlases or intrinsic landmarks to estimate the location of target deep brain structures, such as the subthalamic nucleus (STN) and the globus pallidus pars interna (GPi). However, these structures can be difficult to localize with conventional clinical magnetic resonance imaging (MRI), and thus targeting can be prone to error. Ultra-high-field imaging at 7T has the ability to clearly resolve these structures and thus atlases built with these data have the potential to improve targeting accuracy. Methods T1 and T2-weighted images of 12 healthy control subjects were acquired using a 7T MR scanner. These images were then used with groupwise registration to generate an unbiased average template with T1w and T2w contrast. Deep brain structures were manually labelled in each subject by two raters and rater reliability was assessed. We compared the use of this unbiased atlas with two other methods of atlas-based segmentation (single-template and multi-template) for subthalamic nucleus (STN) segmentation on 7T MRI data. We also applied this atlas to clinical DBS data acquired at 1.5T to evaluate its efficacy for DBS target localization as compared to using a standard atlas. Results The unbiased templates provide superb detail of subcortical structures. Through one-way ANOVA tests, the unbiased template is significantly (p <0.05) more accurate than a single-template in atlas-based segmentation and DBS target localization tasks. Conclusion The generated unbiased averaged templates provide better visualization of deep brain nuclei and an increase in accuracy over single-template and lower field strength atlases.
Extending unbiased stereology of brain ultrastructure to three-dimensional volumes
NASA Technical Reports Server (NTRS)
Fiala, J. C.; Harris, K. M.; Koslow, S. H. (Principal Investigator)
2001-01-01
OBJECTIVE: Analysis of brain ultrastructure is needed to reveal how neurons communicate with one another via synapses and how disease processes alter this communication. In the past, such analyses have usually been based on single or paired sections obtained by electron microscopy. Reconstruction from multiple serial sections provides a much needed, richer representation of the three-dimensional organization of the brain. This paper introduces a new reconstruction system and new methods for analyzing in three dimensions the location and ultrastructure of neuronal components, such as synapses, which are distributed non-randomly throughout the brain. DESIGN AND MEASUREMENTS: Volumes are reconstructed by defining transformations that align the entire area of adjacent sections. Whole-field alignment requires rotation, translation, skew, scaling, and second-order nonlinear deformations. Such transformations are implemented by a linear combination of bivariate polynomials. Computer software for generating transformations based on user input is described. Stereological techniques for assessing structural distributions in reconstructed volumes are the unbiased bricking, disector, unbiased ratio, and per-length counting techniques. A new general method, the fractional counter, is also described. This unbiased technique relies on the counting of fractions of objects contained in a test volume. A volume of brain tissue from stratum radiatum of hippocampal area CA1 is reconstructed and analyzed for synaptic density to demonstrate and compare the techniques. RESULTS AND CONCLUSIONS: Reconstruction makes practicable volume-oriented analysis of ultrastructure using such techniques as the unbiased bricking and fractional counter methods. These analysis methods are less sensitive to the section-to-section variations in counts and section thickness, factors that contribute to the inaccuracy of other stereological methods. In addition, volume reconstruction facilitates visualization and modeling of structures and analysis of three-dimensional relationships such as synaptic connectivity.
2015-01-01
Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745
Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon
2014-05-27
Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.
Current and efficiency of Brownian particles under oscillating forces in entropic barriers
NASA Astrophysics Data System (ADS)
Nutku, Ferhat; Aydιner, Ekrem
2015-04-01
In this study, considering the temporarily unbiased force and different forms of oscillating forces, we investigate the current and efficiency of Brownian particles in an entropic tube structure and present the numerically obtained results. We show that different force forms give rise to different current and efficiency profiles in different optimized parameter intervals. We find that an unbiased oscillating force and an unbiased temporal force lead to the current and efficiency, which are dependent on these parameters. We also observe that the current and efficiency caused by temporal and different oscillating forces have maximum and minimum values in different parameter intervals. We conclude that the current or efficiency can be controlled dynamically by adjusting the parameters of entropic barriers and applied force. Project supported by the Funds from Istanbul University (Grant No. 45662).
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Quantum key distribution for composite dimensional finite systems
NASA Astrophysics Data System (ADS)
Shalaby, Mohamed; Kamal, Yasser
2017-06-01
The application of quantum mechanics contributes to the field of cryptography with very important advantage as it offers a mechanism for detecting the eavesdropper. The pioneering work of quantum key distribution uses mutually unbiased bases (MUBs) to prepare and measure qubits (or qudits). Weak mutually unbiased bases (WMUBs) have weaker properties than MUBs properties, however, unlike MUBs, a complete set of WMUBs can be constructed for systems with composite dimensions. In this paper, we study the use of weak mutually unbiased bases (WMUBs) in quantum key distribution for composite dimensional finite systems. We prove that the security analysis of using a complete set of WMUBs to prepare and measure the quantum states in the generalized BB84 protocol, gives better results than using the maximum number of MUBs that can be constructed, when they are analyzed against the intercept and resend attack.
Stabilized determination of geopotential coefficients by the mixed hom-BLUP approach
NASA Technical Reports Server (NTRS)
Middel, B.; Schaffrin, B.
1989-01-01
For the determination of geopotential coefficients, data can be used from rather different sources, e.g., satellite tracking, gravimetry, or altimetry. As each data type is particularly sensitive to certain wavelengths of the spherical harmonic coefficients it is of essential importance how they are treated in a combination solution. For example the longer wavelengths are well described by the coefficients of a model derived by satellite tracking, while other observation types such as gravity anomalies, delta g, and geoid heights, N, from altimetry contain only poor information for these long wavelengths. Therefore, the lower coefficients of the satellite model should be treated as being superior in the combination. In the combination a new method is presented which turns out to be highly suitable for this purpose due to its great flexibility combined with robustness.
Unbiased Estimation of Refractive State of Aberrated Eyes
Martin, Jesson; Vasudevan, Balamurali; Himebaugh, Nikole; Bradley, Arthur; Thibos, Larry
2011-01-01
To identify unbiased methods for estimating the target vergence required to maximize visual acuity based on wavefront aberration measurements. Experiments were designed to minimize the impact of confounding factors that have hampered previous research. Objective wavefront refractions and subjective acuity refractions were obtained for the same monochromatic wavelength. Accommodation and pupil fluctuations were eliminated by cycloplegia. Unbiased subjective refractions that maximize visual acuity for high contrast letters were performed with a computer controlled forced choice staircase procedure, using 0.125 diopter steps of defocus. All experiments were performed for two pupil diameters (3mm and 6mm). As reported in the literature, subjective refractive error does not change appreciably when the pupil dilates. For 3 mm pupils most metrics yielded objective refractions that were about 0.1D more hyperopic than subjective acuity refractions. When pupil diameter increased to 6 mm, this bias changed in the myopic direction and the variability between metrics also increased. These inaccuracies were small compared to the precision of the measurements, which implies that most metrics provided unbiased estimates of refractive state for medium and large pupils. A variety of image quality metrics may be used to determine ocular refractive state for monochromatic (635nm) light, thereby achieving accurate results without the need for empirical correction factors. PMID:21777601
Xu, Yan; Liu, Biao; Ding, Fengan; Zhou, Xiaodie; Tu, Pin; Yu, Bo; He, Yan; Huang, Peilin
2017-06-01
Circulating tumor cells (CTCs), isolated as a 'liquid biopsy', may provide important diagnostic and prognostic information. Therefore, rapid, reliable and unbiased detection of CTCs are required for routine clinical analyses. It was demonstrated that negative enrichment, an epithelial marker-independent technique for isolating CTCs, exhibits a better efficiency in the detection of CTCs compared with positive enrichment techniques that only use specific anti-epithelial cell adhesion molecules. However, negative enrichment techniques incur significant cell loss during the isolation procedure, and as it is a method that uses only one type of antibody, it is inherently biased. The detection procedure and identification of cell types also relies on skilled and experienced technicians. In the present study, the detection sensitivity of using negative enrichment and a previously described unbiased detection method was compared. The results revealed that unbiased detection methods may efficiently detect >90% of cancer cells in blood samples containing CTCs. By contrast, only 40-60% of CTCs were detected by negative enrichment. Additionally, CTCs were identified in >65% of patients with stage I/II lung cancer. This simple yet efficient approach may achieve a high level of sensitivity. It demonstrates a potential for the large-scale clinical implementation of CTC-based diagnostic and prognostic strategies.
Building unbiased estimators from non-gaussian likelihoods with application to shear estimation
Madhavacheril, Mathew S.; McDonald, Patrick; Sehgal, Neelima; ...
2015-01-15
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong’s estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g| = 0.2.« less
Building unbiased estimators from non-Gaussian likelihoods with application to shear estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madhavacheril, Mathew S.; Sehgal, Neelima; McDonald, Patrick
2015-01-01
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong's estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g|=0.2.« less
2016-10-01
identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to identify novel PCSC-targeting chemicals. In the past...display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides ; and 2) To perform unbiased drug library screening to...Goals of the Project (SOW): Aim 1: To perform phage display library (PDL) screening in PSA-/lo PCa cells to identify PCSC- specific homing peptides
ERIC Educational Resources Information Center
Raudenbush, Stephen
2013-01-01
This brief considers the problem of using value-added scores to compare teachers who work in different schools. The author focuses on whether such comparisons can be regarded as fair, or, in statistical language, "unbiased." An unbiased measure does not systematically favor teachers because of the backgrounds of the students they are…
Long Term Follow up of the Delayed Effects of Acute Radiation Exposure in Primates
2017-10-01
66 of 94 We will then use shRNAs and/or CRISPR constructs targeting the gene of interest to knock down its expression in stem cells prior to...DLBCLs Mutational profiling identifies 150 driver genes Gene expression identifies sub- groups including cell of origin Unbiased CRISPR screen...Exome sequencing in 1,001 DLBCL patients comprehensively identifies 150 driver genes d Unbiased CRISPR screen in DLBCL cell lines identifies essential
Four photon parametric amplification. [in unbiased Josephson junction
NASA Technical Reports Server (NTRS)
Parrish, P. T.; Feldman, M. J.; Ohta, H.; Chiao, R. Y.
1974-01-01
An analysis is presented describing four-photon parametric amplification in an unbiased Josephson junction. Central to the theory is the model of the Josephson effect as a nonlinear inductance. Linear, small signal analysis is applied to the two-fluid model of the Josephson junction. The gain, gain-bandwidth product, high frequency limit, and effective noise temperature are calculated for a cavity reflection amplifier. The analysis is extended to multiple (series-connected) junctions and subharmonic pumping.
NASA Astrophysics Data System (ADS)
Kwon, Ki-Won; Cho, Yongsoo
This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.
Test of mutually unbiased bases for six-dimensional photonic quantum systems
D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio
2013-01-01
In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a “qusix”), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution. PMID:24067548
Test of mutually unbiased bases for six-dimensional photonic quantum systems.
D'Ambrosio, Vincenzo; Cardano, Filippo; Karimi, Ebrahim; Nagali, Eleonora; Santamato, Enrico; Marrucci, Lorenzo; Sciarrino, Fabio
2013-09-25
In quantum information, complementarity of quantum mechanical observables plays a key role. The eigenstates of two complementary observables form a pair of mutually unbiased bases (MUBs). More generally, a set of MUBs consists of bases that are all pairwise unbiased. Except for specific dimensions of the Hilbert space, the maximal sets of MUBs are unknown in general. Even for a dimension as low as six, the identification of a maximal set of MUBs remains an open problem, although there is strong numerical evidence that no more than three simultaneous MUBs do exist. Here, by exploiting a newly developed holographic technique, we implement and test different sets of three MUBs for a single photon six-dimensional quantum state (a "qusix"), encoded exploiting polarization and orbital angular momentum of photons. A close agreement is observed between theory and experiments. Our results can find applications in state tomography, quantitative wave-particle duality, quantum key distribution.
Graph-state formalism for mutually unbiased bases
NASA Astrophysics Data System (ADS)
Spengler, Christoph; Kraus, Barbara
2013-11-01
A pair of orthonormal bases is called mutually unbiased if all mutual overlaps between any element of one basis and an arbitrary element of the other basis coincide. In case the dimension, d, of the considered Hilbert space is a power of a prime number, complete sets of d+1 mutually unbiased bases (MUBs) exist. Here we present a method based on the graph-state formalism to construct such sets of MUBs. We show that for n p-level systems, with p being prime, one particular graph suffices to easily construct a set of pn+1 MUBs. In fact, we show that a single n-dimensional vector, which is associated with this graph, can be used to generate a complete set of MUBs and demonstrate that this vector can be easily determined. Finally, we discuss some advantages of our formalism regarding the analysis of entanglement structures in MUBs, as well as experimental realizations.
Mutually unbiased coarse-grained measurements of two or more phase-space variables
NASA Astrophysics Data System (ADS)
Paul, E. C.; Walborn, S. P.; Tasca, D. S.; Rudnicki, Łukasz
2018-05-01
Mutual unbiasedness of the eigenstates of phase-space operators—such as position and momentum, or their standard coarse-grained versions—exists only in the limiting case of infinite squeezing. In Phys. Rev. Lett. 120, 040403 (2018), 10.1103/PhysRevLett.120.040403, it was shown that mutual unbiasedness can be recovered for periodic coarse graining of these two operators. Here we investigate mutual unbiasedness of coarse-grained measurements for more than two phase-space variables. We show that mutual unbiasedness can be recovered between periodic coarse graining of any two nonparallel phase-space operators. We illustrate these results through optics experiments, using the fractional Fourier transform to prepare and measure mutually unbiased phase-space variables. The differences between two and three mutually unbiased measurements is discussed. Our results contribute to bridging the gap between continuous and discrete quantum mechanics, and they could be useful in quantum-information protocols.
A Bayesian method for assessing multiscalespecies-habitat relationships
Stuber, Erica F.; Gruber, Lutz F.; Fontaine, Joseph J.
2017-01-01
ContextScientists face several theoretical and methodological challenges in appropriately describing fundamental wildlife-habitat relationships in models. The spatial scales of habitat relationships are often unknown, and are expected to follow a multi-scale hierarchy. Typical frequentist or information theoretic approaches often suffer under collinearity in multi-scale studies, fail to converge when models are complex or represent an intractable computational burden when candidate model sets are large.ObjectivesOur objective was to implement an automated, Bayesian method for inference on the spatial scales of habitat variables that best predict animal abundance.MethodsWe introduce Bayesian latent indicator scale selection (BLISS), a Bayesian method to select spatial scales of predictors using latent scale indicator variables that are estimated with reversible-jump Markov chain Monte Carlo sampling. BLISS does not suffer from collinearity, and substantially reduces computation time of studies. We present a simulation study to validate our method and apply our method to a case-study of land cover predictors for ring-necked pheasant (Phasianus colchicus) abundance in Nebraska, USA.ResultsOur method returns accurate descriptions of the explanatory power of multiple spatial scales, and unbiased and precise parameter estimates under commonly encountered data limitations including spatial scale autocorrelation, effect size, and sample size. BLISS outperforms commonly used model selection methods including stepwise and AIC, and reduces runtime by 90%.ConclusionsGiven the pervasiveness of scale-dependency in ecology, and the implications of mismatches between the scales of analyses and ecological processes, identifying the spatial scales over which species are integrating habitat information is an important step in understanding species-habitat relationships. BLISS is a widely applicable method for identifying important spatial scales, propagating scale uncertainty, and testing hypotheses of scaling relationships.
Endotoxin Exposure: Predictors and Prevalence of Associated Asthma Outcomes in the United States
Mendy, Angelico; Metwali, Nervana; Salo, Päivi; Co, Caroll; Jaramillo, Renee; Rose, Kathryn M.; Zeldin, Darryl C.
2015-01-01
Rationale: Inhaled endotoxin induces airway inflammation and is an established risk factor for asthma. The 2005–2006 National Health and Nutrition Examination Survey included measures of endotoxin and allergens in homes as well as specific IgE to inhalant allergens. Objectives: To understand the relationships between endotoxin exposure, asthma outcomes, and sensitization status for 15 aeroallergens in a nationally representative sample. Methods: Participants were administered questionnaires in their homes. Reservoir dust was vacuum sampled to generate composite bedding and bedroom floor samples. We analyzed 7,450 National Health and Nutrition Examination Survey dust and quality assurance samples for their endotoxin content using extreme quality assurance measures. Data for 6,963 subjects were available, making this the largest study of endotoxin exposure to date. Log-transformed endotoxin concentrations were analyzed using logistic models and forward stepwise linear regression. Analyses were weighted to provide national prevalence estimates and unbiased variances. Measurements and Main Results: Endotoxin exposure was significantly associated with wheeze in the past 12 months, wheeze during exercise, doctor and/or emergency room visits for wheeze, and use of prescription medications for wheeze. Models adjusted for age, sex, race and/or ethnicity, and poverty-to-income ratio and stratified by allergy status showed that these relationships were not dependent upon sensitization status but were worsened among those living in poverty. Significant predictors of higher endotoxin exposures were lower family income; Hispanic ethnicity; participant age; dog(s), cat(s), cockroaches, and/or smoker(s) in the home; and carpeted floors. Conclusions: In this U.S. nationwide representative sample, higher endotoxin exposure was significantly associated with measures of wheeze, with no observed protective effect regardless of sensitization status. PMID:26258643
Jang, In Sock; Dienstmann, Rodrigo; Margolin, Adam A; Guinney, Justin
2015-01-01
Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.
Cross-Layer Resource Allocation for Wireless Visual Sensor Networks and Mobile Ad Hoc Networks
2014-10-01
MMD), minimizes the maximum dis- tortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. We employed the Particle...achieve the ideal tradeoff between the transmitted video quality and energy consumption. Each sensor node has a bit rate that can be used for both...Distortion (MMD), minimizes the maximum distortion among all nodes of the network, promoting a rather unbiased treatment of the nodes. For both criteria
Unbiased estimators for spatial distribution functions of classical fluids
NASA Astrophysics Data System (ADS)
Adib, Artur B.; Jarzynski, Christopher
2005-01-01
We use a statistical-mechanical identity closely related to the familiar virial theorem, to derive unbiased estimators for spatial distribution functions of classical fluids. In particular, we obtain estimators for both the fluid density ρ(r) in the vicinity of a fixed solute and the pair correlation g(r) of a homogeneous classical fluid. We illustrate the utility of our estimators with numerical examples, which reveal advantages over traditional histogram-based methods of computing such distributions.
Drug testing for newborn exposure to illicit substances in pregnancy: pitfalls and pearls.
Farst, Karen J; Valentine, Jimmie L; Hall, R Whit
2011-01-01
Estimates of the prevalence of drug usage during pregnancy vary by region and survey tool used. Clinicians providing care to newborns should be equipped to recognize a newborn who has been exposed to illicit drugs during pregnancy by the effects the exposure might cause at the time of delivery and/or by drug testing of the newborn. The purpose of this paper is to provide an overview of the literature and assess the clinical role of drug testing in the newborn. Accurate recognition of a newborn whose mother has used illicit drugs in pregnancy cannot only impact decisions for healthcare in the nursery around the time of delivery, but can also provide a key opportunity to assess the mother for needed services. While drug use in pregnancy is not an independent predictor of the mother's ability to provide a safe and nurturing environment for her newborn, other issues that often cooccur in the life of a mother with a substance abuse disorder raise concerns for the safety of the discharge environment and should be assessed. Healthcare providers in these roles should advocate for unbiased and effective treatment services for affected families.
Temko, Andriy; Doyle, Orla; Murray, Deirdre; Lightbody, Gordon; Boylan, Geraldine; Marnane, William
2015-08-01
Automated multimodal prediction of outcome in newborns with hypoxic-ischaemic encephalopathy is investigated in this work. Routine clinical measures and 1h EEG and ECG recordings 24h after birth were obtained from 38 newborns with different grades of HIE. Each newborn was reassessed at 24 months to establish their neurodevelopmental outcome. A set of multimodal features is extracted from the clinical, heart rate and EEG measures and is fed into a support vector machine classifier. The performance is reported with the statistically most unbiased leave-one-patient-out performance assessment routine. A subset of informative features, whose rankings are consistent across all patients, is identified. The best performance is obtained using a subset of 9 EEG, 2h and 1 clinical feature, leading to an area under the ROC curve of 87% and accuracy of 84% which compares favourably to the EEG-based clinical outcome prediction, previously reported on the same data. The work presents a promising step towards the use of multimodal data in building an objective decision support tool for clinical prediction of neurodevelopmental outcome in newborns with hypoxic-ischaemic encephalopathy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multilevel structural equation models for assessing moderation within and across levels of analysis.
Preacher, Kristopher J; Zhang, Zhen; Zyphur, Michael J
2016-06-01
Social scientists are increasingly interested in multilevel hypotheses, data, and statistical models as well as moderation or interactions among predictors. The result is a focus on hypotheses and tests of multilevel moderation within and across levels of analysis. Unfortunately, existing approaches to multilevel moderation have a variety of shortcomings, including conflated effects across levels of analysis and bias due to using observed cluster averages instead of latent variables (i.e., "random intercepts") to represent higher-level constructs. To overcome these problems and elucidate the nature of multilevel moderation effects, we introduce a multilevel structural equation modeling (MSEM) logic that clarifies the nature of the problems with existing practices and remedies them with latent variable interactions. This remedy uses random coefficients and/or latent moderated structural equations (LMS) for unbiased tests of multilevel moderation. We describe our approach and provide an example using the publicly available High School and Beyond data with Mplus syntax in Appendix. Our MSEM method eliminates problems of conflated multilevel effects and reduces bias in parameter estimates while offering a coherent framework for conceptualizing and testing multilevel moderation effects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Simultaneous grouping pursuit and feature selection over an undirected graph*
Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei
2013-01-01
Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061
D'Lima, Danielle M; Moore, Joanna; Bottle, Alex; Brett, Stephen J; Arnold, Glenn M; Benn, Jonathan
2015-01-01
Research suggests that better feedback from quality and safety indicators leads to enhanced capability of clinicians and departments to improve care and change behaviour. The aim of the current study was to investigate the characteristics of feedback perceived by clinicians to be of most value. Data were collected using a survey designed as part of a wider evaluation of a data feedback initiative in anaesthesia. Eighty-nine consultant anaesthetists from two English NHS acute Trusts completed the survey. Multiple linear regression with hierarchical variable entry was used to investigate which characteristics of feedback predict its perceived usefulness for monitoring variation and improving care. The final model demonstrated that the relevance of the quality indicators to the specific service area (β=0.64, p=0.01) and the credibility of the data as coming from a trustworthy, unbiased source (β=0.55, p=0.01) were the significant predictors, having controlled for all other covariates. For clinicians to engage with effective quality monitoring and feedback, the perceived local relevance of indicators and trust in the credibility of the resulting data are paramount. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
NASA Astrophysics Data System (ADS)
Moreira, Joao; Zeng, Xiaohan; Amaral, Luis
2013-03-01
Assessing the career performance of scientists has become essential to modern science. Bibliometric indicators, like the h-index are becoming more and more decisive in evaluating grants and approving publication of articles. However, many of the more used indicators can be manipulated or falsified by publishing with very prolific researchers or self-citing papers with a certain number of citations, for instance. Accounting for these factors is possible but it introduces unwanted complexity that drives us further from the purpose of the indicator: to represent in a clear way the prestige and importance of a given scientist. Here we try to overcome this challenge. We used Thompson Reuter's Web of Science database and analyzed all the papers published until 2000 by ~1500 researchers in the top 30 departments of seven scientific fields. We find that over 97% of them have a citation distribution that is consistent with a discrete lognormal model. This suggests that our model can be used to accurately predict the performance of a researcher. Furthermore, this predictor does not depend on the individual number of publications and is not easily ``gamed'' on. The authors acknowledge support from FCT Portugal, and NSF grants
Farias, F J C; Carvalho, L P; Silva Filho, J L; Teodoro, P E
2016-08-19
The harmonic mean of the relative performance of genotypic predicted value (HMRPGV) method has been used to measure the genotypic stability and adaptability of various crops. However, its use in cotton is still restricted. This study aimed to use mixed models to select cotton genotypes that simultaneously result in longer fiber length, higher fiber yield, and phenotypic stability in both of these traits. Eight trials with 16 cotton genotypes were conducted in the 2008/2009 harvest in Mato Grosso State. The experimental design was randomized complete blocks with four replicates of each of the 16 genotypes. In each trial, we evaluated fiber yield and fiber length. The genetic parameters were estimated using the restricted maximum likelihood/best linear unbiased predictor method. Joint selection considering, simultaneously, fiber length, fiber yield, stability, and adaptability is possible with the HMRPGV method. Our results suggested that genotypes CNPA MT 04 2080 and BRS CEDRO may be grown in environments similar to those tested here and may be predicted to result in greater fiber length, fiber yield, adaptability, and phenotypic stability. These genotypes may constitute a promising population base in breeding programs aimed at increasing these trait values.
Genome-wide association study for birth, weaning and yearling weight in Colombian Brahman cattle
Martínez, Rodrigo; Bejarano, Diego; Gómez, Yolanda; Dasoneville, Romain; Jiménez, Ariel; Even, Gael; Sölkner, Johann; Mészáros, Gabor
2017-01-01
Abstract Genotypic and phenotypic data of 1,562 animals were analyzed to find genomic regions that potentially influence the birth weight (BW), weaning weight at seven months of age (WW) and yearling weight (YW) of Colombian Brahman cattle, with genotyping conducted using Illumina Bead chip array with 74,669 SNPs. A Single Step Genomic BLUP (ssGBLP), approach was used to estimate the proportion of variance explained by each marker. Multiple regions scattered across the genome were found to influence weights at different ages, also dependent on the trait component (direct or maternal). The most interesting regions were connected to previously identified QTLs and genes, such as ADAMTSL3, CAPN2, CAPN2, FABP6, ZEB2 influencing growth and weight traits. The identified regions will contribute to the development and refinement of genomic selection programs for Zebu Brahman cattle in Colombia. PMID:28534927
Mogensen, Kris M; Andrew, Benjamin Y; Corona, Jasmine C; Robinson, Malcolm K
2016-07-01
The Society of Critical Care Medicine (SCCM) and American Society for Parenteral and Enteral Nutrition (ASPEN) recommend that obese, critically ill patients receive 11-14 kcal/kg/d using actual body weight (ABW) or 22-25 kcal/kg/d using ideal body weight (IBW), because feeding these patients 50%-70% maintenance needs while administering high protein may improve outcomes. It is unknown whether these equations achieve this target when validated against indirect calorimetry, perform equally across all degrees of obesity, or compare well with other equations. Measured resting energy expenditure (MREE) was determined in obese (body mass index [BMI] ≥30 kg/m(2)), critically ill patients. Resting energy expenditure was predicted (PREE) using several equations: 12.5 kcal/kg ABW (ASPEN-Actual BW), 23.5 kcal/kg IBW (ASPEN-Ideal BW), Harris-Benedict (adjusted-weight and 1.5 stress-factor), and Ireton-Jones for obesity. Correlation of PREE to 65% MREE, predictive accuracy, precision, bias, and large error incidence were calculated. All equations were significantly correlated with 65% MREE but had poor predictive accuracy, had excessive large error incidence, were imprecise, and were biased in the entire cohort (N = 31). In the obesity cohort (n = 20, BMI 30-50 kg/m(2)), ASPEN-Actual BW had acceptable predictive accuracy and large error incidence, was unbiased, and was nearly precise. In super obesity (n = 11, BMI >50 kg/m(2)), ASPEN-Ideal BW had acceptable predictive accuracy and large error incidence and was precise and unbiased. SCCM/ASPEN-recommended body weight equations are reasonable predictors of 65% MREE depending on the equation and degree of obesity. Assuming that feeding 65% MREE is appropriate, this study suggests that patients with a BMI 30-50 kg/m(2) should receive 11-14 kcal/kg/d using ABW and those with a BMI >50 kg/m(2) should receive 22-25 kcal/kg/d using IBW. © 2015 American Society for Parenteral and Enteral Nutrition.
Chen, Charles H; Wiedman, Gregory; Khan, Ayesha; Ulmschneider, Martin B
2014-09-01
Unbiased molecular simulation is a powerful tool to study the atomic details driving functional structural changes or folding pathways of highly fluid systems, which present great challenges experimentally. Here we apply unbiased long-timescale molecular dynamics simulation to study the ab initio folding and partitioning of melittin, a template amphiphilic membrane active peptide. The simulations reveal that the peptide binds strongly to the lipid bilayer in an unstructured configuration. Interfacial folding results in a localized bilayer deformation. Akin to purely hydrophobic transmembrane segments the surface bound native helical conformer is highly resistant against thermal denaturation. Circular dichroism spectroscopy experiments confirm the strong binding and thermostability of the peptide. The study highlights the utility of molecular dynamics simulations for studying transient mechanisms in fluid lipid bilayer systems. This article is part of a Special Issue entitled: Interfacially Active Peptides and Proteins. Guest Editors: William C. Wimley and Kalina Hristova. Copyright © 2014. Published by Elsevier B.V.
Mixed model approaches for diallel analysis based on a bio-model.
Zhu, J; Weir, B S
1996-12-01
A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.
Uncertainty relation based on unbiased parameter estimations
NASA Astrophysics Data System (ADS)
Sun, Liang-Liang; Song, Yong-Shun; Qiao, Cong-Feng; Yu, Sixia; Chen, Zeng-Bing
2017-02-01
Heisenberg's uncertainty relation has been extensively studied in spirit of its well-known original form, in which the inaccuracy measures used exhibit some controversial properties and don't conform with quantum metrology, where the measurement precision is well defined in terms of estimation theory. In this paper, we treat the joint measurement of incompatible observables as a parameter estimation problem, i.e., estimating the parameters characterizing the statistics of the incompatible observables. Our crucial observation is that, in a sequential measurement scenario, the bias induced by the first unbiased measurement in the subsequent measurement can be eradicated by the information acquired, allowing one to extract unbiased information of the second measurement of an incompatible observable. In terms of Fisher information we propose a kind of information comparison measure and explore various types of trade-offs between the information gains and measurement precisions, which interpret the uncertainty relation as surplus variance trade-off over individual perfect measurements instead of a constraint on extracting complete information of incompatible observables.
Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data.
Rohrer, Sebastian G; Baumann, Knut
2009-02-01
Refined nearest neighbor analysis was recently introduced for the analysis of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a mathematical framework for the nonparametric analysis of mapped point patterns. Here, refined nearest neighbor analysis is used to design benchmark data sets for virtual screening based on PubChem bioactivity data. A workflow is devised that purges data sets of compounds active against pharmaceutically relevant targets from unselective hits. Topological optimization using experimental design strategies monitored by refined nearest neighbor analysis functions is applied to generate corresponding data sets of actives and decoys that are unbiased with regard to analogue bias and artificial enrichment. These data sets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The data sets and a software package implementing the MUV design workflow are freely available at http://www.pharmchem.tu-bs.de/lehre/baumann/MUV.html.
Houel, Julien; Doan, Quang T; Cajgfinger, Thomas; Ledoux, Gilles; Amans, David; Aubret, Antoine; Dominjon, Agnès; Ferriol, Sylvain; Barbier, Rémi; Nasilowski, Michel; Lhuillier, Emmanuel; Dubertret, Benoît; Dujardin, Christophe; Kulzer, Florian
2015-01-27
We present an unbiased and robust analysis method for power-law blinking statistics in the photoluminescence of single nanoemitters, allowing us to extract both the bright- and dark-state power-law exponents from the emitters' intensity autocorrelation functions. As opposed to the widely used threshold method, our technique therefore does not require discriminating the emission levels of bright and dark states in the experimental intensity timetraces. We rely on the simultaneous recording of 450 emission timetraces of single CdSe/CdS core/shell quantum dots at a frame rate of 250 Hz with single photon sensitivity. Under these conditions, our approach can determine ON and OFF power-law exponents with a precision of 3% from a comparison to numerical simulations, even for shot-noise-dominated emission signals with an average intensity below 1 photon per frame and per quantum dot. These capabilities pave the way for the unbiased, threshold-free determination of blinking power-law exponents at the microsecond time scale.
Breeding and Genetics Symposium: really big data: processing and analysis of very large data sets.
Cole, J B; Newman, S; Foertter, F; Aguilar, I; Coffey, M
2012-03-01
Modern animal breeding data sets are large and getting larger, due in part to recent availability of high-density SNP arrays and cheap sequencing technology. High-performance computing methods for efficient data warehousing and analysis are under development. Financial and security considerations are important when using shared clusters. Sound software engineering practices are needed, and it is better to use existing solutions when possible. Storage requirements for genotypes are modest, although full-sequence data will require greater storage capacity. Storage requirements for intermediate and results files for genetic evaluations are much greater, particularly when multiple runs must be stored for research and validation studies. The greatest gains in accuracy from genomic selection have been realized for traits of low heritability, and there is increasing interest in new health and management traits. The collection of sufficient phenotypes to produce accurate evaluations may take many years, and high-reliability proofs for older bulls are needed to estimate marker effects. Data mining algorithms applied to large data sets may help identify unexpected relationships in the data, and improved visualization tools will provide insights. Genomic selection using large data requires a lot of computing power, particularly when large fractions of the population are genotyped. Theoretical improvements have made possible the inversion of large numerator relationship matrices, permitted the solving of large systems of equations, and produced fast algorithms for variance component estimation. Recent work shows that single-step approaches combining BLUP with a genomic relationship (G) matrix have similar computational requirements to traditional BLUP, and the limiting factor is the construction and inversion of G for many genotypes. A naïve algorithm for creating G for 14,000 individuals required almost 24 h to run, but custom libraries and parallel computing reduced that to 15 m. Large data sets also create challenges for the delivery of genetic evaluations that must be overcome in a way that does not disrupt the transition from conventional to genomic evaluations. Processing time is important, especially as real-time systems for on-farm decisions are developed. The ultimate value of these systems is to decrease time-to-results in research, increase accuracy in genomic evaluations, and accelerate rates of genetic improvement.
Farmer, Jocelyn R; Ong, Mei-Sing; Barmettler, Sara; Yonker, Lael M; Fuleihan, Ramsay; Sullivan, Kathleen E; Cunningham-Rundles, Charlotte; Walter, Jolan E
2017-01-01
Common variable immunodeficiency (CVID) is increasingly recognized for its association with autoimmune and inflammatory complications. Despite recent advances in immunophenotypic and genetic discovery, clinical care of CVID remains limited by our inability to accurately model risk for non-infectious disease development. Herein, we demonstrate the utility of unbiased network clustering as a novel method to analyze inter-relationships between non-infectious disease outcomes in CVID using databases at the United States Immunodeficiency Network (USIDNET), the centralized immunodeficiency registry of the United States, and Partners, a tertiary care network in Boston, MA, USA, with a shared electronic medical record amenable to natural language processing. Immunophenotypes were comparable in terms of native antibody deficiencies, low titer response to pneumococcus, and B cell maturation arrest. However, recorded non-infectious disease outcomes were more substantial in the Partners cohort across the spectrum of lymphoproliferation, cytopenias, autoimmunity, atopy, and malignancy. Using unbiased network clustering to analyze 34 non-infectious disease outcomes in the Partners cohort, we further identified unique patterns of lymphoproliferative (two clusters), autoimmune (two clusters), and atopic (one cluster) disease that were defined as CVID non-infectious endotypes according to discrete and non-overlapping immunophenotypes. Markers were both previously described {high serum IgE in the atopic cluster [odds ratio (OR) 6.5] and low class-switched memory B cells in the total lymphoproliferative cluster (OR 9.2)} and novel [low serum C3 in the total lymphoproliferative cluster (OR 5.1)]. Mortality risk in the Partners cohort was significantly associated with individual non-infectious disease outcomes as well as lymphoproliferative cluster 2, specifically (OR 5.9). In contrast, unbiased network clustering failed to associate known comorbidities in the adult USIDNET cohort. Together, these data suggest that unbiased network clustering can be used in CVID to redefine non-infectious disease inter-relationships; however, applicability may be limited to datasets well annotated through mechanisms such as natural language processing. The lymphoproliferative, autoimmune, and atopic Partners CVID endotypes herein described can be used moving forward to streamline genetic and biomarker discovery and to facilitate early screening and intervention in CVID patients at highest risk for autoimmune and inflammatory progression.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Implementation of a Comprehensive Curriculum in Personal Finance for Medical Fellows
Bar-Or, Yuval D; Fessler, Henry E; Desai, Dipan A
2018-01-01
Introduction: Many residents and fellows complete graduate medical education having received minimal unbiased financial planning guidance. This places them at risk of making ill-informed financial decisions, which may lead to significant harm to them and their families. Therefore, we sought to provide fellows with comprehensive unbiased financial education and empower them to make timely, constructive financial decisions. Methods: A self-selected cohort of cardiovascular disease, pulmonary and critical care, and infectious disease fellows (n = 18) at a single institution attended a live, eight-hour interactive course on personal finance. The course consisted of four two-hour sessions delivered over four weeks, facilitated by an unbiased business school faculty member with expertise in personal finance. Prior to the course, all participants completed a demographic survey. After course completion, participants were offered an exit survey evaluating the course, which also asked respondents for any tangible financial decisions made as a result of the course learning. Results: Participants included 12 women and six men, with a mean age of 33 and varying amounts of debt and financial assets. Twelve respondents completed the exit survey, and all “Strongly Agreed” that courses on financial literacy are important for trainees. In addition, 11 reported that the course helped them make important financial decisions, providing 21 examples. Conclusions: Fellows derive a significant benefit from objective financial literacy education. Graduate medical education programs should offer comprehensive financial literacy education to all graduating trainees, and that education should be provided by an unbiased expert who has no incentive to sell financial products and services. PMID:29515942
Implementation of a Comprehensive Curriculum in Personal Finance for Medical Fellows.
Bar-Or, Yuval D; Fessler, Henry E; Desai, Dipan A; Zakaria, Sammy
2018-01-01
Many residents and fellows complete graduate medical education having received minimal unbiased financial planning guidance. This places them at risk of making ill-informed financial decisions, which may lead to significant harm to them and their families. Therefore, we sought to provide fellows with comprehensive unbiased financial education and empower them to make timely, constructive financial decisions. A self-selected cohort of cardiovascular disease, pulmonary and critical care, and infectious disease fellows (n = 18) at a single institution attended a live, eight-hour interactive course on personal finance. The course consisted of four two-hour sessions delivered over four weeks, facilitated by an unbiased business school faculty member with expertise in personal finance. Prior to the course, all participants completed a demographic survey. After course completion, participants were offered an exit survey evaluating the course, which also asked respondents for any tangible financial decisions made as a result of the course learning. Results: Participants included 12 women and six men, with a mean age of 33 and varying amounts of debt and financial assets. Twelve respondents completed the exit survey, and all "Strongly Agreed" that courses on financial literacy are important for trainees. In addition, 11 reported that the course helped them make important financial decisions, providing 21 examples. Fellows derive a significant benefit from objective financial literacy education. Graduate medical education programs should offer comprehensive financial literacy education to all graduating trainees, and that education should be provided by an unbiased expert who has no incentive to sell financial products and services.
Detection of seizures from small samples using nonlinear dynamic system theory.
Yaylali, I; Koçak, H; Jayakar, P
1996-07-01
The electroencephalogram (EEG), like many other biological phenomena, is quite likely governed by nonlinear dynamics. Certain characteristics of the underlying dynamics have recently been quantified by computing the correlation dimensions (D2) of EEG time series data. In this paper, D2 of the unbiased autocovariance function of the scalp EEG data was used to detect electrographic seizure activity. Digital EEG data were acquired at a sampling rate of 200 Hz per channel and organized in continuous frames (duration 2.56 s, 512 data points). To increase the reliability of D2 computations with short duration data, raw EEG data were initially simplified using unbiased autocovariance analysis to highlight the periodic activity that is present during seizures. The D2 computation was then performed from the unbiased autocovariance function of each channel using the Grassberger-Procaccia method with Theiler's box-assisted correlation algorithm. Even with short duration data, this preprocessing proved to be computationally robust and displayed no significant sensitivity to implementation details such as the choices of embedding dimension and box size. The system successfully identified various types of seizures in clinical studies.
Biased and unbiased perceptual decision-making on vocal emotions.
Dricu, Mihai; Ceravolo, Leonardo; Grandjean, Didier; Frühholz, Sascha
2017-11-24
Perceptual decision-making on emotions involves gathering sensory information about the affective state of another person and forming a decision on the likelihood of a particular state. These perceptual decisions can be of varying complexity as determined by different contexts. We used functional magnetic resonance imaging and a region of interest approach to investigate the brain activation and functional connectivity behind two forms of perceptual decision-making. More complex unbiased decisions on affective voices recruited an extended bilateral network consisting of the posterior inferior frontal cortex, the orbitofrontal cortex, the amygdala, and voice-sensitive areas in the auditory cortex. Less complex biased decisions on affective voices distinctly recruited the right mid inferior frontal cortex, pointing to a functional distinction in this region following decisional requirements. Furthermore, task-induced neural connectivity revealed stronger connections between these frontal, auditory, and limbic regions during unbiased relative to biased decision-making on affective voices. Together, the data shows that different types of perceptual decision-making on auditory emotions have distinct patterns of activations and functional coupling that follow the decisional strategies and cognitive mechanisms involved during these perceptual decisions.
NASA Astrophysics Data System (ADS)
Schaffrin, Burkhard
2008-02-01
In a linear Gauss-Markov model, the parameter estimates from BLUUE (Best Linear Uniformly Unbiased Estimate) are not robust against possible outliers in the observations. Moreover, by giving up the unbiasedness constraint, the mean squared error (MSE) risk may be further reduced, in particular when the problem is ill-posed. In this paper, the α-weighted S-homBLE (Best homogeneously Linear Estimate) is derived via formulas originally used for variance component estimation on the basis of the repro-BIQUUE (reproducing Best Invariant Quadratic Uniformly Unbiased Estimate) principle in a model with stochastic prior information. In the present model, however, such prior information is not included, which allows the comparison of the stochastic approach (α-weighted S-homBLE) with the well-established algebraic approach of Tykhonov-Phillips regularization, also known as R-HAPS (Hybrid APproximation Solution), whenever the inverse of the “substitute matrix” S exists and is chosen as the R matrix that defines the relative impact of the regularizing term on the final result.
Babaoglu, Kerim; Simeonov, Anton; Irwin, John J.; Nelson, Michael E.; Feng, Brian; Thomas, Craig J.; Cancian, Laura; Costi, M. Paola; Maltby, David A.; Jadhav, Ajit; Inglese, James; Austin, Christopher P.; Shoichet, Brian K.
2009-01-01
High-throughput screening (HTS) is widely used in drug discovery. Especially for screens of unbiased libraries, false positives can dominate “hit lists”; their origins are much debated. Here we determine the mechanism of every active hit from a screen of 70,563 unbiased molecules against β-lactamase using quantitative HTS (qHTS). Of the 1274 initial inhibitors, 95% were detergent-sensitive and were classified as aggregators. Among the 70 remaining were 25 potent, covalent-acting β-lactams. Mass spectra, counter-screens, and crystallography identified 12 as promiscuous covalent inhibitors. The remaining 33 were either aggregators or irreproducible. No specific reversible inhibitors were found. We turned to molecular docking to prioritize molecules from the same library for testing at higher concentrations. Of 16 tested, 2 were modest inhibitors. Subsequent X-ray structures corresponded to the docking prediction. Analog synthesis improved affinity to 8 µM. These results suggest that it may be the physical behavior of organic molecules, not their reactivity, that accounts for most screening artifacts. Structure-based methods may prioritize weak-but-novel chemotypes in unbiased library screens. PMID:18333608
Construction of mutually unbiased bases with cyclic symmetry for qubit systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seyfarth, Ulrich; Ranade, Kedar S.
2011-10-15
For the complete estimation of arbitrary unknown quantum states by measurements, the use of mutually unbiased bases has been well established in theory and experiment for the past 20 years. However, most constructions of these bases make heavy use of abstract algebra and the mathematical theory of finite rings and fields, and no simple and generally accessible construction is available. This is particularly true in the case of a system composed of several qubits, which is arguably the most important case in quantum information science and quantum computation. In this paper, we close this gap by providing a simple andmore » straightforward method for the construction of mutually unbiased bases in the case of a qubit register. We show that our construction is also accessible to experiments, since only Hadamard and controlled-phase gates are needed, which are available in most practical realizations of a quantum computer. Moreover, our scheme possesses the optimal scaling possible, i.e., the number of gates scales only linearly in the number of qubits.« less
Efficiency optimization in a correlation ratchet with asymmetric unbiased fluctuations
NASA Astrophysics Data System (ADS)
Ai, Bao-Quan; Wang, Xian-Ju; Liu, Guo-Tao; Wen, De-Hua; Xie, Hui-Zhang; Chen, Wei; Liu, Liang-Gang
2003-12-01
The efficiency of a Brownian particle moving in a periodic potential in the presence of asymmetric unbiased fluctuations is investigated. We found that even on the quasistatic limit there is a regime where the efficiency can be a peaked function of temperature, which proves that thermal fluctuations facilitate the efficiency of energy transformation, contradicting the earlier findings [H. Kamegawa et al., Phys. Rev. Lett. 80, 5251 (1998)]. It is also found that the mutual interplay between temporal asymmetry and spatial asymmetry may induce optimized efficiency at finite temperatures. The ratchet is not most efficient when it gives maximum current.
AD620SQ/883B Total Ionizing Dose Radiation Lot Acceptance Report for RESTORE-LEO
NASA Technical Reports Server (NTRS)
Burton, Noah; Campola, Michael
2017-01-01
A Radiation Lot Acceptance Test was performed on the AD620SQ/883B, Lot 1708D, in accordance with MIL-STD-883, Method 1019, Condition D. Using a Co-60 source 4 biased parts and 4 unbiased parts were irradiated at 10 mrad/s (0.036 krad/hr) in intervals of approximately 1 krad from 3-10 krads, and ones of 5 krads from 10-25 krads, where it was annealed while unbiased at 25 degrees Celsius, for 2 days, and then, subsequently, annealed while biased at 25 degrees celsius, for another 7 days.
Quasi interpolation with Voronoi splines.
Mirzargar, Mahsa; Entezari, Alireza
2011-12-01
We present a quasi interpolation framework that attains the optimal approximation-order of Voronoi splines for reconstruction of volumetric data sampled on general lattices. The quasi interpolation framework of Voronoi splines provides an unbiased reconstruction method across various lattices. Therefore this framework allows us to analyze and contrast the sampling-theoretic performance of general lattices, using signal reconstruction, in an unbiased manner. Our quasi interpolation methodology is implemented as an efficient FIR filter that can be applied online or as a preprocessing step. We present visual and numerical experiments that demonstrate the improved accuracy of reconstruction across lattices, using the quasi interpolation framework. © 2011 IEEE
Vicini, P; Fields, O; Lai, E; Litwack, E D; Martin, A-M; Morgan, T M; Pacanowski, M A; Papaluca, M; Perez, O D; Ringel, M S; Robson, M; Sakul, H; Vockley, J; Zaks, T; Dolsten, M; Søgaard, M
2016-02-01
High throughput molecular and functional profiling of patients is a key driver of precision medicine. DNA and RNA characterization has been enabled at unprecedented cost and scale through rapid, disruptive progress in sequencing technology, but challenges persist in data management and interpretation. We analyze the state-of-the-art of large-scale unbiased sequencing in drug discovery and development, including technology, application, ethical, regulatory, policy and commercial considerations, and discuss issues of LUS implementation in clinical and regulatory practice. © 2015 American Society for Clinical Pharmacology and Therapeutics.
Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset.
Seashore-Ludlow, Brinton; Rees, Matthew G; Cheah, Jaime H; Cokol, Murat; Price, Edmund V; Coletti, Matthew E; Jones, Victor; Bodycombe, Nicole E; Soule, Christian K; Gould, Joshua; Alexander, Benjamin; Li, Ava; Montgomery, Philip; Wawer, Mathias J; Kuru, Nurdan; Kotz, Joanne D; Hon, C Suk-Yee; Munoz, Benito; Liefeld, Ted; Dančík, Vlado; Bittker, Joshua A; Palmer, Michelle; Bradner, James E; Shamji, Alykhan F; Clemons, Paul A; Schreiber, Stuart L
2015-11-01
Identifying genetic alterations that prime a cancer cell to respond to a particular therapeutic agent can facilitate the development of precision cancer medicines. Cancer cell-line (CCL) profiling of small-molecule sensitivity has emerged as an unbiased method to assess the relationships between genetic or cellular features of CCLs and small-molecule response. Here, we developed annotated cluster multidimensional enrichment analysis to explore the associations between groups of small molecules and groups of CCLs in a new, quantitative sensitivity dataset. This analysis reveals insights into small-molecule mechanisms of action, and genomic features that associate with CCL response to small-molecule treatment. We are able to recapitulate known relationships between FDA-approved therapies and cancer dependencies and to uncover new relationships, including for KRAS-mutant cancers and neuroblastoma. To enable the cancer community to explore these data, and to generate novel hypotheses, we created an updated version of the Cancer Therapeutic Response Portal (CTRP v2). We present the largest CCL sensitivity dataset yet available, and an analysis method integrating information from multiple CCLs and multiple small molecules to identify CCL response predictors robustly. We updated the CTRP to enable the cancer research community to leverage these data and analyses. ©2015 American Association for Cancer Research.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.
Wang, Sheng; Sun, Siqi; Xu, Jinbo
2016-09-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling
Wang, Sheng; Sun, Siqi
2017-01-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168
Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo; Eskridge, Kent; Crossa, José
2014-12-23
Categorical scores for disease susceptibility or resistance often are recorded in plant breeding. The aim of this study was to introduce genomic models for analyzing ordinal characters and to assess the predictive ability of genomic predictions for ordered categorical phenotypes using a threshold model counterpart of the Genomic Best Linear Unbiased Predictor (i.e., TGBLUP). The threshold model was used to relate a hypothetical underlying scale to the outward categorical response. We present an empirical application where a total of nine models, five without interaction and four with genomic × environment interaction (G×E) and genomic additive × additive × environment interaction (G×G×E), were used. We assessed the proposed models using data consisting of 278 maize lines genotyped with 46,347 single-nucleotide polymorphisms and evaluated for disease resistance [with ordinal scores from 1 (no disease) to 5 (complete infection)] in three environments (Colombia, Zimbabwe, and Mexico). Models with G×E captured a sizeable proportion of the total variability, which indicates the importance of introducing interaction to improve prediction accuracy. Relative to models based on main effects only, the models that included G×E achieved 9-14% gains in prediction accuracy; adding additive × additive interactions did not increase prediction accuracy consistently across locations. Copyright © 2015 Montesinos-López et al.
Physics GRE Scores of Prize Postdoctoral Fellows in Astronomy
NASA Astrophysics Data System (ADS)
Levesque, Emily M.; Bezanson, Rachel; Tremblay, Grant
2017-01-01
The Physics GRE has long been a required element of the graduate admissions process in many U.S. astronomy programs; however, its predictive power and utility as a means of selection "successful" applicants had not been quantitatively examined until recently. In the fall of 2015 we circulated a short questionnaire to 271 people who have held U.S. prize postdoctoral fellowships in astrophysics between 2010-2015, asking them to report their Physics GRE scores. The response rate was 64%, and the responding sample was unbiased with respect to the overall gender distribution of prize fellows. The responses revealed that the Physics GRE scores of prize fellows do not adhere to any minimum percentile score and show no statistically significant correlation with the number of first author papers published. As an example, a Physics GRE percentile cutoff of 60% would have eliminated 44% of 2010-2015 U.S. prize postdoctoral fellows, including 60% of the female fellows. From these data, we found no evidence that the Physics GRE could be used as an effective predictor of "success" either in or beyond graduate school. Following this work and last year's official recommendation from the AAS, several astronomy departments have recently decided to eliminate the Physics GRE as a requirement for graduate applicants.
Arnold Anteraper, Sheeba; Guell, Xavier; D'Mello, Anila; Joshi, Neha; Whitfield-Gabrieli, Susan; Joshi, Gagan
2018-06-13
To examine the resting-state functional-connectivity (RsFc) in young adults with high-functioning autism spectrum disorder (HF-ASD) using state-of-the-art fMRI data acquisition and analysis techniques. Simultaneous multi-slice, high temporal resolution fMRI acquisition; unbiased whole-brain connectome-wide multivariate pattern analysis (MVPA) techniques for assessing RsFc; and post-hoc whole-brain seed-to-voxel analyses using MVPA results as seeds. MVPA revealed two clusters of abnormal connectivity in the cerebellum. Whole-brain seed-based functional connectivity analyses informed by MVPA-derived clusters showed significant under connectivity between the cerebellum and social, emotional, and language brain regions in the HF-ASD group compared to healthy controls. The results we report are coherent with existing structural, functional, and RsFc literature in autism, extend previous literature reporting cerebellar abnormalities in the neuropathology of autism, and highlight the cerebellum as a potential target for therapeutic, diagnostic, predictive, and prognostic developments in ASD. The description of functional connectivity abnormalities using whole-brain, data-driven analyses as reported in the present study may crucially advance the development of ASD biomarkers, targets for therapeutic interventions, and neural predictors for measuring treatment response.
A calibration hierarchy for risk models was defined: from utopia to empirical data.
Van Calster, Ben; Nieboer, Daan; Vergouwe, Yvonne; De Cock, Bavo; Pencina, Michael J; Steyerberg, Ewout W
2016-06-01
Calibrated risk models are vital for valid decision support. We define four levels of calibration and describe implications for model development and external validation of predictions. We present results based on simulated data sets. A common definition of calibration is "having an event rate of R% among patients with a predicted risk of R%," which we refer to as "moderate calibration." Weaker forms of calibration only require the average predicted risk (mean calibration) or the average prediction effects (weak calibration) to be correct. "Strong calibration" requires that the event rate equals the predicted risk for every covariate pattern. This implies that the model is fully correct for the validation setting. We argue that this is unrealistic: the model type may be incorrect, the linear predictor is only asymptotically unbiased, and all nonlinear and interaction effects should be correctly modeled. In addition, we prove that moderate calibration guarantees nonharmful decision making. Finally, results indicate that a flexible assessment of calibration in small validation data sets is problematic. Strong calibration is desirable for individualized decision support but unrealistic and counter productive by stimulating the development of overly complex models. Model development and external validation should focus on moderate calibration. Copyright © 2016 Elsevier Inc. All rights reserved.
A high-resolution map of the three-dimensional chromatin interactome in human cells.
Jin, Fulai; Li, Yan; Dixon, Jesse R; Selvaraj, Siddarth; Ye, Zhen; Lee, Ah Young; Yen, Chia-An; Schmitt, Anthony D; Espinoza, Celso A; Ren, Bing
2013-11-14
A large number of cis-regulatory sequences have been annotated in the human genome, but defining their target genes remains a challenge. One strategy is to identify the long-range looping interactions at these elements with the use of chromosome conformation capture (3C)-based techniques. However, previous studies lack either the resolution or coverage to permit a whole-genome, unbiased view of chromatin interactions. Here we report a comprehensive chromatin interaction map generated in human fibroblasts using a genome-wide 3C analysis method (Hi-C). We determined over one million long-range chromatin interactions at 5-10-kb resolution, and uncovered general principles of chromatin organization at different types of genomic features. We also characterized the dynamics of promoter-enhancer contacts after TNF-α signalling in these cells. Unexpectedly, we found that TNF-α-responsive enhancers are already in contact with their target promoters before signalling. Such pre-existing chromatin looping, which also exists in other cell types with different extracellular signalling, is a strong predictor of gene induction. Our observations suggest that the three-dimensional chromatin landscape, once established in a particular cell type, is relatively stable and could influence the selection or activation of target genes by a ubiquitous transcription activator in a cell-specific manner.
Epidemiologic research using probabilistic outcome definitions.
Cai, Bing; Hennessy, Sean; Lo Re, Vincent; Small, Dylan S
2015-01-01
Epidemiologic studies using electronic healthcare data often define the presence or absence of binary clinical outcomes by using algorithms with imperfect specificity, sensitivity, and positive predictive value. This results in misclassification and bias in study results. We describe and evaluate a new method called probabilistic outcome definition (POD) that uses logistic regression to estimate the probability of a clinical outcome using multiple potential algorithms and then uses multiple imputation to make valid inferences about the risk ratio or other epidemiologic parameters of interest. We conducted a simulation to evaluate the performance of the POD method with two variables that can predict the true outcome and compared the POD method with the conventional method. The simulation results showed that when the true risk ratio is equal to 1.0 (null), the conventional method based on a binary outcome provides unbiased estimates. However, when the risk ratio is not equal to 1.0, the traditional method, either using one predictive variable or both predictive variables to define the outcome, is biased when the positive predictive value is <100%, and the bias is very severe when the sensitivity or positive predictive value is poor (less than 0.75 in our simulation). In contrast, the POD method provides unbiased estimates of the risk ratio both when this measure of effect is equal to 1.0 and not equal to 1.0. Even when the sensitivity and positive predictive value are low, the POD method continues to provide unbiased estimates of the risk ratio. The POD method provides an improved way to define outcomes in database research. This method has a major advantage over the conventional method in that it provided unbiased estimates of risk ratios and it is easy to use. Copyright © 2014 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.
2017-12-01
Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.
Constructing statistically unbiased cortical surface templates using feature-space covariance
NASA Astrophysics Data System (ADS)
Parvathaneni, Prasanna; Lyu, Ilwoo; Huo, Yuankai; Blaber, Justin; Hainline, Allison E.; Kang, Hakmook; Woodward, Neil D.; Landman, Bennett A.
2018-03-01
The choice of surface template plays an important role in cross-sectional subject analyses involving cortical brain surfaces because there is a tendency toward registration bias given variations in inter-individual and inter-group sulcal and gyral patterns. In order to account for the bias and spatial smoothing, we propose a feature-based unbiased average template surface. In contrast to prior approaches, we factor in the sample population covariance and assign weights based on feature information to minimize the influence of covariance in the sampled population. The mean surface is computed by applying the weights obtained from an inverse covariance matrix, which guarantees that multiple representations from similar groups (e.g., involving imaging, demographic, diagnosis information) are down-weighted to yield an unbiased mean in feature space. Results are validated by applying this approach in two different applications. For evaluation, the proposed unbiased weighted surface mean is compared with un-weighted means both qualitatively and quantitatively (mean squared error and absolute relative distance of both the means with baseline). In first application, we validated the stability of the proposed optimal mean on a scan-rescan reproducibility dataset by incrementally adding duplicate subjects. In the second application, we used clinical research data to evaluate the difference between the weighted and unweighted mean when different number of subjects were included in control versus schizophrenia groups. In both cases, the proposed method achieved greater stability that indicated reduced impacts of sampling bias. The weighted mean is built based on covariance information in feature space as opposed to spatial location, thus making this a generic approach to be applicable to any feature of interest.
Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi
2015-09-01
Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Liu, Dajiang J; Leal, Suzanne M
2012-10-05
Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information
NASA Technical Reports Server (NTRS)
Howell, L. W., Jr.
2003-01-01
A simple power law model consisting of a single spectral index, sigma(sub 2), is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index sigma(sub 2) greater than sigma(sub 1) above E(sub k). The maximum likelihood (ML) procedure was developed for estimating the single parameter sigma(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (Pl) consistency (asymptotically unbiased), (P2) efficiency (asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only be ascertained by calculating the CRB for an assumed energy spectrum- detector response function combination, which can be quite formidable in practice. However, the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are stained in practice are investigated.
Extension of the Haseman-Elston regression model to longitudinal data.
Won, Sungho; Elston, Robert C; Park, Taesung
2006-01-01
We propose an extension to longitudinal data of the Haseman and Elston regression method for linkage analysis. The proposed model is a mixed model having several random effects. As response variable, we investigate the sibship sample mean corrected cross-product (smHE) and the BLUP-mean corrected cross product (pmHE), comparing them with the original squared difference (oHE), the overall mean corrected cross-product (rHE), and the weighted average of the squared difference and the squared mean-corrected sum (wHE). The proposed model allows for the correlation structure of longitudinal data. Also, the model can test for gene x time interaction to discover genetic variation over time. The model was applied in an analysis of the Genetic Analysis Workshop 13 (GAW13) simulated dataset for a quantitative trait simulating systolic blood pressure. Independence models did not preserve the test sizes, while the mixed models with both family and sibpair random effects tended to preserve size well. Copyright 2006 S. Karger AG, Basel.
Fackler, MaryJo S.; Zhang, Zhe; Lopez-Bujanda, Zoila A.; Jeter, Stacie C.; Sokoll, Lori J.; Garrett-Mayer, Elizabeth; Cope, Leslie M.; Umbricht, Christopher B.; Euhus, David M.; Forero, Andres; Storniolo, Anna M.; Nanda, Rita; Lin, Nancy U.; Carey, Lisa A.; Ingle, James N.; Sukumar, Saraswati; Wolff, Antonio C.
2017-01-01
Purpose Epigenetic alterations measured in blood may help guide breast cancer treatment. The multisite prospective study TBCRC 005 was conducted to examine the ability of a novel panel of cell-free DNA methylation markers to predict survival outcomes in metastatic breast cancer (MBC) using a new quantitative multiplex assay (cMethDNA). Patients and Methods Ten genes were tested in duplicate serum samples from 141 women at baseline, at week 4, and at first restaging. A cumulative methylation index (CMI) was generated on the basis of six of the 10 genes tested. Methylation cut points were selected to maximize the log-rank statistic, and cross-validation was used to obtain unbiased point estimates. Logistic regression or Cox proportional hazard models were used to test associations between the CMI and progression-free survival (PFS), overall survival (OS), and disease status at first restaging. The added value of the CMI in predicting survival outcomes was evaluated and compared with circulating tumor cells (CellSearch). Results Median PFS and OS were significantly shorter in women with a high CMI (PFS, 2.1 months; OS, 12.3 months) versus a low CMI (PFS, 5.8 months; OS, 21.7 months). In multivariable models, among women with MBC, a high versus low CMI at week 4 was independently associated with worse PFS (hazard ratio, 1.79; 95% CI, 1.23 to 2.60; P = .002) and OS (hazard ratio, 1.75; 95% CI, 1.21 to 2.54; P = .003). An increase in the CMI from baseline to week 4 was associated with worse PFS (P < .001) and progressive disease at first restaging (P < .001). Week 4 CMI was a strong predictor of PFS, even in the presence of circulating tumor cells (P = .004). Conclusion Methylation of this gene panel is a strong predictor of survival outcomes in MBC and may have clinical usefulness in risk stratification and disease monitoring. PMID:27870562
NASA Astrophysics Data System (ADS)
Steger, Stefan; Brenning, Alexander; Bell, Rainer; Petschko, Helene; Glade, Thomas
2016-06-01
Empirical models are frequently applied to produce landslide susceptibility maps for large areas. Subsequent quantitative validation results are routinely used as the primary criteria to infer the validity and applicability of the final maps or to select one of several models. This study hypothesizes that such direct deductions can be misleading. The main objective was to explore discrepancies between the predictive performance of a landslide susceptibility model and the geomorphic plausibility of subsequent landslide susceptibility maps while a particular emphasis was placed on the influence of incomplete landslide inventories on modelling and validation results. The study was conducted within the Flysch Zone of Lower Austria (1,354 km2) which is known to be highly susceptible to landslides of the slide-type movement. Sixteen susceptibility models were generated by applying two statistical classifiers (logistic regression and generalized additive model) and two machine learning techniques (random forest and support vector machine) separately for two landslide inventories of differing completeness and two predictor sets. The results were validated quantitatively by estimating the area under the receiver operating characteristic curve (AUROC) with single holdout and spatial cross-validation technique. The heuristic evaluation of the geomorphic plausibility of the final results was supported by findings of an exploratory data analysis, an estimation of odds ratios and an evaluation of the spatial structure of the final maps. The results showed that maps generated by different inventories, classifiers and predictors appeared differently while holdout validation revealed similar high predictive performances. Spatial cross-validation proved useful to expose spatially varying inconsistencies of the modelling results while additionally providing evidence for slightly overfitted machine learning-based models. However, the highest predictive performances were obtained for maps that explicitly expressed geomorphically implausible relationships indicating that the predictive performance of a model might be misleading in the case a predictor systematically relates to a spatially consistent bias of the inventory. Furthermore, we observed that random forest-based maps displayed spatial artifacts. The most plausible susceptibility map of the study area showed smooth prediction surfaces while the underlying model revealed a high predictive capability and was generated with an accurate landslide inventory and predictors that did not directly describe a bias. However, none of the presented models was found to be completely unbiased. This study showed that high predictive performances cannot be equated with a high plausibility and applicability of subsequent landslide susceptibility maps. We suggest that greater emphasis should be placed on identifying confounding factors and biases in landslide inventories. A joint discussion between modelers and decision makers of the spatial pattern of the final susceptibility maps in the field might increase their acceptance and applicability.
Spectroscopic classification of supernova SN 2018Z by NUTS (NOT Un-biased Transient Survey)
NASA Astrophysics Data System (ADS)
Kuncarayakti, H.; Mattila, S.; Kotak, R.; Harmanen, J.; Reynolds, T.; Pastorello, A.; Benetti, S.; Stritzinger, M.; Onori, F.; Somero, A.; Kangas, T.; Lundqvist, P.; Taddia, F.; Ergon, M.
2018-01-01
The NOT Unbiased Transient Survey (NUTS; ATel #8992) collaboration reports the spectroscopic classification of supernova SN 2018Z in host galaxy SDSS J231809.76+212553.5 The observations were performed with the 2.56 m Nordic Optical Telescope equipped with ALFOSC (range 350-950 nm; resolution 1.6 nm) on 2018-01-09.9 UT. Survey Name | IAU Name | Discovery (UT) | Discovery mag | Observation (UT) | Redshift | Type | Phase | Notes PS18ao | SN 2018Z | 2018-01-01.2 | 19.96 | 2018-01-09.9 | 0.102 | Ia | post-maximum? | (1) (1) Redshift was derived from the SN and host absorption features.
On the mathematical foundations of mutually unbiased bases
NASA Astrophysics Data System (ADS)
Thas, Koen
2018-02-01
In order to describe a setting to handle Zauner's conjecture on mutually unbiased bases (MUBs) (stating that in C^d, a set of MUBs of the theoretical maximal size d + 1 exists only if d is a prime power), we pose some fundamental questions which naturally arise. Some of these questions have important consequences for the construction theory of (new) sets of maximal MUBs. Partial answers will be provided in particular cases; more specifically, we will analyze MUBs with associated operator groups that have nilpotence class 2, and consider MUBs of height 1. We will also confirm Zauner's conjecture for MUBs with associated finite nilpotent operator groups.
Analysis of conditional genetic effects and variance components in developmental genetics.
Zhu, J
1995-12-01
A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.
Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics
Zhu, J.
1995-01-01
A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500
Lange, Rense; Thalbourne, Michael A
2002-12-01
Research on the relation between demographic variables and paranormal belief remains controversial given the possible semantic distortions introduced by item and test level biases. We illustrate how Rasch scaling can be used to detect such biases and to quantify their effects, using the Australian Sheep-Goal Scale as a substantive example. Based on data from 1.822 respondents, this test was Rasch scalable, reliable, and unbiased at the test level. Consistent with other research in which unbiased measures of paranormal belief were used, extremely weak age and sex effects were found (partial eta2 = .005 and .012, respectively).
Large deviations in the presence of cooperativity and slow dynamics
NASA Astrophysics Data System (ADS)
Whitelam, Stephen
2018-06-01
We study simple models of intermittency, involving switching between two states, within the dynamical large-deviation formalism. Singularities appear in the formalism when switching is cooperative or when its basic time scale diverges. In the first case the unbiased trajectory distribution undergoes a symmetry breaking, leading to a change in shape of the large-deviation rate function for a particular dynamical observable. In the second case the symmetry of the unbiased trajectory distribution remains unbroken. Comparison of these models suggests that singularities of the dynamical large-deviation formalism can signal the dynamical equivalent of an equilibrium phase transition but do not necessarily do so.
Unbiased classification of spatial strategies in the Barnes maze.
Illouz, Tomer; Madar, Ravit; Clague, Charlotte; Griffioen, Kathleen J; Louzoun, Yoram; Okun, Eitan
2016-11-01
Spatial learning is one of the most widely studied cognitive domains in neuroscience. The Morris water maze and the Barnes maze are the most commonly used techniques to assess spatial learning and memory in rodents. Despite the fact that these tasks are well-validated paradigms for testing spatial learning abilities, manual categorization of performance into behavioral strategies is subject to individual interpretation, and thus to bias. We have previously described an unbiased machine-learning algorithm to classify spatial strategies in the Morris water maze. Here, we offer a support vector machine-based, automated, Barnes-maze unbiased strategy (BUNS) classification algorithm, as well as a cognitive score scale that can be used for memory acquisition, reversal training and probe trials. The BUNS algorithm can greatly benefit Barnes maze users as it provides a standardized method of strategy classification and cognitive scoring scale, which cannot be derived from typical Barnes maze data analysis. Freely available on the web at http://okunlab.wix.com/okunlab as a MATLAB application. eitan.okun@biu.ac.ilSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Detection of sea otters in boat-based surveys of Prince William Sound, Alaska
Udevitz, Mark S.; Bodkin, James L.; Costa, Daniel P.
1995-01-01
Boat-based surveys have been commonly used to monitor sea otter populations, but there has been little quantitative work to evaluate detection biases that may affect these surveys. We used ground-based observers to investigate sea otter detection probabilities in a boat-based survey of Prince William Sound, Alaska. We estimated that 30% of the otters present on surveyed transects were not detected by boat crews. Approximately half (53%) of the undetected otters were missed because the otters left the transects, apparently in response to the approaching boat. Unbiased estimates of detection probabilities will be required for obtaining unbiased population estimates from boat-based surveys of sea otters. Therefore, boat-based surveys should include methods to estimate sea otter detection probabilities under the conditions specific to each survey. Unbiased estimation of detection probabilities with ground-based observers requires either that the ground crews detect all of the otters in observed subunits, or that there are no errors in determining which crews saw each detected otter. Ground-based observer methods may be appropriate in areas where nearly all of the sea otter habitat is potentially visible from ground-based vantage points.
Russell, Joseph A; Campos, Brittany; Stone, Jennifer; Blosser, Erik M; Burkett-Cadena, Nathan; Jacobs, Jonathan L
2018-04-03
The future of infectious disease surveillance and outbreak response is trending towards smaller hand-held solutions for point-of-need pathogen detection. Here, samples of Culex cedecei mosquitoes collected in Southern Florida, USA were tested for Venezuelan Equine Encephalitis Virus (VEEV), a previously-weaponized arthropod-borne RNA-virus capable of causing acute and fatal encephalitis in animal and human hosts. A single 20-mosquito pool tested positive for VEEV by quantitative reverse transcription polymerase chain reaction (RT-qPCR) on the Biomeme two3. The virus-positive sample was subjected to unbiased metatranscriptome sequencing on the Oxford Nanopore MinION and shown to contain Everglades Virus (EVEV), an alphavirus in the VEEV serocomplex. Our results demonstrate, for the first time, the use of unbiased sequence-based detection and subtyping of a high-consequence biothreat pathogen directly from an environmental sample using field-forward protocols. The development and validation of methods designed for field-based diagnostic metagenomics and pathogen discovery, such as those suitable for use in mobile "pocket laboratories", will address a growing demand for public health teams to carry out their mission where it is most urgent: at the point-of-need.
Using Maximum Entropy to Find Patterns in Genomes
NASA Astrophysics Data System (ADS)
Liu, Sophia; Hockenberry, Adam; Lancichinetti, Andrea; Jewett, Michael; Amaral, Luis
The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. To accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. This approach can also be easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes. National Institute of General Medical Science, Northwestern University Presidential Fellowship, National Science Foundation, David and Lucile Packard Foundation, Camille Dreyfus Teacher Scholar Award.
2016-01-01
We report a theoretical description and numerical tests of the extended-system adaptive biasing force method (eABF), together with an unbiased estimator of the free energy surface from eABF dynamics. Whereas the original ABF approach uses its running estimate of the free energy gradient as the adaptive biasing force, eABF is built on the idea that the exact free energy gradient is not necessary for efficient exploration, and that it is still possible to recover the exact free energy separately with an appropriate estimator. eABF does not directly bias the collective coordinates of interest, but rather fictitious variables that are harmonically coupled to them; therefore is does not require second derivative estimates, making it easily applicable to a wider range of problems than ABF. Furthermore, the extended variables present a smoother, coarse-grain-like sampling problem on a mollified free energy surface, leading to faster exploration and convergence. We also introduce CZAR, a simple, unbiased free energy estimator from eABF trajectories. eABF/CZAR converges to the physical free energy surface faster than standard ABF for a wide range of parameters. PMID:27959559
Power Generation from a Radiative Thermal Source Using a Large-Area Infrared Rectenna
NASA Astrophysics Data System (ADS)
Shank, Joshua; Kadlec, Emil A.; Jarecki, Robert L.; Starbuck, Andrew; Howell, Stephen; Peters, David W.; Davids, Paul S.
2018-05-01
Electrical power generation from a moderate-temperature thermal source by means of direct conversion of infrared radiation is important and highly desirable for energy harvesting from waste heat and micropower applications. Here, we demonstrate direct rectified power generation from an unbiased large-area nanoantenna-coupled tunnel diode rectifier called a rectenna. Using a vacuum radiometric measurement technique with irradiation from a temperature-stabilized thermal source, a generated power density of 8 nW /cm2 is observed at a source temperature of 450 °C for the unbiased rectenna across an optimized load resistance. The optimized load resistance for the peak power generation for each temperature coincides with the tunnel diode resistance at zero bias and corresponds to the impedance matching condition for a rectifying antenna. Current-voltage measurements of a thermally illuminated large-area rectenna show current zero crossing shifts into the second quadrant indicating rectification. Photon-assisted tunneling in the unbiased rectenna is modeled as the mechanism for the large short-circuit photocurrents observed where the photon energy serves as an effective bias across the tunnel junction. The measured current and voltage across the load resistor as a function of the thermal source temperature represents direct current electrical power generation.
Prioritizing causal disease genes using unbiased genomic features.
Deo, Rahul C; Musso, Gabriel; Tasan, Murat; Tang, Paul; Poon, Annie; Yuan, Christiana; Felix, Janine F; Vasan, Ramachandran S; Beroukhim, Rameen; De Marco, Teresa; Kwok, Pui-Yan; MacRae, Calum A; Roth, Frederick P
2014-12-03
Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
Nie, Binbin; Liang, Shengxiang; Jiang, Xiaofeng; Duan, Shaofeng; Huang, Qi; Zhang, Tianhao; Li, Panlong; Liu, Hua; Shan, Baoci
2018-06-07
Positron emission tomography (PET) imaging of functional metabolism has been widely used to investigate functional recovery and to evaluate therapeutic efficacy after stroke. The voxel intensity of a PET image is the most important indicator of cellular activity, but is affected by other factors such as the basal metabolic ratio of each subject. In order to locate dysfunctional regions accurately, intensity normalization by a scale factor is a prerequisite in the data analysis, for which the global mean value is most widely used. However, this is unsuitable for stroke studies. Alternatively, a specified scale factor calculated from a reference region is also used, comprising neither hyper- nor hypo-metabolic voxels. But there is no such recognized reference region for stroke studies. Therefore, we proposed a totally data-driven automatic method for unbiased scale factor generation. This factor was generated iteratively until the residual deviation of two adjacent scale factors was reduced by < 5%. Moreover, both simulated and real stroke data were used for evaluation, and these suggested that our proposed unbiased scale factor has better sensitivity and accuracy for stroke studies.
Mohammad-Zamani, Mohammad Javad; Neshat, Mohammad; Moravvej-Farshi, Mohammad Kazem
2016-01-15
A new generation unbiased antennaless CW terahertz (THz) photomixer emitters array made of asymmetric metal-semiconductor-metal (MSM) gratings with a subwavelength pitch, operating in the optical near-field regime, is proposed. We take advantage of size effects in near-field optics and electrostatics to demonstrate the possibility of enhancing the THz power by 4 orders of magnitude, compared to a similar unbiased antennaless array of the same size that operates in the far-field regime. We show that, with the appropriate choice of grating parameters in such THz sources, the first plasmonic resonant cavity mode in the nanoslit between two adjacent MSMs can enhance the optical near-field absorption and, hence, the generation of photocarriers under the slit in the active medium. These photocarriers, on the other hand, are accelerated by the large built-in electric field sustained under the nanoslits by two dissimilar Schottky barriers to create the desired large THz power that is mainly radiated downward. The proposed structure can be tuned in a broadband frequency range of 0.1-3 THz, with output power increasing with frequency.
Probability Theory Plus Noise: Descriptive Estimation and Inferential Judgment.
Costello, Fintan; Watts, Paul
2018-01-01
We describe a computational model of two central aspects of people's probabilistic reasoning: descriptive probability estimation and inferential probability judgment. This model assumes that people's reasoning follows standard frequentist probability theory, but it is subject to random noise. This random noise has a regressive effect in descriptive probability estimation, moving probability estimates away from normative probabilities and toward the center of the probability scale. This random noise has an anti-regressive effect in inferential judgement, however. These regressive and anti-regressive effects explain various reliable and systematic biases seen in people's descriptive probability estimation and inferential probability judgment. This model predicts that these contrary effects will tend to cancel out in tasks that involve both descriptive estimation and inferential judgement, leading to unbiased responses in those tasks. We test this model by applying it to one such task, described by Gallistel et al. ). Participants' median responses in this task were unbiased, agreeing with normative probability theory over the full range of responses. Our model captures the pattern of unbiased responses in this task, while simultaneously explaining systematic biases away from normatively correct probabilities seen in other tasks. Copyright © 2018 Cognitive Science Society, Inc.
Comparison of estimators of standard deviation for hydrologic time series
Tasker, Gary D.; Gilroy, Edward J.
1982-01-01
Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.
Unbiased multi-fidelity estimate of failure probability of a free plane jet
NASA Astrophysics Data System (ADS)
Marques, Alexandre; Kramer, Boris; Willcox, Karen; Peherstorfer, Benjamin
2017-11-01
Estimating failure probability related to fluid flows is a challenge because it requires a large number of evaluations of expensive models. We address this challenge by leveraging multiple low fidelity models of the flow dynamics to create an optimal unbiased estimator. In particular, we investigate the effects of uncertain inlet conditions in the width of a free plane jet. We classify a condition as failure when the corresponding jet width is below a small threshold, such that failure is a rare event (failure probability is smaller than 0.001). We estimate failure probability by combining the frameworks of multi-fidelity importance sampling and optimal fusion of estimators. Multi-fidelity importance sampling uses a low fidelity model to explore the parameter space and create a biasing distribution. An unbiased estimate is then computed with a relatively small number of evaluations of the high fidelity model. In the presence of multiple low fidelity models, this framework offers multiple competing estimators. Optimal fusion combines all competing estimators into a single estimator with minimal variance. We show that this combined framework can significantly reduce the cost of estimating failure probabilities, and thus can have a large impact in fluid flow applications. This work was funded by DARPA.
An examination of effect estimation in factorial and standardly-tailored designs
Allore, Heather G; Murphy, Terrence E
2012-01-01
Background Many clinical trials are designed to test an intervention arm against a control arm wherein all subjects are equally eligible for all interventional components. Factorial designs have extended this to test multiple intervention components and their interactions. A newer design referred to as a ‘standardly-tailored’ design, is a multicomponent interventional trial that applies individual interventional components to modify risk factors identified a priori and tests whether health outcomes differ between treatment arms. Standardly-tailored designs do not require that all subjects be eligible for every interventional component. Although standardly-tailored designs yield an estimate for the net effect of the multicomponent intervention, it has not yet been shown if they permit separate, unbiased estimation of individual component effects. The ability to estimate the most potent interventional components has direct bearing on conducting second stage translational research. Purpose We present statistical issues related to the estimation of individual component effects in trials of geriatric conditions using factorial and standardly-tailored designs. The medical community is interested in second stage translational research involving the transfer of results from a randomized clinical trial to a community setting. Before such research is undertaken, main effects and synergistic and or antagonistic interactions between them should be identified. Knowledge of the relative strength and direction of the effects of the individual components and their interactions facilitates the successful transfer of clinically significant findings and may potentially reduce the number of interventional components needed. Therefore the current inability of the standardly-tailored design to provide unbiased estimates of individual interventional components is a serious limitation in their applicability to second stage translational research. Methods We discuss estimation of individual component effects from the family of factorial designs and this limitation for standardly-tailored designs. We use the phrase ‘factorial designs’ to describe full-factorial designs and their derivatives including the fractional factorial, partial factorial, incomplete factorial and modified reciprocal designs. We suggest two potential directions for designing multicomponent interventions to facilitate unbiased estimates of individual interventional components. Results Full factorial designs and their variants are the most common multicomponent trial design described in the literature and differ meaningfully from standardly-tailored designs. Factorial and standardly-tailored designs result in similar estimates of net effect with different levels of precision. Unbiased estimation of individual component effects from a standardly-tailored design will require new methodology. Limitations Although clinically relevant in geriatrics, previous applications of standardly-tailored designs have not provided unbiased estimates of the effects of individual interventional components. Discussion Future directions to estimate individual component effects from standardly-tailored designs include applying D-optimal designs and creating independent linear combinations of risk factors analogous to factor analysis. Conclusion Methods are needed to extract unbiased estimates of the effects of individual interventional components from standardly-tailored designs. PMID:18375650
Solution to the mean king's problem with mutually unbiased bases for arbitrary levels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kimura, Gen; Tanaka, Hajime; Ozawa, Masanao
2006-05-15
The mean king's problem with mutually unbiased bases is reconsidered for arbitrary d-level systems. Hayashi et al. [Phys. Rev. A 71, 052331 (2005)] related the problem to the existence of a maximal set of d-1 mutually orthogonal Latin squares, in their restricted setting that allows only measurements of projection-valued measures. However, we then cannot find a solution to the problem when, e.g., d=6 or d=10. In contrast to their result, we show that the king's problem always has a solution for arbitrary levels if we also allow positive operator-valued measures. In constructing the solution, we use orthogonal arrays in combinatorialmore » design theory.« less
Human systems immunology: hypothesis-based modeling and unbiased data-driven approaches.
Arazi, Arnon; Pendergraft, William F; Ribeiro, Ruy M; Perelson, Alan S; Hacohen, Nir
2013-10-31
Systems immunology is an emerging paradigm that aims at a more systematic and quantitative understanding of the immune system. Two major approaches have been utilized to date in this field: unbiased data-driven modeling to comprehensively identify molecular and cellular components of a system and their interactions; and hypothesis-based quantitative modeling to understand the operating principles of a system by extracting a minimal set of variables and rules underlying them. In this review, we describe applications of the two approaches to the study of viral infections and autoimmune diseases in humans, and discuss possible ways by which these two approaches can synergize when applied to human immunology. Copyright © 2012 Elsevier Ltd. All rights reserved.
A cross-correlation-based estimate of the galaxy luminosity function
NASA Astrophysics Data System (ADS)
van Daalen, Marcel P.; White, Martin
2018-06-01
We extend existing methods for using cross-correlations to derive redshift distributions for photometric galaxies, without using photometric redshifts. The model presented in this paper simultaneously yields highly accurate and unbiased redshift distributions and, for the first time, redshift-dependent luminosity functions, using only clustering information and the apparent magnitudes of the galaxies as input. In contrast to many existing techniques for recovering unbiased redshift distributions, the output of our method is not degenerate with the galaxy bias b(z), which is achieved by modelling the shape of the luminosity bias. We successfully apply our method to a mock galaxy survey and discuss improvements to be made before applying our model to real data.
Unbiased Sampling of Globular Lattice Proteins in Three Dimensions
NASA Astrophysics Data System (ADS)
Jacobsen, Jesper Lykke
2008-03-01
We present a Monte Carlo method that allows efficient and unbiased sampling of Hamiltonian walks on a cubic lattice. Such walks are self-avoiding and visit each lattice site exactly once. They are often used as simple models of globular proteins, upon adding suitable local interactions. Our algorithm can easily be equipped with such interactions, but we study here mainly the flexible homopolymer case where each conformation is generated with uniform probability. We argue that the algorithm is ergodic and has dynamical exponent z=0. We then use it to study polymers of size up to 643=262144 monomers. Results are presented for the effective interaction between end points, and the interaction with the boundaries of the system.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS NATIONAL VOLUNTARY... as an unbiased third party to accredit both testing and calibration laboratories. Supplementary...
Code of Federal Regulations, 2014 CFR
2014-01-01
... AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS NATIONAL VOLUNTARY... as an unbiased third party to accredit both testing and calibration laboratories. Supplementary...
Code of Federal Regulations, 2013 CFR
2013-01-01
... AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS NATIONAL VOLUNTARY... as an unbiased third party to accredit both testing and calibration laboratories. Supplementary...
Code of Federal Regulations, 2011 CFR
2011-01-01
... AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS NATIONAL VOLUNTARY... as an unbiased third party to accredit both testing and calibration laboratories. Supplementary...
Code of Federal Regulations, 2012 CFR
2012-01-01
... AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS NATIONAL VOLUNTARY... as an unbiased third party to accredit both testing and calibration laboratories. Supplementary...
Unbiased and targeted mass spectrometry for the HDL proteome.
Singh, Sasha A; Aikawa, Masanori
2017-02-01
Mass spectrometry is an ever evolving technology that is equipped with a variety of tools for protein research. Some lipoprotein studies, especially those pertaining to HDL biology, have been exploiting the versatility of mass spectrometry to understand HDL function through its proteome. Despite the role of mass spectrometry in advancing research as a whole, however, the technology remains obscure to those without hands on experience, but still wishing to understand it. In this review, we walk the reader through the coevolution of common mass spectrometry workflows and HDL research, starting from the basic unbiased mass spectrometry methods used to profile the HDL proteome to the most recent targeted methods that have enabled an unprecedented view of HDL metabolism. Unbiased global proteomics have demonstrated that the HDL proteome is organized into subgroups across the HDL size fractions providing further evidence that HDL functional heterogeneity is in part governed by its varying protein constituents. Parallel reaction monitoring, a novel targeted mass spectrometry method, was used to monitor the metabolism of HDL apolipoproteins in humans and revealed that apolipoproteins contained within the same HDL size fraction exhibit diverse metabolic properties. Mass spectrometry provides a variety of tools and strategies to facilitate understanding, through its proteins, the complex biology of HDL.
Geldsetzer, Pascal; Fink, Günther; Vaikath, Maria; Bärnighausen, Till
2018-02-01
(1) To evaluate the operational efficiency of various sampling methods for patient exit interviews; (2) to discuss under what circumstances each method yields an unbiased sample; and (3) to propose a new, operationally efficient, and unbiased sampling method. Literature review, mathematical derivation, and Monte Carlo simulations. Our simulations show that in patient exit interviews it is most operationally efficient if the interviewer, after completing an interview, selects the next patient exiting the clinical consultation. We demonstrate mathematically that this method yields a biased sample: patients who spend a longer time with the clinician are overrepresented. This bias can be removed by selecting the next patient who enters, rather than exits, the consultation room. We show that this sampling method is operationally more efficient than alternative methods (systematic and simple random sampling) in most primary health care settings. Under the assumption that the order in which patients enter the consultation room is unrelated to the length of time spent with the clinician and the interviewer, selecting the next patient entering the consultation room tends to be the operationally most efficient unbiased sampling method for patient exit interviews. © 2016 The Authors. Health Services Research published by Wiley Periodicals, Inc. on behalf of Health Research and Educational Trust.
Transfer of location-specific control to untrained locations.
Weidler, Blaire J; Bugg, Julie M
2016-11-01
Recent research highlights a seemingly flexible and automatic form of cognitive control that is triggered by potent contextual cues, as exemplified by the location-specific proportion congruence effect--reduced compatibility effects in locations associated with a high as compared to low likelihood of conflict. We investigated just how flexible location-specific control is by examining whether novel locations effectively cue control for congruency-unbiased stimuli. In two experiments, biased (mostly compatible or mostly incompatible) training stimuli appeared in distinct locations. During a final block, unbiased (50% compatible) stimuli appeared in novel untrained locations spatially linked to biased locations. The flanker compatibly effect was reduced for unbiased stimuli in novel locations linked to a mostly incompatible compared to a mostly compatible location, indicating transfer. Transfer was observed when stimuli appeared along a linear function (Experiment 1) or in rings of a bullseye (Experiment 2). The novel transfer effects imply that location-specific control is more flexible than previously reported and further counter the complex stimulus-response learning account of location-specific proportion congruence effects. We propose that the representation and retrieval of control settings in untrained locations may depend on environmental support and the presentation of stimuli in novel locations that fall within the same categories of space as trained locations.
Inoue, Naoki; Hirouchi, Taisei; Kasai, Atsushi; Higashi, Shintaro; Hiraki, Natsumi; Tanaka, Shota; Nakazawa, Takanobu; Nunomura, Kazuto; Lin, Bangzhong; Omori, Akiko; Hayata-Takano, Atsuko; Kim, Yoon-Jeong; Doi, Takefumi; Baba, Akemichi; Hashimoto, Hitoshi; Shintani, Norihito
2018-01-08
We recently showed that a 13-kDa protein (p13), the homolog protein of formation of mitochondrial complex V assembly factor 1 in yeast, acts as a potential protective factor in pancreatic islets under diabetes. Here, we aimed to identify known compounds regulating p13 mRNA expression to obtain therapeutic insight into the cellular stress response. A luciferase reporter system was developed using the putative promoter region of the human p13 gene. Overexpression of peroxisome proliferator-activated receptor gamma coactivator 1α, a master player regulating mitochondrial metabolism, increased both reporter activity and p13 expression. Following unbiased screening with 2320 known compounds in HeLa cells, 12 pharmacological agents (including 8 cardiotonics and 2 anthracyclines) that elicited >2-fold changes in p13 mRNA expression were identified. Among them, four cardiac glycosides decreased p13 expression and concomitantly elevated cellular oxidative stress. Additional database analyses showed highest p13 expression in heart, with typically decreased expression in cardiac disease. Accordingly, our results illustrate the usefulness of unbiased compound screening as a method for identifying novel functional roles of unfamiliar genes. Our findings also highlight the importance of p13 in the cellular stress response in heart. Copyright © 2017. Published by Elsevier Inc.
Sayers, A; Heron, J; Smith, Adac; Macdonald-Wallis, C; Gilthorpe, M S; Steele, F; Tilling, K
2017-02-01
There is a growing debate with regards to the appropriate methods of analysis of growth trajectories and their association with prospective dependent outcomes. Using the example of childhood growth and adult BP, we conducted an extensive simulation study to explore four two-stage and two joint modelling methods, and compared their bias and coverage in estimation of the (unconditional) association between birth length and later BP, and the association between growth rate and later BP (conditional on birth length). We show that the two-stage method of using multilevel models to estimate growth parameters and relating these to outcome gives unbiased estimates of the conditional associations between growth and outcome. Using simulations, we demonstrate that the simple methods resulted in bias in the presence of measurement error, as did the two-stage multilevel method when looking at the total (unconditional) association of birth length with outcome. The two joint modelling methods gave unbiased results, but using the re-inflated residuals led to undercoverage of the confidence intervals. We conclude that either joint modelling or the simpler two-stage multilevel approach can be used to estimate conditional associations between growth and later outcomes, but that only joint modelling is unbiased with nominal coverage for unconditional associations.
Hirose, Kensuke; Mikawa, Satoshi; Okumura, Naohiko; Noguchi, Go; Fukawa, Kazuo; Kanaya, Naoe; Mikawa, Ayumi; Arakawa, Aisaku; Ito, Tetsuya; Hayashi, Yoichi; Tachibana, Fumio; Awata, Takashi
2013-03-01
Vertnin (VRTN) is involved in the variation of vertebral number in pigs and it is located on Sus scrofa chromosome 7. Vertebral number is related to body size in pigs, and many reports have suggested presence of an association between body length (BL) and meat production traits. Therefore, we analyzed the relationship between the VRTN genotype and the production and body composition traits in purebred Duroc pigs. Intramuscular fat content (IMF) in the Longissimus muscle was significantly associated with the VRTN genotype. The mean IMF of individuals with the wild-type genotype (Wt/Wt) (5.22%) was greater than that of individuals with the Wt/Q (4.99%) and Q/Q genotypes (4.79%). In addition, a best linear unbiased predictor of multiple traits animal model showed that the Wt allele had a positive effect on the IMF breeding value. No associations were observed between the VRTN genotype and other production traits. The VRTN genotype was related to BL. The Q/Q genotype individuals (100.0 cm) were longer than individuals with the Wt/Q (99.5 cm) and Wt/Wt genotypes (98.9 cm). These results suggest that in addition to the maintenance of an appropriate backfat thickness value, VRTN has the potential to act as a genetic marker of IMF. © 2012 Japanese Society of Animal Science.
Decision rules for unbiased inventory estimates
NASA Technical Reports Server (NTRS)
Argentiero, P. D.; Koch, D.
1979-01-01
An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.
The effects of rehabilitative voir dire on juror bias and decision making.
Crocker, Caroline B; Kovera, Margaret Bull
2010-06-01
During voir dire, judges frequently attempt to "rehabilitate" venirepersons who express an inability to be impartial. Venirepersons who agree to ignore their biases and base their verdict on the evidence and the law are eligible for jury service. In Experiment 1, biased and unbiased mock jurors participated in either a standard or rehabilitative voir dire conducted by a judge and watched a trial video. Rehabilitation influenced insanity defense attitudes and perceptions of the defendant's mental state, and decreased scaled guilt judgments compared to standard questioning. Although rehabilitation is intended to correct for partiality among biased jurors, rehabilitation similarly influenced biased and unbiased jurors. Experiment 2 found that watching rehabilitation did not influence jurors' perceptions of the judge's personal beliefs about the case.
Prediction and measurement results of radiation damage to CMOS devices on board spacecraft
NASA Technical Reports Server (NTRS)
Stassinopoulos, E. G.; Danchenko, V.; Cliff, R. A.; Sing, M.; Brucker, G. J.; Ohanian, R. S.
1977-01-01
Final results from the CMOS Radiation Effects Measurement (CREM) experiment flown on Explorer 55 are presented and discussed, based on about 15 months of observations and measurements. Conclusions are given relating to long-range annealing, effects of operating temperature on semiconductor performance in space, biased and unbiased P-MOS device degradation, unbiased n-channel device performance, changes in device transconductance, and the difference in ionization efficiency between Co-60 gamma rays and 1-Mev Van de Graaff electrons. The performance of devices in a heavily shielded electronic subsystem box within the spacecraft is evaluated and compared. Environment models and computational methods and their impact on device-degradation estimates are being reviewed to determine whether they permit cost-effective design of spacecraft.
Bases for qudits from a nonstandard approach to SU(2)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kibler, M. R., E-mail: kibler@ipnl.in2p3.fr
2011-06-15
Bases of finite-dimensional Hilbert spaces (in dimension d) of relevance for quantum information and quantum computation are constructed from angular momentum theory and su(2) Lie algebraic methods. We report on a formula for deriving in one step the (1 + p)p qupits (i.e., qudits with d = p a prime integer) of a complete set of 1 + p mutually unbiased bases in C{sup p}. Repeated application of the formula can be used for generating mutually unbiased bases in C{sup d} with d = p{sup e} (e {>=} 2) a power of a prime integer. A connection between mutually unbiasedmore » bases and the unitary group SU(d) is briefly discussed in the case d = p{sup e}.« less
NASA Astrophysics Data System (ADS)
Bartkiewicz, Karol; Černoch, Antonín; Lemr, Karel; Miranowicz, Adam; Nori, Franco
2016-06-01
Temporal steering, which is a temporal analog of Einstein-Podolsky-Rosen steering, refers to temporal quantum correlations between the initial and final state of a quantum system. Our analysis of temporal steering inequalities in relation to the average quantum bit error rates reveals the interplay between temporal steering and quantum cloning, which guarantees the security of quantum key distribution based on mutually unbiased bases against individual attacks. The key distributions analyzed here include the Bennett-Brassard 1984 protocol and the six-state 1998 protocol by Bruss. Moreover, we define a temporal steerable weight, which enables us to identify a kind of monogamy of temporal correlation that is essential to quantum cryptography and useful for analyzing various scenarios of quantum causality.
Silva, V B; Daher, R F; Araújo, M S B; Souza, Y P; Cassaro, S; Menezes, B R S; Gravina, L M; Novo, A A C; Tardin, F D; Júnior, A T Amaral
2017-09-27
Genetically improved cultivars of elephant grass need to be adapted to different ecosystems with a faster growth speed and lower seasonality of biomass production over the year. This study aimed to use selection indices using mixed models (REML/BLUP) for selecting families and progenies within full-sib families of elephant grass (Pennisetum purpureum) for biomass production. One hundred and twenty full-sib progenies were assessed from 2014 to 2015 in a randomized block design with three replications. During this period, the traits dry matter production, the number of tillers, plant height, stem diameter, and neutral detergent fiber were assessed. Families 3 and 1 were the best classified, being the most indicated for selection effect. Progenies 40, 45, 46, and 49 got the first positions in the three indices assessed in the first cut. The gain for individual 40 was 161.76% using Mulamba and Mock index. The use of selection indices using mixed models is advantageous in elephant grass since they provide high gains with the selection, which are distributed among all the assessed traits in the most appropriate situation to breeding programs.
An Appraisal of Some Energy Films.
ERIC Educational Resources Information Center
Dowling, John
1979-01-01
Discusses problems of achieving balance and objectivity in dealing with the energy crisis, and describes ten instructional films that might be useful in achieving objective, unbiased instruction. (CMV)
Wendel, Jeanne; Dumitras, Diana
2005-06-01
This paper describes an analytical methodology for obtaining statistically unbiased outcomes estimates for programs in which participation decisions may be correlated with variables that impact outcomes. This methodology is particularly useful for intraorganizational program evaluations conducted for business purposes. In this situation, data is likely to be available for a population of managed care members who are eligible to participate in a disease management (DM) program, with some electing to participate while others eschew the opportunity. The most pragmatic analytical strategy for in-house evaluation of such programs is likely to be the pre-intervention/post-intervention design in which the control group consists of people who were invited to participate in the DM program, but declined the invitation. Regression estimates of program impacts may be statistically biased if factors that impact participation decisions are correlated with outcomes measures. This paper describes an econometric procedure, the Treatment Effects model, developed to produce statistically unbiased estimates of program impacts in this type of situation. Two equations are estimated to (a) estimate the impacts of patient characteristics on decisions to participate in the program, and then (b) use this information to produce a statistically unbiased estimate of the impact of program participation on outcomes. This methodology is well-established in economics and econometrics, but has not been widely applied in the DM outcomes measurement literature; hence, this paper focuses on one illustrative application.
Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.
Youssef, Noha H; Elshahed, Mostafa S
2008-09-01
Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.
Alibay, Irfan; Burusco, Kepa K; Bruce, Neil J; Bryce, Richard A
2018-03-08
Determining the conformations accessible to carbohydrate ligands in aqueous solution is important for understanding their biological action. In this work, we evaluate the conformational free-energy surfaces of Lewis oligosaccharides in explicit aqueous solvent using a multidimensional variant of the swarm-enhanced sampling molecular dynamics (msesMD) method; we compare with multi-microsecond unbiased MD simulations, umbrella sampling, and accelerated MD approaches. For the sialyl Lewis A tetrasaccharide, msesMD simulations in aqueous solution predict conformer landscapes in general agreement with the other biased methods and with triplicate unbiased 10 μs trajectories; these simulations find a predominance of closed conformer and a range of low-occupancy open forms. The msesMD simulations also suggest closed-to-open transitions in the tetrasaccharide are facilitated by changes in ring puckering of its GlcNAc residue away from the 4 C 1 form, in line with previous work. For sialyl Lewis X tetrasaccharide, msesMD simulations predict a minor population of an open form in solution corresponding to a rare lectin-bound pose observed crystallographically. Overall, from comparison with biased MD calculations, we find that triplicate 10 μs unbiased MD simulations may not be enough to fully sample glycan conformations in aqueous solution. However, the computational efficiency and intuitive approach of the msesMD method suggest potential for its application in glycomics as a tool for analysis of oligosaccharide conformation.
Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.
2014-01-01
Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.
Sodium Binding Sites and Permeation Mechanism in the NaChBac Channel: A Molecular Dynamics Study.
Guardiani, Carlo; Rodger, P Mark; Fedorenko, Olena A; Roberts, Stephen K; Khovanov, Igor A
2017-03-14
NaChBac was the first discovered bacterial sodium voltage-dependent channel, yet computational studies are still limited due to the lack of a crystal structure. In this work, a pore-only construct built using the NavMs template was investigated using unbiased molecular dynamics and metadynamics. The potential of mean force (PMF) from the unbiased run features four minima, three of which correspond to sites IN, CEN, and HFS discovered in NavAb. During the run, the selectivity filter (SF) is spontaneously occupied by two ions, and frequent access of a third one is often observed. In the innermost sites IN and CEN, Na + is fully hydrated by six water molecules and occupies an on-axis position. In site HFS sodium interacts with a glutamate and a serine from the same subunit and is forced to adopt an off-axis placement. Metadynamics simulations biasing one and two ions show an energy barrier in the SF that prevents single-ion permeation. An analysis of the permeation mechanism was performed both computing minimum energy paths in the axial-axial PMF and through a combination of Markov state modeling and transition path theory. Both approaches reveal a knock-on mechanism involving at least two but possibly three ions. The currents predicted from the unbiased simulation using linear response theory are in excellent agreement with single-channel patch-clamp recordings.
Power and On-Board Propulsion System Benefit Studies at NASA GRC
NASA Technical Reports Server (NTRS)
Hoffman, David J.
2000-01-01
This paper discusses the value of systems studies that provide unbiased 'honest broker' assessments of the quantified benefits afforded by advanced technologies for specific missions. The organization, format, and approach used by the NASA Glenn Research Center (GRC) Systems Assessment Team (SAT) to perform system studies for the GRC advanced power and on-board propulsion technology development program is described. Three levels of assessments and a sensitivity analysis are explained and example results are presented. The impact of system studies results and some of the main challenges associated with systems studies are identified. A call for collaboration is made where system studies of all types from all organizations can be reviewed, providing a forum for the widest peer review to ensure accurate and unbiased technical content, and to avoid needless duplication.
Cai, Xin; Liu, Jinsong; Wang, Shenglie
2009-02-16
This paper presents calculations for an idea in photorefractive spatial soliton, namely, a dissipative holographic soliton and a Hamiltonian soliton in one dimension form in an unbiased series photorefractive crystal circuit consisting of two photorefractive crystals of which at least one must be photovoltaic. The two solitons are known collectively as a separate Holographic-Hamiltonian spatial soliton pair and there are two types: dark-dark and bright-dark if only one crystal of the circuit is photovoltaic. The numerical results show that the Hamiltonian soliton in a soliton pair can affect the holographic one by the light-induced current whereas the effect of the holographic soliton on the Hamiltonian soliton is too weak to be ignored, i.e., the holographic soliton cannot affect the Hamiltonian one.
Host Galaxy Properties of SWIFT Hard X-ray Selected AGN
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.
2010-01-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.
Communication: Improved ab initio molecular dynamics by minimally biasing with experimental data
NASA Astrophysics Data System (ADS)
White, Andrew D.; Knight, Chris; Hocky, Glen M.; Voth, Gregory A.
2017-01-01
Accounting for electrons and nuclei simultaneously is a powerful capability of ab initio molecular dynamics (AIMD). However, AIMD is often unable to accurately reproduce properties of systems such as water due to inaccuracies in the underlying electronic density functionals. This shortcoming is often addressed by added empirical corrections and/or increasing the simulation temperature. We present here a maximum-entropy approach to directly incorporate limited experimental data via a minimal bias. Biased AIMD simulations of water and an excess proton in water are shown to give significantly improved properties both for observables which were biased to match experimental data and for unbiased observables. This approach also yields new physical insight into inaccuracies in the underlying density functional theory as utilized in the unbiased AIMD.
Communication: Improved ab initio molecular dynamics by minimally biasing with experimental data.
White, Andrew D; Knight, Chris; Hocky, Glen M; Voth, Gregory A
2017-01-28
Accounting for electrons and nuclei simultaneously is a powerful capability of ab initio molecular dynamics (AIMD). However, AIMD is often unable to accurately reproduce properties of systems such as water due to inaccuracies in the underlying electronic density functionals. This shortcoming is often addressed by added empirical corrections and/or increasing the simulation temperature. We present here a maximum-entropy approach to directly incorporate limited experimental data via a minimal bias. Biased AIMD simulations of water and an excess proton in water are shown to give significantly improved properties both for observables which were biased to match experimental data and for unbiased observables. This approach also yields new physical insight into inaccuracies in the underlying density functional theory as utilized in the unbiased AIMD.
Bipartite entangled stabilizer mutually unbiased bases as maximum cliques of Cayley graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dam, Wim van; Howard, Mark; Department of Physics, University of California, Santa Barbara, California 93106
2011-07-15
We examine the existence and structure of particular sets of mutually unbiased bases (MUBs) in bipartite qudit systems. In contrast to well-known power-of-prime MUB constructions, we restrict ourselves to using maximally entangled stabilizer states as MUB vectors. Consequently, these bipartite entangled stabilizer MUBs (BES MUBs) provide no local information, but are sufficient and minimal for decomposing a wide variety of interesting operators including (mixtures of) Jamiolkowski states, entanglement witnesses, and more. The problem of finding such BES MUBs can be mapped, in a natural way, to that of finding maximum cliques in a family of Cayley graphs. Some relationships withmore » known power-of-prime MUB constructions are discussed, and observables for BES MUBs are given explicitly in terms of Pauli operators.« less
Intrinsic Atomic Orbitals: An Unbiased Bridge between Quantum Theory and Chemical Concepts.
Knizia, Gerald
2013-11-12
Modern quantum chemistry can make quantitative predictions on an immense array of chemical systems. However, the interpretation of those predictions is often complicated by the complex wave function expansions used. Here we show that an exceptionally simple algebraic construction allows for defining atomic core and valence orbitals, polarized by the molecular environment, which can exactly represent self-consistent field wave functions. This construction provides an unbiased and direct connection between quantum chemistry and empirical chemical concepts, and can be used, for example, to calculate the nature of bonding in molecules, in chemical terms, from first principles. In particular, we find consistency with electronegativities (χ), C 1s core-level shifts, resonance substituent parameters (σR), Lewis structures, and oxidation states of transition-metal complexes.
Soto, Horacio; Tong, Miriam A; Domínguez, Juan C; Muraoka, Ramón
2017-09-04
We have inserted into an unbiased semiconductor optical amplifier (SOA) a powerful control beam, with photon energy slightly smaller than that of the band-gap of its active region, for exciting two-photon absorption and the quadratic Stark effect. For the available SOA, we estimated these phenomena generated a nonlinear absorption coefficient β= -865 cm/GW and induced an appreciable birefringence inside the amplifier waveguide, which significantly modified the polarization-state of a probe beam. Based on these effects, we have experimentally demonstrated the operation of an all-optical buffer, using an 80 Gb/s optical pulse comb, as well as an unbiased SOA, which was therefore, devoid of amplified spontaneous emission and pattern effects.
Chamberlain, Michael Dean; Wells, Laura A.; Lisovsky, Alexandra; Guo, Hongbo; Isserlin, Ruth; Talior-Volodarsky, Ilana; Mahou, Redouan; Emili, Andrew; Sefton, Michael V.
2015-01-01
An unbiased phosphoproteomic method was used to identify biomaterial-associated changes in the phosphorylation patterns of macrophage-like cells. The phosphorylation differences between differentiated THP1 (dTHP1) cells treated for 10, 20, or 30 min with a vascular regenerative methacrylic acid (MAA) copolymer or a control methyl methacrylate (MM) copolymer were determined by MS. There were 1,470 peptides (corresponding to 729 proteins) that were differentially phosphorylated in dTHP1 cells treated with the two materials with a greater cellular response to MAA treatment. In addition to identifying pathways (such as integrin signaling and cytoskeletal arrangement) that are well known to change with cell–material interaction, previously unidentified pathways, such as apoptosis and mRNA splicing, were also discovered. PMID:26261332
Belyaev, Orlin; Müller, Christophe; Uhl, Waldemar
2006-01-01
Up until about 15 years ago the only realistic option for end-stage fecal incontinence was the creation of a permanent stoma. There have since been several developments. Dynamic graciloplasty (DGP) and artificial bowel sphincter (ABS) are well-established surgical techniques, which offer the patient a chance for continence restoration and improved quality of life; however, they are unfortunately associated with high morbidity and low success rates. Several trials have been done in an attempt to clarify the advantages and disadvantages of these methods and define their place in the second-line treatment of severe, refractory fecal incontinence. This review presents a critical and unbiased overview of the current status of neosphincter surgery according to the available data in the world literature.
Bipartite entangled stabilizer mutually unbiased bases as maximum cliques of Cayley graphs
NASA Astrophysics Data System (ADS)
van Dam, Wim; Howard, Mark
2011-07-01
We examine the existence and structure of particular sets of mutually unbiased bases (MUBs) in bipartite qudit systems. In contrast to well-known power-of-prime MUB constructions, we restrict ourselves to using maximally entangled stabilizer states as MUB vectors. Consequently, these bipartite entangled stabilizer MUBs (BES MUBs) provide no local information, but are sufficient and minimal for decomposing a wide variety of interesting operators including (mixtures of) Jamiołkowski states, entanglement witnesses, and more. The problem of finding such BES MUBs can be mapped, in a natural way, to that of finding maximum cliques in a family of Cayley graphs. Some relationships with known power-of-prime MUB constructions are discussed, and observables for BES MUBs are given explicitly in terms of Pauli operators.
Gap-filling methods to impute eddy covariance flux data by preserving variance.
NASA Astrophysics Data System (ADS)
Kunwor, S.; Staudhammer, C. L.; Starr, G.; Loescher, H. W.
2015-12-01
To represent carbon dynamics, in terms of exchange of CO2 between the terrestrial ecosystem and the atmosphere, eddy covariance (EC) data has been collected using eddy flux towers from various sites across globe for more than two decades. However, measurements from EC data are missing for various reasons: precipitation, routine maintenance, or lack of vertical turbulence. In order to have estimates of net ecosystem exchange of carbon dioxide (NEE) with high precision and accuracy, robust gap-filling methods to impute missing data are required. While the methods used so far have provided robust estimates of the mean value of NEE, little attention has been paid to preserving the variance structures embodied by the flux data. Preserving the variance of these data will provide unbiased and precise estimates of NEE over time, which mimic natural fluctuations. We used a non-linear regression approach with moving windows of different lengths (15, 30, and 60-days) to estimate non-linear regression parameters for one year of flux data from a long-leaf pine site at the Joseph Jones Ecological Research Center. We used as our base the Michaelis-Menten and Van't Hoff functions. We assessed the potential physiological drivers of these parameters with linear models using micrometeorological predictors. We then used a parameter prediction approach to refine the non-linear gap-filling equations based on micrometeorological conditions. This provides us an opportunity to incorporate additional variables, such as vapor pressure deficit (VPD) and volumetric water content (VWC) into the equations. Our preliminary results indicate that improvements in gap-filling can be gained with a 30-day moving window with additional micrometeorological predictors (as indicated by lower root mean square error (RMSE) of the predicted values of NEE). Our next steps are to use these parameter predictions from moving windows to gap-fill the data with and without incorporation of potential driver variables of the parameters traditionally used. Then, comparisons of the predicted values from these methods and 'traditional' gap-filling methods (using 12 fixed monthly windows) will be assessed to show the scale of preserving variance. Further, this method will be applied to impute artificially created gaps for analyzing if variance is preserved.
Gladine, Cécile; Newman, John W; Durand, Thierry; Pedersen, Theresa L; Galano, Jean-Marie; Demougeot, Céline; Berdeaux, Olivier; Pujos-Guillot, Estelle; Mazur, Andrzej; Comte, Blandine
2014-01-01
The anti-atherogenic effects of omega 3 fatty acids, namely eicosapentaenoic (EPA) and docosahexaenoic acids (DHA) are well recognized but the impact of dietary intake on bioactive lipid mediator profiles remains unclear. Such a profiling effort may offer novel targets for future studies into the mechanism of action of omega 3 fatty acids. The present study aimed to determine the impact of DHA supplementation on the profiles of polyunsaturated fatty acids (PUFA) oxygenated metabolites and to investigate their contribution to atherosclerosis prevention. A special emphasis was given to the non-enzymatic metabolites knowing the high susceptibility of DHA to free radical-mediated peroxidation and the increased oxidative stress associated with plaque formation. Atherosclerosis prone mice (LDLR(-/-)) received increasing doses of DHA (0, 0.1, 1 or 2% of energy) during 20 weeks leading to a dose-dependent reduction of atherosclerosis (R(2) = 0.97, p = 0.02), triglyceridemia (R(2) = 0.97, p = 0.01) and cholesterolemia (R(2) = 0.96, p<0.01). Targeted lipidomic analyses revealed that both the profiles of EPA and DHA and their corresponding oxygenated metabolites were substantially modulated in plasma and liver. Notably, the hepatic level of F4-neuroprostanes, a specific class of DHA peroxidized metabolites, was strongly correlated with the hepatic DHA level. Moreover, unbiased statistical analysis including correlation analyses, hierarchical cluster and projection to latent structure discriminate analysis revealed that the hepatic level of F4-neuroprostanes was the variable most negatively correlated with the plaque extent (p<0.001) and along with plasma EPA-derived diols was an important mathematical positive predictor of atherosclerosis prevention. Thus, oxygenated n-3 PUFAs, and F4-neuroprostanes in particular, are potential biomarkers of DHA-associated atherosclerosis prevention. While these may contribute to the anti-atherogenic effects of DHA, further in vitro investigations are needed to confirm such a contention and to decipher the molecular mechanisms of action.
Terence L. Wagner; Joe Mulrooney; Chris Petereson
2002-01-01
The United States Department of Agriculture Forest Service's termiticide testing program provides unbiased efficacy data for product registration using standardized tests, sites and evaluation procedures. Virtually all termiticides undergo Forest Service tests prior to registration.
Engagement of the medical-technology sector with society.
Williams, David; Edelman, Elazer R; Radisic, Milica; Laurencin, Cato; Untereker, Darrel
2017-04-12
The medical-technology sector must educate society in an unbiased rational way about the successes and benefits of biotechnology innovation. Copyright © 2017, American Association for the Advancement of Science.
Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information
NASA Technical Reports Server (NTRS)
Howell, L. W.
2002-01-01
A simple power law model consisting of a single spectral index, a is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index alpha(sub 2) greater than alpha(sub 1) above E(sub k). The Maximum likelihood (ML) procedure was developed for estimating the single parameter alpha(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (P1) consistency (asymptotically unbiased). (P2) efficiency asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only he ascertained by calculating the CRB for an assumed energy spectrum-detector response function combination, which can be quite formidable in practice. However. the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are attained in practice are investigated. The ML technique is then extended to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral parameter estimates based on the combination of data sets.
Unbiased water and methanol maser surveys of NGC 1333
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lyo, A-Ran; Kim, Jongsoo; Byun, Do-Young
2014-11-01
We present the results of unbiased 22 GHz H{sub 2}O water and 44 GHz class I CH{sub 3}OH methanol maser surveys in the central 7' × 10' area of NGC 1333 and two additional mapping observations of a 22 GHz water maser in a ∼3' × 3' area of the IRAS4A region. In the 22 GHz water maser survey of NGC 1333 with a sensitivity of σ ∼ 0.3 Jy, we confirmed the detection of masers toward H{sub 2}O(B) in the region of HH 7-11 and IRAS4B. We also detected new water masers located ∼20'' away in the western directionmore » of IRAS4B or ∼25'' away in the southern direction of IRAS4A. We could not, however, find young stellar objects or molecular outflows associated with them. They showed two different velocity components of ∼0 and ∼16 km s{sup –1}, which are blue- and redshifted relative to the adopted systemic velocity of ∼7 km s{sup –1} for NGC 1333. They also showed time variabilities in both intensity and velocity from multi-epoch observations and an anti-correlation between the intensities of the blue- and redshifted velocity components. We suggest that the unidentified power source of these masers might be found in the earliest evolutionary stage of star formation, before the onset of molecular outflows. Finding this kind of water maser is only possible through an unbiased blind survey. In the 44 GHz methanol maser survey with a sensitivity of σ ∼ 0.5 Jy, we confirmed masers toward IRAS4A2 and the eastern shock region of IRAS2A. Both sources are also detected in 95 and 132 GHz methanol maser lines. In addition, we had new detections of methanol masers at 95 and 132 GHz toward IRAS4B. In terms of the isotropic luminosity, we detected methanol maser sources brighter than ∼5 × 10{sup 25} erg s{sup –1} from our unbiased survey.« less
Waples, R S
2016-10-01
The relationship between life-history traits and the key eco-evolutionary parameters effective population size (Ne) and Ne/N is revisited for iteroparous species with overlapping generations, with a focus on the annual rate of adult mortality (d). Analytical methods based on populations with arbitrarily long adult lifespans are used to evaluate the influence of d on Ne, Ne/N and the factors that determine these parameters: adult abundance (N), generation length (T), age at maturity (α), the ratio of variance to mean reproductive success in one season by individuals of the same age (φ) and lifetime variance in reproductive success of individuals in a cohort (Vk•). Although the resulting estimators of N, T and Vk• are upwardly biased for species with short adult lifespans, the estimate of Ne/N is largely unbiased because biases in T are compensated for by biases in Vk• and N. For the first time, the contrasting effects of T and Vk• on Ne and Ne/N are jointly considered with respect to d and φ. A simple function of d and α based on the assumption of constant vital rates is shown to be a robust predictor (R(2)=0.78) of Ne/N in an empirical data set of life tables for 63 animal and plant species with diverse life histories. Results presented here should provide important context for interpreting the surge of genetically based estimates of Ne that has been fueled by the genomics revolution.
Feng, Wen; Li, Hong-Chang; Xu, Ke; Chen, Ya-Feng; Pan, Li-Yun; Mei, Yi; Cai, Han; Jiang, Yi-Ming; Chen, Teng; Feng, Dian-Xu
2016-08-01
SHC SH2-binding protein 1, a member of Src homolog and collagen homolog (Shc) family, has been recently identified in different contexts in unbiased screening assays. It has been reported to be over-expressed in several malignant cancers. Immunohistochemistry of SHCBP1 on 128 breast cancer tissues and adjacent normal tissues were used to evaluate the prognostic significance of SHCBP1. Survival analyses were performed by Kaplan-Meier method. CRISPR/CAS9 method was used to knockout SHCBP1 expression. CRISPR/CAS9 technology was used to knockout SHCBP1 in 2 breast cancer cell lines. MTT assay, BrdU assay, colony formation assay, cell cycle assay and apoptosis analysis in MCF-7 and MDA-MB-231 cell lines were carried out to evaluate the effects of SHCBP1 on breast cancer in vitro. Immunohistochemical analysis revealed SHCBP1 was significantly up-regulated in breast cancer tissues compared with adjacent normal tissues (82 of 128, 64%). Over-expressed SHCBP1 was correlated with advanced clinical stage and poorer survival. Ablation of SHCBP1 inhibited the proliferation in vitro. SHCBP1 knockout increased cyclin-dependent kinase inhibitor p21, and decreased the Cyclin B1 and CDK1. Our study suggests SHCBP1 is dysregulated expressed in breast cancer and plays a critical role in cancer progression, which can be a potential prognosis predictor of breast cancer. Copyright © 2016. Published by Elsevier B.V.
Guilloux, Jean-Philippe; Bassi, Sabrina; Ding, Ying; Walsh, Chris; Turecki, Gustavo; Tseng, George; Cyranowski, Jill M; Sibille, Etienne
2015-02-01
Major depressive disorder (MDD) in general, and anxious-depression in particular, are characterized by poor rates of remission with first-line treatments, contributing to the chronic illness burden suffered by many patients. Prospective research is needed to identify the biomarkers predicting nonremission prior to treatment initiation. We collected blood samples from a discovery cohort of 34 adult MDD patients with co-occurring anxiety and 33 matched, nondepressed controls at baseline and after 12 weeks (of citalopram plus psychotherapy treatment for the depressed cohort). Samples were processed on gene arrays and group differences in gene expression were investigated. Exploratory analyses suggest that at pretreatment baseline, nonremitting patients differ from controls with gene function and transcription factor analyses potentially related to elevated inflammation and immune activation. In a second phase, we applied an unbiased machine learning prediction model and corrected for model-selection bias. Results show that baseline gene expression predicted nonremission with 79.4% corrected accuracy with a 13-gene model. The same gene-only model predicted nonremission after 8 weeks of citalopram treatment with 76% corrected accuracy in an independent validation cohort of 63 MDD patients treated with citalopram at another institution. Together, these results demonstrate the potential, but also the limitations, of baseline peripheral blood-based gene expression to predict nonremission after citalopram treatment. These results not only support their use in future prediction tools but also suggest that increased accuracy may be obtained with the inclusion of additional predictors (eg, genetics and clinical scales).
Lamadrid-Figueroa, Héctor; Téllez-Rojo, Martha M; Angeles, Gustavo; Hernández-Ávila, Mauricio; Hu, Howard
2011-01-01
In-vivo measurement of bone lead by means of K-X-ray fluorescence (KXRF) is the preferred biological marker of chronic exposure to lead. Unfortunately, considerable measurement error associated with KXRF estimations can introduce bias in estimates of the effect of bone lead when this variable is included as the exposure in a regression model. Estimates of uncertainty reported by the KXRF instrument reflect the variance of the measurement error and, although they can be used to correct the measurement error bias, they are seldom used in epidemiological statistical analyzes. Errors-in-variables regression (EIV) allows for correction of bias caused by measurement error in predictor variables, based on the knowledge of the reliability of such variables. The authors propose a way to obtain reliability coefficients for bone lead measurements from uncertainty data reported by the KXRF instrument and compare, by the use of Monte Carlo simulations, results obtained using EIV regression models vs. those obtained by the standard procedures. Results of the simulations show that Ordinary Least Square (OLS) regression models provide severely biased estimates of effect, and that EIV provides nearly unbiased estimates. Although EIV effect estimates are more imprecise, their mean squared error is much smaller than that of OLS estimates. In conclusion, EIV is a better alternative than OLS to estimate the effect of bone lead when measured by KXRF. Copyright © 2010 Elsevier Inc. All rights reserved.
A Highly Efficient Design Strategy for Regression with Outcome Pooling
Mitchell, Emily M.; Lyles, Robert H.; Manatunga, Amita K.; Perkins, Neil J.; Schisterman, Enrique F.
2014-01-01
The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. PMID:25220822
van Rein, Nienke; Lijfering, Willem M; Bos, Mettine H A; Herruer, Martien H; Vermaas, Helga W; van der Meer, Felix J M; Reitsma, Pieter H
2016-01-01
Risk scores for patients who are at high risk for major bleeding complications during treatment with vitamin K antagonists (VKAs) do not perform that well. BLEEDS was initiated to search for new biomarkers that predict bleeding in these patients. To describe the outline and objectives of BLEEDS and to examine whether the study population is generalizable to other VKA treated populations. A cohort was created consisting of all patients starting VKA treatment at three Dutch anticoagulation clinics between January-2012 and July-2014. We stored leftover plasma and DNA following analysis of the INR. Of 16,706 eligible patients, 16,570 (99%) were included in BLEEDS and plasma was stored from 13,779 patients (83%). Patients had a mean age of 70 years (SD 14), 8713 were male (53%). The most common VKA indications were atrial fibrillation (10,876 patients, 66%) and venous thrombosis (3920 patients, 24%). 326 Major bleeds occurred during 17,613 years of follow-up (incidence rate 1.85/100 person years, 95%CI 1.66-2.06). The risk for major bleeding was highest in the initial three months of VKA treatment and increased when the international normalized ratio increased. These results and characteristics are in concordance with results from other VKA treated populations. BLEEDS is generalizable to other VKA treated populations and will permit innovative and unbiased research of biomarkers that may predict major bleeding during VKA treatment.
Zervantonakis, Ioannis K; Iavarone, Claudia; Chen, Hsing-Yu; Selfors, Laura M; Palakurthi, Sangeetha; Liu, Joyce F; Drapkin, Ronny; Matulonis, Ursula; Leverson, Joel D; Sampath, Deepak; Mills, Gordon B; Brugge, Joan S
2017-08-28
The lack of effective chemotherapies for high-grade serous ovarian cancers (HGS-OvCa) has motivated a search for alternative treatment strategies. Here, we present an unbiased systems-approach to interrogate a panel of 14 well-annotated HGS-OvCa patient-derived xenografts for sensitivity to PI3K and PI3K/mTOR inhibitors and uncover cell death vulnerabilities. Proteomic analysis reveals that PI3K/mTOR inhibition in HGS-OvCa patient-derived xenografts induces both pro-apoptotic and anti-apoptotic signaling responses that limit cell killing, but also primes cells for inhibitors of anti-apoptotic proteins. In-depth quantitative analysis of BCL-2 family proteins and other apoptotic regulators, together with computational modeling and selective anti-apoptotic protein inhibitors, uncovers new mechanistic details about apoptotic regulators that are predictive of drug sensitivity (BIM, caspase-3, BCL-X L ) and resistance (MCL-1, XIAP). Our systems-approach presents a strategy for systematic analysis of the mechanisms that limit effective tumor cell killing and the identification of apoptotic vulnerabilities to overcome drug resistance in ovarian and other cancers.High-grade serous ovarian cancers (HGS-OvCa) frequently develop chemotherapy resistance. Here, the authors through a systematic analysis of proteomic and drug response data of 14 HGS-OvCa PDXs demonstrate that targeting apoptosis regulators can improve response of these tumors to inhibitors of the PI3K/mTOR pathway.
An aerial sightability model for estimating ferruginous hawk population size
Ayers, L.W.; Anderson, S.H.
1999-01-01
Most raptor aerial survey projects have focused on numeric description of visibility bias without identifying the contributing factors or developing predictive models to account for imperfect detection rates. Our goal was to develop a sightability model for nesting ferruginous hawks (Buteo regalis) that could account for nests missed during aerial surveys and provide more accurate population estimates. Eighteen observers, all unfamiliar with nest locations in a known population, searched for nests within 300 m of flight transects via a Maule fixed-wing aircraft. Flight variables tested for their influence on nest-detection rates included aircraft speed, height, direction of travel, time of day, light condition, distance to nest, and observer experience level. Nest variables included status (active vs. inactive), condition (i.e., excellent, good, fair, poor, bad), substrate type, topography, and tree density. A multiple logistic regression model identified nest substrate type, distance to nest, and observer experience level as significant predictors of detection rates (P < 0.05). The overall model was significant (??26 = 124.4, P < 0.001, n = 255 nest observations), and the correct classification rate was 78.4%. During 2 validation surveys, observers saw 23.7% (14/59) and 36.5% (23/63) of the actual population. Sightability model predictions, with 90% confidence intervals, captured the true population in both tests. Our results indicate standardized aerial surveys, when used in conjunction with the predictive sightability model, can provide unbiased population estimates for nesting ferruginous hawks.
A highly efficient design strategy for regression with outcome pooling.
Mitchell, Emily M; Lyles, Robert H; Manatunga, Amita K; Perkins, Neil J; Schisterman, Enrique F
2014-12-10
The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. Copyright © 2014 John Wiley & Sons, Ltd.
Crighton, Eric J; Elliott, Susan J; Moineddin, Rahim; Kanaroglou, Pavlos; Upshur, Ross
2007-04-01
Previous research on the determinants of pneumonia and influenza has focused primarily on the role of individual level biological and behavioural risk factors resulting in partial explanations and largely curative approaches to reducing the disease burden. This study examines the geographic patterns of pneumonia and influenza hospitalizations and the role that broad ecologic-level factors may have in determining them. We conducted a county level, retrospective, ecologic study of pneumonia and influenza hospitalizations in the province of Ontario, Canada, between 1992 and 2001 (N=241,803), controlling for spatial dependence in the data. Non-spatial and spatial regression models were estimated using a range of environmental, social, economic, behavioural, and health care predictors. Results revealed low education to be positively associated with hospitalization rates over all age groups and both genders. The Aboriginal population variable was also positively associated in most models except for the 65+-year age group. Behavioural factors (daily smoking and heavy drinking), environmental factors (passive smoking, poor housing, temperature), and health care factors (influenza vaccination) were all significantly associated in different age and gender-specific models. The use of spatial error regression models allowed for unbiased estimation of regression parameters and their significance levels. These findings demonstrate the importance of broad age and gender-specific population-level factors in determining pneumonia and influenza hospitalizations, and illustrate the need for place and population-specific policies that take these factors into consideration.
Morsanyi, Kinga; Busdraghi, Chiara; Primi, Caterina
2014-09-01
When asked to solve mathematical problems, some people experience anxiety and threat, which can lead to impaired mathematical performance (Curr Dir Psychol Sci 11:181-185, 2002). The present studies investigated the link between mathematical anxiety and performance on the cognitive reflection test (CRT; J Econ Perspect 19:25-42, 2005). The CRT is a measure of a person's ability to resist intuitive response tendencies, and it correlates strongly with important real-life outcomes, such as time preferences, risk-taking, and rational thinking. In Experiments 1 and 2 the relationships between maths anxiety, mathematical knowledge/mathematical achievement, test anxiety and cognitive reflection were analysed using mediation analyses. Experiment 3 included a manipulation of working memory load. The effects of anxiety and working memory load were analysed using ANOVAs. Our experiments with university students (Experiments 1 and 3) and secondary school students (Experiment 2) demonstrated that mathematical anxiety was a significant predictor of cognitive reflection, even after controlling for the effects of general mathematical knowledge (in Experiment 1), school mathematical achievement (in Experiment 2) and test anxiety (in Experiments 1-3). Furthermore, Experiment 3 showed that mathematical anxiety and burdening working memory resources with a secondary task had similar effects on cognitive reflection. Given earlier findings that showed a close link between cognitive reflection, unbiased decisions and rationality, our results suggest that mathematical anxiety might be negatively related to individuals' ability to make advantageous choices and good decisions.
Allard, Alix; Bink, Marco C.A.M.; Martinez, Sébastien; Kelner, Jean-Jacques; Legave, Jean-Michel; di Guardo, Mario; Di Pierro, Erica A.; Laurens, François; van de Weg, Eric W.; Costes, Evelyne
2016-01-01
In temperate trees, growth resumption in spring time results from chilling and heat requirements, and is an adaptive trait under global warming. Here, the genetic determinism of budbreak and flowering time was deciphered using five related full-sib apple families. Both traits were observed over 3 years and two sites and expressed in calendar and degree-days. Best linear unbiased predictors of genotypic effect or interaction with climatic year were extracted from mixed linear models and used for quantitative trait locus (QTL) mapping, performed with an integrated genetic map containing 6849 single nucleotide polymorphisms (SNPs), grouped into haplotypes, and with a Bayesian pedigree-based analysis. Four major regions, on linkage group (LG) 7, LG10, LG12, and LG9, the latter being the most stable across families, sites, and years, explained 5.6–21.3% of trait variance. Co-localizations for traits in calendar days or growing degree hours (GDH) suggested common genetic determinism for chilling and heating requirements. Homologs of two major flowering genes, AGL24 and FT, were predicted close to LG9 and LG12 QTLs, respectively, whereas Dormancy Associated MADs-box (DAM) genes were near additional QTLs on LG8 and LG15. This suggests that chilling perception mechanisms could be common among perennial and annual plants. Progenitors with favorable alleles depending on trait and LG were identified and could benefit new breeding strategies for apple adaptation to temperature increase. PMID:27034326
FALCON: fast and unbiased reconstruction of high-density super-resolution microscopy data
NASA Astrophysics Data System (ADS)
Min, Junhong; Vonesch, Cédric; Kirshner, Hagai; Carlini, Lina; Olivier, Nicolas; Holden, Seamus; Manley, Suliana; Ye, Jong Chul; Unser, Michael
2014-04-01
Super resolution microscopy such as STORM and (F)PALM is now a well known method for biological studies at the nanometer scale. However, conventional imaging schemes based on sparse activation of photo-switchable fluorescent probes have inherently slow temporal resolution which is a serious limitation when investigating live-cell dynamics. Here, we present an algorithm for high-density super-resolution microscopy which combines a sparsity-promoting formulation with a Taylor series approximation of the PSF. Our algorithm is designed to provide unbiased localization on continuous space and high recall rates for high-density imaging, and to have orders-of-magnitude shorter run times compared to previous high-density algorithms. We validated our algorithm on both simulated and experimental data, and demonstrated live-cell imaging with temporal resolution of 2.5 seconds by recovering fast ER dynamics.
FALCON: fast and unbiased reconstruction of high-density super-resolution microscopy data
Min, Junhong; Vonesch, Cédric; Kirshner, Hagai; Carlini, Lina; Olivier, Nicolas; Holden, Seamus; Manley, Suliana; Ye, Jong Chul; Unser, Michael
2014-01-01
Super resolution microscopy such as STORM and (F)PALM is now a well known method for biological studies at the nanometer scale. However, conventional imaging schemes based on sparse activation of photo-switchable fluorescent probes have inherently slow temporal resolution which is a serious limitation when investigating live-cell dynamics. Here, we present an algorithm for high-density super-resolution microscopy which combines a sparsity-promoting formulation with a Taylor series approximation of the PSF. Our algorithm is designed to provide unbiased localization on continuous space and high recall rates for high-density imaging, and to have orders-of-magnitude shorter run times compared to previous high-density algorithms. We validated our algorithm on both simulated and experimental data, and demonstrated live-cell imaging with temporal resolution of 2.5 seconds by recovering fast ER dynamics. PMID:24694686
Maximum likelihood: Extracting unbiased information from complex networks
NASA Astrophysics Data System (ADS)
Garlaschelli, Diego; Loffredo, Maria I.
2008-07-01
The choice of free parameters in network models is subjective, since it depends on what topological properties are being monitored. However, we show that the maximum likelihood (ML) principle indicates a unique, statistically rigorous parameter choice, associated with a well-defined topological feature. We then find that, if the ML condition is incompatible with the built-in parameter choice, network models turn out to be intrinsically ill defined or biased. To overcome this problem, we construct a class of safely unbiased models. We also propose an extension of these results that leads to the fascinating possibility to extract, only from topological data, the “hidden variables” underlying network organization, making them “no longer hidden.” We test our method on World Trade Web data, where we recover the empirical gross domestic product using only topological information.
Personalized recommendation based on unbiased consistence
NASA Astrophysics Data System (ADS)
Zhu, Xuzhen; Tian, Hui; Zhang, Ping; Hu, Zheng; Zhou, Tao
2015-08-01
Recently, in physical dynamics, mass-diffusion-based recommendation algorithms on bipartite network provide an efficient solution by automatically pushing possible relevant items to users according to their past preferences. However, traditional mass-diffusion-based algorithms just focus on unidirectional mass diffusion from objects having been collected to those which should be recommended, resulting in a biased causal similarity estimation and not-so-good performance. In this letter, we argue that in many cases, a user's interests are stable, and thus bidirectional mass diffusion abilities, no matter originated from objects having been collected or from those which should be recommended, should be consistently powerful, showing unbiased consistence. We further propose a consistence-based mass diffusion algorithm via bidirectional diffusion against biased causality, outperforming the state-of-the-art recommendation algorithms in disparate real data sets, including Netflix, MovieLens, Amazon and Rate Your Music.
Morton, Christine H
2009-01-01
The trend toward evidence-based information in childbirth education has been ongoing for some time. Lamaze educators are encouraged to present evidence for the Six Care Practices That Support Normal Birth to pregnant women in their childbirth classes. In a previous article published in The Journal of Perinatal Education, my colleague and I provided an overview of the dilemmas facing American childbirth educators. Childbirth education is a domain in which many types of authoritative knowledge are used: evidence, beliefs, and experience. In our study, educators told us their goal is to provide class participants with unbiased information that allows women to choose what is best for them. In this article, I further analyze educators’ dilemmas and challenges in presenting unbiased information, and I discuss some ethical considerations in educators’ practices. PMID:19436597
Host Galaxy Properties Of The Swift Bat Hard X-ray Survey Of Agn
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.; Winter, L.
2010-03-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. In 2008, we observed 90 of these targets at Kitt Peak with the 2.1m in the SDSS ugriz bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, stellar mass, star formation, and AGN luminosity for a sample of 145 AGN Hard X-ray Selected AGN.
An unbiased X-ray sampling of stars within 25 parsecs of the Sun
NASA Technical Reports Server (NTRS)
Johnson, H. M.
1985-01-01
A search of all of the Einstein Observatory IPC and HRI fields for untargeted stars in the Woolley, et al., Catalogue of the nearby stars is reported. Optical data and IPC coordinates, flux density F sub x, and luminosity L sub x, or upper limits, are tabulated for 126 single or blended systems, and HRI results for a few of them. IPC luminosity functions are derived for the systems, for 193 individual stars in the systems (with L sub x shared equally among blended components), and for 63 individual M dwarfs. These stars have relatively large X-ray flux densities that are free of interstellar extinction, because they are nearby, but they are otherwise unbiased with respect to the X-ray properties that are found in a defined small space around the Sun.
Biased and unbiased strategies to identify biologically active small molecules.
Abet, Valentina; Mariani, Angelica; Truscott, Fiona R; Britton, Sébastien; Rodriguez, Raphaël
2014-08-15
Small molecules are central players in chemical biology studies. They promote the perturbation of cellular processes underlying diseases and enable the identification of biological targets that can be validated for therapeutic intervention. Small molecules have been shown to accurately tune a single function of pluripotent proteins in a reversible manner with exceptional temporal resolution. The identification of molecular probes and drugs remains a worthy challenge that can be addressed by the use of biased and unbiased strategies. Hypothesis-driven methodologies employs a known biological target to synthesize complementary hits while discovery-driven strategies offer the additional means of identifying previously unanticipated biological targets. This review article provides a general overview of recent synthetic frameworks that gave rise to an impressive arsenal of biologically active small molecules with unprecedented cellular mechanisms. Copyright © 2014. Published by Elsevier Ltd.
Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits
Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia
2004-01-01
Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials
Diaz-Ordaz, Karla; Bartlett, Jonathan W
2016-01-01
Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885
NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.
Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N
2016-11-01
The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.
NASA Astrophysics Data System (ADS)
Raley, Angélique; Lee, Joe; Smith, Jeffrey T.; Sun, Xinghua; Farrell, Richard A.; Shearer, Jeffrey; Xu, Yongan; Ko, Akiteru; Metz, Andrew W.; Biolsi, Peter; Devilliers, Anton; Arnold, John; Felix, Nelson
2018-04-01
We report a sub-30nm pitch self-aligned double patterning (SADP) integration scheme with EUV lithography coupled with self-aligned block technology (SAB) targeting the back end of line (BEOL) metal line patterning applications for logic nodes beyond 5nm. The integration demonstration is a validation of the scalability of a previously reported flow, which used 193nm immersion SADP targeting a 40nm pitch with the same material sets (Si3N4 mandrel, SiO2 spacer, Spin on carbon, spin on glass). The multi-color integration approach is successfully demonstrated and provides a valuable method to address overlay concerns and more generally edge placement error (EPE) as a whole for advanced process nodes. Unbiased LER/LWR analysis comparison between EUV SADP and 193nm immersion SADP shows that both integrations follow the same trend throughout the process steps. While EUV SADP shows increased LER after mandrel pull, metal hardmask open and dielectric etch compared to 193nm immersion SADP, the final process performance is matched in terms of LWR (1.08nm 3 sigma unbiased) and is only 6% higher than 193nm immersion SADP for average unbiased LER. Using EUV SADP enables almost doubling the line density while keeping most of the remaining processes and films unchanged, and provides a compelling alternative to other multipatterning integrations, which present their own sets of challenges.
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.
Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W
2017-06-01
Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.
Testing assumptions for unbiased estimation of survival of radiomarked harlequin ducks
Esler, Daniel N.; Mulcahy, Daniel M.; Jarvis, Robert L.
2000-01-01
Unbiased estimates of survival based on individuals outfitted with radiotransmitters require meeting the assumptions that radios do not affect survival, and animals for which the radio signal is lost have the same survival probability as those for which fate is known. In most survival studies, researchers have made these assumptions without testing their validity. We tested these assumptions by comparing interannual recapture rates (and, by inference, survival) between radioed and unradioed adult female harlequin ducks (Histrionicus histrionicus), and for radioed females, between right-censored birds (i.e., those for which the radio signal was lost during the telemetry monitoring period) and birds with known fates. We found that recapture rates of birds equipped with implanted radiotransmitters (21.6 ± 3.0%; x̄ ± SE) were similar to unradioed birds (21.7 ± 8.6%), suggesting that radios did not affect survival. Recapture rates also were similar between right-censored (20.6 ± 5.1%) and known-fate individuals (22.1 ± 3.8%), suggesting that missing birds were not subject to differential mortality. We also determined that capture and handling resulted in short-term loss of body mass for both radioed and unradioed females and that this effect was more pronounced for radioed birds (the difference between groups was 15.4 ± 7.1 g). However, no difference existed in body mass after recapture 1 year later. Our study suggests that implanted radios are an unbiased method for estimating survival of harlequin ducks and likely other species under similar circumstances.
Evaluation and construction of diagnostic criteria for inclusion body myositis
Mammen, Andrew L.; Amato, Anthony A.; Weiss, Michael D.; Needham, Merrilee
2014-01-01
Objective: To use patient data to evaluate and construct diagnostic criteria for inclusion body myositis (IBM), a progressive disease of skeletal muscle. Methods: The literature was reviewed to identify all previously proposed IBM diagnostic criteria. These criteria were applied through medical records review to 200 patients diagnosed as having IBM and 171 patients diagnosed as having a muscle disease other than IBM by neuromuscular specialists at 2 institutions, and to a validating set of 66 additional patients with IBM from 2 other institutions. Machine learning techniques were used for unbiased construction of diagnostic criteria. Results: Twenty-four previously proposed IBM diagnostic categories were identified. Twelve categories all performed with high (≥97%) specificity but varied substantially in their sensitivities (11%–84%). The best performing category was European Neuromuscular Centre 2013 probable (sensitivity of 84%). Specialized pathologic features and newly introduced strength criteria (comparative knee extension/hip flexion strength) performed poorly. Unbiased data-directed analysis of 20 features in 371 patients resulted in construction of higher-performing data-derived diagnostic criteria (90% sensitivity and 96% specificity). Conclusions: Published expert consensus–derived IBM diagnostic categories have uniformly high specificity but wide-ranging sensitivities. High-performing IBM diagnostic category criteria can be developed directly from principled unbiased analysis of patient data. Classification of evidence: This study provides Class II evidence that published expert consensus–derived IBM diagnostic categories accurately distinguish IBM from other muscle disease with high specificity but wide-ranging sensitivities. PMID:24975859
León, Ileana R.; Schwämmle, Veit; Jensen, Ole N.; Sprenger, Richard R.
2013-01-01
The majority of mass spectrometry-based protein quantification studies uses peptide-centric analytical methods and thus strongly relies on efficient and unbiased protein digestion protocols for sample preparation. We present a novel objective approach to assess protein digestion efficiency using a combination of qualitative and quantitative liquid chromatography-tandem MS methods and statistical data analysis. In contrast to previous studies we employed both standard qualitative as well as data-independent quantitative workflows to systematically assess trypsin digestion efficiency and bias using mitochondrial protein fractions. We evaluated nine trypsin-based digestion protocols, based on standard in-solution or on spin filter-aided digestion, including new optimized protocols. We investigated various reagents for protein solubilization and denaturation (dodecyl sulfate, deoxycholate, urea), several trypsin digestion conditions (buffer, RapiGest, deoxycholate, urea), and two methods for removal of detergents before analysis of peptides (acid precipitation or phase separation with ethyl acetate). Our data-independent quantitative liquid chromatography-tandem MS workflow quantified over 3700 distinct peptides with 96% completeness between all protocols and replicates, with an average 40% protein sequence coverage and an average of 11 peptides identified per protein. Systematic quantitative and statistical analysis of physicochemical parameters demonstrated that deoxycholate-assisted in-solution digestion combined with phase transfer allows for efficient, unbiased generation and recovery of peptides from all protein classes, including membrane proteins. This deoxycholate-assisted protocol was also optimal for spin filter-aided digestions as compared with existing methods. PMID:23792921
AN UNBIASED 1.3 mm EMISSION LINE SURVEY OF THE PROTOPLANETARY DISK ORBITING LkCa 15
DOE Office of Scientific and Technical Information (OSTI.GOV)
Punzi, K. M.; Kastner, J. H.; Hily-Blant, P.
2015-06-01
The outer (>30 AU) regions of the dusty circumstellar disk orbiting the ∼2–5 Myr old, actively accreting solar analog LkCa 15 are known to be chemically rich, and the inner disk may host a young protoplanet within its central cavity. To obtain a complete census of the brightest molecular line emission emanating from the LkCa 15 disk over the 210–270 GHz (1.4–1.1 mm) range, we have conducted an unbiased radio spectroscopic survey with the Institute de Radioastronomie Millimétrique (IRAM) 30 m telescope. The survey demonstrates that in this spectral region, the most readily detectable lines are those of CO andmore » its isotopologues {sup 13}CO and C{sup 18}O, as well as HCO{sup +}, HCN, CN, C{sub 2}H, CS, and H{sub 2}CO. All of these species had been previously detected in the LkCa 15 disk; however, the present survey includes the first complete coverage of the CN (2–1) and C{sub 2}H (3–2) hyperfine complexes. Modeling of these emission complexes indicates that the CN and C{sub 2}H either reside in the coldest regions of the disk or are subthermally excited, and that their abundances are enhanced relative to molecular clouds and young stellar object environments. These results highlight the value of unbiased single-dish line surveys in guiding future high-resolution interferometric imaging of disks.« less
Impact of using scatterometer and altimeter data on storm surge forecasting
NASA Astrophysics Data System (ADS)
Bajo, Marco; De Biasio, Francesco; Umgiesser, Georg; Vignudelli, Stefano; Zecchetto, Stefano
2017-05-01
Satellite data are rarely used in storm surge models because of the lack of established methodologies. Nevertheless, they can provide useful information on surface wind and sea level, which can potentially improve the forecast. In this paper satellite wind data are used to correct the bias of wind originating from a global atmospheric model, while satellite sea level data are used to improve the initial conditions of the model simulations. In a first step, the capability of global winds (biased and unbiased) to adequately force a storm surge model are assessed against that of a high resolution local wind. Then, the added value of direct assimilation of satellite altimeter data in the storm surge model is tested. Eleven storm surge events, recorded in Venice from 2008 to 2012, are simulated using different configurations of wind forcing and altimeter data assimilation. Focusing on the maximum surge peak, results show that the relative error, averaged over the eleven cases considered, decreases from 13% to 7%, using both the unbiased wind and assimilating the altimeter data, while, if the high resolution local wind is used to force the hydrodynamic model, the altimeter data assimilation reduces the error from 9% to 6%. Yet, the overall capabilities in reproducing the surge in the first day of forecast, measured by the correlation and by the rms error, improve only with the use of the unbiased global wind and not with the use of high resolution local wind and altimeter data assimilation.
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2012 CFR
2012-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2013 CFR
2013-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2014 CFR
2014-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
48 CFR 31.201-6 - Accounting for unallowable costs.
Code of Federal Regulations, 2011 CFR
2011-10-01
... unbiased sample that is a reasonable representation of the sampling universe. (ii) Any large dollar value... universe from that sampled cost is also subject to the same penalty provisions. (4) Use of statistical...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-04
... HMS stock assessment. In order to ensure that the peer review is unbiased, individuals who... fisheries, related industries, research, teaching, writing, conservation, or management of marine organisms...
48 CFR 25.703-1 - Definitions.
Code of Federal Regulations, 2012 CFR
2012-10-01
... that is to be used specifically— (i) To restrict the free flow of unbiased information in Iran; or (ii) To disrupt, monitor, or otherwise restrict speech of the people of Iran; and (2) Does not include...
Is Science Biased Toward Natural?
Having widely available, accurate, understandable, and unbiased scientific information is central to the successful resolution of the typically contentious, divisive, and litigious natural resource policy issue. Three examples are offered to illustrate how science is often misus...
ERIC Educational Resources Information Center
Gould, Stephen Jay
1980-01-01
Challenges Jensen's arguments (set forth in the book "Bias in Mental Testing") that intelligence tests are scientifically unbiased and that IQ and other mental tests measure something called "intelligence" by refuting Jensen's reading of the psychometric research literature. (EF)
Larzul, Catherine; Gondret, Florence; Combes, Sylvie; de Rochambeau, Hubert
2005-01-01
The effects of selection for growth rate on weights and qualitative carcass and muscle traits were assessed by comparing two lines selected for live body weight at 63 days of age and a cryopreserved control population raised contemporaneously with generation 5 selected rabbits. The animals were divergently selected for five generations for either a high (H line) or a low (L line) body weight, based on their BLUP breeding value. Heritability (h2) was 0.22 for 63-d body weight (N = 4754). Growth performance and quantitative carcass traits in the C group were intermediate between the H and L lines (N = 390). Perirenal fat proportion (h2 = 0.64) and dressing out percentage (h2 = 0.55) ranked in the order L < H = C (from high to low). The weight and cross-sectional area of the Semitendinosus muscle, and the mean diameter of the constitutive myofibres were reduced in the L line only (N = 140). In the Longissimus muscle (N = 180), the ultimate pH (h2 = 0.16) and the maximum shear force reached in the Warner-Braztler test (h2 = 0.57) were slightly modified by selection. PMID:15588570
Trait-specific long-term consequences of genomic selection in beef cattle.
de Rezende Neves, Haroldo Henrique; Carvalheiro, Roberto; de Queiroz, Sandra Aidar
2018-02-01
Simulation studies allow addressing consequences of selection schemes, helping to identify effective strategies to enable genetic gain and maintain genetic diversity. The aim of this study was to evaluate the long-term impact of genomic selection (GS) in genetic progress and genetic diversity of beef cattle. Forward-in-time simulation generated a population with pattern of linkage disequilibrium close to that previously reported for real beef cattle populations. Different scenarios of GS and traditional pedigree-based BLUP (PBLUP) selection were simulated for 15 generations, mimicking selection for female reproduction and meat quality. For GS scenarios, an alternative selection criterion was simulated (wGBLUP), intended to enhance long-term gains by attributing more weight to favorable alleles with low frequency. GS allowed genetic progress up to 40% greater than PBLUP, for female reproduction and meat quality. The alternative criterion wGBLUP did not increase long-term response, although allowed reducing inbreeding rates and loss of favorable alleles. The results suggest that GS outperforms PBLUP when the selected trait is under less polygenic background and that attributing more weight to low-frequency favorable alleles can reduce inbreeding rates and loss of favorable alleles in GS.
Solar Economics for Policymakers | State, Local, and Tribal Governments |
NREL Economics for Policymakers Solar Economics for Policymakers The Solar Technical Assistance regions to give policymakers up-to-date, accurate, and unbiased information on solar economics and likely
Best practices for traffic impact studies : final report.
DOT National Transportation Integrated Search
2006-06-01
For many years there have been concerns that some traffic engineers may approach traffic impact studies with an eye : toward assisting developers expedite their development approval rather than delivering an unbiased evaluation of the : impact of the...
Alternative Fuel Transit Bus Evaluation Program Results
DOT National Transportation Integrated Search
1996-05-06
The objective of this program, which is supported by the U.S. Department of : Energy (DOE) through the National Renewable Energy Laboratory (NREL), is to : provide an unbiased and comprehensive comparison of transit buses operating on : alternative f...
Motor activity as an unbiased variable to assess anaphylaxis in allergic rats
Abril-Gil, Mar; Garcia-Just, Alba; Cambras, Trinitat; Pérez-Cano, Francisco J; Castellote, Cristina; Franch, Àngels
2015-01-01
The release of mediators by mast cells triggers allergic symptoms involving various physiological systems and, in the most severe cases, the development of anaphylactic shock compromising mainly the nervous and cardiovascular systems. We aimed to establish variables to objectively study the anaphylactic response (AR) after an oral challenge in an allergy model. Brown Norway rats were immunized by intraperitoneal injection of ovalbumin with alum and toxin from Bordetella pertussis. Specific immunoglobulin (Ig) E antibodies were developed in immunized animals. Forty days after immunization, the rats were orally challenged with the allergen, and motor activity, body temperature and serum mast cell protease concentration were determined. The anaphylaxis induced a reduction in body temperature and a decrease in the number of animal movements, which was inversely correlated with serum mast cell protease release. In summary, motor activity is a reliable tool for assessing AR and also an unbiased method for screening new anti-allergic drugs. PMID:25716015
Xiao, Mengli; Zhang, Yongbo; Fu, Huimin; Wang, Zhihua
2018-05-01
High-precision navigation algorithm is essential for the future Mars pinpoint landing mission. The unknown inputs caused by large uncertainties of atmospheric density and aerodynamic coefficients as well as unknown measurement biases may cause large estimation errors of conventional Kalman filters. This paper proposes a derivative-free version of nonlinear unbiased minimum variance filter for Mars entry navigation. This filter has been designed to solve this problem by estimating the state and unknown measurement biases simultaneously with derivative-free character, leading to a high-precision algorithm for the Mars entry navigation. IMU/radio beacons integrated navigation is introduced in the simulation, and the result shows that with or without radio blackout, our proposed filter could achieve an accurate state estimation, much better than the conventional unscented Kalman filter, showing the ability of high-precision Mars entry navigation algorithm. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Beilina, Alexandria; Rudenko, Iakov N.; Kaganovich, Alice; Civiero, Laura; Chau, Hien; Kalia, Suneil K.; Kalia, Lorraine V.; Lobbestael, Evy; Chia, Ruth; Ndukwe, Kelechi; Ding, Jinhui; Nalls, Mike A.; Olszewski, Maciej; Hauser, David N.; Kumaran, Ravindran; Lozano, Andres M.; Baekelandt, Veerle; Greene, Lois E.; Taymans, Jean-Marc; Greggio, Elisa; Cookson, Mark R.; Nalls, Mike A.; Plagnol, Vincent; Martinez, Maria; Hernandez, Dena G; Sharma, Manu; Sheerin, Una-Marie; Saad, Mohamad; Simón-Sánchez, Javier; Schulte, Claudia; Lesage, Suzanne; Sveinbjörnsdóttir, Sigurlaug; Arepalli, Sampath; Barker, Roger; Ben-Shlomo, Yoav; Berendse, Henk W; Berg, Daniela; Bhatia, Kailash; de Bie, Rob M A; Biffi, Alessandro; Bloem, Bas; Bochdanovits, Zoltan; Bonin, Michael; Bras, Jose M; Brockmann, Kathrin; Brooks, Janet; Burn, David J; Charlesworth, Gavin; Chen, Honglei; Chong, Sean; Clarke, Carl E; Cookson, Mark R; Cooper, J Mark; Corvol, Jean Christophe; Counsell, Carl; Damier, Philippe; Dartigues, Jean-François; Deloukas, Panos; Deuschl, Günther; Dexter, David T; van Dijk, Karin D; Dillman, Allissa; Durif, Frank; Dürr, Alexandra; Edkins, Sarah; Evans, Jonathan R; Foltynie, Thomas; Gao, Jianjun; Gardner, Michelle; Gibbs, J Raphael; Goate, Alison; Gray, Emma; Guerreiro, Rita; Gústafsson, Ómar; Harris, Clare; van Hilten, Jacobus J; Hofman, Albert; Hollenbeck, Albert; Holton, Janice; Hu, Michele; Huang, Xuemei; Huber, Heiko; Hudson, Gavin; Hunt, Sarah E; Huttenlocher, Johanna; Illig, Thomas; München, Helmholtz Zentrum; Jónsson, Pálmi V; Lambert, Jean-Charles; Langford, Cordelia; Lees, Andrew; Lichtner, Peter; München, Helmholtz Zentrum; Limousin, Patricia; Lopez, Grisel; Lorenz, Delia; McNeill, Alisdair; Moorby, Catriona; Moore, Matthew; Morris, Huw R; Morrison, Karen E; Mudanohwo, Ese; O’Sullivan, Sean S; Pearson, Justin; Perlmutter, Joel S; Pétursson, Hjörvar; Pollak, Pierre; Post, Bart; Potter, Simon; Ravina, Bernard; Revesz, Tamas; Riess, Olaf; Rivadeneira, Fernando; Rizzu, Patrizia; Ryten, Mina; Sawcer, Stephen; Schapira, Anthony; Scheffer, Hans; Shaw, Karen; Shoulson, Ira; Sidransky, Ellen; Smith, Colin; Spencer, Chris C A; Stefánsson, Hreinn; Steinberg, Stacy; Stockton, Joanna D; Strange, Amy; Talbot, Kevin; Tanner, Carlie M; Tashakkori-Ghanbaria, Avazeh; Tison, François; Trabzuni, Daniah; Traynor, Bryan J; Uitterlinden, André G; Velseboer, Daan; Vidailhet, Marie; Walker, Robert; van de Warrenburg, Bart; Wickremaratchi, Mirdhu; Williams, Nigel; Williams-Gray, Caroline H; Winder-Rhodes, Sophie; Stefánsson, Kári; Hardy, John; Heutink, Peter; Brice, Alexis; Gasser, Thomas; Singleton, Andrew B; Wood, Nicholas W; Chinnery, Patrick F; Arepalli, Sampath; Cookson, Mark R; Dillman, Allissa; Ferrucci, Luigi; Gibbs, J Raphael; Hernandez, Dena G; Johnson, Robert; Longo, Dan L; Majounie, Elisa; Nalls, Michael A; O’Brien, Richard; Singleton, Andrew B; Traynor, Bryan J; Troncoso, Juan; van der Brug, Marcel; Zielke, H Ronald; Zonderman, Alan B
2014-01-01
Mutations in leucine-rich repeat kinase 2 (LRRK2) cause inherited Parkinson disease (PD), and common variants around LRRK2 are a risk factor for sporadic PD. Using protein–protein interaction arrays, we identified BCL2-associated athanogene 5, Rab7L1 (RAB7, member RAS oncogene family-like 1), and Cyclin-G–associated kinase as binding partners of LRRK2. The latter two genes are candidate genes for risk for sporadic PD identified by genome-wide association studies. These proteins form a complex that promotes clearance of Golgi-derived vesicles through the autophagy–lysosome system both in vitro and in vivo. We propose that three different genes for PD have a common biological function. More generally, data integration from multiple unbiased screens can provide insight into human disease mechanisms. PMID:24510904
GTARG - The TOPEX/Poseidon ground track maintenance maneuver targeting program
NASA Technical Reports Server (NTRS)
Shapiro, Bruce E.; Bhat, Ramachandra S.
1993-01-01
GTARG is a computer program used to design orbit maintenance maneuvers for the TOPEX/Poseidon satellite. These maneuvers ensure that the ground track is kept within +/-1 km with of an = 9.9 day exact repeat pattern. Maneuver parameters are determined using either of two targeting strategies: longitude targeting, which maximizes the time between maneuvers, and time targeting, in which maneuvers are targeted to occur at specific intervals. The GTARG algorithm propagates nonsingular mean elements, taking into account anticipated error sigma's in orbit determination, Delta v execution, drag prediction and Delta v quantization. A satellite unique drag model is used which incorporates an approximate mean orbital Jacchia-Roberts atmosphere and a variable mean area model. Maneuver Delta v magnitudes are targeted to precisely maintain either the unbiased ground track itself, or a comfortable (3 sigma) error envelope about the unbiased ground track.
The New Peabody Picture Vocabulary Test-III: An Illusion of Unbiased Assessment?
Stockman, Ida J
2000-10-01
This article examines whether changes in the ethnic minority composition of the standardization sample for the latest edition of the Peabody Picture Vocabulary Test (PPVT-III, Dunn & Dunn, 1997) can be used as the sole explanation for children's better test scores when compared to an earlier edition, the Peabody Picture Vocabulary Test-Revised (PPVT-R, Dunn & Dunn, 1981). Results from a comparative analysis of these two test editions suggest that other factors may explain improved performances. Among these factors are the number of words and age levels sampled, the types of words and pictures used, and characteristics of the standardization sample other than its ethnic minority composition. This analysis also raises questions regarding the usefulness of converting scores from one edition to the other and the type of criteria that could be used to evaluate whether the PPVT-III is an unbiased test of vocabulary for children from diverse cultural and linguistic backgrounds.
Image denoising in mixed Poisson-Gaussian noise.
Luisier, Florian; Blu, Thierry; Unser, Michael
2011-03-01
We propose a general methodology (PURE-LET) to design and optimize a wide class of transform-domain thresholding algorithms for denoising images corrupted by mixed Poisson-Gaussian noise. We express the denoising process as a linear expansion of thresholds (LET) that we optimize by relying on a purely data-adaptive unbiased estimate of the mean-squared error (MSE), derived in a non-Bayesian framework (PURE: Poisson-Gaussian unbiased risk estimate). We provide a practical approximation of this theoretical MSE estimate for the tractable optimization of arbitrary transform-domain thresholding. We then propose a pointwise estimator for undecimated filterbank transforms, which consists of subband-adaptive thresholding functions with signal-dependent thresholds that are globally optimized in the image domain. We finally demonstrate the potential of the proposed approach through extensive comparisons with state-of-the-art techniques that are specifically tailored to the estimation of Poisson intensities. We also present denoising results obtained on real images of low-count fluorescence microscopy.
Host Galaxy Morphologies Of Hard X-ray Selected AGN From The Swift BAT Survey
NASA Astrophysics Data System (ADS)
Koss, Michael; Mushotzky, R.; Veilleux, S.
2009-01-01
Surveys of AGN taken in the optical, UV, and soft X-rays miss an important population of obscured AGN only visible in the hard X-rays and mid-IR wavelengths. The SWIFT BAT survey in the hard X-ray range (14-195 keV) has provided a uniquely unbiased sample of 258 AGN unaffected by galactic or circumnuclear absorption. Optical imaging of this unbiased sample provides a new opportunity to understand how the environments of the host galaxies are linked to AGN. For these host galaxies, only a fraction, 29%, have high quality optical images, predominately from the SDSS. In addition, about 33% show peculiar morphologies and interaction. In 2008, we observed 110 of these targets at Kitt Peak with the 2.1m in the SDSS bands over 17 nights. Using these observations and SDSS data we review the relationships between color, morphology, merger activity, star formation, and AGN luminosity.
Darré, Leonardo; Machado, Matías Rodrigo; Brandner, Astrid Febe; González, Humberto Carlos; Ferreira, Sebastián; Pantano, Sergio
2015-02-10
Modeling of macromolecular structures and interactions represents an important challenge for computational biology, involving different time and length scales. However, this task can be facilitated through the use of coarse-grained (CG) models, which reduce the number of degrees of freedom and allow efficient exploration of complex conformational spaces. This article presents a new CG protein model named SIRAH, developed to work with explicit solvent and to capture sequence, temperature, and ionic strength effects in a topologically unbiased manner. SIRAH is implemented in GROMACS, and interactions are calculated using a standard pairwise Hamiltonian for classical molecular dynamics simulations. We present a set of simulations that test the capability of SIRAH to produce a qualitatively correct solvation on different amino acids, hydrophilic/hydrophobic interactions, and long-range electrostatic recognition leading to spontaneous association of unstructured peptides and stable structures of single polypeptides and protein-protein complexes.
Biased lineup instructions and face identification from video images.
Thompson, W Burt; Johnson, Jaime
2008-01-01
Previous eyewitness memory research has shown that biased lineup instructions reduce identification accuracy, primarily by increasing false-positive identifications in target-absent lineups. Because some attempts at identification do not rely on a witness's memory of the perpetrator but instead involve matching photos to images on surveillance video, the authors investigated the effects of biased instructions on identification accuracy in a matching task. In Experiment 1, biased instructions did not affect the overall accuracy of participants who used video images as an identification aid, but nearly all correct decisions occurred with target-present photo spreads. Both biased and unbiased instructions resulted in high false-positive rates. In Experiment 2, which focused on video-photo matching accuracy with target-absent photo spreads, unbiased instructions led to more correct responses (i.e., fewer false positives). These findings suggest that investigators should not relax precautions against biased instructions when people attempt to match photos to an unfamiliar person recorded on video.
Unbiased approaches to biomarker discovery in neurodegenerative diseases
Chen-Plotkin, Alice S.
2014-01-01
Neurodegenerative diseases such as Alzheimer’s disease, Parkinson’s disease, amyotrophic lateral sclerosis, and frontotemporal dementia have several important features in common. They are progressive, they affect a relatively inaccessible organ, and we have no disease-modifying therapies for them. For these brain-based diseases, current diagnosis and evaluation of disease severity rely almost entirely on clinical examination, which may only be a rough approximation of disease state. Thus, the development of biomarkers – objective, relatively easily measured and precise indicators of pathogenic processes – could improve patient care and accelerate therapeutic discovery. Yet existing, rigorously tested neurodegenerative disease biomarkers are few, and even fewer biomarkers have translated into clinical use. To find new biomarkers for these diseases, an unbiased, high-throughput screening approach may be needed. In this review, I will describe the potential utility of such an approach to biomarker discovery, using Parkinson’s disease as a case example. PMID:25442938
Fault Tolerant Characteristics of Artificial Neural Network Electronic Hardware
NASA Technical Reports Server (NTRS)
Zee, Frank
1995-01-01
The fault tolerant characteristics of analog-VLSI artificial neural network (with 32 neurons and 532 synapses) chips are studied by exposing them to high energy electrons, high energy protons, and gamma ionizing radiations under biased and unbiased conditions. The biased chips became nonfunctional after receiving a cumulative dose of less than 20 krads, while the unbiased chips only started to show degradation with a cumulative dose of over 100 krads. As the total radiation dose increased, all the components demonstrated graceful degradation. The analog sigmoidal function of the neuron became steeper (increase in gain), current leakage from the synapses progressively shifted the sigmoidal curve, and the digital memory of the synapses and the memory addressing circuits began to gradually fail. From these radiation experiments, we can learn how to modify certain designs of the neural network electronic hardware without using radiation-hardening techniques to increase its reliability and fault tolerance.
Directed current in the Holstein system.
Hennig, D; Burbanks, A D; Osbaldestin, A H
2011-03-01
We propose a mechanism to rectify charge transport in the semiclassical Holstein model. It is shown that localized initial conditions associated with a polaron solution, in conjunction with static electron on-site potential not having inversion symmetry, constitute minimal prerequisites for the emergence of a directed current in the underlying periodic lattice system. In particular, we demonstrate that for unbiased spatially localized initial conditions (constituted by kicked static polaron states), violation of parity prevents the existence of pairs of counterpropagating trajectories, thus allowing for a directed current despite the time reversibility of the equations of motion. Nevertheless, propagating polaron solutions associated with sets of unbiased localized initial conditions which eventually leave the region of localized initial conditions do not exhibit time reversibility. Since the initial conditions belonging to the corresponding counterpropagating, current-compensating polaron solutions are not contained in the set, this gives rise to the emergence of a current. Occurrence of long-range coherent charge transport is demonstrated.
The Swift GRB Host Galaxy Legacy Survey
NASA Astrophysics Data System (ADS)
Perley, Daniel A.
2015-01-01
I introduce the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population across its entire redshift range. Using unbiased selection criteria we have designated a subset of 130 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, and Gemini to obtain complementary optical/NIR photometry to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass functions and their evolution with redshift between z=0 and z=5, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to probe cosmic star-formation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balci, Soner; Czaplewski, David A.; Jung, Il Woong
Besides having perfect control on structural features, such as vertical alignment and uniform distribution by fabricating the wires via e-beam lithography and etching process, we also investigated the THz emission from these fabricated nanowires when they are applied DC bias voltage. To be able to apply a voltage bias, an interdigitated gold (Au) electrode was patterned on the high-quality InGaAs epilayer grown on InP substrate bymolecular beam epitaxy. Afterwards, perfect vertically aligned and uniformly distributed nanowires were fabricated in between the electrodes of this interdigitated pattern so that we could apply voltage bias to improve the THz emission. As amore » result, we achieved enhancement in the emitted THz radiation by ~four times, about 12 dB increase in power ratio at 0.25 THz with a DC biased electric field compared with unbiased NWs.« less
Conformational free energy modeling of druglike molecules by metadynamics in the WHIM space.
Spiwok, Vojtěch; Hlat-Glembová, Katarína; Tvaroška, Igor; Králová, Blanka
2012-03-26
Protein-ligand affinities can be significantly influenced not only by the interaction itself but also by conformational equilibrium of both binding partners, free ligand and free protein. Identification of important conformational families of a ligand and prediction of their thermodynamics is important for efficient ligand design. Here we report conformational free energy modeling of nine small-molecule drugs in explicitly modeled water by metadynamics with a bias potential applied in the space of weighted holistic invariant molecular (WHIM) descriptors. Application of metadynamics enhances conformational sampling compared to unbiased molecular dynamics simulation and allows to predict relative free energies of key conformations. Selected free energy minima and one example of transition state were tested by a series of unbiased molecular dynamics simulation. Comparison of free energy surfaces of free and target-bound Imatinib provides an estimate of free energy penalty of conformational change induced by its binding to the target. © 2012 American Chemical Society
Unbiased free energy estimates in fast nonequilibrium transformations using Gaussian mixtures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Procacci, Piero
2015-04-21
In this paper, we present an improved method for obtaining unbiased estimates of the free energy difference between two thermodynamic states using the work distribution measured in nonequilibrium driven experiments connecting these states. The method is based on the assumption that any observed work distribution is given by a mixture of Gaussian distributions, whose normal components are identical in either direction of the nonequilibrium process, with weights regulated by the Crooks theorem. Using the prototypical example for the driven unfolding/folding of deca-alanine, we show that the predicted behavior of the forward and reverse work distributions, assuming a combination of onlymore » two Gaussian components with Crooks derived weights, explains surprisingly well the striking asymmetry in the observed distributions at fast pulling speeds. The proposed methodology opens the way for a perfectly parallel implementation of Jarzynski-based free energy calculations in complex systems.« less
Functional annotation of chemical libraries across diverse biological processes.
Piotrowski, Jeff S; Li, Sheena C; Deshpande, Raamesh; Simpkins, Scott W; Nelson, Justin; Yashiroda, Yoko; Barber, Jacqueline M; Safizadeh, Hamid; Wilson, Erin; Okada, Hiroki; Gebre, Abraham A; Kubo, Karen; Torres, Nikko P; LeBlanc, Marissa A; Andrusiak, Kerry; Okamoto, Reika; Yoshimura, Mami; DeRango-Adem, Eva; van Leeuwen, Jolanda; Shirahige, Katsuhiko; Baryshnikova, Anastasia; Brown, Grant W; Hirano, Hiroyuki; Costanzo, Michael; Andrews, Brenda; Ohya, Yoshikazu; Osada, Hiroyuki; Yoshida, Minoru; Myers, Chad L; Boone, Charles
2017-09-01
Chemical-genetic approaches offer the potential for unbiased functional annotation of chemical libraries. Mutations can alter the response of cells in the presence of a compound, revealing chemical-genetic interactions that can elucidate a compound's mode of action. We developed a highly parallel, unbiased yeast chemical-genetic screening system involving three key components. First, in a drug-sensitive genetic background, we constructed an optimized diagnostic mutant collection that is predictive for all major yeast biological processes. Second, we implemented a multiplexed (768-plex) barcode-sequencing protocol, enabling the assembly of thousands of chemical-genetic profiles. Finally, based on comparison of the chemical-genetic profiles with a compendium of genome-wide genetic interaction profiles, we predicted compound functionality. Applying this high-throughput approach, we screened seven different compound libraries and annotated their functional diversity. We further validated biological process predictions, prioritized a diverse set of compounds, and identified compounds that appear to have dual modes of action.
Ehrenberg, A J; Nguy, A K; Theofilas, P; Dunlop, S; Suemoto, C K; Di Lorenzo Alho, A T; Leite, R P; Diehl Rodriguez, R; Mejia, M B; Rüb, U; Farfel, J M; de Lucena Ferretti-Rebustini, R E; Nascimento, C F; Nitrini, R; Pasquallucci, C A; Jacob-Filho, W; Miller, B; Seeley, W W; Heinsen, H; Grinberg, L T
2017-08-01
Hyperphosphorylated tau neuronal cytoplasmic inclusions (ht-NCI) are the best protein correlate of clinical decline in Alzheimer's disease (AD). Qualitative evidence identifies ht-NCI accumulating in the isodendritic core before the entorhinal cortex. Here, we used unbiased stereology to quantify ht-NCI burden in the locus coeruleus (LC) and dorsal raphe nucleus (DRN), aiming to characterize the impact of AD pathology in these nuclei with a focus on early stages. We utilized unbiased stereology in a sample of 48 well-characterized subjects enriched for controls and early AD stages. ht-NCI counts were estimated in 60-μm-thick sections immunostained for p-tau throughout LC and DRN. Data were integrated with unbiased estimates of LC and DRN neuronal population for a subset of cases. In Braak stage 0, 7.9% and 2.6% of neurons in LC and DRN, respectively, harbour ht-NCIs. Although the number of ht-NCI+ neurons significantly increased by about 1.9× between Braak stages 0 to I in LC (P = 0.02), we failed to detect any significant difference between Braak stage I and II. Also, the number of ht-NCI+ neurons remained stable in DRN between all stages 0 and II. Finally, the differential susceptibility to tau inclusions among nuclear subdivisions was more notable in LC than in DRN. LC and DRN neurons exhibited ht-NCI during AD precortical stages. The ht-NCI increases along AD progression on both nuclei, but quantitative changes in LC precede DRN changes. © 2017 British Neuropathological Society.
A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue
Nyengaard, Jens Randel; Lind, Martin; Spector, Myron
2015-01-01
Objective To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin–eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. Results We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm3 (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. Conclusion We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage. PMID:26069715
Guenole, Nigel
2018-01-01
The test for item level cluster bias examines the improvement in model fit that results from freeing an item's between level residual variance from a baseline model with equal within and between level factor loadings and between level residual variances fixed at zero. A potential problem is that this approach may include a misspecified unrestricted model if any non-invariance is present, but the log-likelihood difference test requires that the unrestricted model is correctly specified. A free baseline approach where the unrestricted model includes only the restrictions needed for model identification should lead to better decision accuracy, but no studies have examined this yet. We ran a Monte Carlo study to investigate this issue. When the referent item is unbiased, compared to the free baseline approach, the constrained baseline approach led to similar true positive (power) rates but much higher false positive (Type I error) rates. The free baseline approach should be preferred when the referent indicator is unbiased. When the referent assumption is violated, the false positive rate was unacceptably high for both free and constrained baseline approaches, and the true positive rate was poor regardless of whether the free or constrained baseline approach was used. Neither the free or constrained baseline approach can be recommended when the referent indicator is biased. We recommend paying close attention to ensuring the referent indicator is unbiased in tests of cluster bias. All Mplus input and output files, R, and short Python scripts used to execute this simulation study are uploaded to an open access repository.
Griaud, François; Denefeld, Blandine; Lang, Manuel; Hensinger, Héloïse; Haberl, Peter; Berg, Matthias
2017-07-01
Characterization of charge-based variants by mass spectrometry (MS) is required for the analytical development of a new biologic entity and its marketing approval by health authorities. However, standard peak-based data analysis approaches are time-consuming and biased toward the detection, identification, and quantification of main variants only. The aim of this study was to characterize in-depth acidic and basic species of a stressed IgG1 monoclonal antibody using comprehensive and unbiased MS data evaluation tools. Fractions collected from cation ion exchange (CEX) chromatography were analyzed as intact, after reduction of disulfide bridges, and after proteolytic cleavage using Lys-C. Data of both intact and reduced samples were evaluated consistently using a time-resolved deconvolution algorithm. Peptide mapping data were processed simultaneously, quantified and compared in a systematic manner for all MS signals and fractions. Differences observed between the fractions were then further characterized and assigned. Time-resolved deconvolution enhanced pattern visualization and data interpretation of main and minor modifications in 3-dimensional maps across CEX fractions. Relative quantification of all MS signals across CEX fractions before peptide assignment enabled the detection of fraction-specific chemical modifications at abundances below 1%. Acidic fractions were shown to be heterogeneous, containing antibody fragments, glycated as well as deamidated forms of the heavy and light chains. In contrast, the basic fractions contained mainly modifications of the C-terminus and pyroglutamate formation at the N-terminus of the heavy chain. Systematic data evaluation was performed to investigate multiple data sets and comprehensively extract main and minor differences between each CEX fraction in an unbiased manner.
Gao, Wen; Yang, Hua; Qi, Lian-Wen; Liu, E-Hu; Ren, Mei-Ting; Yan, Yu-Ting; Chen, Jun; Li, Ping
2012-07-06
Plant-based medicines become increasingly popular over the world. Authentication of herbal raw materials is important to ensure their safety and efficacy. Some herbs belonging to closely related species but differing in medicinal properties are difficult to be identified because of similar morphological and microscopic characteristics. Chromatographic fingerprinting is an alternative method to distinguish them. Existing approaches do not allow a comprehensive analysis for herbal authentication. We have now developed a strategy consisting of (1) full metabolic profiling of herbal medicines by rapid resolution liquid chromatography (RRLC) combined with quadrupole time-of-flight mass spectrometry (QTOF MS), (2) global analysis of non-targeted compounds by molecular feature extraction algorithm, (3) multivariate statistical analysis for classification and prediction, and (4) marker compounds characterization. This approach has provided a fast and unbiased comparative multivariate analysis of the metabolite composition of 33-batch samples covering seven Lonicera species. Individual metabolic profiles are performed at the level of molecular fragments without prior structural assignment. In the entire set, the obtained classifier for seven Lonicera species flower buds showed good prediction performance and a total of 82 statistically different components were rapidly obtained by the strategy. The elemental compositions of discriminative metabolites were characterized by the accurate mass measurement of the pseudomolecular ions and their chemical types were assigned by the MS/MS spectra. The high-resolution, comprehensive and unbiased strategy for metabolite data analysis presented here is powerful and opens the new direction of authentication in herbal analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Estimating unbiased economies of scale of HIV prevention projects: a case study of Avahan.
Lépine, Aurélia; Vassall, Anna; Chandrashekar, Sudha; Blanc, Elodie; Le Nestour, Alexis
2015-04-01
Governments and donors are investing considerable resources on HIV prevention in order to scale up these services rapidly. Given the current economic climate, providers of HIV prevention services increasingly need to demonstrate that these investments offer good 'value for money'. One of the primary routes to achieve efficiency is to take advantage of economies of scale (a reduction in the average cost of a health service as provision scales-up), yet empirical evidence on economies of scale is scarce. Methodologically, the estimation of economies of scale is hampered by several statistical issues preventing causal inference and thus making the estimation of economies of scale complex. In order to estimate unbiased economies of scale when scaling up HIV prevention services, we apply our analysis to one of the few HIV prevention programmes globally delivered at a large scale: the Indian Avahan initiative. We costed the project by collecting data from the 138 Avahan NGOs and the supporting partners in the first four years of its scale-up, between 2004 and 2007. We develop a parsimonious empirical model and apply a system Generalized Method of Moments (GMM) and fixed-effects Instrumental Variable (IV) estimators to estimate unbiased economies of scale. At the programme level, we find that, after controlling for the endogeneity of scale, the scale-up of Avahan has generated high economies of scale. Our findings suggest that average cost reductions per person reached are achievable when scaling-up HIV prevention in low and middle income countries. Copyright © 2015 Elsevier Ltd. All rights reserved.
Guenole, Nigel
2018-01-01
The test for item level cluster bias examines the improvement in model fit that results from freeing an item's between level residual variance from a baseline model with equal within and between level factor loadings and between level residual variances fixed at zero. A potential problem is that this approach may include a misspecified unrestricted model if any non-invariance is present, but the log-likelihood difference test requires that the unrestricted model is correctly specified. A free baseline approach where the unrestricted model includes only the restrictions needed for model identification should lead to better decision accuracy, but no studies have examined this yet. We ran a Monte Carlo study to investigate this issue. When the referent item is unbiased, compared to the free baseline approach, the constrained baseline approach led to similar true positive (power) rates but much higher false positive (Type I error) rates. The free baseline approach should be preferred when the referent indicator is unbiased. When the referent assumption is violated, the false positive rate was unacceptably high for both free and constrained baseline approaches, and the true positive rate was poor regardless of whether the free or constrained baseline approach was used. Neither the free or constrained baseline approach can be recommended when the referent indicator is biased. We recommend paying close attention to ensuring the referent indicator is unbiased in tests of cluster bias. All Mplus input and output files, R, and short Python scripts used to execute this simulation study are uploaded to an open access repository. PMID:29551985
A Stereological Method for the Quantitative Evaluation of Cartilage Repair Tissue.
Foldager, Casper Bindzus; Nyengaard, Jens Randel; Lind, Martin; Spector, Myron
2015-04-01
To implement stereological principles to develop an easy applicable algorithm for unbiased and quantitative evaluation of cartilage repair. Design-unbiased sampling was performed by systematically sectioning the defect perpendicular to the joint surface in parallel planes providing 7 to 10 hematoxylin-eosin stained histological sections. Counting windows were systematically selected and converted into image files (40-50 per defect). The quantification was performed by two-step point counting: (1) calculation of defect volume and (2) quantitative analysis of tissue composition. Step 2 was performed by assigning each point to one of the following categories based on validated and easy distinguishable morphological characteristics: (1) hyaline cartilage (rounded cells in lacunae in hyaline matrix), (2) fibrocartilage (rounded cells in lacunae in fibrous matrix), (3) fibrous tissue (elongated cells in fibrous tissue), (4) bone, (5) scaffold material, and (6) others. The ability to discriminate between the tissue types was determined using conventional or polarized light microscopy, and the interobserver variability was evaluated. We describe the application of the stereological method. In the example, we assessed the defect repair tissue volume to be 4.4 mm(3) (CE = 0.01). The tissue fractions were subsequently evaluated. Polarized light illumination of the slides improved discrimination between hyaline cartilage and fibrocartilage and increased the interobserver agreement compared with conventional transmitted light. We have applied a design-unbiased method for quantitative evaluation of cartilage repair, and we propose this algorithm as a natural supplement to existing descriptive semiquantitative scoring systems. We also propose that polarized light is effective for discrimination between hyaline cartilage and fibrocartilage.
Griffen, Edward J; Dossetter, Alexander G; Leach, Andrew G; Montague, Shane
2018-03-22
AI comes to lead optimization: medicinal chemistry in all disease areas can be accelerated by exploiting our pre-competitive knowledge in an unbiased way. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOT National Transportation Integrated Search
2003-10-29
The objective of the DOE/NREL evaluation program is to provide comprehensive, unbiased evaluation results of advanced technology vehicle development and operations, evaluation of hydrogen infrastructure development and operation, and descriptions of ...
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
Hozé, C; Fritz, S; Phocas, F; Boichard, D; Ducrocq, V; Croiseau, P
2014-01-01
Single-breed genomic selection (GS) based on medium single nucleotide polymorphism (SNP) density (~50,000; 50K) is now routinely implemented in several large cattle breeds. However, building large enough reference populations remains a challenge for many medium or small breeds. The high-density BovineHD BeadChip (HD chip; Illumina Inc., San Diego, CA) containing 777,609 SNP developed in 2010 is characterized by short-distance linkage disequilibrium expected to be maintained across breeds. Therefore, combining reference populations can be envisioned. A population of 1,869 influential ancestors from 3 dairy breeds (Holstein, Montbéliarde, and Normande) was genotyped with the HD chip. Using this sample, 50K genotypes were imputed within breed to high-density genotypes, leading to a large HD reference population. This population was used to develop a multi-breed genomic evaluation. The goal of this paper was to investigate the gain of multi-breed genomic evaluation for a small breed. The advantage of using a large breed (Normande in the present study) to mimic a small breed is the large potential validation population to compare alternative genomic selection approaches more reliably. In the Normande breed, 3 training sets were defined with 1,597, 404, and 198 bulls, and a unique validation set included the 394 youngest bulls. For each training set, estimated breeding values (EBV) were computed using pedigree-based BLUP, single-breed BayesC, or multi-breed BayesC for which the reference population was formed by any of the Normande training data sets and 4,989 Holstein and 1,788 Montbéliarde bulls. Phenotypes were standardized by within-breed genetic standard deviation, the proportion of polygenic variance was set to 30%, and the estimated number of SNP with a nonzero effect was about 7,000. The 2 genomic selection (GS) approaches were performed using either the 50K or HD genotypes. The correlations between EBV and observed daughter yield deviations (DYD) were computed for 6 traits and using the different prediction approaches. Compared with pedigree-based BLUP, the average gain in accuracy with GS in small populations was 0.057 for the single-breed and 0.086 for multi-breed approach. This gain was up to 0.193 and 0.209, respectively, with the large reference population. Improvement of EBV prediction due to the multi-breed evaluation was higher for animals not closely related to the reference population. In the case of a breed with a small reference population size, the increase in correlation due to multi-breed GS was 0.141 for bulls without their sire in reference population compared with 0.016 for bulls with their sire in reference population. These results demonstrate that multi-breed GS can contribute to increase genomic evaluation accuracy in small breeds. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
News Sources on Rhodesia: A Comparative Analysis.
ERIC Educational Resources Information Center
McCoy, Jennifer; Cholawsky, Elizabeth
1982-01-01
Concludes that the "London Times" and the Foreign Broadcast Information Service of the United States government provide both comprehensive and unbiased coverage of events in Rhodesia, while the "New York Times" is less complete and the "Christian Science Monitor" is selective. (FL)
The Battle for Legitimacy: "Jazz" Versus Academia
ERIC Educational Resources Information Center
Baker, David
1973-01-01
Jazz is black music; black man gave this music to the world, and every advancement and major innovation of this music has come from him. This is perhaps the reason that has hindered the unbiased acceptance of jazz into academia. (Author/RJ)
2014-01-01
Background When asked to solve mathematical problems, some people experience anxiety and threat, which can lead to impaired mathematical performance (Curr Dir Psychol Sci 11:181–185, 2002). The present studies investigated the link between mathematical anxiety and performance on the cognitive reflection test (CRT; J Econ Perspect 19:25–42, 2005). The CRT is a measure of a person’s ability to resist intuitive response tendencies, and it correlates strongly with important real-life outcomes, such as time preferences, risk-taking, and rational thinking. Methods In Experiments 1 and 2 the relationships between maths anxiety, mathematical knowledge/mathematical achievement, test anxiety and cognitive reflection were analysed using mediation analyses. Experiment 3 included a manipulation of working memory load. The effects of anxiety and working memory load were analysed using ANOVAs. Results Our experiments with university students (Experiments 1 and 3) and secondary school students (Experiment 2) demonstrated that mathematical anxiety was a significant predictor of cognitive reflection, even after controlling for the effects of general mathematical knowledge (in Experiment 1), school mathematical achievement (in Experiment 2) and test anxiety (in Experiments 1–3). Furthermore, Experiment 3 showed that mathematical anxiety and burdening working memory resources with a secondary task had similar effects on cognitive reflection. Conclusions Given earlier findings that showed a close link between cognitive reflection, unbiased decisions and rationality, our results suggest that mathematical anxiety might be negatively related to individuals’ ability to make advantageous choices and good decisions. PMID:25179230
Parresol, B. R.; Scott, D. A.; Zarnoch, S. J.; ...
2017-12-15
Spatially explicit mapping of forest productivity is important to assess many forest management alternatives. We assessed the relationship between mapped variables and site index of forests ranging from southern pine plantations to natural hardwoods on a 74,000-ha landscape in South Carolina, USA. Mapped features used in the analysis were soil association, land use condition in 1951, depth to groundwater, slope and aspect. Basal area, species composition, age and height were the tree variables measured. Linear modelling identified that plot basal area, depth to groundwater, soils association and the interactions between depth to groundwater and forest group, and between land usemore » in 1951 and forest group were related to site index (SI) (R 2 =0.37), but this model had regression attenuation. We then used structural equation modeling to incorporate error-in-measurement corrections for basal area and groundwater to remove bias in the model. We validated this model using 89 independent observations and found the 95% confidence intervals for the slope and intercept of an observed vs. predicted site index error-corrected regression included zero and one, respectively, indicating a good fit. With error in measurement incorporated, only basal area, soil association, and the interaction between forest groups and land use were important predictors (R2 =0.57). Thus, we were able to develop an unbiased model of SI that could be applied to create a spatially explicit map based primarily on soils as modified by past (land use and forest type) and recent forest management (basal area).« less
Influences on choice of surgery as a career: a study of consecutive cohorts in a medical school.
Sobral, Dejano T
2006-06-01
To examine the differential impact of person-based and programme-related features on graduates' dichotomous choice between surgical or non-surgical field specialties for first-year residency. A 10-year cohort study was conducted, following 578 students (55.4% male) who graduated from a university medical school during 1994-2003. Data were collected as follows: at the beginning of medical studies, on career preference and learning frame; during medical studies, on academic achievement, cross-year peer tutoring and selective clinical traineeship, and at graduation, on the first-year residency selected. Contingency and logistic regression analyses were performed, with graduates grouped by the dichotomous choice of surgery or not. Overall, 23% of graduates selected a first-year residency in surgery. Seven time-steady features related to this choice: male sex, high self-confidence, option of surgery at admission, active learning style, preference for surgery after Year 1, peer tutoring on clinical surgery, and selective training in clinical surgery. Logistic regression analysis, including all features, predicted 87.1% of the graduates' choices. Male sex, updated preference, peer tutoring and selective training were the most significant predictors in the pathway to choice. The relative roles of person-based and programme-related factors in the choice process are discussed. The findings suggest that for most students the choice of surgery derives from a temporal summation of influences that encompass entry and post-entry factors blended in variable patterns. It is likely that sex-unbiased peer tutoring and selective training supported the students' search process for personal compatibility with specialty-related domains of content and process.
NASA Astrophysics Data System (ADS)
Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd
2017-08-01
The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.
van Rein, Nienke; Lijfering, Willem M.; Bos, Mettine H. A.; Herruer, Martien H.; Vermaas, Helga W.; van der Meer, Felix J. M.; Reitsma, Pieter H.
2016-01-01
Background Risk scores for patients who are at high risk for major bleeding complications during treatment with vitamin K antagonists (VKAs) do not perform that well. BLEEDS was initiated to search for new biomarkers that predict bleeding in these patients. Objectives To describe the outline and objectives of BLEEDS and to examine whether the study population is generalizable to other VKA treated populations. Methods A cohort was created consisting of all patients starting VKA treatment at three Dutch anticoagulation clinics between January-2012 and July-2014. We stored leftover plasma and DNA following analysis of the INR. Results Of 16,706 eligible patients, 16,570 (99%) were included in BLEEDS and plasma was stored from 13,779 patients (83%). Patients had a mean age of 70 years (SD 14), 8713 were male (53%). The most common VKA indications were atrial fibrillation (10,876 patients, 66%) and venous thrombosis (3920 patients, 24%). 326 Major bleeds occurred during 17,613 years of follow-up (incidence rate 1.85/100 person years, 95%CI 1.66–2.06). The risk for major bleeding was highest in the initial three months of VKA treatment and increased when the international normalized ratio increased. These results and characteristics are in concordance with results from other VKA treated populations. Conclusion BLEEDS is generalizable to other VKA treated populations and will permit innovative and unbiased research of biomarkers that may predict major bleeding during VKA treatment. PMID:27935941
Coser, S M; Motoike, S Y; Corrêa, T R; Pires, T P; Resende, M D V
2016-10-17
Macaw palm (Acrocomia aculeata) is a promising species for use in biofuel production, and establishing breeding programs is important for the development of commercial plantations. The aim of the present study was to analyze genetic diversity, verify correlations between traits, estimate genetic parameters, and select different accessions of A. aculeata in the Macaw Palm Germplasm Bank located in Universidade Federal de Viçosa, to develop a breeding program for this species. Accessions were selected based on precocity (PREC), total spathe (TS), diameter at breast height (DBH), height of the first spathe (HFS), and canopy area (CA). The traits were evaluated in 52 accessions during the 2012/2013 season and analyzed by restricted estimation maximum likelihood/best linear unbiased predictor procedures. Genetic diversity resulted in the formation of four groups by Tocher's clustering method. The correlation analysis showed it was possible to have indirect and early selection for the traits PREC and DBH. Estimated genetic parameters strengthened the genetic variability verified by cluster analysis. Narrow-sense heritability was classified as moderate (PREC, TS, and CA) to high (HFS and DBH), resulting in strong genetic control of the traits and success in obtaining genetic gains by selection. Accuracy values were classified as moderate (PREC and CA) to high (TS, HFS, and DBH), reinforcing the success of the selection process. Selection of accessions for PREC, TS, and HFS by the rank-average method permits selection gains of over 100%, emphasizing the successful use of the accessions in breeding programs and obtaining superior genotypes for commercial plantations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parresol, B. R.; Scott, D. A.; Zarnoch, S. J.
Spatially explicit mapping of forest productivity is important to assess many forest management alternatives. We assessed the relationship between mapped variables and site index of forests ranging from southern pine plantations to natural hardwoods on a 74,000-ha landscape in South Carolina, USA. Mapped features used in the analysis were soil association, land use condition in 1951, depth to groundwater, slope and aspect. Basal area, species composition, age and height were the tree variables measured. Linear modelling identified that plot basal area, depth to groundwater, soils association and the interactions between depth to groundwater and forest group, and between land usemore » in 1951 and forest group were related to site index (SI) (R 2 =0.37), but this model had regression attenuation. We then used structural equation modeling to incorporate error-in-measurement corrections for basal area and groundwater to remove bias in the model. We validated this model using 89 independent observations and found the 95% confidence intervals for the slope and intercept of an observed vs. predicted site index error-corrected regression included zero and one, respectively, indicating a good fit. With error in measurement incorporated, only basal area, soil association, and the interaction between forest groups and land use were important predictors (R2 =0.57). Thus, we were able to develop an unbiased model of SI that could be applied to create a spatially explicit map based primarily on soils as modified by past (land use and forest type) and recent forest management (basal area).« less
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo
2016-01-01
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo
2017-01-05
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.
Velstra, Inge-Marie; Bolliger, Marc; Krebs, Jörg; Rietman, Johan S; Curt, Armin
2016-05-01
To determine which single or combined upper limb muscles as defined by the International Standards for the Neurological Classification of Spinal Cord Injury (ISNCSCI); upper extremity motor score (UEMS) and the Graded Redefined Assessment of Strength, Sensibility, and Prehension (GRASSP), best predict upper limb function and independence in activities of daily living (ADLs) and to assess the predictive value of qualitative grasp movements (QlG) on upper limb function in individuals with acute tetraplegia. As part of a Europe-wide, prospective, longitudinal, multicenter study ISNCSCI, GRASSP, and Spinal Cord Independence Measure (SCIM III) scores were recorded at 1 and 6 months after SCI. For prediction of upper limb function and ADLs, a logistic regression model and unbiased recursive partitioning conditional inference tree (URP-CTREE) were used. Results: Logistic regression and URP-CTREE revealed that a combination of ISNCSCI and GRASSP muscles (to a maximum of 4) demonstrated the best prediction (specificity and sensitivity ranged from 81.8% to 96.0%) of upper limb function and identified homogenous outcome cohorts at 6 months. The URP-CTREE model with the QlG predictors for upper limb function showed similar results. Prediction of upper limb function can be achieved through a combination of defined, specific upper limb muscles assessed in the ISNCSCI and GRASSP. A combination of a limited number of proximal and distal muscles along with an assessment of grasping movements can be applied for clinical decision making for rehabilitation interventions and clinical trials. © The Author(s) 2015.
Meppelink, Renée; de Bruin, Esther I; Wanders-Mulder, Femy H; Vennik, Corinne J; Bögels, Susan M
Mindful parenting training is an application of mindfulness-based interventions that allows parents to perceive their children with unbiased and open attention without prejudgment and become more attentive and less reactive in their parenting. This study examined the effectiveness of mindful parenting training in a clinical setting on child and parental psychopathology and of mindfulness as a predictor of these outcomes. Seventy parents of 70 children (mean age = 8.7) who were referred to a mental health care clinic because of their children's psychopathology participated in an 8-week mindful parenting training. Parents completed questionnaires at pre-test, post-test and 8-week follow-up. A significant decrease was found in children's and parents' psychopathology and a significant increase in mindful parenting and in general mindful awareness. Improvement in general mindful awareness, but not mindful parenting, was found to predict a reduction in parental psychopathology, whereas improvement in mindful parenting, but not general mindful awareness, predicted the reduction of child psychopathology. This study adds to the emerging body of evidence indicating that mindful parenting training is effective for parents themselves and, indirectly, for their children suffering from psychopathology. As parents' increased mindful parenting, but not increased general mindfulness, is found to predict child psychopathology, mindful parenting training rather than general mindfulness training appears to be the training of choice. However, RCTs comparing mindful parenting to general mindfulness training and to parent management training are needed in order to shed more light on the effects of mindful parenting and mechanisms of change.
Allard, Alix; Bink, Marco C A M; Martinez, Sébastien; Kelner, Jean-Jacques; Legave, Jean-Michel; di Guardo, Mario; Di Pierro, Erica A; Laurens, François; van de Weg, Eric W; Costes, Evelyne
2016-04-01
In temperate trees, growth resumption in spring time results from chilling and heat requirements, and is an adaptive trait under global warming. Here, the genetic determinism of budbreak and flowering time was deciphered using five related full-sib apple families. Both traits were observed over 3 years and two sites and expressed in calendar and degree-days. Best linear unbiased predictors of genotypic effect or interaction with climatic year were extracted from mixed linear models and used for quantitative trait locus (QTL) mapping, performed with an integrated genetic map containing 6849 single nucleotide polymorphisms (SNPs), grouped into haplotypes, and with a Bayesian pedigree-based analysis. Four major regions, on linkage group (LG) 7, LG10, LG12, and LG9, the latter being the most stable across families, sites, and years, explained 5.6-21.3% of trait variance. Co-localizations for traits in calendar days or growing degree hours (GDH) suggested common genetic determinism for chilling and heating requirements. Homologs of two major flowering genes, AGL24 and FT, were predicted close to LG9 and LG12 QTLs, respectively, whereas Dormancy Associated MADs-box (DAM) genes were near additional QTLs on LG8 and LG15. This suggests that chilling perception mechanisms could be common among perennial and annual plants. Progenitors with favorable alleles depending on trait and LG were identified and could benefit new breeding strategies for apple adaptation to temperature increase. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Molecular andrology as related to sperm DNA fragmentation/sperm chromatin biotechnology.
Shafik, A; Shafik, A A; Shafik, I; El Sibai, O
2006-01-01
Genetic male infertility occurs throughout the life cycle from genetic traits carried by the sperm, to fertilization and post-fertilization genome alterations, and subsequent developmental changes in the blastocyst and fetus as well as errors in meiosis and abnormalities in spermatogenesis/spermatogenesis. Genes encoding proteins for normal development include SRY, SOX9, INSL3 and LGR8. Genetic abnormalities affect spermatogenesis whereas polymorphisms affect receptor affinity and hormone bioactivity. Transgenic animal models, the human genome project, and other techniques have identified numerous genes related to male fertility. Several techniques have been developed to measure the amount of sperm DNA damage in an effort to identify more objective parameters for evaluation of infertile men. The integrity of sperm DNA influences a couple's fertility and helps predict the chances of pregnancy and its successful outcome. The available tests of sperm DNA damage require additional large-scale clinical trials before their integration into routine clinical practice. The physiological/molecular integrity of sperm DNA is a novel parameter of semen quality and a potential fertility predictor. Although DNA integrity assessment appears to be a logical biomarker of sperm quality, it is not being assessed as a routine part of semen analysis by clinical andrologists. Extensive investigation has been conducted for the comparative evaluation of these techniques. However, some of these techniques require expensive instrumentation for optimal and unbiased analysis, are labor intensive, or require the use of enzymes whose activity and accessibility to DNA breaks may be irregular. Thus, these techniques are recommended for basic research rather than for routine andrology laboratories.
The purpose of the Center is to provide timely, unbiased, scientifically sound evaluations of human and experimental evidence for adverse effects on reproduction and development caused by agents to which humans may be exposed.
Some Unintended Consequences of "Top Down" Organization Development
ERIC Educational Resources Information Center
White, Bernard J.; Ramsey, V. Jean
1978-01-01
An organizational development consultant is expected to perform a thorough and unbiased diagnosis of the organization's functioning. This is an account of a case study of the effect of top management influence on the consultant's awareness and definition of problems. (Author/MLF)
Minimum variance geographic sampling
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
Resource inventories require samples with geographical scatter, sometimes not as widely spaced as would be hoped. A simple model of correlation over distances is used to create a minimum variance unbiased estimate population means. The fitting procedure is illustrated from data used to estimate Missouri corn acreage.
Nurse Practitioners, Certified Nurse Midwives, and Physician Assistants in Physician Offices
... on Vital and Health Statistics Annual Reports Health Survey Research Methods Conference Reports from the National Medical Care ... each sample visit that takes all stages of design into account. The survey data are inflated or weighted to produce unbiased ...
Unbiased Quantitative Models of Protein Translation Derived from Ribosome Profiling Data
Gritsenko, Alexey A.; Hulsman, Marc; Reinders, Marcel J. T.; de Ridder, Dick
2015-01-01
Translation of RNA to protein is a core process for any living organism. While for some steps of this process the effect on protein production is understood, a holistic understanding of translation still remains elusive. In silico modelling is a promising approach for elucidating the process of protein synthesis. Although a number of computational models of the process have been proposed, their application is limited by the assumptions they make. Ribosome profiling (RP), a relatively new sequencing-based technique capable of recording snapshots of the locations of actively translating ribosomes, is a promising source of information for deriving unbiased data-driven translation models. However, quantitative analysis of RP data is challenging due to high measurement variance and the inability to discriminate between the number of ribosomes measured on a gene and their speed of translation. We propose a solution in the form of a novel multi-scale interpretation of RP data that allows for deriving models with translation dynamics extracted from the snapshots. We demonstrate the usefulness of this approach by simultaneously determining for the first time per-codon translation elongation and per-gene translation initiation rates of Saccharomyces cerevisiae from RP data for two versions of the Totally Asymmetric Exclusion Process (TASEP) model of translation. We do this in an unbiased fashion, by fitting the models using only RP data with a novel optimization scheme based on Monte Carlo simulation to keep the problem tractable. The fitted models match the data significantly better than existing models and their predictions show better agreement with several independent protein abundance datasets than existing models. Results additionally indicate that the tRNA pool adaptation hypothesis is incomplete, with evidence suggesting that tRNA post-transcriptional modifications and codon context may play a role in determining codon elongation rates. PMID:26275099
Unbiased Quantitative Models of Protein Translation Derived from Ribosome Profiling Data.
Gritsenko, Alexey A; Hulsman, Marc; Reinders, Marcel J T; de Ridder, Dick
2015-08-01
Translation of RNA to protein is a core process for any living organism. While for some steps of this process the effect on protein production is understood, a holistic understanding of translation still remains elusive. In silico modelling is a promising approach for elucidating the process of protein synthesis. Although a number of computational models of the process have been proposed, their application is limited by the assumptions they make. Ribosome profiling (RP), a relatively new sequencing-based technique capable of recording snapshots of the locations of actively translating ribosomes, is a promising source of information for deriving unbiased data-driven translation models. However, quantitative analysis of RP data is challenging due to high measurement variance and the inability to discriminate between the number of ribosomes measured on a gene and their speed of translation. We propose a solution in the form of a novel multi-scale interpretation of RP data that allows for deriving models with translation dynamics extracted from the snapshots. We demonstrate the usefulness of this approach by simultaneously determining for the first time per-codon translation elongation and per-gene translation initiation rates of Saccharomyces cerevisiae from RP data for two versions of the Totally Asymmetric Exclusion Process (TASEP) model of translation. We do this in an unbiased fashion, by fitting the models using only RP data with a novel optimization scheme based on Monte Carlo simulation to keep the problem tractable. The fitted models match the data significantly better than existing models and their predictions show better agreement with several independent protein abundance datasets than existing models. Results additionally indicate that the tRNA pool adaptation hypothesis is incomplete, with evidence suggesting that tRNA post-transcriptional modifications and codon context may play a role in determining codon elongation rates.
Griaud, François; Denefeld, Blandine; Lang, Manuel; Hensinger, Héloïse; Haberl, Peter; Berg, Matthias
2017-01-01
ABSTRACT Characterization of charge-based variants by mass spectrometry (MS) is required for the analytical development of a new biologic entity and its marketing approval by health authorities. However, standard peak-based data analysis approaches are time-consuming and biased toward the detection, identification, and quantification of main variants only. The aim of this study was to characterize in-depth acidic and basic species of a stressed IgG1 monoclonal antibody using comprehensive and unbiased MS data evaluation tools. Fractions collected from cation ion exchange (CEX) chromatography were analyzed as intact, after reduction of disulfide bridges, and after proteolytic cleavage using Lys-C. Data of both intact and reduced samples were evaluated consistently using a time-resolved deconvolution algorithm. Peptide mapping data were processed simultaneously, quantified and compared in a systematic manner for all MS signals and fractions. Differences observed between the fractions were then further characterized and assigned. Time-resolved deconvolution enhanced pattern visualization and data interpretation of main and minor modifications in 3-dimensional maps across CEX fractions. Relative quantification of all MS signals across CEX fractions before peptide assignment enabled the detection of fraction-specific chemical modifications at abundances below 1%. Acidic fractions were shown to be heterogeneous, containing antibody fragments, glycated as well as deamidated forms of the heavy and light chains. In contrast, the basic fractions contained mainly modifications of the C-terminus and pyroglutamate formation at the N-terminus of the heavy chain. Systematic data evaluation was performed to investigate multiple data sets and comprehensively extract main and minor differences between each CEX fraction in an unbiased manner. PMID:28379786
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
NASA Astrophysics Data System (ADS)
Nüske, Feliks; Wu, Hao; Prinz, Jan-Hendrik; Wehmeyer, Christoph; Clementi, Cecilia; Noé, Frank
2017-03-01
Many state-of-the-art methods for the thermodynamic and kinetic characterization of large and complex biomolecular systems by simulation rely on ensemble approaches, where data from large numbers of relatively short trajectories are integrated. In this context, Markov state models (MSMs) are extremely popular because they can be used to compute stationary quantities and long-time kinetics from ensembles of short simulations, provided that these short simulations are in "local equilibrium" within the MSM states. However, over the last 15 years since the inception of MSMs, it has been controversially discussed and not yet been answered how deviations from local equilibrium can be detected, whether these deviations induce a practical bias in MSM estimation, and how to correct for them. In this paper, we address these issues: We systematically analyze the estimation of MSMs from short non-equilibrium simulations, and we provide an expression for the error between unbiased transition probabilities and the expected estimate from many short simulations. We show that the unbiased MSM estimate can be obtained even from relatively short non-equilibrium simulations in the limit of long lag times and good discretization. Further, we exploit observable operator model (OOM) theory to derive an unbiased estimator for the MSM transition matrix that corrects for the effect of starting out of equilibrium, even when short lag times are used. Finally, we show how the OOM framework can be used to estimate the exact eigenvalues or relaxation time scales of the system without estimating an MSM transition matrix, which allows us to practically assess the discretization quality of the MSM. Applications to model systems and molecular dynamics simulation data of alanine dipeptide are included for illustration. The improved MSM estimator is implemented in PyEMMA of version 2.3.
Rapid Evolution of Ovarian-Biased Genes in the Yellow Fever Mosquito (Aedes aegypti).
Whittle, Carrie A; Extavour, Cassandra G
2017-08-01
Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system ( e.g. , sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition. Copyright © 2017 by the Genetics Society of America.
Rosenberger, Amanda E.; Dunham, Jason B.
2005-01-01
Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.
Gao, Hongying; Deng, Shibing; Obach, R Scott
2015-12-01
An unbiased scanning methodology using ultra high-performance liquid chromatography coupled with high-resolution mass spectrometry was used to bank data and plasma samples for comparing the data generated at different dates. This method was applied to bank the data generated earlier in animal samples and then to compare the exposure to metabolites in animal versus human for safety assessment. With neither authentic standards nor prior knowledge of the identities and structures of metabolites, full scans for precursor ions and all ion fragments (AIF) were employed with a generic gradient LC method to analyze plasma samples at positive and negative polarity, respectively. In a total of 22 tested drugs and metabolites, 21 analytes were detected using this unbiased scanning method except that naproxen was not detected due to low sensitivity at negative polarity and interference at positive polarity; and 4'- or 5-hydroxy diclofenac was not separated by a generic UPLC method. Statistical analysis of the peak area ratios of the analytes versus the internal standard in five repetitive analyses over approximately 1 year demonstrated that the analysis variation was significantly different from sample instability. The confidence limits for comparing the exposure using peak area ratio of metabolites in animal plasma versus human plasma measured over approximately 1 year apart were comparable to the analysis undertaken side by side on the same days. These statistical analysis results showed it was feasible to compare data generated at different dates with neither authentic standards nor prior knowledge of the analytes.
Density estimation in wildlife surveys
Bart, Jonathan; Droege, Sam; Geissler, Paul E.; Peterjohn, Bruce G.; Ralph, C. John
2004-01-01
Several authors have recently discussed the problems with using index methods to estimate trends in population size. Some have expressed the view that index methods should virtually never be used. Others have responded by defending index methods and questioning whether better alternatives exist. We suggest that index methods are often a cost-effective component of valid wildlife monitoring but that double-sampling or another procedure that corrects for bias or establishes bounds on bias is essential. The common assertion that index methods require constant detection rates for trend estimation is mathematically incorrect; the requirement is no long-term trend in detection "ratios" (index result/parameter of interest), a requirement that is probably approximately met by many well-designed index surveys. We urge that more attention be given to defining bird density rigorously and in ways useful to managers. Once this is done, 4 sources of bias in density estimates may be distinguished: coverage, closure, surplus birds, and detection rates. Distance, double-observer, and removal methods do not reduce bias due to coverage, closure, or surplus birds. These methods may yield unbiased estimates of the number of birds present at the time of the survey, but only if their required assumptions are met, which we doubt occurs very often in practice. Double-sampling, in contrast, produces unbiased density estimates if the plots are randomly selected and estimates on the intensive surveys are unbiased. More work is needed, however, to determine the feasibility of double-sampling in different populations and habitats. We believe the tension that has developed over appropriate survey methods can best be resolved through increased appreciation of the mathematical aspects of indices, especially the effects of bias, and through studies in which candidate methods are evaluated against known numbers determined through intensive surveys.
Solomon, Isaac H; Spera, Kristyn M; Ryan, Sophia L; Helgager, Jeffrey; Andrici, Juliana; Zaki, Sherif R; Vaitkevicius, Henrikas; Leon, Kristoffer E; Wilson, Michael R; DeRisi, Joseph L; Koo, Sophia; Smirnakis, Stelios M; De Girolami, Umberto
2018-06-01
Powassan virus is a rare but increasingly recognized cause of severe neurological disease. To highlight the diagnostic challenges and neuropathological findings in a fatal case of Powassan encephalitis caused by deer tick virus (lineage II) in a patient with follicular lymphoma receiving rituximab, with nonspecific anti-GAD65 antibodies, who was initially seen with fever and orchiepididymitis. Comparison of clinical, radiological, histological, and laboratory findings, including immunohistochemistry, real-time polymerase chain reaction, antibody detection, and unbiased sequencing assays, in a single case report (first seen in December 2016) at an academic medical center. Infection with Powassan virus. Results of individual assays compared retrospectively. In a 63-year-old man with fatal Powassan encephalitis, serum and cerebrospinal fluid IgM antibodies were not detected via standard methods, likely because of rituximab exposure. Neuropathological findings were extensive, including diffuse leptomeningeal and parenchymal lymphohistiocytic infiltration, microglial proliferation, marked neuronal loss, and white matter microinfarctions most severely involving the cerebellum, thalamus, and basal ganglia. Diagnosis was made after death by 3 independent methods, including demonstration of Powassan virus antigen in brain biopsy and autopsy tissue, detection of viral RNA in serum and cerebrospinal fluid by targeted real-time polymerase chain reaction, and detection of viral RNA in cerebrospinal fluid by unbiased sequencing. Extensive testing for other etiologies yielded negative results, including mumps virus owing to prodromal orchiepididymitis. Low-titer anti-GAD65 antibodies identified in serum, suggestive of limbic encephalitis, were not detected in cerebrospinal fluid. Owing to the rarity of Powassan encephalitis, a high degree of suspicion is required to make the diagnosis, particularly in an immunocompromised patient, in whom antibody-based assays may be falsely negative. Unbiased sequencing assays have the potential to detect uncommon infectious agents and may prove useful in similar scenarios.
Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas
2017-04-15
The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Lineup Administrator Influences on Eyewitness Identification Decisions
ERIC Educational Resources Information Center
Clark, Steven E.; Marshall, Tanya E.; Rosenthal, Robert
2009-01-01
The present research examines how a lineup administrator may influence eyewitness identification decisions through different forms of influence, after providing the witness with standard, unbiased instructions. Participant-witnesses viewed a staged crime and were later shown a target-present or target-absent lineup. The lineup administrators…
The Problem of Biased Data and Potential Solutions for Environmental Assessments
The utility and credibility of environmental assessments depend on the use of unbiased data. However, it is increasingly clear that, despite peer review, much of the scientific literature is biased. Sources of bias include fraud, publication bias, research designs, funding bias...
How to Hire Fund-Raising Counsel.
ERIC Educational Resources Information Center
Hayes, Joanne
1991-01-01
As objective outsiders, consultants can bring a fresh and unbiased view to institutional needs and perspectives. However, careful preliminary screening of consulting firms by colleges and universities considering their use is important, addressing a variety of cost considerations; prospective firms' experience and success record; and the specific…
A sampling strategy to estimate the area and perimeter of irregularly shaped planar regions
Timothy G. Gregoire; Harry T. Valentine
1995-01-01
The length of a randomly oriented ray emanating from an interior point of a planar region can be used to unbiasedly estimate the region's area and perimeter. Estimators and corresponding variance estimators under various selection strategies are presented.
47 CFR 52.13 - North American Numbering Plan Administrator.
Code of Federal Regulations, 2012 CFR
2012-10-01
... North American Numbering Plan Administrator (NANPA) shall be an independent and impartial non-government... section. It shall assign and administer NANP resources in an efficient, effective, fair, unbiased, and non... additional functions, including but not limited to: (1) Ensuring the efficient and effective administration...
47 CFR 52.13 - North American Numbering Plan Administrator.
Code of Federal Regulations, 2011 CFR
2011-10-01
... North American Numbering Plan Administrator (NANPA) shall be an independent and impartial non-government... section. It shall assign and administer NANP resources in an efficient, effective, fair, unbiased, and non... additional functions, including but not limited to: (1) Ensuring the efficient and effective administration...
47 CFR 52.13 - North American Numbering Plan Administrator.
Code of Federal Regulations, 2014 CFR
2014-10-01
... North American Numbering Plan Administrator (NANPA) shall be an independent and impartial non-government... section. It shall assign and administer NANP resources in an efficient, effective, fair, unbiased, and non... additional functions, including but not limited to: (1) Ensuring the efficient and effective administration...
47 CFR 52.13 - North American Numbering Plan Administrator.
Code of Federal Regulations, 2013 CFR
2013-10-01
... North American Numbering Plan Administrator (NANPA) shall be an independent and impartial non-government... section. It shall assign and administer NANP resources in an efficient, effective, fair, unbiased, and non... additional functions, including but not limited to: (1) Ensuring the efficient and effective administration...
Roofing: Don't Let What's Over Head Kill Your Bottom Line.
ERIC Educational Resources Information Center
Shannon, James W., Jr.
1983-01-01
A Colorado school district employs a professional consulting firm to give an unbiased opinion on the district's roofing needs. Built-up, single-ply, and modified asphalt roofing systems have all been utilized. Preventive maintenance keeps roofing bills to a minimum. (MLF)
Iowa crop variety yield testing: A history and annotated bibliography
USDA-ARS?s Scientific Manuscript database
Variety testing by U.S. agricultural universities, often in cooperation with experiment stations, and professional crop associations is recognized as an independent, unbiased validation of the viability of commercial crop varieties. In Iowa, variety testing has also been conducted by many private ag...
Overy, Catherine; Booth, George H; Blunt, N S; Shepherd, James J; Cleland, Deidre; Alavi, Ali
2014-12-28
Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamic itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.
Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.
2016-01-01
The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334
Hetero-type dual photoanodes for unbiased solar water splitting with extended light harvesting
Kim, Jin Hyun; Jang, Ji-Wook; Jo, Yim Hyun; Abdi, Fatwa F.; Lee, Young Hye; van de Krol, Roel; Lee, Jae Sung
2016-01-01
Metal oxide semiconductors are promising photoelectrode materials for solar water splitting due to their robustness in aqueous solutions and low cost. Yet, their solar-to-hydrogen conversion efficiencies are still not high enough for practical applications. Here we present a strategy to enhance the efficiency of metal oxides, hetero-type dual photoelectrodes, in which two photoanodes of different bandgaps are connected in parallel for extended light harvesting. Thus, a photoelectrochemical device made of modified BiVO4 and α-Fe2O3 as dual photoanodes utilizes visible light up to 610 nm for water splitting, and shows stable photocurrents of 7.0±0.2 mA cm−2 at 1.23 VRHE under 1 sun irradiation. A tandem cell composed with the dual photoanodes–silicon solar cell demonstrates unbiased water splitting efficiency of 7.7%. These results and concept represent a significant step forward en route to the goal of >10% efficiency required for practical solar hydrogen production. PMID:27966548
Geophysical Interpretation of Venus Gravity Data
NASA Technical Reports Server (NTRS)
Reasenberg, R. D.
1985-01-01
The subsurface distribution of Venus was investigated through the analysis of the data from Pioneer Venus Orbiter (PVO). In particular, the Doppler tracking data were used to map the gravitational potential. These were compared to the topographic data from the PVO radar (ORAD). In order to obtain an unbiased comparison, the topography data obtained from the PVO-ORAD were filtered to introduce distortions which are the same as those of the gravity models. Both the gravity and filtered topography maps are derived by two stage processes with a common second stage. In the first stage, the topography was used to calculate a corresponding spacecraft acceleration under the assumptions that the topography has a uniform given density and no compensation. In the second stage, the acceleration measures found in the first stage were passed through a linear inverter to yield maps of gravity and topography. Because these maps are the result of the same inversion process, they contain the same distortion; a comparison between them is unbiased to first order.
Grand canonical validation of the bipartite international trade network.
Straka, Mika J; Caldarelli, Guido; Saracco, Fabio
2017-08-01
Devising strategies for economic development in a globally competitive landscape requires a solid and unbiased understanding of countries' technological advancements and similarities among export products. Both can be addressed through the bipartite representation of the International Trade Network. In this paper, we apply the recently proposed grand canonical projection algorithm to uncover country and product communities. Contrary to past endeavors, our methodology, based on information theory, creates monopartite projections in an unbiased and analytically tractable way. Single links between countries or products represent statistically significant signals, which are not accounted for by null models such as the bipartite configuration model. We find stable country communities reflecting the socioeconomic distinction in developed, newly industrialized, and developing countries. Furthermore, we observe product clusters based on the aforementioned country groups. Our analysis reveals the existence of a complicated structure in the bipartite International Trade Network: apart from the diversification of export baskets from the most basic to the most exclusive products, we observe a statistically significant signal of an export specialization mechanism towards more sophisticated products.
Grand canonical validation of the bipartite international trade network
NASA Astrophysics Data System (ADS)
Straka, Mika J.; Caldarelli, Guido; Saracco, Fabio
2017-08-01
Devising strategies for economic development in a globally competitive landscape requires a solid and unbiased understanding of countries' technological advancements and similarities among export products. Both can be addressed through the bipartite representation of the International Trade Network. In this paper, we apply the recently proposed grand canonical projection algorithm to uncover country and product communities. Contrary to past endeavors, our methodology, based on information theory, creates monopartite projections in an unbiased and analytically tractable way. Single links between countries or products represent statistically significant signals, which are not accounted for by null models such as the bipartite configuration model. We find stable country communities reflecting the socioeconomic distinction in developed, newly industrialized, and developing countries. Furthermore, we observe product clusters based on the aforementioned country groups. Our analysis reveals the existence of a complicated structure in the bipartite International Trade Network: apart from the diversification of export baskets from the most basic to the most exclusive products, we observe a statistically significant signal of an export specialization mechanism towards more sophisticated products.
An unbiased risk estimator for image denoising in the presence of mixed poisson-gaussian noise.
Le Montagner, Yoann; Angelini, Elsa D; Olivo-Marin, Jean-Christophe
2014-03-01
The behavior and performance of denoising algorithms are governed by one or several parameters, whose optimal settings depend on the content of the processed image and the characteristics of the noise, and are generally designed to minimize the mean squared error (MSE) between the denoised image returned by the algorithm and a virtual ground truth. In this paper, we introduce a new Poisson-Gaussian unbiased risk estimator (PG-URE) of the MSE applicable to a mixed Poisson-Gaussian noise model that unifies the widely used Gaussian and Poisson noise models in fluorescence bioimaging applications. We propose a stochastic methodology to evaluate this estimator in the case when little is known about the internal machinery of the considered denoising algorithm, and we analyze both theoretically and empirically the characteristics of the PG-URE estimator. Finally, we evaluate the PG-URE-driven parametrization for three standard denoising algorithms, with and without variance stabilizing transforms, and different characteristics of the Poisson-Gaussian noise mixture.
Robertson, David S; Prevost, A Toby; Bowden, Jack
2016-09-30
Seamless phase II/III clinical trials offer an efficient way to select an experimental treatment and perform confirmatory analysis within a single trial. However, combining the data from both stages in the final analysis can induce bias into the estimates of treatment effects. Methods for bias adjustment developed thus far have made restrictive assumptions about the design and selection rules followed. In order to address these shortcomings, we apply recent methodological advances to derive the uniformly minimum variance conditionally unbiased estimator for two-stage seamless phase II/III trials. Our framework allows for the precision of the treatment arm estimates to take arbitrary values, can be utilised for all treatments that are taken forward to phase III and is applicable when the decision to select or drop treatment arms is driven by a multiplicity-adjusted hypothesis testing procedure. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
A novel SURE-based criterion for parametric PSF estimation.
Xue, Feng; Blu, Thierry
2015-02-01
We propose an unbiased estimate of a filtered version of the mean squared error--the blur-SURE (Stein's unbiased risk estimate)--as a novel criterion for estimating an unknown point spread function (PSF) from the degraded image only. The PSF is obtained by minimizing this new objective functional over a family of Wiener processings. Based on this estimated blur kernel, we then perform nonblind deconvolution using our recently developed algorithm. The SURE-based framework is exemplified with a number of parametric PSF, involving a scaling factor that controls the blur size. A typical example of such parametrization is the Gaussian kernel. The experimental results demonstrate that minimizing the blur-SURE yields highly accurate estimates of the PSF parameters, which also result in a restoration quality that is very similar to the one obtained with the exact PSF, when plugged into our recent multi-Wiener SURE-LET deconvolution algorithm. The highly competitive results obtained outline the great potential of developing more powerful blind deconvolution algorithms based on SURE-like estimates.
Drug discovery for Diamond-Blackfan anemia using reprogrammed hematopoietic progenitors
Doulatov, Sergei; Vo, Linda T.; Macari, Elizabeth R.; Wahlster, Lara; Kinney, Melissa A.; Taylor, Alison M.; Barragan, Jessica; Gupta, Manav; McGrath, Katherine; Lee, Hsiang-Ying; Humphries, Jessica M.; DeVine, Alex; Narla, Anupama; Alter, Blanche P.; Beggs, Alan H.; Agarwal, Suneet; Ebert, Benjamin L.; Gazda, Hanna T.; Lodish, Harvey F.; Sieff, Colin A.; Schlaeger, Thorsten M.; Zon, Leonard I.; Daley, George Q.
2017-01-01
Diamond-Blackfan anemia (DBA) is a congenital disorder characterized by the failure of erythroid progenitor differentiation, severely curtailing red blood cell production. Because many DBA patients fail to respond to corticosteroid therapy, there is considerable need for therapeutics for this disorder. Identifying therapeutics for DBA requires circumventing the paucity of primary patient blood stem and progenitor cells. To this end, we adopted a reprogramming strategy to generate expandable hematopoietic progenitor cells from induced pluripotent stem cells (iPSCs) from DBA patients. Reprogrammed DBA progenitors recapitulate defects in erythroid differentiation, which were rescued by gene complementation. Unbiased chemical screens identified SMER28, a small-molecule inducer of autophagy, which enhanced erythropoiesis in a range of in vitro and in vivo models of DBA. SMER28 acted through autophagy factor ATG5 to stimulate erythropoiesis and up-regulate expression of globin genes. These findings present an unbiased drug screen for hematological disease using iPSCs and identify autophagy as a therapeutic pathway in DBA. PMID:28179501
Benchmarking methods and data sets for ligand enrichment assessment in virtual screening.
Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2015-01-01
Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. "analogue bias", "artificial enrichment" and "false negative". In addition, we introduce our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylases (HDACs) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The leave-one-out cross-validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased as measured by property matching, ROC curves and AUCs. Copyright © 2014 Elsevier Inc. All rights reserved.
Benchmarking Methods and Data Sets for Ligand Enrichment Assessment in Virtual Screening
Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon
2014-01-01
Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. “analogue bias”, “artificial enrichment” and “false negative”. In addition, we introduced our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylase (HDAC) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The Leave-One-Out Cross-Validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased in terms of property matching, ROC curves and AUCs. PMID:25481478
Efficient Stochastic Rendering of Static and Animated Volumes Using Visibility Sweeps.
von Radziewsky, Philipp; Kroes, Thomas; Eisemann, Martin; Eisemann, Elmar
2017-09-01
Stochastically solving the rendering integral (particularly visibility) is the de-facto standard for physically-based light transport but it is computationally expensive, especially when displaying heterogeneous volumetric data. In this work, we present efficient techniques to speed-up the rendering process via a novel visibility-estimation method in concert with an unbiased importance sampling (involving environmental lighting and visibility inside the volume), filtering, and update techniques for both static and animated scenes. Our major contributions include a progressive estimate of partial occlusions based on a fast sweeping-plane algorithm. These occlusions are stored in an octahedral representation, which can be conveniently transformed into a quadtree-based hierarchy suited for a joint importance sampling. Further, we propose sweep-space filtering, which suppresses the occurrence of fireflies and investigate different update schemes for animated scenes. Our technique is unbiased, requires little precomputation, is highly parallelizable, and is applicable to a various volume data sets, dynamic transfer functions, animated volumes and changing environmental lighting.
Meng, Yilin; Roux, Benoît
2015-08-11
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
Conformational free energies of methyl-α-L-iduronic and methyl-β-D-glucuronic acids in water
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Sagui, Celeste
2010-03-01
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of α-L-iduronic acid and its C5 epimer β-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Conformational free energies of methyl-alpha-L-iduronic and methyl-beta-D-glucuronic acids in water.
Babin, Volodymyr; Sagui, Celeste
2010-03-14
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of alpha-L-iduronic acid and its C5 epimer beta-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
2015-01-01
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
Overlap between treatment and control distributions as an effect size measure in experiments.
Hedges, Larry V; Olkin, Ingram
2016-03-01
The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).
[Application of ordinary Kriging method in entomologic ecology].
Zhang, Runjie; Zhou, Qiang; Chen, Cuixian; Wang, Shousong
2003-01-01
Geostatistics is a statistic method based on regional variables and using the tool of variogram to analyze the spatial structure and the patterns of organism. In simulating the variogram within a great range, though optimal simulation cannot be obtained, the simulation method of a dialogue between human and computer can be used to optimize the parameters of the spherical models. In this paper, the method mentioned above and the weighted polynomial regression were utilized to simulate the one-step spherical model, the two-step spherical model and linear function model, and the available nearby samples were used to draw on the ordinary Kriging procedure, which provided a best linear unbiased estimate of the constraint of the unbiased estimation. The sum of square deviation between the estimating and measuring values of varying theory models were figured out, and the relative graphs were shown. It was showed that the simulation based on the two-step spherical model was the best simulation, and the one-step spherical model was better than the linear function model.
Unbiased mean direction of paleomagnetic data and better estimate of paleolatitude
NASA Astrophysics Data System (ADS)
Hatakeyama, T.; Shibuya, H.
2010-12-01
In paleomagnetism, when we obtain only paleodirection data without paleointensities we calculate Fisher-mean directions (I, D) and Fisher-mean VGP positions as the description of the mean field. However, Kono (1997) and Hatakeyama and Kono (2001) indicated that these averaged directions does not show the unbiased estimated mean directions derived from the time-averaged field (TAF). Hatakeyama and Kono (2002) calculated the TAF and paleosecular variation (PSV) models for the past 5My with considering the biases due to the averaging of the nonlinear functions such as the summation of the unit vectors in the Fisher statistics process. Here we will show a zonal TAF model based on the Hatakeyama and Kono TAF model. Moreover, we will introduce the biased angles due to the PSV in the mean direction and a method for determining true paleolatitudes, which represents the TAF, from paleodirections. This method will helps tectonics studies, especially in the estimation of the accurate paleolatitude in the middle latitude regions.
The Swift GRB Host Galaxy Legacy Survey
NASA Astrophysics Data System (ADS)
Perley, Daniel
2015-08-01
I will describe the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population and its redshift evolution from z=0 to z=7. Using unbiased selection criteria we have designated a subset of 119 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, Gemini, VLT, and Magellan to obtain complementary optical/NIR photometry and spectroscopy to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass distributions and their evolution with redshift, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to serve as tools for measuring and studying cosmic star-formation in the distant universe.
Quantum ratchet in two-dimensional semiconductors with Rashba spin-orbit interaction
Ang, Yee Sin; Ma, Zhongshui; Zhang, Chao
2015-01-01
Ratchet is a device that produces direct current of particles when driven by an unbiased force. We demonstrate a simple scattering quantum ratchet based on an asymmetrical quantum tunneling effect in two-dimensional electron gas with Rashba spin-orbit interaction (R2DEG). We consider the tunneling of electrons across a square potential barrier sandwiched by interface scattering potentials of unequal strengths on its either sides. It is found that while the intra-spin tunneling probabilities remain unchanged, the inter-spin-subband tunneling probabilities of electrons crossing the barrier in one direction is unequal to that of the opposite direction. Hence, when the system is driven by an unbiased periodic force, a directional flow of electron current is generated. The scattering quantum ratchet in R2DEG is conceptually simple and is capable of converting a.c. driving force into a rectified current without the need of additional symmetry breaking mechanism or external magnetic field. PMID:25598490
Dynamic properties of molecular motors in burnt-bridge models
NASA Astrophysics Data System (ADS)
Artyomov, Maxim N.; Morozov, Alexander Yu; Pronina, Ekaterina; Kolomeisky, Anatoly B.
2007-08-01
Dynamic properties of molecular motors that fuel their motion by actively interacting with underlying molecular tracks are studied theoretically via discrete-state stochastic 'burnt-bridge' models. The transport of the particles is viewed as an effective diffusion along one-dimensional lattices with periodically distributed weak links. When an unbiased random walker passes the weak link it can be destroyed ('burned') with probability p, providing a bias in the motion of the molecular motor. We present a theoretical approach that allows one to calculate exactly all dynamic properties of motor proteins, such as velocity and dispersion, under general conditions. It is found that dispersion is a decreasing function of the concentration of bridges, while the dependence of dispersion on the burning probability is more complex. Our calculations also show a gap in dispersion for very low concentrations of weak links or for very low burning probabilities which indicates a dynamic phase transition between unbiased and biased diffusion regimes. Theoretical findings are supported by Monte Carlo computer simulations.
Hetero-type dual photoanodes for unbiased solar water splitting with extended light harvesting.
Kim, Jin Hyun; Jang, Ji-Wook; Jo, Yim Hyun; Abdi, Fatwa F; Lee, Young Hye; van de Krol, Roel; Lee, Jae Sung
2016-12-14
Metal oxide semiconductors are promising photoelectrode materials for solar water splitting due to their robustness in aqueous solutions and low cost. Yet, their solar-to-hydrogen conversion efficiencies are still not high enough for practical applications. Here we present a strategy to enhance the efficiency of metal oxides, hetero-type dual photoelectrodes, in which two photoanodes of different bandgaps are connected in parallel for extended light harvesting. Thus, a photoelectrochemical device made of modified BiVO 4 and α-Fe 2 O 3 as dual photoanodes utilizes visible light up to 610 nm for water splitting, and shows stable photocurrents of 7.0±0.2 mA cm -2 at 1.23 V RHE under 1 sun irradiation. A tandem cell composed with the dual photoanodes-silicon solar cell demonstrates unbiased water splitting efficiency of 7.7%. These results and concept represent a significant step forward en route to the goal of >10% efficiency required for practical solar hydrogen production.
Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A
2014-10-01
Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.
Motor activity as an unbiased variable to assess anaphylaxis in allergic rats.
Abril-Gil, Mar; Garcia-Just, Alba; Cambras, Trinitat; Pérez-Cano, Francisco J; Castellote, Cristina; Franch, Àngels; Castell, Margarida
2015-10-01
The release of mediators by mast cells triggers allergic symptoms involving various physiological systems and, in the most severe cases, the development of anaphylactic shock compromising mainly the nervous and cardiovascular systems. We aimed to establish variables to objectively study the anaphylactic response (AR) after an oral challenge in an allergy model. Brown Norway rats were immunized by intraperitoneal injection of ovalbumin with alum and toxin from Bordetella pertussis. Specific immunoglobulin (Ig) E antibodies were developed in immunized animals. Forty days after immunization, the rats were orally challenged with the allergen, and motor activity, body temperature and serum mast cell protease concentration were determined. The anaphylaxis induced a reduction in body temperature and a decrease in the number of animal movements, which was inversely correlated with serum mast cell protease release. In summary, motor activity is a reliable tool for assessing AR and also an unbiased method for screening new anti-allergic drugs. © 2015 by the Society for Experimental Biology and Medicine.
Functional renormalization group and bosonization as a solver for 2D fermionic Hubbard models
NASA Astrophysics Data System (ADS)
Schuetz, Florian; Marston, Brad
2007-03-01
The functional renormalization group (fRG) provides an unbiased framework to analyze competing instabilities in two-dimensional electron systems and has been used extensively over the past decade [1]. In order to obtain an equally unbiased tool to interprete the flow, we investigate the combination of a many-patch, one-loop calculation with higher dimensional bosonization [2] of the resulting low-energy action. Subsequently a semi-classical approximation [3] can be used to describe the resulting phases. The spinless Hubbard model on a square lattice with nearest neighbor repulsion is investigated as a test case. [1] M. Salmhofer and C. Honerkamp, Prog. Theor. Phys. 105, 1 (2001). [2] A. Houghton, H.-J. Kwon, J. B. Marston, Adv.Phys. 49, 141 (2000); P. Kopietz, Bosonization of interacting fermions in arbitrary dimensions, (Springer, Berlin, 1997). [3] H.-H. Lin, L. Balents, M. P. A. Fisher, Phys. Rev. B 56, 6569 6593 (1997); J. O. Fjaerestad, J. B. Marston, U. Schollwoeck, Ann. Phys. (N.Y.) 321, 894 (2006).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Overy, Catherine; Blunt, N. S.; Shepherd, James J.
2014-12-28
Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamicmore » itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.« less
Artificial Intelligence based technique for BTS placement
NASA Astrophysics Data System (ADS)
Alenoghena, C. O.; Emagbetere, J. O.; Aibinu, A. M.
2013-12-01
The increase of the base transceiver station (BTS) in most urban areas can be traced to the drive by network providers to meet demand for coverage and capacity. In traditional network planning, the final decision of BTS placement is taken by a team of radio planners, this decision is not fool proof against regulatory requirements. In this paper, an intelligent based algorithm for optimal BTS site placement has been proposed. The proposed technique takes into consideration neighbour and regulation considerations objectively while determining cell site. The application will lead to a quantitatively unbiased evaluated decision making process in BTS placement. An experimental data of a 2km by 3km territory was simulated for testing the new algorithm, results obtained show a 100% performance of the neighbour constrained algorithm in BTS placement optimization. Results on the application of GA with neighbourhood constraint indicate that the choices of location can be unbiased and optimization of facility placement for network design can be carried out.
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Strengthening syntheses on fire: Increasing their usefulness for managers
Jane Kapler Smith
2015-01-01
A synthesis for fire managers summarizes and interprets a body of information, presents its meaning in an objective, unbiased way, and describes its implications for decisionmakers. Following are suggestions for ways to strengthen syntheses on fire and on other natural resource issues:Include managers, scientists, and...
Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.
ERIC Educational Resources Information Center
Kromrey, Jeffrey D.; Hines, Constance V.
1995-01-01
The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…
U.S. Forest Service termiticide tests
Terence Wagner
2003-01-01
The U.S. Forest Service has been testingchemicals for termite control since 1939. Today its termiticide testing program is nationally recognized for providing unbiased efficacy data for product registration using standardized tests, sites, and evaluation procedures. Virtually all termiticides undergo Forest Service testing before being registered by EPA. Termiticides...
NTP-CERHR EXPERT PANEL REPORT ON THE REPRODUCTIVE AND DEVELOPMENTAL TOXICITY OF 1-BROMOPROPANE
The National Toxicology Program (NTP) and the National Institute of Environmetnal Health Sciences (NIEHS) established the NTP Center for the Evaluation of Risks to Human Reproduction (CERHR) in order to provide timely, unbiased, scientifically sound evaluations of human and exper...
The mystery shopper: an anonymous review of your services.
Steiner, K
1986-06-01
Mystery shoppers can provide an unbiased report on the day-to-day functioning of hospital activities. For this study, six shoppers "used" various hospital services and reported their impressions through a questionnaire. The general findings may be applicable to many other well-run health care organizations.
NTP-CERHR EXPERT PANEL REPORT ON REPRODUCTIVE AND DEVELOPMENTAL TOXICITY OF METHYLPHENIDATE.
A manuscript describes the results of an expert panel meeting of the NTP Center for the Evaluation of Risks to Human Reproduction (CERHR). The purpose CERHR is to provide timely, unbiased, scientifically sound evaluations of human and experimental evidence for adverse effects on...
Constructing Aligned Assessments Using Automated Test Construction
ERIC Educational Resources Information Center
Porter, Andrew; Polikoff, Morgan S.; Barghaus, Katherine M.; Yang, Rui
2013-01-01
We describe an innovative automated test construction algorithm for building aligned achievement tests. By incorporating the algorithm into the test construction process, along with other test construction procedures for building reliable and unbiased assessments, the result is much more valid tests than result from current test construction…
NTP-CERHR EXPERT PANEL REPORT ON THE REPRODUCTIVE AND DEVELOPMENTAL TOXICITY OF 2-BROMOPROPANE
The National Toxicology Program (NTP) and the National Institute of Environmental Health Sciences (NIEHS) established the NTP Center for the Evaluation of Risks to Human Reproduction (CERHR) in order to provide timely, unbiased, scientifically sound evaluations of human and exper...
Computer program uses Monte Carlo techniques for statistical system performance analysis
NASA Technical Reports Server (NTRS)
Wohl, D. P.
1967-01-01
Computer program with Monte Carlo sampling techniques determines the effect of a component part of a unit upon the overall system performance. It utilizes the full statistics of the disturbances and misalignments of each component to provide unbiased results through simulated random sampling.
Decision Support | Solar Research | NREL
informed solar decision making with credible, objective, accessible, and timely resources. Solar Energy Decision Support Decision Support NREL provides technical and analytical support to support provide unbiased information on solar policies and issues for state and local government decision makers
NTP-CERHR Expert Panel Report on the reproductive and developmental toxicity of hydroxyurea
The National Toxicology Program (NTP) and the National Institute of Environmental Health Sciences (NIEHS) established the NTP Center for the Evaluation of Risks to Human Reproduction (CERHR) in June 1998. The purpose of CERHR is to provide timely, unbiased, scientifically sound e...
Fast and unbiased estimator of the time-dependent Hurst exponent.
Pianese, Augusto; Bianchi, Sergio; Palazzo, Anna Maria
2018-03-01
We combine two existing estimators of the local Hurst exponent to improve both the goodness of fit and the computational speed of the algorithm. An application with simulated time series is implemented, and a Monte Carlo simulation is performed to provide evidence of the improvement.
Fast and unbiased estimator of the time-dependent Hurst exponent
NASA Astrophysics Data System (ADS)
Pianese, Augusto; Bianchi, Sergio; Palazzo, Anna Maria
2018-03-01
We combine two existing estimators of the local Hurst exponent to improve both the goodness of fit and the computational speed of the algorithm. An application with simulated time series is implemented, and a Monte Carlo simulation is performed to provide evidence of the improvement.
Genomic evaluation of regional dairy cattle breeds in single-breed and multibreed contexts.
Jónás, D; Ducrocq, V; Fritz, S; Baur, A; Sanchez, M-P; Croiseau, P
2017-02-01
An important prerequisite for high prediction accuracy in genomic prediction is the availability of a large training population, which allows accurate marker effect estimation. This requirement is not fulfilled in case of regional breeds with a limited number of breeding animals. We assessed the efficiency of the current French routine genomic evaluation procedure in four regional breeds (Abondance, Tarentaise, French Simmental and Vosgienne) as well as the potential benefits when the training populations consisting of males and females of these breeds are merged to form a multibreed training population. Genomic evaluation was 5-11% more accurate than a pedigree-based BLUP in three of the four breeds, while the numerically smallest breed showed a < 1% increase in accuracy. Multibreed genomic evaluation was beneficial for two breeds (Abondance and French Simmental) with maximum gains of 5 and 8% in correlation coefficients between yield deviations and genomic estimated breeding values, when compared to the single-breed genomic evaluation results. Inflation of genomic evaluation of young candidates was also reduced. Our results indicate that genomic selection can be effective in regional breeds as well. Here, we provide empirical evidence proving that genetic distance between breeds is only one of the factors affecting the efficiency of multibreed genomic evaluation. © 2016 Blackwell Verlag GmbH.
Barbosa, M H P; Ferreira, A; Peixoto, L A; Resende, M D V; Nascimento, M; Silva, F F
2014-03-12
This study evaluated different strategies to select sugar cane families and obtain clones adapted to the conditions of the Brazilian savannah. Specifically, 7 experiments were conducted, with 10 full sib families, and 2 witnesses in common to all experiments, in each experiment. The plants were grown in random blocks, with witnesses in common (incomplete blocks), and 6 repetitions of each experiment. The data were analyzed through the methodology of mixed patterns, in which the matrices of kinship between the families were identified by the method of restricted maximum likelihood. The characteristics that were evaluated included soluble solids content (BRIX), BRIX ton/ha, average mass of a culm, number of culms/m, and tons of culms/ha. A multi-diverse alternative based on the analysis of groupings by using the UPGMA method was used to identify the most viable families for selection, when considering the genotypic effects on all characteristics. This method appeared suitable for the selection of families, with 5 family groups being formed. The families that formed Group 2 appeared superior to all other families for all the evaluated characteristics. It is recommended that the families in Group 2 are preferentially used in sugar cane improvement programs to obtain varieties optimally adapted to the conditions of the Brazilian savannah.
Contextualized Teacher-Training and Racial/Ethnic Tensions in U.S. Schools
ERIC Educational Resources Information Center
Guinn, Cameron S.
2017-01-01
Public school teachers are required to have specialized training to appropriately address discipline problems and foster positive school culture. Despite good intentions, many teacher-training initiatives fall short of creating an unbiased school atmosphere. This study used data collected by the National Center for Educational Statistics through…
Beyond Bigotry: Teaching about Unconscious Prejudice
ERIC Educational Resources Information Center
Ghoshal, Raj Andrew; Lippard, Cameron; Ribas, Vanesa; Muir, Ken
2013-01-01
Researchers have demonstrated that unconscious prejudices around characteristics such as race, gender, and class are common, even among people who avow themselves unbiased. The authors present a method for teaching about implicit racial bias using online Implicit Association Tests. The authors do not claim that their method rids students of…
Efforts Toward the Development of Unbiased Selection and Assessment Instruments.
ERIC Educational Resources Information Center
Rudner, Lawrence M.
Investigations into item bias provide an empirical basis for the identification and elimination of test items which appear to measure different traits across populations or cultural groups. The Psychometric rationales for six approaches to the identification of biased test items are reviewed: (1) Transformed item difficulties: within-group…
Intravenous Cocaine Priming Reinstates Cocaine-Induced Conditioned Place Preference
ERIC Educational Resources Information Center
Lombas, Andres S.; Freeman, Kevin B.; Roma, Peter G.; Riley, Anthony L.
2007-01-01
Separate groups of rats underwent an unbiased conditioned place preference (CPP) procedure involving alternate pairings of distinct environments with intravenous (IV) injections of cocaine (0.75 mg/kg) or saline immediately or 15 min after injection. A subsequent extinction phase consisted of exposure to both conditioning environments preceded by…
On the Bias-Amplifying Effect of Near Instruments in Observational Studies
ERIC Educational Resources Information Center
Steiner, Peter M.; Kim, Yongnam
2014-01-01
In contrast to randomized experiments, the estimation of unbiased treatment effects from observational data requires an analysis that conditions on all confounding covariates. Conditioning on covariates can be done via standard parametric regression techniques or nonparametric matching like propensity score (PS) matching. The regression or…
Toward Developing an Unbiased Scoring Algorithm for "NASA" and Similar Ranking Tasks.
ERIC Educational Resources Information Center
Lane, Irving M.; And Others
1981-01-01
Presents both logical and empirical evidence to illustrate that the conventional scoring algorithm for ranking tasks significantly underestimates the initial level of group ability and that Slevin's alternative scoring algorithm significantly overestimates the initial level of ability. Presents a modification of Slevin's algorithm which authors…
Confidence Judgments in Children's and Adults' Event Recall and Suggestibility.
ERIC Educational Resources Information Center
Roebers, Claudia M.
2002-01-01
Three studies investigated the role of 8- and 10-year-olds' and adults' metacognitive monitoring and control processes for unbiased event recall tasks and suggestibility. Findings suggested strong tendencies to overestimate confidence regardless of age and question format. Children did not lack principal metacognitive competencies when questions…
NTP-CERHR Expert Panel Report on the Reproductive and Developmental Toxicity of Bisphenol A
The National Toxicology Program (NTP)1 established the NTP Center for the Evaluation of Risks to Human Reproduction (CERHR) in June 1998. The purpose of the CERHR is to provide timely, unbiased, scientifically sound evaluations of the potential for adverse effects on reproduction...
A Superintendent's Perspective of the Grievance Process.
ERIC Educational Resources Information Center
Salmon, Hanford A.
Grievance procedures must be fast and fair to satisfy everyone and to disrupt the normal workflow as little as possible. The following factors help make for effective procedures: (1) binding arbitration, because it is orderly and unbiased; (2) contributions to contract language by the contract administrators; (3) absolute honesty about contract…
The Acceptability and Representativeness of Standardized Parent-Child Interaction Tasks
ERIC Educational Resources Information Center
Rhule, Dana M.; McMahon, Robert J.; Vando, Jessica
2009-01-01
Analogue behavioral observation of structured parent-child interactions has often been used to obtain a standardized, unbiased measure of child noncompliance and parenting behavior. However, for assessment information to be clinically relevant, it is essential that the behavior observed be similar to that which the child normally experiences and…
Suicide and Homosexual Teens: What Can Biology Teachers Do to Help?
ERIC Educational Resources Information Center
Smith, Mike U.; Drake, Mary Ann
2001-01-01
Discusses the teacher's role in helping students deal with homosexuality and suicide. Teachers can provide unbiased information about personal relevant biological issues; be good listeners and confidantes; and value each student without regard to race, gender, class, or sexual orientation. Provides useful information on addressing homosexuality in…
Unbiased survival estimates and evidence for skipped breeding opportunities in females
Muths, Erin L.; Scherer, Rick D.; Lambert, Brad A.
2010-01-01
5. Establishing the occurrence of temporary emigration not only reduces bias in estimates of survival probabilities but also provides information about expected breeding attempts by females, a critical element in understanding the ecology of an organism and the impacts of outside stressors and conservation actions.
Estimating total suspended sediment yield with probability sampling
Robert B. Thomas
1985-01-01
The ""Selection At List Time"" (SALT) scheme controls sampling of concentration for estimating total suspended sediment yield. The probability of taking a sample is proportional to its estimated contribution to total suspended sediment discharge. This procedure gives unbiased estimates of total suspended sediment yield and the variance of the...
Car telephone use and road safety : an overview prepared for the European Commission
DOT National Transportation Integrated Search
2009-06-01
This is a fairly unbiased review of the issue of driver distraction framed for used in the EU. From the perspective of providing new data this paper is lacking, however this does framing for the EU and the more international perspective is interestin...
Empirically Driven Variable Selection for the Estimation of Causal Effects with Observational Data
ERIC Educational Resources Information Center
Keller, Bryan; Chen, Jianshen
2016-01-01
Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies…
Guidelines for Nonsexist Language in APA Journals
ERIC Educational Resources Information Center
American Psychological Association
1978-01-01
Sexism in journal writing may be classified as problems of evaluation. Endeavors to change language is a difficult task. Few attempts exist to end sexist language. Careful rephrasing can often result in accurate, unbiased communication. The APA Guidelines attempt to develop awareness and competence in using non-sexist language. (Author/MFD)
An Unbiased Estimate of Global Interrater Agreement
ERIC Educational Resources Information Center
Cousineau, Denis; Laurencelle, Louis
2017-01-01
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Essays on Policy Evaluation with Endogenous Adoption
ERIC Educational Resources Information Center
Gentile, Elisabetta
2011-01-01
Over the last decade, experimental and quasi-experimental methods have been favored by researchers in empirical economics, as they provide unbiased causal estimates. However, when implementing a program, it is often not possible to randomly assign subjects to treatment, leading to a possible endogeneity bias. This dissertation consists of two…