Genomic-Enabled Prediction in Maize Using Kernel Models with Genotype × Environment Interaction
Bandeira e Sousa, Massaine; Cuevas, Jaime; de Oliveira Couto, Evellyn Giselly; Pérez-Rodríguez, Paulino; Jarquín, Diego; Fritsche-Neto, Roberto; Burgueño, Juan; Crossa, Jose
2017-01-01
Multi-environment trials are routinely conducted in plant breeding to select candidates for the next selection cycle. In this study, we compare the prediction accuracy of four developed genomic-enabled prediction models: (1) single-environment, main genotypic effect model (SM); (2) multi-environment, main genotypic effects model (MM); (3) multi-environment, single variance G×E deviation model (MDs); and (4) multi-environment, environment-specific variance G×E deviation model (MDe). Each of these four models were fitted using two kernel methods: a linear kernel Genomic Best Linear Unbiased Predictor, GBLUP (GB), and a nonlinear kernel Gaussian kernel (GK). The eight model-method combinations were applied to two extensive Brazilian maize data sets (HEL and USP data sets), having different numbers of maize hybrids evaluated in different environments for grain yield (GY), plant height (PH), and ear height (EH). Results show that the MDe and the MDs models fitted with the Gaussian kernel (MDe-GK, and MDs-GK) had the highest prediction accuracy. For GY in the HEL data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 9 to 32%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 9 to 49%. For GY in the USP data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 0 to 7%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 34 to 70%. For traits PH and EH, gains in prediction accuracy of models with GK compared to models with GB were smaller than those achieved in GY. Also, these gains in prediction accuracy decreased when a more difficult prediction problem was studied. PMID:28455415
Genomic-Enabled Prediction in Maize Using Kernel Models with Genotype × Environment Interaction.
Bandeira E Sousa, Massaine; Cuevas, Jaime; de Oliveira Couto, Evellyn Giselly; Pérez-Rodríguez, Paulino; Jarquín, Diego; Fritsche-Neto, Roberto; Burgueño, Juan; Crossa, Jose
2017-06-07
Multi-environment trials are routinely conducted in plant breeding to select candidates for the next selection cycle. In this study, we compare the prediction accuracy of four developed genomic-enabled prediction models: (1) single-environment, main genotypic effect model (SM); (2) multi-environment, main genotypic effects model (MM); (3) multi-environment, single variance G×E deviation model (MDs); and (4) multi-environment, environment-specific variance G×E deviation model (MDe). Each of these four models were fitted using two kernel methods: a linear kernel Genomic Best Linear Unbiased Predictor, GBLUP (GB), and a nonlinear kernel Gaussian kernel (GK). The eight model-method combinations were applied to two extensive Brazilian maize data sets (HEL and USP data sets), having different numbers of maize hybrids evaluated in different environments for grain yield (GY), plant height (PH), and ear height (EH). Results show that the MDe and the MDs models fitted with the Gaussian kernel (MDe-GK, and MDs-GK) had the highest prediction accuracy. For GY in the HEL data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 9 to 32%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 9 to 49%. For GY in the USP data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 0 to 7%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 34 to 70%. For traits PH and EH, gains in prediction accuracy of models with GK compared to models with GB were smaller than those achieved in GY. Also, these gains in prediction accuracy decreased when a more difficult prediction problem was studied. Copyright © 2017 Bandeira e Sousa et al.
Outcome Prediction in Mathematical Models of Immune Response to Infection.
Mai, Manuel; Wang, Kun; Huber, Greg; Kirby, Michael; Shattuck, Mark D; O'Hern, Corey S
2015-01-01
Clinicians need to predict patient outcomes with high accuracy as early as possible after disease inception. In this manuscript, we show that patient-to-patient variability sets a fundamental limit on outcome prediction accuracy for a general class of mathematical models for the immune response to infection. However, accuracy can be increased at the expense of delayed prognosis. We investigate several systems of ordinary differential equations (ODEs) that model the host immune response to a pathogen load. Advantages of systems of ODEs for investigating the immune response to infection include the ability to collect data on large numbers of 'virtual patients', each with a given set of model parameters, and obtain many time points during the course of the infection. We implement patient-to-patient variability v in the ODE models by randomly selecting the model parameters from distributions with coefficients of variation v that are centered on physiological values. We use logistic regression with one-versus-all classification to predict the discrete steady-state outcomes of the system. We find that the prediction algorithm achieves near 100% accuracy for v = 0, and the accuracy decreases with increasing v for all ODE models studied. The fact that multiple steady-state outcomes can be obtained for a given initial condition, i.e. the basins of attraction overlap in the space of initial conditions, limits the prediction accuracy for v > 0. Increasing the elapsed time of the variables used to train and test the classifier, increases the prediction accuracy, while adding explicit external noise to the ODE models decreases the prediction accuracy. Our results quantify the competition between early prognosis and high prediction accuracy that is frequently encountered by clinicians.
Correa, Katharina; Bangera, Rama; Figueroa, René; Lhorente, Jean P; Yáñez, José M
2017-01-31
Sea lice infestations caused by Caligus rogercresseyi are a main concern to the salmon farming industry due to associated economic losses. Resistance to this parasite was shown to have low to moderate genetic variation and its genetic architecture was suggested to be polygenic. The aim of this study was to compare accuracies of breeding value predictions obtained with pedigree-based best linear unbiased prediction (P-BLUP) methodology against different genomic prediction approaches: genomic BLUP (G-BLUP), Bayesian Lasso, and Bayes C. To achieve this, 2404 individuals from 118 families were measured for C. rogercresseyi count after a challenge and genotyped using 37 K single nucleotide polymorphisms. Accuracies were assessed using fivefold cross-validation and SNP densities of 0.5, 1, 5, 10, 25 and 37 K. Accuracy of genomic predictions increased with increasing SNP density and was higher than pedigree-based BLUP predictions by up to 22%. Both Bayesian and G-BLUP methods can predict breeding values with higher accuracies than pedigree-based BLUP, however, G-BLUP may be the preferred method because of reduced computation time and ease of implementation. A relatively low marker density (i.e. 10 K) is sufficient for maximal increase in accuracy when using G-BLUP or Bayesian methods for genomic prediction of C. rogercresseyi resistance in Atlantic salmon.
Final Technical Report: Increasing Prediction Accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
King, Bruce Hardison; Hansen, Clifford; Stein, Joshua
2015-12-01
PV performance models are used to quantify the value of PV plants in a given location. They combine the performance characteristics of the system, the measured or predicted irradiance and weather at a site, and the system configuration and design into a prediction of the amount of energy that will be produced by a PV system. These predictions must be as accurate as possible in order for finance charges to be minimized. Higher accuracy equals lower project risk. The Increasing Prediction Accuracy project at Sandia focuses on quantifying and reducing uncertainties in PV system performance models.
The effect of using genealogy-based haplotypes for genomic prediction
2013-01-01
Background Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Conclusions Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy. PMID:23496971
The effect of using genealogy-based haplotypes for genomic prediction.
Edriss, Vahid; Fernando, Rohan L; Su, Guosheng; Lund, Mogens S; Guldbrandtsen, Bernt
2013-03-06
Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding
2013-01-01
Background In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. Results The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. Conclusions The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies. PMID:24314298
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding.
Ould Estaghvirou, Sidi Boubacar; Ogutu, Joseph O; Schulz-Streeck, Torben; Knaak, Carsten; Ouzunova, Milena; Gordillo, Andres; Piepho, Hans-Peter
2013-12-06
In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies.
Auinger, Hans-Jürgen; Schönleben, Manfred; Lehermeier, Christina; Schmidt, Malthe; Korzun, Viktor; Geiger, Hartwig H; Piepho, Hans-Peter; Gordillo, Andres; Wilde, Peer; Bauer, Eva; Schön, Chris-Carolin
2016-11-01
Genomic prediction accuracy can be significantly increased by model calibration across multiple breeding cycles as long as selection cycles are connected by common ancestors. In hybrid rye breeding, application of genome-based prediction is expected to increase selection gain because of long selection cycles in population improvement and development of hybrid components. Essentially two prediction scenarios arise: (1) prediction of the genetic value of lines from the same breeding cycle in which model training is performed and (2) prediction of lines from subsequent cycles. It is the latter from which a reduction in cycle length and consequently the strongest impact on selection gain is expected. We empirically investigated genome-based prediction of grain yield, plant height and thousand kernel weight within and across four selection cycles of a hybrid rye breeding program. Prediction performance was assessed using genomic and pedigree-based best linear unbiased prediction (GBLUP and PBLUP). A total of 1040 S 2 lines were genotyped with 16 k SNPs and each year testcrosses of 260 S 2 lines were phenotyped in seven or eight locations. The performance gap between GBLUP and PBLUP increased significantly for all traits when model calibration was performed on aggregated data from several cycles. Prediction accuracies obtained from cross-validation were in the order of 0.70 for all traits when data from all cycles (N CS = 832) were used for model training and exceeded within-cycle accuracies in all cases. As long as selection cycles are connected by a sufficient number of common ancestors and prediction accuracy has not reached a plateau when increasing sample size, aggregating data from several preceding cycles is recommended for predicting genetic values in subsequent cycles despite decreasing relatedness over time.
Morgante, Fabio; Huang, Wen; Maltecca, Christian; Mackay, Trudy F C
2018-06-01
Predicting complex phenotypes from genomic data is a fundamental aim of animal and plant breeding, where we wish to predict genetic merits of selection candidates; and of human genetics, where we wish to predict disease risk. While genomic prediction models work well with populations of related individuals and high linkage disequilibrium (LD) (e.g., livestock), comparable models perform poorly for populations of unrelated individuals and low LD (e.g., humans). We hypothesized that low prediction accuracies in the latter situation may occur when the genetics architecture of the trait departs from the infinitesimal and additive architecture assumed by most prediction models. We used simulated data for 10,000 lines based on sequence data from a population of unrelated, inbred Drosophila melanogaster lines to evaluate this hypothesis. We show that, even in very simplified scenarios meant as a stress test of the commonly used Genomic Best Linear Unbiased Predictor (G-BLUP) method, using all common variants yields low prediction accuracy regardless of the trait genetic architecture. However, prediction accuracy increases when predictions are informed by the genetic architecture inferred from mapping the top variants affecting main effects and interactions in the training data, provided there is sufficient power for mapping. When the true genetic architecture is largely or partially due to epistatic interactions, the additive model may not perform well, while models that account explicitly for interactions generally increase prediction accuracy. Our results indicate that accounting for genetic architecture can improve prediction accuracy for quantitative traits.
Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?
2017-01-01
Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making. PMID:28837692
Correcting Memory Improves Accuracy of Predicted Task Duration
ERIC Educational Resources Information Center
Roy, Michael M.; Mitten, Scott T.; Christenfeld, Nicholas J. S.
2008-01-01
People are often inaccurate in predicting task duration. The memory bias explanation holds that this error is due to people having incorrect memories of how long previous tasks have taken, and these biased memories cause biased predictions. Therefore, the authors examined the effect on increasing predictive accuracy of correcting memory through…
Prospects for Genomic Selection in Cassava Breeding.
Wolfe, Marnin D; Del Carpio, Dunia Pino; Alabi, Olumide; Ezenwaka, Lydia C; Ikeogu, Ugochukwu N; Kayondo, Ismail S; Lozano, Roberto; Okeke, Uche G; Ozimati, Alfred A; Williams, Esuma; Egesi, Chiedozie; Kawuki, Robert S; Kulakow, Peter; Rabbi, Ismail Y; Jannink, Jean-Luc
2017-11-01
Cassava ( Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on previous analyses by assessing the accuracy of seven prediction models for seven traits in three prediction scenarios: cross-validation within populations, cross-population prediction and cross-generation prediction. We also evaluated the impact of increasing the training population (TP) size by phenotyping progenies selected either at random or with a genetic algorithm. Cross-validation results were mostly consistent across programs, with nonadditive models predicting of 10% better on average. Cross-population accuracy was generally low (mean = 0.18) but prediction of cassava mosaic disease increased up to 57% in one Nigerian population when data from another related population were combined. Accuracy across generations was poorer than within-generation accuracy, as expected, but accuracy for dry matter content and mosaic disease severity should be sufficient for rapid-cycling GS. Selection of a prediction model made some difference across generations, but increasing TP size was more important. With a genetic algorithm, selection of one-third of progeny could achieve an accuracy equivalent to phenotyping all progeny. We are in the early stages of GS for this crop but the results are promising for some traits. General guidelines that are emerging are that TPs need to continue to grow but phenotyping can be done on a cleverly selected subset of individuals, reducing the overall phenotyping burden. Copyright © 2017 Crop Science Society of America.
Influence of outliers on accuracy estimation in genomic prediction in plant breeding.
Estaghvirou, Sidi Boubacar Ould; Ogutu, Joseph O; Piepho, Hans-Peter
2014-10-01
Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased. Copyright © 2014 Ould Estaghvirou et al.
Waide, Emily H; Tuggle, Christopher K; Serão, Nick V L; Schroyen, Martine; Hess, Andrew; Rowland, Raymond R R; Lunney, Joan K; Plastow, Graham; Dekkers, Jack C M
2018-02-01
Genomic prediction of the pig's response to the porcine reproductive and respiratory syndrome (PRRS) virus (PRRSV) would be a useful tool in the swine industry. This study investigated the accuracy of genomic prediction based on porcine SNP60 Beadchip data using training and validation datasets from populations with different genetic backgrounds that were challenged with different PRRSV isolates. Genomic prediction accuracy averaged 0.34 for viral load (VL) and 0.23 for weight gain (WG) following experimental PRRSV challenge, which demonstrates that genomic selection could be used to improve response to PRRSV infection. Training on WG data during infection with a less virulent PRRSV, KS06, resulted in poor accuracy of prediction for WG during infection with a more virulent PRRSV, NVSL. Inclusion of single nucleotide polymorphisms (SNPs) that are in linkage disequilibrium with a major quantitative trait locus (QTL) on chromosome 4 was vital for accurate prediction of VL. Overall, SNPs that were significantly associated with either trait in single SNP genome-wide association analysis were unable to predict the phenotypes with an accuracy as high as that obtained by using all genotyped SNPs across the genome. Inclusion of data from close relatives into the training population increased whole genome prediction accuracy by 33% for VL and by 37% for WG but did not affect the accuracy of prediction when using only SNPs in the major QTL region. Results show that genomic prediction of response to PRRSV infection is moderately accurate and, when using all SNPs on the porcine SNP60 Beadchip, is not very sensitive to differences in virulence of the PRRSV in training and validation populations. Including close relatives in the training population increased prediction accuracy when using the whole genome or SNPs other than those near a major QTL.
Hahn, Sowon; Buttaccio, Daniel R; Hahn, Jungwon; Lee, Taehun
2015-01-01
The present study demonstrates that levels of extraversion and neuroticism can predict attentional performance during a change detection task. After completing a change detection task built on the flicker paradigm, participants were assessed for personality traits using the Revised Eysenck Personality Questionnaire (EPQ-R). Multiple regression analyses revealed that higher levels of extraversion predict increased change detection accuracies, while higher levels of neuroticism predict decreased change detection accuracies. In addition, neurotic individuals exhibited decreased sensitivity A' and increased fixation dwell times. Hierarchical regression analyses further revealed that eye movement measures mediate the relationship between neuroticism and change detection accuracies. Based on the current results, we propose that neuroticism is associated with decreased attentional control over the visual field, presumably due to decreased attentional disengagement. Extraversion can predict increased attentional performance, but the effect is smaller than the relationship between neuroticism and attention.
The Influence of Delaying Judgments of Learning on Metacognitive Accuracy: A Meta-Analytic Review
ERIC Educational Resources Information Center
Rhodes, Matthew G.; Tauber, Sarah K.
2011-01-01
Many studies have examined the accuracy of predictions of future memory performance solicited through judgments of learning (JOLs). Among the most robust findings in this literature is that delaying predictions serves to substantially increase the relative accuracy of JOLs compared with soliciting JOLs immediately after study, a finding termed the…
Belay, T K; Dagnachew, B S; Boison, S A; Ådnøy, T
2018-03-28
Milk infrared spectra are routinely used for phenotyping traits of interest through links developed between the traits and spectra. Predicted individual traits are then used in genetic analyses for estimated breeding value (EBV) or for phenotypic predictions using a single-trait mixed model; this approach is referred to as indirect prediction (IP). An alternative approach [direct prediction (DP)] is a direct genetic analysis of (a reduced dimension of) the spectra using a multitrait model to predict multivariate EBV of the spectral components and, ultimately, also to predict the univariate EBV or phenotype for the traits of interest. We simulated 3 traits under different genetic (low: 0.10 to high: 0.90) and residual (zero to high: ±0.90) correlation scenarios between the 3 traits and assumed the first trait is a linear combination of the other 2 traits. The aim was to compare the IP and DP approaches for predictions of EBV and phenotypes under the different correlation scenarios. We also evaluated relationships between performances of the 2 approaches and the accuracy of calibration equations. Moreover, the effect of using different regression coefficients estimated from simulated phenotypes (β p ), true breeding values (β g ), and residuals (β r ) on performance of the 2 approaches were evaluated. The simulated data contained 2,100 parents (100 sires and 2,000 cows) and 8,000 offspring (4 offspring per cow). Of the 8,000 observations, 2,000 were randomly selected and used to develop links between the first and the other 2 traits using partial least square (PLS) regression analysis. The different PLS regression coefficients, such as β p , β g , and β r , were used in subsequent predictions following the IP and DP approaches. We used BLUP analyses for the remaining 6,000 observations using the true (co)variance components that had been used for the simulation. Accuracy of prediction (of EBV and phenotype) was calculated as a correlation between predicted and true values from the simulations. The results showed that accuracies of EBV prediction were higher in the DP than in the IP approach. The reverse was true for accuracy of phenotypic prediction when using β p but not when using β g and β r , where accuracy of phenotypic prediction in the DP was slightly higher than in the IP approach. Within the DP approach, accuracies of EBV when using β g were higher than when using β p only at the low genetic correlation scenario. However, we found no differences in EBV prediction accuracy between the β p and β g in the IP approach. Accuracy of the calibration models increased with an increase in genetic and residual correlations between the traits. Performance of both approaches increased with an increase in accuracy of the calibration models. In conclusion, the DP approach is a good strategy for EBV prediction but not for phenotypic prediction, where the classical PLS regression-based equations or the IP approach provided better results. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Rutkoski, Jessica; Poland, Jesse; Mondal, Suchismita; Autrique, Enrique; Pérez, Lorena González; Crossa, José; Reynolds, Matthew; Singh, Ravi
2016-01-01
Genomic selection can be applied prior to phenotyping, enabling shorter breeding cycles and greater rates of genetic gain relative to phenotypic selection. Traits measured using high-throughput phenotyping based on proximal or remote sensing could be useful for improving pedigree and genomic prediction model accuracies for traits not yet possible to phenotype directly. We tested if using aerial measurements of canopy temperature, and green and red normalized difference vegetation index as secondary traits in pedigree and genomic best linear unbiased prediction models could increase accuracy for grain yield in wheat, Triticum aestivum L., using 557 lines in five environments. Secondary traits on training and test sets, and grain yield on the training set were modeled as multivariate, and compared to univariate models with grain yield on the training set only. Cross validation accuracies were estimated within and across-environment, with and without replication, and with and without correcting for days to heading. We observed that, within environment, with unreplicated secondary trait data, and without correcting for days to heading, secondary traits increased accuracies for grain yield by 56% in pedigree, and 70% in genomic prediction models, on average. Secondary traits increased accuracy slightly more when replicated, and considerably less when models corrected for days to heading. In across-environment prediction, trends were similar but less consistent. These results show that secondary traits measured in high-throughput could be used in pedigree and genomic prediction to improve accuracy. This approach could improve selection in wheat during early stages if validated in early-generation breeding plots. PMID:27402362
Zhou, L; Lund, M S; Wang, Y; Su, G
2014-08-01
This study investigated genomic predictions across Nordic Holstein and Nordic Red using various genomic relationship matrices. Different sources of information, such as consistencies of linkage disequilibrium (LD) phase and marker effects, were used to construct the genomic relationship matrices (G-matrices) across these two breeds. Single-trait genomic best linear unbiased prediction (GBLUP) model and two-trait GBLUP model were used for single-breed and two-breed genomic predictions. The data included 5215 Nordic Holstein bulls and 4361 Nordic Red bulls, which was composed of three populations: Danish Red, Swedish Red and Finnish Ayrshire. The bulls were genotyped with 50 000 SNP chip. Using the two-breed predictions with a joint Nordic Holstein and Nordic Red reference population, accuracies increased slightly for all traits in Nordic Red, but only for some traits in Nordic Holstein. Among the three subpopulations of Nordic Red, accuracies increased more for Danish Red than for Swedish Red and Finnish Ayrshire. This is because closer genetic relationships exist between Danish Red and Nordic Holstein. Among Danish Red, individuals with higher genomic relationship coefficients with Nordic Holstein showed more increased accuracies in the two-breed predictions. Weighting the two-breed G-matrices by LD phase consistencies, marker effects or both did not further improve accuracies of the two-breed predictions. © 2014 Blackwell Verlag GmbH.
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies
2010-01-01
Background All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. Results The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. Conclusions This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general. PMID:20144194
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies.
David, Maria Pamela C; Concepcion, Gisela P; Padlan, Eduardo A
2010-02-08
All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.
Putz, A M; Tiezzi, F; Maltecca, C; Gray, K A; Knauer, M T
2018-02-01
The objective of this study was to compare and determine the optimal validation method when comparing accuracy from single-step GBLUP (ssGBLUP) to traditional pedigree-based BLUP. Field data included six litter size traits. Simulated data included ten replicates designed to mimic the field data in order to determine the method that was closest to the true accuracy. Data were split into training and validation sets. The methods used were as follows: (i) theoretical accuracy derived from the prediction error variance (PEV) of the direct inverse (iLHS), (ii) approximated accuracies from the accf90(GS) program in the BLUPF90 family of programs (Approx), (iii) correlation between predictions and the single-step GEBVs from the full data set (GEBV Full ), (iv) correlation between predictions and the corrected phenotypes of females from the full data set (Y c ), (v) correlation from method iv divided by the square root of the heritability (Y ch ) and (vi) correlation between sire predictions and the average of their daughters' corrected phenotypes (Y cs ). Accuracies from iLHS increased from 0.27 to 0.37 (37%) in the Large White. Approximation accuracies were very consistent and close in absolute value (0.41 to 0.43). Both iLHS and Approx were much less variable than the corrected phenotype methods (ranging from 0.04 to 0.27). On average, simulated data showed an increase in accuracy from 0.34 to 0.44 (29%) using ssGBLUP. Both iLHS and Y ch approximated the increase well, 0.30 to 0.46 and 0.36 to 0.45, respectively. GEBV Full performed poorly in both data sets and is not recommended. Results suggest that for within-breed selection, theoretical accuracy using PEV was consistent and accurate. When direct inversion is infeasible to get the PEV, correlating predictions to the corrected phenotypes divided by the square root of heritability is adequate given a large enough validation data set. © 2017 Blackwell Verlag GmbH.
Linkage disequilibrium among commonly genotyped SNP and variants detected from bull sequence
USDA-ARS?s Scientific Manuscript database
Genomic prediction utilizing causal variants could increase selection accuracy above that achieved with SNP genotyped by commercial assays. A number of variants detected from sequencing influential sires are likely to be causal, but noticable improvements in prediction accuracy using imputed sequen...
Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav
2013-12-09
In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
de Saint Laumer, Jean‐Yves; Leocata, Sabine; Tissot, Emeline; Baroux, Lucie; Kampf, David M.; Merle, Philippe; Boschung, Alain; Seyfried, Markus
2015-01-01
We previously showed that the relative response factors of volatile compounds were predictable from either combustion enthalpies or their molecular formulae only 1. We now extend this prediction to silylated derivatives by adding an increment in the ab initio calculation of combustion enthalpies. The accuracy of the experimental relative response factors database was also improved and its population increased to 490 values. In particular, more brominated compounds were measured, and their prediction accuracy was improved by adding a correction factor in the algorithm. The correlation coefficient between predicted and measured values increased from 0.936 to 0.972, leading to a mean prediction accuracy of ± 6%. Thus, 93% of the relative response factors values were predicted with an accuracy of better than ± 10%. The capabilities of the extended algorithm are exemplified by (i) the quick and accurate quantification of hydroxylated metabolites resulting from a biodegradation test after silylation and prediction of their relative response factors, without having the reference substances available; and (ii) the rapid purity determinations of volatile compounds. This study confirms that Gas chromatography with a flame ionization detector and using predicted relative response factors is one of the few techniques that enables quantification of volatile compounds without calibrating the instrument with the pure reference substance. PMID:26179324
Maintenance of equilibrium point control during an unexpectedly loaded rapid limb movement.
Simmons, R W; Richardson, C
1984-06-08
Two experiments investigated whether the equilibrium point hypothesis or the mass-spring model of motor control subserves positioning accuracy during spring loaded, rapid, bi-articulated movement. For intact preparations, the equilibrium point hypothesis predicts response accuracy to be determined by a mixture of afferent and efferent information, whereas the mass-spring model predicts positioning to be under a direct control system. Subjects completed a series of load-resisted training trials to a spatial target. The magnitude of a sustained spring load was unexpectedly increased on selected trials. Results indicated positioning accuracy and applied force varied with increases in load, which suggests that the original efferent commands are modified by afferent information during the movement as predicted by the equilibrium point hypothesis.
Gonzalez, Maritza G; Reed, Kathryn L; Center, Katherine E; Hill, Meghan G
2017-05-01
The purpose of this study was to investigate the relationship between the maternal body mass index (BMI) and the accuracy of ultrasound-derived birth weight. A retrospective chart review was performed on women who had an ultrasound examination between 36 and 43 weeks' gestation and had complete delivery data available through electronic medical records. The ultrasound-derived fetal weight was adjusted by 30 g per day of gestation that elapsed between the ultrasound examination and delivery to arrive at the predicted birth weight. A total of 403 pregnant women met inclusion criteria. Age ranged from 13-44 years (mean ± SD, 28.38 ± 5.97 years). The mean BMI was 32.62 ± 8.59 kg/m 2 . Most of the women did not have diabetes (n = 300 [74.0%]). The sample was primarily white (n = 165 [40.9%]) and Hispanic (n = 147 [36.5%]). The predicted weight of neonates at delivery (3677.07 ± 540.51 g) was higher than the actual birth weight (3335.92 ± 585.46 g). Based on regression analyses, as the BMI increased, so did the predicted weight (P < .01) and weight at delivery (P < .01). The accuracy of the estimated ultrasound-derived birth weight was not predicted by the maternal BMI (P = .22). Maternal race and diabetes status were not associated with the accuracy of ultrasound in predicting birth weight. Both predicted and actual birth weight increased as the BMI increased. However, the BMI did not affect the accuracy of the estimated ultrasound-derived birth weight. Maternal race and diabetes status did not influence the accuracy of the ultrasound-derived predicted birth weight. © 2017 by the American Institute of Ultrasound in Medicine.
On the accuracy of ERS-1 orbit predictions
NASA Technical Reports Server (NTRS)
Koenig, Rolf; Li, H.; Massmann, Franz-Heinrich; Raimondo, J. C.; Rajasenan, C.; Reigber, C.
1993-01-01
Since the launch of ERS-1, the D-PAF (German Processing and Archiving Facility) provides regularly orbit predictions for the worldwide SLR (Satellite Laser Ranging) tracking network. The weekly distributed orbital elements are so called tuned IRV's and tuned SAO-elements. The tuning procedure, designed to improve the accuracy of the recovery of the orbit at the stations, is discussed based on numerical results. This shows that tuning of elements is essential for ERS-1 with the currently applied tracking procedures. The orbital elements are updated by daily distributed time bias functions. The generation of the time bias function is explained. Problems and numerical results are presented. The time bias function increases the prediction accuracy considerably. Finally, the quality assessment of ERS-1 orbit predictions is described. The accuracy is compiled for about 250 days since launch. The average accuracy lies in the range of 50-100 ms and has considerably improved.
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
Bernecker, Samantha L; Rosellini, Anthony J; Nock, Matthew K; Chiu, Wai Tat; Gutierrez, Peter M; Hwang, Irving; Joiner, Thomas E; Naifeh, James A; Sampson, Nancy A; Zaslavsky, Alan M; Stein, Murray B; Ursano, Robert J; Kessler, Ronald C
2018-04-03
High rates of mental disorders, suicidality, and interpersonal violence early in the military career have raised interest in implementing preventive interventions with high-risk new enlistees. The Army Study to Assess Risk and Resilience in Servicemembers (STARRS) developed risk-targeting systems for these outcomes based on machine learning methods using administrative data predictors. However, administrative data omit many risk factors, raising the question whether risk targeting could be improved by adding self-report survey data to prediction models. If so, the Army may gain from routinely administering surveys that assess additional risk factors. The STARRS New Soldier Survey was administered to 21,790 Regular Army soldiers who agreed to have survey data linked to administrative records. As reported previously, machine learning models using administrative data as predictors found that small proportions of high-risk soldiers accounted for high proportions of negative outcomes. Other machine learning models using self-report survey data as predictors were developed previously for three of these outcomes: major physical violence and sexual violence perpetration among men and sexual violence victimization among women. Here we examined the extent to which this survey information increases prediction accuracy, over models based solely on administrative data, for those three outcomes. We used discrete-time survival analysis to estimate a series of models predicting first occurrence, assessing how model fit improved and concentration of risk increased when adding the predicted risk score based on survey data to the predicted risk score based on administrative data. The addition of survey data improved prediction significantly for all outcomes. In the most extreme case, the percentage of reported sexual violence victimization among the 5% of female soldiers with highest predicted risk increased from 17.5% using only administrative predictors to 29.4% adding survey predictors, a 67.9% proportional increase in prediction accuracy. Other proportional increases in concentration of risk ranged from 4.8% to 49.5% (median = 26.0%). Data from an ongoing New Soldier Survey could substantially improve accuracy of risk models compared to models based exclusively on administrative predictors. Depending upon the characteristics of interventions used, the increase in targeting accuracy from survey data might offset survey administration costs.
Weng, Ziqing; Wolc, Anna; Shen, Xia; Fernando, Rohan L; Dekkers, Jack C M; Arango, Jesus; Settar, Petek; Fulton, Janet E; O'Sullivan, Neil P; Garrick, Dorian J
2016-03-19
Genomic estimated breeding values (GEBV) based on single nucleotide polymorphism (SNP) genotypes are widely used in animal improvement programs. It is typically assumed that the larger the number of animals is in the training set, the higher is the prediction accuracy of GEBV. The aim of this study was to quantify genomic prediction accuracy depending on the number of ancestral generations included in the training set, and to determine the optimal number of training generations for different traits in an elite layer breeding line. Phenotypic records for 16 traits on 17,793 birds were used. All parents and some selection candidates from nine non-overlapping generations were genotyped for 23,098 segregating SNPs. An animal model with pedigree relationships (PBLUP) and the BayesB genomic prediction model were applied to predict EBV or GEBV at each validation generation (progeny of the most recent training generation) based on varying numbers of immediately preceding ancestral generations. Prediction accuracy of EBV or GEBV was assessed as the correlation between EBV and phenotypes adjusted for fixed effects, divided by the square root of trait heritability. The optimal number of training generations that resulted in the greatest prediction accuracy of GEBV was determined for each trait. The relationship between optimal number of training generations and heritability was investigated. On average, accuracies were higher with the BayesB model than with PBLUP. Prediction accuracies of GEBV increased as the number of closely-related ancestral generations included in the training set increased, but reached an asymptote or slightly decreased when distant ancestral generations were used in the training set. The optimal number of training generations was 4 or more for high heritability traits but less than that for low heritability traits. For less heritable traits, limiting the training datasets to individuals closely related to the validation population resulted in the best predictions. The effect of adding distant ancestral generations in the training set on prediction accuracy differed between traits and the optimal number of necessary training generations is associated with the heritability of traits.
Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav
2013-01-01
In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models. PMID:24082033
ERIC Educational Resources Information Center
Borgmeier, Chris; Horner, Robert H.
2006-01-01
Faced with limited resources, schools require tools that increase the accuracy and efficiency of functional behavioral assessment. Yarbrough and Carr (2000) provided evidence that informant confidence ratings of the likelihood of problem behavior in specific situations offered a promising tool for predicting the accuracy of function-based…
The effect of concurrent hand movement on estimated time to contact in a prediction motion task.
Zheng, Ran; Maraj, Brian K V
2018-04-27
In many activities, we need to predict the arrival of an occluded object. This action is called prediction motion or motion extrapolation. Previous researchers have found that both eye tracking and the internal clocking model are involved in the prediction motion task. Additionally, it is reported that concurrent hand movement facilitates the eye tracking of an externally generated target in a tracking task, even if the target is occluded. The present study examined the effect of concurrent hand movement on the estimated time to contact in a prediction motion task. We found different (accurate/inaccurate) concurrent hand movements had the opposite effect on the eye tracking accuracy and estimated TTC in the prediction motion task. That is, the accurate concurrent hand tracking enhanced eye tracking accuracy and had the trend to increase the precision of estimated TTC, but the inaccurate concurrent hand tracking decreased eye tracking accuracy and disrupted estimated TTC. However, eye tracking accuracy does not determine the precision of estimated TTC.
Chen, L; Schenkel, F; Vinsky, M; Crews, D H; Li, C
2013-10-01
In beef cattle, phenotypic data that are difficult and/or costly to measure, such as feed efficiency, and DNA marker genotypes are usually available on a small number of animals of different breeds or populations. To achieve a maximal accuracy of genomic prediction using the phenotype and genotype data, strategies for forming a training population to predict genomic breeding values (GEBV) of the selection candidates need to be evaluated. In this study, we examined the accuracy of predicting GEBV for residual feed intake (RFI) based on 522 Angus and 395 Charolais steers genotyped on SNP with the Illumina Bovine SNP50 Beadchip for 3 training population forming strategies: within breed, across breed, and by pooling data from the 2 breeds (i.e., combined). Two other scenarios with the training and validation data split by birth year and by sire family within a breed were also investigated to assess the impact of genetic relationships on the accuracy of genomic prediction. Three statistical methods including the best linear unbiased prediction with the relationship matrix defined based on the pedigree (PBLUP), based on the SNP genotypes (GBLUP), and a Bayesian method (BayesB) were used to predict the GEBV. The results showed that the accuracy of the GEBV prediction was the highest when the prediction was within breed and when the validation population had greater genetic relationships with the training population, with a maximum of 0.58 for Angus and 0.64 for Charolais. The within-breed prediction accuracies dropped to 0.29 and 0.38, respectively, when the validation populations had a minimal pedigree link with the training population. When the training population of a different breed was used to predict the GEBV of the validation population, that is, across-breed genomic prediction, the accuracies were further reduced to 0.10 to 0.22, depending on the prediction method used. Pooling data from the 2 breeds to form the training population resulted in accuracies increased to 0.31 and 0.43, respectively, for the Angus and Charolais validation populations. The results suggested that the genetic relationship of selection candidates with the training population has a greater impact on the accuracy of GEBV using the Illumina Bovine SNP50 Beadchip. Pooling data from different breeds to form the training population will improve the accuracy of across breed genomic prediction for RFI in beef cattle.
Multivariate prediction of motor diagnosis in Huntington's disease: 12 years of PREDICT-HD.
Long, Jeffrey D; Paulsen, Jane S
2015-10-01
It is well known in Huntington's disease that cytosine-adenine-guanine expansion and age at study entry are predictive of the timing of motor diagnosis. The goal of this study was to assess whether additional motor, imaging, cognitive, functional, psychiatric, and demographic variables measured at study entry increased the ability to predict the risk of motor diagnosis over 12 years. One thousand seventy-eight Huntington's disease gene-expanded carriers (64% female) from the Neurobiological Predictors of Huntington's Disease study were followed up for up to 12 y (mean = 5, standard deviation = 3.3) covering 2002 to 2014. No one had a motor diagnosis at study entry, but 225 (21%) carriers prospectively received a motor diagnosis. Analysis was performed with random survival forests, which is a machine learning method for right-censored data. Adding 34 variables along with cytosine-adenine-guanine and age substantially increased predictive accuracy relative to cytosine-adenine-guanine and age alone. Adding six of the common motor and cognitive variables (total motor score, diagnostic confidence level, Symbol Digit Modalities Test, three Stroop tests) resulted in lower predictive accuracy than the full set, but still had twice the 5-y predictive accuracy than when using cytosine-adenine-guanine and age alone. Additional analysis suggested interactions and nonlinear effects that were characterized in a post hoc Cox regression model. Measurement of clinical variables can substantially increase the accuracy of predicting motor diagnosis over and above cytosine-adenine-guanine and age (and their interaction). Estimated probabilities can be used to characterize progression level and aid in future studies' sample selection. © 2015 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.
Predicting School Enrollments Using the Modified Regression Technique.
ERIC Educational Resources Information Center
Grip, Richard S.; Young, John W.
This report is based on a study in which a regression model was constructed to increase accuracy in enrollment predictions. A model, known as the Modified Regression Technique (MRT), was used to examine K-12 enrollment over the past 20 years in 2 New Jersey school districts of similar size and ethnicity. To test the model's accuracy, MRT was…
Isma’eel, Hussain A.; Sakr, George E.; Almedawar, Mohamad M.; Fathallah, Jihan; Garabedian, Torkom; Eddine, Savo Bou Zein
2015-01-01
Background High dietary salt intake is directly linked to hypertension and cardiovascular diseases (CVDs). Predicting behaviors regarding salt intake habits is vital to guide interventions and increase their effectiveness. We aim to compare the accuracy of an artificial neural network (ANN) based tool that predicts behavior from key knowledge questions along with clinical data in a high cardiovascular risk cohort relative to the least square models (LSM) method. Methods We collected knowledge, attitude and behavior data on 115 patients. A behavior score was calculated to classify patients’ behavior towards reducing salt intake. Accuracy comparison between ANN and regression analysis was calculated using the bootstrap technique with 200 iterations. Results Starting from a 69-item questionnaire, a reduced model was developed and included eight knowledge items found to result in the highest accuracy of 62% CI (58-67%). The best prediction accuracy in the full and reduced models was attained by ANN at 66% and 62%, respectively, compared to full and reduced LSM at 40% and 34%, respectively. The average relative increase in accuracy over all in the full and reduced models is 82% and 102%, respectively. Conclusions Using ANN modeling, we can predict salt reduction behaviors with 66% accuracy. The statistical model has been implemented in an online calculator and can be used in clinics to estimate the patient’s behavior. This will help implementation in future research to further prove clinical utility of this tool to guide therapeutic salt reduction interventions in high cardiovascular risk individuals. PMID:26090333
Predicting grain yield using canopy hyperspectral reflectance in wheat breeding data.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; de Los Campos, Gustavo; Alvarado, Gregorio; Suchismita, Mondal; Rutkoski, Jessica; González-Pérez, Lorena; Burgueño, Juan
2017-01-01
Modern agriculture uses hyperspectral cameras to obtain hundreds of reflectance data measured at discrete narrow bands to cover the whole visible light spectrum and part of the infrared and ultraviolet light spectra, depending on the camera. This information is used to construct vegetation indices (VI) (e.g., green normalized difference vegetation index or GNDVI, simple ratio or SRa, etc.) which are used for the prediction of primary traits (e.g., biomass). However, these indices only use some bands and are cultivar-specific; therefore they lose considerable information and are not robust for all cultivars. This study proposes models that use all available bands as predictors to increase prediction accuracy; we compared these approaches with eight conventional vegetation indexes (VIs) constructed using only some bands. The data set we used comes from CIMMYT's global wheat program and comprises 1170 genotypes evaluated for grain yield (ton/ha) in five environments (Drought, Irrigated, EarlyHeat, Melgas and Reduced Irrigated); the reflectance data were measured in 250 discrete narrow bands ranging between 392 and 851 nm. The proposed models for the simultaneous analysis of all the bands were ordinal least square (OLS), Bayes B, principal components with Bayes B, functional B-spline, functional Fourier and functional partial least square. The results of these models were compared with the OLS performed using as predictors each of the eight VIs individually and combined. We found that using all bands simultaneously increased prediction accuracy more than using VI alone. The Splines and Fourier models had the best prediction accuracy for each of the nine time-points under study. Combining image data collected at different time-points led to a small increase in prediction accuracy relative to models that use data from a single time-point. Also, using bands with heritabilities larger than 0.5 only in Drought as predictor variables showed improvements in prediction accuracy.
Genomic selection across multiple breeding cycles in applied bread wheat breeding.
Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann
2016-06-01
We evaluated genomic selection across five breeding cycles of bread wheat breeding. Bias of within-cycle cross-validation and methods for improving the prediction accuracy were assessed. The prospect of genomic selection has been frequently shown by cross-validation studies using the same genetic material across multiple environments, but studies investigating genomic selection across multiple breeding cycles in applied bread wheat breeding are lacking. We estimated the prediction accuracy of grain yield, protein content and protein yield of 659 inbred lines across five independent breeding cycles and assessed the bias of within-cycle cross-validation. We investigated the influence of outliers on the prediction accuracy and predicted protein yield by its components traits. A high average heritability was estimated for protein content, followed by grain yield and protein yield. The bias of the prediction accuracy using populations from individual cycles using fivefold cross-validation was accordingly substantial for protein yield (17-712 %) and less pronounced for protein content (8-86 %). Cross-validation using the cycles as folds aimed to avoid this bias and reached a maximum prediction accuracy of [Formula: see text] = 0.51 for protein content, [Formula: see text] = 0.38 for grain yield and [Formula: see text] = 0.16 for protein yield. Dropping outlier cycles increased the prediction accuracy of grain yield to [Formula: see text] = 0.41 as estimated by cross-validation, while dropping outlier environments did not have a significant effect on the prediction accuracy. Independent validation suggests, on the other hand, that careful consideration is necessary before an outlier correction is undertaken, which removes lines from the training population. Predicting protein yield by multiplying genomic estimated breeding values of grain yield and protein content raised the prediction accuracy to [Formula: see text] = 0.19 for this derived trait.
Lee, S Hong; Clark, Sam; van der Werf, Julius H J
2017-01-01
Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as 'unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Ne). Both the effective number of chromosome segments (Me) and Ne are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.
All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences.
Hayat, Sikander; Sander, Chris; Marks, Debora S; Elofsson, Arne
2015-04-28
Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.
Accuracy of fetal sex determination on ultrasound examination in the first trimester of pregnancy.
Manzanares, Sebastián; Benítez, Adara; Naveiro-Fuentes, Mariña; López-Criado, María Setefilla; Sánchez-Gila, Mar
2016-06-01
The aim of this study was to evaluate the feasibility and success rate of sex determination on transabdominal sonographic examination at 11-13 weeks' gestation and to identify factors influencing accuracy. In this prospective observational evaluation of 672 fetuses between 11 weeks' and 13 weeks + 6 days' gestational age (GA), we determined fetal sex according to the angle of the genital tubercle viewed on the midsagittal plane. We also analyzed maternal, fetal, and operator factors possibly influencing the accuracy of the determination. Fetal sex determination was feasible in 608 of the 672 fetuses (90.5%), and the prediction was correct in 532 of those 608 cases (87.5%). Fetal sex was more accurately predicted as the fetal crown-rump length (CRL), and GA increased and was less accurately predicted as the maternal body mass index increased. A CRL greater than 55.7 mm, a GA more than 12 weeks + 2 days, and a body mass index below 23.8 were identified as the best cutoff values for sex prediction. None of the other analyzed factors influenced the feasibility or accuracy of sex determination. The sex of a fetus can be accurately determined on sonographic examination in the first trimester of pregnancy; the accuracy of this prediction is influenced by the fetal CRL and GA and by the maternal body mass index. © 2015 Wiley Periodicals, Inc. J Clin Ultrasound 44:272-277, 2016. © 2015 Wiley Periodicals, Inc.
Multivariate prediction of motor diagnosis in Huntington's disease: 12 years of PREDICT‐HD
Long, Jeffrey D.
2015-01-01
Abstract Background It is well known in Huntington's disease that cytosine‐adenine‐guanine expansion and age at study entry are predictive of the timing of motor diagnosis. The goal of this study was to assess whether additional motor, imaging, cognitive, functional, psychiatric, and demographic variables measured at study entry increased the ability to predict the risk of motor diagnosis over 12 years. Methods One thousand seventy‐eight Huntington's disease gene–expanded carriers (64% female) from the Neurobiological Predictors of Huntington's Disease study were followed up for up to 12 y (mean = 5, standard deviation = 3.3) covering 2002 to 2014. No one had a motor diagnosis at study entry, but 225 (21%) carriers prospectively received a motor diagnosis. Analysis was performed with random survival forests, which is a machine learning method for right‐censored data. Results Adding 34 variables along with cytosine‐adenine‐guanine and age substantially increased predictive accuracy relative to cytosine‐adenine‐guanine and age alone. Adding six of the common motor and cognitive variables (total motor score, diagnostic confidence level, Symbol Digit Modalities Test, three Stroop tests) resulted in lower predictive accuracy than the full set, but still had twice the 5‐y predictive accuracy than when using cytosine‐adenine‐guanine and age alone. Additional analysis suggested interactions and nonlinear effects that were characterized in a post hoc Cox regression model. Conclusions Measurement of clinical variables can substantially increase the accuracy of predicting motor diagnosis over and above cytosine‐adenine‐guanine and age (and their interaction). Estimated probabilities can be used to characterize progression level and aid in future studies' sample selection. © 2015 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society PMID:26340420
Edwards, Stefan M.; Sørensen, Izel F.; Sarup, Pernille; Mackay, Trudy F. C.; Sørensen, Peter
2016-01-01
Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits. PMID:27235308
Yao, Chen; Zhu, Xiaojin; Weigel, Kent A
2016-11-07
Genomic prediction for novel traits, which can be costly and labor-intensive to measure, is often hampered by low accuracy due to the limited size of the reference population. As an option to improve prediction accuracy, we introduced a semi-supervised learning strategy known as the self-training model, and applied this method to genomic prediction of residual feed intake (RFI) in dairy cattle. We describe a self-training model that is wrapped around a support vector machine (SVM) algorithm, which enables it to use data from animals with and without measured phenotypes. Initially, a SVM model was trained using data from 792 animals with measured RFI phenotypes. Then, the resulting SVM was used to generate self-trained phenotypes for 3000 animals for which RFI measurements were not available. Finally, the SVM model was re-trained using data from up to 3792 animals, including those with measured and self-trained RFI phenotypes. Incorporation of additional animals with self-trained phenotypes enhanced the accuracy of genomic predictions compared to that of predictions that were derived from the subset of animals with measured phenotypes. The optimal ratio of animals with self-trained phenotypes to animals with measured phenotypes (2.5, 2.0, and 1.8) and the maximum increase achieved in prediction accuracy measured as the correlation between predicted and actual RFI phenotypes (5.9, 4.1, and 2.4%) decreased as the size of the initial training set (300, 400, and 500 animals with measured phenotypes) increased. The optimal number of animals with self-trained phenotypes may be smaller when prediction accuracy is measured as the mean squared error rather than the correlation between predicted and actual RFI phenotypes. Our results demonstrate that semi-supervised learning models that incorporate self-trained phenotypes can achieve genomic prediction accuracies that are comparable to those obtained with models using larger training sets that include only animals with measured phenotypes. Semi-supervised learning can be helpful for genomic prediction of novel traits, such as RFI, for which the size of reference population is limited, in particular, when the animals to be predicted and the animals in the reference population originate from the same herd-environment.
Predicting Earth orientation changes from global forecasts of atmosphere-hydrosphere dynamics
NASA Astrophysics Data System (ADS)
Dobslaw, Henryk; Dill, Robert
2018-02-01
Effective Angular Momentum (EAM) functions obtained from global numerical simulations of atmosphere, ocean, and land surface dynamics are routinely processed by the Earth System Modelling group at Deutsches GeoForschungsZentrum. EAM functions are available since January 1976 with up to 3 h temporal resolution. Additionally, 6 days-long EAM forecasts are routinely published every day. Based on hindcast experiments with 305 individual predictions distributed over 15 months, we demonstrate that EAM forecasts improve the prediction accuracy of the Earth Orientation Parameters at all forecast horizons between 1 and 6 days. At day 6, prediction accuracy improves down to 1.76 mas for the terrestrial pole offset, and 2.6 mas for Δ UT1, which correspond to an accuracy increase of about 41% over predictions published in Bulletin A by the International Earth Rotation and Reference System Service.
Zylberberg, Ariel; Fetsch, Christopher R; Shadlen, Michael N
2016-01-01
Many decisions are thought to arise via the accumulation of noisy evidence to a threshold or bound. In perception, the mechanism explains the effect of stimulus strength, characterized by signal-to-noise ratio, on decision speed, accuracy and confidence. It also makes intriguing predictions about the noise itself. An increase in noise should lead to faster decisions, reduced accuracy and, paradoxically, higher confidence. To test these predictions, we introduce a novel sensory manipulation that mimics the addition of unbiased noise to motion-selective regions of visual cortex, which we verified with neuronal recordings from macaque areas MT/MST. For both humans and monkeys, increasing the noise induced faster decisions and greater confidence over a range of stimuli for which accuracy was minimally impaired. The magnitude of the effects was in agreement with predictions of a bounded evidence accumulation model. DOI: http://dx.doi.org/10.7554/eLife.17688.001 PMID:27787198
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-06-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-01-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889
ShinyGPAS: interactive genomic prediction accuracy simulator based on deterministic formulas.
Morota, Gota
2017-12-20
Deterministic formulas for the accuracy of genomic predictions highlight the relationships among prediction accuracy and potential factors influencing prediction accuracy prior to performing computationally intensive cross-validation. Visualizing such deterministic formulas in an interactive manner may lead to a better understanding of how genetic factors control prediction accuracy. The software to simulate deterministic formulas for genomic prediction accuracy was implemented in R and encapsulated as a web-based Shiny application. Shiny genomic prediction accuracy simulator (ShinyGPAS) simulates various deterministic formulas and delivers dynamic scatter plots of prediction accuracy versus genetic factors impacting prediction accuracy, while requiring only mouse navigation in a web browser. ShinyGPAS is available at: https://chikudaisei.shinyapps.io/shinygpas/ . ShinyGPAS is a shiny-based interactive genomic prediction accuracy simulator using deterministic formulas. It can be used for interactively exploring potential factors that influence prediction accuracy in genome-enabled prediction, simulating achievable prediction accuracy prior to genotyping individuals, or supporting in-class teaching. ShinyGPAS is open source software and it is hosted online as a freely available web-based resource with an intuitive graphical user interface.
Dissolved oxygen content prediction in crab culture using a hybrid intelligent method
Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang
2016-01-01
A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds. PMID:27270206
Dissolved oxygen content prediction in crab culture using a hybrid intelligent method.
Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang
2016-06-08
A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds.
NASA Astrophysics Data System (ADS)
Dyar, M. Darby; Giguere, Stephen; Carey, CJ; Boucher, Thomas
2016-12-01
This project examines the causes, effects, and optimization of continuum removal in laser-induced breakdown spectroscopy (LIBS) to produce the best possible prediction accuracy of elemental composition in geological samples. We compare prediction accuracy resulting from several different techniques for baseline removal, including asymmetric least squares (ALS), adaptive iteratively reweighted penalized least squares (Air-PLS), fully automatic baseline correction (FABC), continuous wavelet transformation, median filtering, polynomial fitting, the iterative thresholding Dietrich method, convex hull/rubber band techniques, and a newly-developed technique for Custom baseline removal (BLR). We assess the predictive performance of these methods using partial least-squares analysis for 13 elements of geological interest, expressed as the weight percentages of SiO2, Al2O3, TiO2, FeO, MgO, CaO, Na2O, K2O, and the parts per million concentrations of Ni, Cr, Zn, Mn, and Co. We find that previously published methods for baseline subtraction generally produce equivalent prediction accuracies for major elements. When those pre-existing methods are used, automated optimization of their adjustable parameters is always necessary to wring the best predictive accuracy out of a data set; ideally, it should be done for each individual variable. The new technique of Custom BLR produces significant improvements in prediction accuracy over existing methods across varying geological data sets, instruments, and varying analytical conditions. These results also demonstrate the dual objectives of the continuum removal problem: removing a smooth underlying signal to fit individual peaks (univariate analysis) versus using feature selection to select only those channels that contribute to best prediction accuracy for multivariate analyses. Overall, the current practice of using generalized, one-method-fits-all-spectra baseline removal results in poorer predictive performance for all methods. The extra steps needed to optimize baseline removal for each predicted variable and empower multivariate techniques with the best possible input data for optimal prediction accuracy are shown to be well worth the slight increase in necessary computations and complexity.
Improving Genomic Prediction in Cassava Field Experiments Using Spatial Analysis.
Elias, Ani A; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2018-01-04
Cassava ( Manihot esculenta Crantz) is an important staple food in sub-Saharan Africa. Breeding experiments were conducted at the International Institute of Tropical Agriculture in cassava to select elite parents. Taking into account the heterogeneity in the field while evaluating these trials can increase the accuracy in estimation of breeding values. We used an exploratory approach using the parametric spatial kernels Power, Spherical, and Gaussian to determine the best kernel for a given scenario. The spatial kernel was fit simultaneously with a genomic kernel in a genomic selection model. Predictability of these models was tested through a 10-fold cross-validation method repeated five times. The best model was chosen as the one with the lowest prediction root mean squared error compared to that of the base model having no spatial kernel. Results from our real and simulated data studies indicated that predictability can be increased by accounting for spatial variation irrespective of the heritability of the trait. In real data scenarios we observed that the accuracy can be increased by a median value of 3.4%. Through simulations, we showed that a 21% increase in accuracy can be achieved. We also found that Range (row) directional spatial kernels, mostly Gaussian, explained the spatial variance in 71% of the scenarios when spatial correlation was significant. Copyright © 2018 Elias et al.
He, Jun; Xu, Jiaqi; Wu, Xiao-Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D; Spangler, Matthew L
2018-04-01
SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.
Dynamic Filtering Improves Attentional State Prediction with fNIRS
NASA Technical Reports Server (NTRS)
Harrivel, Angela R.; Weissman, Daniel H.; Noll, Douglas C.; Huppert, Theodore; Peltier, Scott J.
2016-01-01
Brain activity can predict a person's level of engagement in an attentional task. However, estimates of brain activity are often confounded by measurement artifacts and systemic physiological noise. The optimal method for filtering this noise - thereby increasing such state prediction accuracy - remains unclear. To investigate this, we asked study participants to perform an attentional task while we monitored their brain activity with functional near infrared spectroscopy (fNIRS). We observed higher state prediction accuracy when noise in the fNIRS hemoglobin [Hb] signals was filtered with a non-stationary (adaptive) model as compared to static regression (84% +/- 6% versus 72% +/- 15%).
Using Word Prediction Software to Increase Typing Fluency with Students with Physical Disabilities
ERIC Educational Resources Information Center
Tumlin, Jennifer; Heller, Kathryn Wolff
2004-01-01
The purpose of this study was to examine the use of word prediction software to increase typing speed and decrease spelling errors for students who have physical disabilities that affect hand use. Student perceptions regarding the effectiveness of word prediction was examined as well as their typing rates and spelling accuracy. Four students with…
Mauya, Ernest William; Hansen, Endre Hofstad; Gobakken, Terje; Bollandsås, Ole Martin; Malimbwi, Rogers Ernest; Næsset, Erik
2015-12-01
Airborne laser scanning (ALS) has recently emerged as a promising tool to acquire auxiliary information for improving aboveground biomass (AGB) estimation in sample-based forest inventories. Under design-based and model-assisted inferential frameworks, the estimation relies on a model that relates the auxiliary ALS metrics to AGB estimated on ground plots. The size of the field plots has been identified as one source of model uncertainty because of the so-called boundary effects which increases with decreasing plot size. Recent research in tropical forests has aimed to quantify the boundary effects on model prediction accuracy, but evidence of the consequences for the final AGB estimates is lacking. In this study we analyzed the effect of field plot size on model prediction accuracy and its implication when used in a model-assisted inferential framework. The results showed that the prediction accuracy of the model improved as the plot size increased. The adjusted R 2 increased from 0.35 to 0.74 while the relative root mean square error decreased from 63.6 to 29.2%. Indicators of boundary effects were identified and confirmed to have significant effects on the model residuals. Variance estimates of model-assisted mean AGB relative to corresponding variance estimates of pure field-based AGB, decreased with increasing plot size in the range from 200 to 3000 m 2 . The variance ratio of field-based estimates relative to model-assisted variance ranged from 1.7 to 7.7. This study showed that the relative improvement in precision of AGB estimation when increasing field-plot size, was greater for an ALS-assisted inventory compared to that of a pure field-based inventory.
Danner, Omar K; Hendren, Sandra; Santiago, Ethel; Nye, Brittany; Abraham, Prasad
2017-04-01
Enhancing the efficiency of diagnosis and treatment of severe sepsis by using physiologically-based, predictive analytical strategies has not been fully explored. We hypothesize assessment of heart-rate-to-systolic-ratio significantly increases the timeliness and accuracy of sepsis prediction after emergency department (ED) presentation. We evaluated the records of 53,313 ED patients from a large, urban teaching hospital between January and June 2015. The HR-to-systolic ratio was compared to SIRS criteria for sepsis prediction. There were 884 patients with discharge diagnoses of sepsis, severe sepsis, and/or septic shock. Variations in three presenting variables, heart rate, systolic BP and temperature were determined to be primary early predictors of sepsis with a 74% (654/884) accuracy compared to 34% (304/884) using SIRS criteria (p < 0.0001)in confirmed septic patients. Physiologically-based predictive analytics improved the accuracy and expediency of sepsis identification via detection of variations in HR-to-systolic ratio. This approach may lead to earlier sepsis workup and life-saving interventions. Copyright © 2017 Elsevier Inc. All rights reserved.
Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E
2017-07-01
The leaf spotting diseases in wheat that include Septoria tritici blotch (STB) caused by , Stagonospora nodorum blotch (SNB) caused by , and tan spot (TS) caused by pose challenges to breeding programs in selecting for resistance. A promising approach that could enable selection prior to phenotyping is genomic selection that uses genome-wide markers to estimate breeding values (BVs) for quantitative traits. To evaluate this approach for seedling and/or adult plant resistance (APR) to STB, SNB, and TS, we compared the predictive ability of least-squares (LS) approach with genomic-enabled prediction models including genomic best linear unbiased predictor (GBLUP), Bayesian ridge regression (BRR), Bayes A (BA), Bayes B (BB), Bayes Cπ (BC), Bayesian least absolute shrinkage and selection operator (BL), and reproducing kernel Hilbert spaces markers (RKHS-M), a pedigree-based model (RKHS-P) and RKHS markers and pedigree (RKHS-MP). We observed that LS gave the lowest prediction accuracies and RKHS-MP, the highest. The genomic-enabled prediction models and RKHS-P gave similar accuracies. The increase in accuracy using genomic prediction models over LS was 48%. The mean genomic prediction accuracies were 0.45 for STB (APR), 0.55 for SNB (seedling), 0.66 for TS (seedling) and 0.48 for TS (APR). We also compared markers from two whole-genome profiling approaches: genotyping by sequencing (GBS) and diversity arrays technology sequencing (DArTseq) for prediction. While, GBS markers performed slightly better than DArTseq, combining markers from the two approaches did not improve accuracies. We conclude that implementing GS in breeding for these diseases would help to achieve higher accuracies and rapid gains from selection. Copyright © 2017 Crop Science Society of America.
Hidalgo, A M; Bastiaansen, J W M; Lopes, M S; Veroneze, R; Groenen, M A M; de Koning, D-J
2015-07-01
Genomic selection is applied to dairy cattle breeding to improve the genetic progress of purebred (PB) animals, whereas in pigs and poultry the target is a crossbred (CB) animal for which a different strategy appears to be needed. The source of information used to estimate the breeding values, i.e., using phenotypes of CB or PB animals, may affect the accuracy of prediction. The objective of our study was to assess the direct genomic value (DGV) accuracy of CB and PB pigs using different sources of phenotypic information. Data used were from 3 populations: 2,078 Dutch Landrace-based, 2,301 Large White-based, and 497 crossbreds from an F1 cross between the 2 lines. Two female reproduction traits were analyzed: gestation length (GLE) and total number of piglets born (TNB). Phenotypes used in the analyses originated from offspring of genotyped individuals. Phenotypes collected on CB and PB animals were analyzed as separate traits using a single-trait model. Breeding values were estimated separately for each trait in a pedigree BLUP analysis and subsequently deregressed. Deregressed EBV for each trait originating from different sources (CB or PB offspring) were used to study the accuracy of genomic prediction. Accuracy of prediction was computed as the correlation between DGV and the DEBV of the validation population. Accuracy of prediction within PB populations ranged from 0.43 to 0.62 across GLE and TNB. Accuracies to predict genetic merit of CB animals with one PB population in the training set ranged from 0.12 to 0.28, with the exception of using the CB offspring phenotype of the Dutch Landrace that resulted in an accuracy estimate around 0 for both traits. Accuracies to predict genetic merit of CB animals with both parental PB populations in the training set ranged from 0.17 to 0.30. We conclude that prediction within population and trait had good predictive ability regardless of the trait being the PB or CB performance, whereas using PB population(s) to predict genetic merit of CB animals had zero to moderate predictive ability. We observed that the DGV accuracy of CB animals when training on PB data was greater than or equal to training on CB data. However, when results are corrected for the different levels of reliabilities in the PB and CB training data, we showed that training on CB data does outperform PB data for the prediction of CB genetic merit, indicating that more CB animals should be phenotyped to increase the reliability and, consequently, accuracy of DGV for CB genetic merit.
Naderi, S; Yin, T; König, S
2016-09-01
A simulation study was conducted to investigate the performance of random forest (RF) and genomic BLUP (GBLUP) for genomic predictions of binary disease traits based on cow calibration groups. Training and testing sets were modified in different scenarios according to disease incidence, the quantitative-genetic background of the trait (h(2)=0.30 and h(2)=0.10), and the genomic architecture [725 quantitative trait loci (QTL) and 290 QTL, populations with high and low levels of linkage disequilibrium (LD)]. For all scenarios, 10,005 SNP (depicting a low-density 10K SNP chip) and 50,025 SNP (depicting a 50K SNP chip) were evenly spaced along 29 chromosomes. Training and testing sets included 20,000 cows (4,000 sick, 16,000 healthy, disease incidence 20%) from the last 2 generations. Initially, 4,000 sick cows were assigned to the testing set, and the remaining 16,000 healthy cows represented the training set. In the ongoing allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick animals from the testing set to the training set, and vice versa. The size of the training and testing sets was kept constant. Evaluation criteria for both GBLUP and RF were the correlations between genomic breeding values and true breeding values (prediction accuracy), and the area under the receiving operating characteristic curve (AUROC). Prediction accuracy and AUROC increased for both methods and all scenarios as increasing percentages of sick cows were allocated to the training set. Highest prediction accuracies were observed for disease incidences in training sets that reflected the population disease incidence of 0.20. For this allocation scheme, the largest prediction accuracies of 0.53 for RF and of 0.51 for GBLUP, and the largest AUROC of 0.66 for RF and of 0.64 for GBLUP, were achieved using 50,025 SNP, a heritability of 0.30, and 725 QTL. Heritability decreases from 0.30 to 0.10 and QTL reduction from 725 to 290 were associated with decreasing prediction accuracy and decreasing AUROC for all scenarios. This decrease was more pronounced for RF. Also, the increase of LD had stronger effect on RF results than on GBLUP results. The highest prediction accuracy from the low LD scenario was 0.30 from RF and 0.36 from GBLUP, and increased to 0.39 for both methods in the high LD population. Random forest successfully identified important SNP in close map distance to QTL explaining a high proportion of the phenotypic trait variations. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Can plantar soft tissue mechanics enhance prognosis of diabetic foot ulcer?
Naemi, R; Chatzistergos, P; Suresh, S; Sundar, L; Chockalingam, N; Ramachandran, A
2017-04-01
To investigate if the assessment of the mechanical properties of plantar soft tissue can increase the accuracy of predicting Diabetic Foot Ulceration (DFU). 40 patients with diabetic neuropathy and no DFU were recruited. Commonly assessed clinical parameters along with plantar soft tissue stiffness and thickness were measured at baseline using ultrasound elastography technique. 7 patients developed foot ulceration during a 12months follow-up. Logistic regression was used to identify parameters that contribute to predicting the DFU incidence. The effect of using parameters related to the mechanical behaviour of plantar soft tissue on the specificity, sensitivity, prediction strength and accuracy of the predicting models for DFU was assessed. Patients with higher plantar soft tissue thickness and lower stiffness at the 1st Metatarsal head area showed an increased risk of DFU. Adding plantar soft tissue stiffness and thickness to the model improved its specificity (by 3%), sensitivity (by 14%), prediction accuracy (by 5%) and prognosis strength (by 1%). The model containing all predictors was able to effectively (χ 2 (8, N=40)=17.55, P<0.05) distinguish between the patients with and without DFU incidence. The mechanical properties of plantar soft tissue can be used to improve the predictability of DFU in moderate/high risk patients. Copyright © 2017 Elsevier B.V. All rights reserved.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Dynamic filtering improves attentional state prediction with fNIRS
Harrivel, Angela R.; Weissman, Daniel H.; Noll, Douglas C.; Huppert, Theodore; Peltier, Scott J.
2016-01-01
Brain activity can predict a person’s level of engagement in an attentional task. However, estimates of brain activity are often confounded by measurement artifacts and systemic physiological noise. The optimal method for filtering this noise – thereby increasing such state prediction accuracy – remains unclear. To investigate this, we asked study participants to perform an attentional task while we monitored their brain activity with functional near infrared spectroscopy (fNIRS). We observed higher state prediction accuracy when noise in the fNIRS hemoglobin [Hb] signals was filtered with a non-stationary (adaptive) model as compared to static regression (84% ± 6% versus 72% ± 15%). PMID:27231602
Farinati, F; Cardin, F; Di Mario, F; Sava, G A; Piccoli, A; Costa, F; Penon, G; Naccarato, R
1987-08-01
The endoscopic diagnosis of chronic atrophic gastritis is often underestimated, and most of the procedures adopted to increase diagnostic accuracy are time consuming and complex. In this study, we evaluated the usefulness of the determination of gastric juice pH by means of litmus paper. Values obtained by this method correlate well with gastric acid secretory capacity as measured by gastric acid analysis (r = -0.64, p less than 0.001) and are not affected by the presence of bile. Gastric juice pH determination increases sensitivity and other diagnostic parameters such as performance index (Youden J test), positive predictive value, and post-test probability difference by 50%. Furthermore, the negative predictive value is very high, the probability of missing a patient with chronic atrophic gastritis with this simple method being 2% for fundic and 15% for antral atrophic change. We conclude that gastric juice pH determination, which substantially increases diagnostic accuracy and is very simple to perform, should be routinely adopted.
Torres-Dowdall, J.; Farmer, A.H.; Bucher, E.H.; Rye, R.O.; Landis, G.
2009-01-01
Stable isotope analyses have revolutionized the study of migratory connectivity. However, as with all tools, their limitations must be understood in order to derive the maximum benefit of a particular application. The goal of this study was to evaluate the efficacy of stable isotopes of C, N, H, O and S for assigning known-origin feathers to the molting sites of migrant shorebird species wintering and breeding in Argentina. Specific objectives were to: 1) compare the efficacy of the technique for studying shorebird species with different migration patterns, life histories and habitat-use patterns; 2) evaluate the grouping of species with similar migration and habitat use patterns in a single analysis to potentially improve prediction accuracy; and 3) evaluate the potential gains in prediction accuracy that might be achieved from using multiple stable isotopes. The efficacy of stable isotope ratios to determine origin was found to vary with species. While one species (White-rumped Sandpiper, Calidris fuscicollis) had high levels of accuracy assigning samples to known origin (91% of samples correctly assigned), another (Collared Plover, Charadrius collaris) showed low levels of accuracy (52% of samples correctly assigned). Intra-individual variability may account for this difference in efficacy. The prediction model for three species with similar migration and habitat-use patterns performed poorly compared with the model for just one of the species (71% versus 91% of samples correctly assigned). Thus, combining multiple sympatric species may not improve model prediction accuracy. Increasing the number of stable isotopes in the analyses increased the accuracy of assigning shorebirds to their molting origin, but the best combination - involving a subset of all the isotopes analyzed - varied among species.
NASA Astrophysics Data System (ADS)
Prince, John R.
1982-12-01
Sensitivity, specificity, and predictive accuracy have been shown to be useful measures of the clinical efficacy of diagnostic tests and can be used to predict the potential improvement in diagnostic certitude resulting from the introduction of a competing technology. This communication demonstrates how the informal use of clinical decision analysis may guide health planners in the allocation of resources, purchasing decisions, and implementation of high technology. For didactic purposes the focus is on a comparison between conventional planar radioscintigraphy (RS) and single photon transverse section emission conputed tomography (SPECT). For example, positive predictive accuracy (PPA) for brain RS in a specialist hospital with a 50% disease prevalance is about 95%. SPECT should increase this predicted accuracy to 96%. In a primary care hospital with only a 15% disease prevalance the PPA is only 77% and SPECT may increase this accuracy to about 79%. Similar calculations based on published data show that marginal improvements are expected with SPECT in the liver. It is concluded that: a) The decision to purchase a high technology imaging modality such as SPECT for clinical purposes should be analyzed on an individual organ system and institutional basis. High technology may be justified in specialist hospitals but not necessarily in primary care hospitals. This is more dependent on disease prevalance than procedure volume; b) It is questionable whether SPECT imaging will be competitive with standard RS procedures. Research should concentrate on the development of different medical applications.
Calus, M P L; de Haas, Y; Veerkamp, R F
2013-10-01
Genomic selection holds the promise to be particularly beneficial for traits that are difficult or expensive to measure, such that access to phenotypes on large daughter groups of bulls is limited. Instead, cow reference populations can be generated, potentially supplemented with existing information from the same or (highly) correlated traits available on bull reference populations. The objective of this study, therefore, was to develop a model to perform genomic predictions and genome-wide association studies based on a combined cow and bull reference data set, with the accuracy of the phenotypes differing between the cow and bull genomic selection reference populations. The developed bivariate Bayesian stochastic search variable selection model allowed for an unbalanced design by imputing residuals in the residual updating scheme for all missing records. The performance of this model is demonstrated on a real data example, where the analyzed trait, being milk fat or protein yield, was either measured only on a cow or a bull reference population, or recorded on both. Our results were that the developed bivariate Bayesian stochastic search variable selection model was able to analyze 2 traits, even though animals had measurements on only 1 of 2 traits. The Bayesian stochastic search variable selection model yielded consistently higher accuracy for fat yield compared with a model without variable selection, both for the univariate and bivariate analyses, whereas the accuracy of both models was very similar for protein yield. The bivariate model identified several additional quantitative trait loci peaks compared with the single-trait models on either trait. In addition, the bivariate models showed a marginal increase in accuracy of genomic predictions for the cow traits (0.01-0.05), although a greater increase in accuracy is expected as the size of the bull population increases. Our results emphasize that the chosen value of priors in Bayesian genomic prediction models are especially important in small data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Non-additive genetic variation in growth, carcass and fertility traits of beef cattle.
Bolormaa, Sunduimijid; Pryce, Jennie E; Zhang, Yuandan; Reverter, Antonio; Barendse, William; Hayes, Ben J; Goddard, Michael E
2015-04-02
A better understanding of non-additive variance could lead to increased knowledge on the genetic control and physiology of quantitative traits, and to improved prediction of the genetic value and phenotype of individuals. Genome-wide panels of single nucleotide polymorphisms (SNPs) have been mainly used to map additive effects for quantitative traits, but they can also be used to investigate non-additive effects. We estimated dominance and epistatic effects of SNPs on various traits in beef cattle and the variance explained by dominance, and quantified the increase in accuracy of phenotype prediction by including dominance deviations in its estimation. Genotype data (729 068 real or imputed SNPs) and phenotypes on up to 16 traits of 10 191 individuals from Bos taurus, Bos indicus and composite breeds were used. A genome-wide association study was performed by fitting the additive and dominance effects of single SNPs. The dominance variance was estimated by fitting a dominance relationship matrix constructed from the 729 068 SNPs. The accuracy of predicted phenotypic values was evaluated by best linear unbiased prediction using the additive and dominance relationship matrices. Epistatic interactions (additive × additive) were tested between each of the 28 SNPs that are known to have additive effects on multiple traits, and each of the other remaining 729 067 SNPs. The number of significant dominance effects was greater than expected by chance and most of them were in the direction that is presumed to increase fitness and in the opposite direction to inbreeding depression. Estimates of dominance variance explained by SNPs varied widely between traits, but had large standard errors. The median dominance variance across the 16 traits was equal to 5% of the phenotypic variance. Including a dominance deviation in the prediction did not significantly increase its accuracy for any of the phenotypes. The number of additive × additive epistatic effects that were statistically significant was greater than expected by chance. Significant dominance and epistatic effects occur for growth, carcass and fertility traits in beef cattle but they are difficult to estimate precisely and including them in phenotype prediction does not increase its accuracy.
Rath, Timo; Tontini, Gian E; Nägel, Andreas; Vieth, Michael; Zopf, Steffen; Günther, Claudia; Hoffman, Arthur; Neurath, Markus F; Neumann, Helmut
2015-10-22
Distal diminutive colorectal polyps are common and accurate endoscopic prediction of hyperplastic or adenomatous polyp histology could reduce procedural time, costs and potential risks associated with the resection. Within this study we assessed whether digital chromoendoscopy can accurately predict the histology of distal diminutive colorectal polyps according to the ASGE PIVI statement. In this prospective cohort study, 224 consecutive patients undergoing screening or surveillance colonoscopy were included. Real time histology of 121 diminutive distal colorectal polyps was evaluated using high-definition endoscopy with digital chromoendoscopy and the accuracy of predicting histology with digital chromoendoscopy was assessed. The overall accuracy of digital chromoendoscopy for prediction of adenomatous polyp histology was 90.1 %. Sensitivity, specificity, positive and negative predictive values were 93.3, 88.7, 88.7, and 93.2 %, respectively. In high-confidence predictions, the accuracy increased to 96.3 % while sensitivity, specificity, positive and negative predictive values were calculated as 98.1, 94.4, 94.5, and 98.1 %, respectively. Surveillance intervals with digital chromoendoscopy were correctly predicted with >90 % accuracy. High-definition endoscopy in combination with digital chromoendoscopy allowed real-time in vivo prediction of distal colorectal polyp histology and is accurate enough to leave distal colorectal polyps in place without resection or to resect and discard them without pathologic assessment. This approach has the potential to reduce costs and risks associated with the redundant removal of diminutive colorectal polyps. ClinicalTrials NCT02217449.
Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter
2017-05-12
A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P < 0.05). GFBLUP provides a framework for integrating multiple layers of biological knowledge to provide novel insights into the biological basis of complex traits, and to improve the accuracy of genomic prediction. The SNP set test might be used as a first-step to improve GFBLUP models. Approaches like GFBLUP and SNP set test will become increasingly useful, as the functional annotations of genomes keep accumulating for a range of species and traits.
Revolutionizing Toxicity Testing For Predicting Developmental Outcomes (DNT4)
Characterizing risk from environmental chemical exposure currently requires extensive animal testing; however, alternative approaches are being researched to increase throughput of chemicals screened, decrease reliance on animal testing, and improve accuracy in predicting adverse...
Li, Jin; Tran, Maggie; Siwabessy, Justy
2016-01-01
Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia’s marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to ‘small p and large n’ problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and caution should be taken when applying filter FS methods in selecting predictive models. PMID:26890307
Li, Jin; Tran, Maggie; Siwabessy, Justy
2016-01-01
Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia's marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to 'small p and large n' problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and caution should be taken when applying filter FS methods in selecting predictive models.
Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer
NASA Astrophysics Data System (ADS)
Zhang, Yucheng; Oikonomou, Anastasia; Wong, Alexander; Haider, Masoom A.; Khalvati, Farzad
2017-04-01
Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.
Ngo, L; Ho, H; Hunter, P; Quinn, K; Thomson, A; Pearson, G
2016-02-01
Post-mortem measurements (cold weight, grade and external carcass linear dimensions) as well as live animal data (age, breed, sex) were used to predict ovine primal and retail cut weights for 792 lamb carcases. Significant levels of variance could be explained using these predictors. The predictive power of those measurements on primal and retail cut weights was studied by using the results from principal component analysis and the absolute value of the t-statistics of the linear regression model. High prediction accuracy for primal cut weight was achieved (adjusted R(2) up to 0.95), as well as moderate accuracy for key retail cut weight: tenderloins (adj-R(2)=0.60), loin (adj-R(2)=0.62), French rack (adj-R(2)=0.76) and rump (adj-R(2)=0.75). The carcass cold weight had the best predictive power, with the accuracy increasing by around 10% after including the next three most significant variables. Copyright © 2015 Elsevier Ltd. All rights reserved.
Mancuso, Renzo; Osta, Rosario; Navarro, Xavier
2014-12-01
We assessed the predictive value of electrophysiological tests as a marker of clinical disease onset and survival in superoxide-dismutase 1 (SOD1)(G93A) mice. We evaluated the accuracy of electrophysiological tests in differentiating transgenic versus wild-type mice. We made a correlation analysis of electrophysiological parameters and the onset of symptoms, survival, and number of spinal motoneurons. Presymptomatic electrophysiological tests show great accuracy in differentiating transgenic versus wild-type mice, with the most sensitive parameter being the tibialis anterior compound muscle action potential (CMAP) amplitude. The CMAP amplitude at age 10 weeks correlated significantly with clinical disease onset and survival. Electrophysiological tests increased their survival prediction accuracy when evaluated at later stages of the disease and also predicted the amount of lumbar spinal motoneuron preservation. Electrophysiological tests predict clinical disease onset, survival, and spinal motoneuron preservation in SOD1(G93A) mice. This is a methodological improvement for preclinical studies. © 2014 Wiley Periodicals, Inc.
Increased genomic prediction accuracy in wheat breeding using a large Australian panel.
Norman, Adam; Taylor, Julian; Tanaka, Emi; Telfer, Paul; Edwards, James; Martinant, Jean-Pierre; Kuchel, Haydn
2017-12-01
Genomic prediction accuracy within a large panel was found to be substantially higher than that previously observed in smaller populations, and also higher than QTL-based prediction. In recent years, genomic selection for wheat breeding has been widely studied, but this has typically been restricted to population sizes under 1000 individuals. To assess its efficacy in germplasm representative of commercial breeding programmes, we used a panel of 10,375 Australian wheat breeding lines to investigate the accuracy of genomic prediction for grain yield, physical grain quality and other physiological traits. To achieve this, the complete panel was phenotyped in a dedicated field trial and genotyped using a custom Axiom TM Affymetrix SNP array. A high-quality consensus map was also constructed, allowing the linkage disequilibrium present in the germplasm to be investigated. Using the complete SNP array, genomic prediction accuracies were found to be substantially higher than those previously observed in smaller populations and also more accurate compared to prediction approaches using a finite number of selected quantitative trait loci. Multi-trait genetic correlations were also assessed at an additive and residual genetic level, identifying a negative genetic correlation between grain yield and protein as well as a positive genetic correlation between grain size and test weight.
In general, the accuracy of a predicted toxicity value increases with increase in similarity between the query chemical and the chemicals used to develop a QSAR model. A toxicity estimation methodology employing this finding has been developed. A hierarchical based clustering t...
Data Prediction for Public Events in Professional Domains Based on Improved RNN- LSTM
NASA Astrophysics Data System (ADS)
Song, Bonan; Fan, Chunxiao; Wu, Yuexin; Sun, Juanjuan
2018-02-01
The traditional data services of prediction for emergency or non-periodic events usually cannot generate satisfying result or fulfill the correct prediction purpose. However, these events are influenced by external causes, which mean certain a priori information of these events generally can be collected through the Internet. This paper studied the above problems and proposed an improved model—LSTM (Long Short-term Memory) dynamic prediction and a priori information sequence generation model by combining RNN-LSTM and public events a priori information. In prediction tasks, the model is qualified for determining trends, and its accuracy also is validated. This model generates a better performance and prediction results than the previous one. Using a priori information can increase the accuracy of prediction; LSTM can better adapt to the changes of time sequence; LSTM can be widely applied to the same type of prediction tasks, and other prediction tasks related to time sequence.
Predictors of change in depressive symptoms from preschool to first grade.
Reinfjell, Trude; Kårstad, Silja Berg; Berg-Nielsen, Turid Suzanne; Luby, Joan L; Wichstrøm, Lars
2016-11-01
Children's depressive symptoms in the transition from preschool to school are rarely investigated. We therefore tested whether children's temperament (effortful control and negative affect), social skills, child psychopathology, environmental stressors (life events), parental accuracy of predicting their child's emotion understanding (parental accuracy), parental emotional availability, and parental depression predict changes in depressive symptoms from preschool to first grade. Parents of a community sample of 995 4-year-olds were interviewed using the Preschool Age Psychiatric Assessment. The children and parents were reassessed when the children started first grade (n = 795). The results showed that DSM-5 defined depressive symptoms increased. Child temperamental negative affect and parental depression predicted increased, whereas social skills predicted decreased, depressive symptoms. However, such social skills were only protective among children with low and medium effortful control. Further, high parental accuracy proved protective among children with low effortful control and high negative affect. Thus, interventions that treat parental depression may be important for young children. Children with low effortful control and high negative affect may especially benefit from having parents who accurately perceive their emotional understanding. Efforts to enhance social skills may prove particularly important for children with low or medium effortful control.
Poore, Joshua C; Forlines, Clifton L; Miller, Sarah M; Regan, John R; Irvine, John M
2014-12-01
The decision sciences are increasingly challenged to advance methods for modeling analysts, accounting for both analytic strengths and weaknesses, to improve inferences taken from increasingly large and complex sources of data. We examine whether psychometric measures-personality, cognitive style, motivated cognition-predict analytic performance and whether psychometric measures are competitive with aptitude measures (i.e., SAT scores) as analyst sample selection criteria. A heterogeneous, national sample of 927 participants completed an extensive battery of psychometric measures and aptitude tests and was asked 129 geopolitical forecasting questions over the course of 1 year. Factor analysis reveals four dimensions among psychometric measures; dimensions characterized by differently motivated "top-down" cognitive styles predicted distinctive patterns in aptitude and forecasting behavior. These dimensions were not better predictors of forecasting accuracy than aptitude measures. However, multiple regression and mediation analysis reveals that these dimensions influenced forecasting accuracy primarily through bias in forecasting confidence. We also found that these facets were competitive with aptitude tests as forecast sampling criteria designed to mitigate biases in forecasting confidence while maximizing accuracy. These findings inform the understanding of individual difference dimensions at the intersection of analytic aptitude and demonstrate that they wield predictive power in applied, analytic domains.
Forlines, Clifton L.; Miller, Sarah M.; Regan, John R.; Irvine, John M.
2014-01-01
The decision sciences are increasingly challenged to advance methods for modeling analysts, accounting for both analytic strengths and weaknesses, to improve inferences taken from increasingly large and complex sources of data. We examine whether psychometric measures—personality, cognitive style, motivated cognition—predict analytic performance and whether psychometric measures are competitive with aptitude measures (i.e., SAT scores) as analyst sample selection criteria. A heterogeneous, national sample of 927 participants completed an extensive battery of psychometric measures and aptitude tests and was asked 129 geopolitical forecasting questions over the course of 1 year. Factor analysis reveals four dimensions among psychometric measures; dimensions characterized by differently motivated “top-down” cognitive styles predicted distinctive patterns in aptitude and forecasting behavior. These dimensions were not better predictors of forecasting accuracy than aptitude measures. However, multiple regression and mediation analysis reveals that these dimensions influenced forecasting accuracy primarily through bias in forecasting confidence. We also found that these facets were competitive with aptitude tests as forecast sampling criteria designed to mitigate biases in forecasting confidence while maximizing accuracy. These findings inform the understanding of individual difference dimensions at the intersection of analytic aptitude and demonstrate that they wield predictive power in applied, analytic domains. PMID:25983670
Legarra, A; Baloche, G; Barillet, F; Astruc, J M; Soulas, C; Aguerre, X; Arrese, F; Mintegi, L; Lasarte, M; Maeztu, F; Beltrán de Heredia, I; Ugarte, E
2014-05-01
Genotypes, phenotypes and pedigrees of 6 breeds of dairy sheep (including subdivisions of Latxa, Manech, and Basco-Béarnaise) from the Spain and France Western Pyrenees were used to estimate genetic relationships across breeds (together with genotypes from the Lacaune dairy sheep) and to verify by forward cross-validation single-breed or multiple-breed genetic evaluations. The number of rams genotyped fluctuated between 100 and 1,300 but generally represented the 10 last cohorts of progeny-tested rams within each breed. Genetic relationships were assessed by principal components analysis of the genomic relationship matrices and also by the conservation of linkage disequilibrium patterns at given physical distances in the genome. Genomic and pedigree-based evaluations used daughter yield performances of all rams, although some of them were not genotyped. A pseudo-single step method was used in this case for genomic predictions. Results showed a clear structure in blond and black breeds for Manech and Latxa, reflecting historical exchanges, and isolation of Basco-Béarnaise and Lacaune. Relatedness between any 2 breeds was, however, lower than expected. Single-breed genomic predictions had accuracies comparable with other breeds of dairy sheep or small breeds of dairy cattle. They were more accurate than pedigree predictions for 5 out of 6 breeds, with absolute increases in accuracy ranging from 0.05 to 0.30 points. They were significantly better, as assessed by bootstrapping of candidates, for 2 of the breeds. Predictions using multiple populations only marginally increased the accuracy for a couple of breeds. Pooling populations does not increase the accuracy of genomic evaluations in dairy sheep; however, single-breed genomic predictions are more accurate, even for small breeds, and make the consideration of genomic schemes in dairy sheep interesting. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Analysis of near infrared spectra for age-grading of wild populations of Anopheles gambiae.
Krajacich, Benjamin J; Meyers, Jacob I; Alout, Haoues; Dabiré, Roch K; Dowell, Floyd E; Foy, Brian D
2017-11-07
Understanding the age-structure of mosquito populations, especially malaria vectors such as Anopheles gambiae, is important for assessing the risk of infectious mosquitoes, and how vector control interventions may impact this risk. The use of near-infrared spectroscopy (NIRS) for age-grading has been demonstrated previously on laboratory and semi-field mosquitoes, but to date has not been utilized on wild-caught mosquitoes whose age is externally validated via parity status or parasite infection stage. In this study, we developed regression and classification models using NIRS on datasets of wild An. gambiae (s.l.) reared from larvae collected from the field in Burkina Faso, and two laboratory strains. We compared the accuracy of these models for predicting the ages of wild-caught mosquitoes that had been scored for their parity status as well as for positivity for Plasmodium sporozoites. Regression models utilizing variable selection increased predictive accuracy over the more common full-spectrum partial least squares (PLS) approach for cross-validation of the datasets, validation, and independent test sets. Models produced from datasets that included the greatest range of mosquito samples (i.e. different sampling locations and times) had the highest predictive accuracy on independent testing sets, though overall accuracy on these samples was low. For classification, we found that intramodel accuracy ranged between 73.5-97.0% for grouping of mosquitoes into "early" and "late" age classes, with the highest prediction accuracy found in laboratory colonized mosquitoes. However, this accuracy was decreased on test sets, with the highest classification of an independent set of wild-caught larvae reared to set ages being 69.6%. Variation in NIRS data, likely from dietary, genetic, and other factors limits the accuracy of this technique with wild-caught mosquitoes. Alternative algorithms may help improve prediction accuracy, but care should be taken to either maximize variety in models or minimize confounders.
Accuracy of active chirp linearization for broadband frequency modulated continuous wave ladar.
Barber, Zeb W; Babbitt, Wm Randall; Kaylor, Brant; Reibel, Randy R; Roos, Peter A
2010-01-10
As the bandwidth and linearity of frequency modulated continuous wave chirp ladar increase, the resulting range resolution, precisions, and accuracy are improved correspondingly. An analysis of a very broadband (several THz) and linear (<1 ppm) chirped ladar system based on active chirp linearization is presented. Residual chirp nonlinearity and material dispersion are analyzed as to their effect on the dynamic range, precision, and accuracy of the system. Measurement precision and accuracy approaching the part per billion level is predicted.
Development of machine learning models for diagnosis of glaucoma.
Kim, Seong Jae; Cho, Kyong Jin; Oh, Sejong
2017-01-01
The study aimed to develop machine learning models that have strong prediction power and interpretability for diagnosis of glaucoma based on retinal nerve fiber layer (RNFL) thickness and visual field (VF). We collected various candidate features from the examination of retinal nerve fiber layer (RNFL) thickness and visual field (VF). We also developed synthesized features from original features. We then selected the best features proper for classification (diagnosis) through feature evaluation. We used 100 cases of data as a test dataset and 399 cases of data as a training and validation dataset. To develop the glaucoma prediction model, we considered four machine learning algorithms: C5.0, random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN). We repeatedly composed a learning model using the training dataset and evaluated it by using the validation dataset. Finally, we got the best learning model that produces the highest validation accuracy. We analyzed quality of the models using several measures. The random forest model shows best performance and C5.0, SVM, and KNN models show similar accuracy. In the random forest model, the classification accuracy is 0.98, sensitivity is 0.983, specificity is 0.975, and AUC is 0.979. The developed prediction models show high accuracy, sensitivity, specificity, and AUC in classifying among glaucoma and healthy eyes. It will be used for predicting glaucoma against unknown examination records. Clinicians may reference the prediction results and be able to make better decisions. We may combine multiple learning models to increase prediction accuracy. The C5.0 model includes decision rules for prediction. It can be used to explain the reasons for specific predictions.
Ahnlide, I; Zalaudek, I; Nilsson, F; Bjellerup, M; Nielsen, K
2016-10-01
Prediction of the histopathological subtype of basal cell carcinoma (BCC) is important for tailoring optimal treatment, especially in patients with suspected superficial BCC (sBCC). To assess the accuracy of the preoperative prediction of subtypes of BCC in clinical practice, to evaluate whether dermoscopic examination enhances accuracy and to find dermoscopic criteria for discriminating sBCC from other subtypes. The main presurgical diagnosis was compared with the histopathological, postoperative diagnosis of routinely excised skin tumours in a predominantly fair-skinned patient cohort of northern Europe during a study period of 3 years (2011-13). The study period was split in two: during period 1, dermoscopy was optional (850 cases with a pre- or postoperative diagnosis of BCC), while during period 2 (after an educational dermoscopic update) dermoscopy was mandatory (651 cases). A classification tree based on clinical and dermoscopic features for prediction of sBCC was applied. For a total of 3544 excised skin tumours, the sensitivity for the diagnosis of BCC (any subtype) was 93·3%, specificity 91·8%, and the positive predictive value (PPV) 89·0%. The diagnostic accuracy as well as the PPV and the positive likelihood ratio for sBCC were significantly higher when dermoscopy was mandatory. A flat surface and multiple small erosions predicted sBCC. The study shows a high accuracy for an overall diagnosis of BCC and increased accuracy in prediction of sBCC for the period when dermoscopy was applied in all cases. The most discriminating findings for sBCC, based on clinical and dermoscopic features in this fair-skinned population, were a flat surface and multiple small erosions. © 2016 British Association of Dermatologists.
Children's Memories for Painful Cancer Treatment Procedures: Implications for Distress.
ERIC Educational Resources Information Center
Chen, Edith; Zeltzer, Lonnie K.; Craske, Michelle G.; Katz, Ernest R.
2000-01-01
Examined memory of 3- to 18-year-olds with leukemia regarding lumbar punctures (LP). Found that children displayed considerable accuracy for event details, with accuracy increasing with age. Use of Versed (anxiolytic medication described as a "memory blocker") was not related to recall. Higher distress predicted greater exaggerations in…
Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction.
Zhou, Yao; Vales, M Isabel; Wang, Aoxue; Zhang, Zhiwu
2017-09-01
Accuracy of genomic prediction is commonly calculated as the Pearson correlation coefficient between the predicted and observed phenotypes in the inference population by using cross-validation analysis. More frequently than expected, significant negative accuracies of genomic prediction have been reported in genomic selection studies. These negative values are surprising, given that the minimum value for prediction accuracy should hover around zero when randomly permuted data sets are analyzed. We reviewed the two common approaches for calculating the Pearson correlation and hypothesized that these negative accuracy values reflect potential bias owing to artifacts caused by the mathematical formulas used to calculate prediction accuracy. The first approach, Instant accuracy, calculates correlations for each fold and reports prediction accuracy as the mean of correlations across fold. The other approach, Hold accuracy, predicts all phenotypes in all fold and calculates correlation between the observed and predicted phenotypes at the end of the cross-validation process. Using simulated and real data, we demonstrated that our hypothesis is true. Both approaches are biased downward under certain conditions. The biases become larger when more fold are employed and when the expected accuracy is low. The bias of Instant accuracy can be corrected using a modified formula. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genomic Prediction Accounting for Residual Heteroskedasticity
Ou, Zhining; Tempelman, Robert J.; Steibel, Juan P.; Ernst, Catherine W.; Bates, Ronald O.; Bello, Nora M.
2015-01-01
Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. PMID:26564950
ERIC Educational Resources Information Center
Owen, Amanda J.
2010-01-01
Purpose: The author examined the influence of sentence type, clause order, and verb transitivity on the accuracy of children's past tense productions. All groups of children, but especially children with specific language impairment (SLI), were predicted to decrease accuracy as linguistic complexity increased. Method: The author elicited past…
Lopes, F B; Wu, X-L; Li, H; Xu, J; Perkins, T; Genho, J; Ferretti, R; Tait, R G; Bauck, S; Rosa, G J M
2018-02-01
Reliable genomic prediction of breeding values for quantitative traits requires the availability of sufficient number of animals with genotypes and phenotypes in the training set. As of 31 October 2016, there were 3,797 Brangus animals with genotypes and phenotypes. These Brangus animals were genotyped using different commercial SNP chips. Of them, the largest group consisted of 1,535 animals genotyped by the GGP-LDV4 SNP chip. The remaining 2,262 genotypes were imputed to the SNP content of the GGP-LDV4 chip, so that the number of animals available for training the genomic prediction models was more than doubled. The present study showed that the pooling of animals with both original or imputed 40K SNP genotypes substantially increased genomic prediction accuracies on the ten traits. By supplementing imputed genotypes, the relative gains in genomic prediction accuracies on estimated breeding values (EBV) were from 12.60% to 31.27%, and the relative gain in genomic prediction accuracies on de-regressed EBV was slightly small (i.e. 0.87%-18.75%). The present study also compared the performance of five genomic prediction models and two cross-validation methods. The five genomic models predicted EBV and de-regressed EBV of the ten traits similarly well. Of the two cross-validation methods, leave-one-out cross-validation maximized the number of animals at the stage of training for genomic prediction. Genomic prediction accuracy (GPA) on the ten quantitative traits was validated in 1,106 newly genotyped Brangus animals based on the SNP effects estimated in the previous set of 3,797 Brangus animals, and they were slightly lower than GPA in the original data. The present study was the first to leverage currently available genotype and phenotype resources in order to harness genomic prediction in Brangus beef cattle. © 2018 Blackwell Verlag GmbH.
Increased Accuracy of Ligand Sensing by Receptor Internalization and Lateral Receptor Diffusion
NASA Astrophysics Data System (ADS)
Aquino, Gerardo; Endres, Robert
2010-03-01
Many types of cells can sense external ligand concentrations with cell-surface receptors at extremely high accuracy. Interestingly, ligand-bound receptors are often internalized, a process also known as receptor-mediated endocytosis. While internalization is involved in a vast number of important functions for the life of a cell, it was recently also suggested to increase the accuracy of sensing ligand as overcounting of the same ligand molecules is reduced. A similar role may be played by receptor diffusion om the cell membrane. Fast, lateral receptor diffusion is known to be relevant in neurotransmission initiated by release of neurotransmitter glutamate in the synaptic cleft between neurons. By binding ligand and removal by diffusion from the region of release of the neurotransmitter, diffusing receptors can be reasonably expected to reduce the local overcounting of the same ligand molecules in the region of signaling. By extending simple ligand-receptor models to out-of-equilibrium thermodynamics, we show that both receptor internalization and lateral diffusion increase the accuracy with which cells can measure ligand concentrations in the external environment. We confirm this with our model and give quantitative predictions for experimental parameters values. We give quantitative predictions, which compare favorably to experimental data of real receptors.
Genomic Prediction of Testcross Performance in Canola (Brassica napus)
Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.
2016-01-01
Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924
Short-arc measurement and fitting based on the bidirectional prediction of observed data
NASA Astrophysics Data System (ADS)
Fei, Zhigen; Xu, Xiaojie; Georgiadis, Anthimos
2016-02-01
To measure a short arc is a notoriously difficult problem. In this study, the bidirectional prediction method based on the Radial Basis Function Neural Network (RBFNN) to the observed data distributed along a short arc is proposed to increase the corresponding arc length, and thus improve its fitting accuracy. Firstly, the rationality of regarding observed data as a time series is discussed in accordance with the definition of a time series. Secondly, the RBFNN is constructed to predict the observed data where the interpolation method is used for enlarging the size of training examples in order to improve the learning accuracy of the RBFNN’s parameters. Finally, in the numerical simulation section, we focus on simulating how the size of the training sample and noise level influence the learning error and prediction error of the built RBFNN. Typically, the observed data coming from a 5{}^\\circ short arc are used to evaluate the performance of the Hyper method known as the ‘unbiased fitting method of circle’ with a different noise level before and after prediction. A number of simulation experiments reveal that the fitting stability and accuracy of the Hyper method after prediction are far superior to the ones before prediction.
USDA-ARS?s Scientific Manuscript database
Small reference populations limit the accuracy of genomic prediction in numerically small breeds, such as the Danish Jersey. The objective of this study was to investigate two approaches to improve genomic prediction by increasing the size of the reference population for Danish Jerseys. The first ap...
Blum, Meike; Distl, Ottmar
2014-01-01
In the present study, breeding values for canine congenital sensorineural deafness, the presence of blue eyes and patches have been predicted using multivariate animal models to test the reliability of the breeding values for planned matings. The dataset consisted of 6669 German Dalmatian dogs born between 1988 and 2009. Data were provided by the Dalmatian kennel clubs which are members of the German Association for Dog Breeding and Husbandry (VDH). The hearing status for all dogs was evaluated using brainstem auditory evoked potentials. The reliability using the prediction error variance of breeding values and the realized reliability of the prediction of the phenotype of future progeny born in each one year between 2006 and 2009 were used as parameters to evaluate the goodness of prediction through breeding values. All animals from the previous birth years were used for prediction of the breeding values of the progeny in each of the up-coming birth years. The breeding values based on pedigree records achieved an average reliability of 0.19 for the future 1951 progeny. The predictive accuracy (R2) for the hearing status of single future progeny was at 1.3%. Combining breeding values for littermates increased the predictive accuracy to 3.5%. Corresponding values for maternal and paternal half-sib groups were at 3.2 and 7.3%. The use of breeding values for planned matings increases the phenotypic selection response over mass selection. The breeding values of sires may be used for planned matings because reliabilities and predictive accuracies for future paternal progeny groups were highest.
Hu, Chen; Steingrimsson, Jon Arni
2018-01-01
A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.
Advanced turboprop noise prediction based on recent theoretical results
NASA Technical Reports Server (NTRS)
Farassat, F.; Padula, S. L.; Dunn, M. H.
1987-01-01
The development of a high speed propeller noise prediction code at Langley Research Center is described. The code utilizes two recent acoustic formulations in the time domain for subsonic and supersonic sources. The structure and capabilities of the code are discussed. Grid size study for accuracy and speed of execution on a computer is also presented. The code is tested against an earlier Langley code. Considerable increase in accuracy and speed of execution are observed. Some examples of noise prediction of a high speed propeller for which acoustic test data are available are given. A brisk derivation of formulations used is given in an appendix.
Jia, Cang-Zhi; He, Wen-Ying; Yao, Yu-Hua
2017-03-01
Hydroxylation of proline or lysine residues in proteins is a common post-translational modification event, and such modifications are found in many physiological and pathological processes. Nonetheless, the exact molecular mechanism of hydroxylation remains under investigation. Because experimental identification of hydroxylation is time-consuming and expensive, bioinformatics tools with high accuracy represent desirable alternatives for large-scale rapid identification of protein hydroxylation sites. In view of this, we developed a supporter vector machine-based tool, OH-PRED, for the prediction of protein hydroxylation sites using the adapted normal distribution bi-profile Bayes feature extraction in combination with the physicochemical property indexes of the amino acids. In a jackknife cross validation, OH-PRED yields an accuracy of 91.88% and a Matthew's correlation coefficient (MCC) of 0.838 for the prediction of hydroxyproline sites, and yields an accuracy of 97.42% and a MCC of 0.949 for the prediction of hydroxylysine sites. These results demonstrate that OH-PRED increased significantly the prediction accuracy of hydroxyproline and hydroxylysine sites by 7.37 and 14.09%, respectively, when compared with the latest predictor PredHydroxy. In independent tests, OH-PRED also outperforms previously published methods.
Gullick, Margaret M; Wolford, George
2013-01-01
We examined the brain activity underlying the development of our understanding of negative numbers, which are amounts lacking direct physical counterparts. Children performed a paired comparison task with positive and negative numbers during an fMRI session. As previously shown in adults, both pre-instruction fifth-graders and post-instruction seventh-graders demonstrated typical behavioral and neural distance effects to negative numbers, where response times and parietal and frontal activity increased as comparison distance decreased. We then determined the factors impacting the distance effect in each age group. Behaviorally, the fifth-grader distance effect for negatives was significantly predicted only by positive comparison accuracy, indicating that children who were generally better at working with numbers were better at comparing negatives. In seventh-graders, negative number comparison accuracy significantly predicted their negative number distance effect, indicating that children who were better at working with negative numbers demonstrated a more typical distance effect. Across children, as age increased, the negative number distance effect increased in the bilateral IPS and decreased frontally, indicating a frontoparietal shift consistent with previous numerical development literature. In contrast, as negative comparison task accuracy increased, the parietal distance effect increased in the left IPS and decreased in the right, possibly indicating a change from an approximate understanding of negatives' values to a more exact, precise representation (particularly supported by the left IPS) with increasing expertise. These shifts separately indicate the effects of increasing maturity generally in numeric processing and specifically in negative number understanding.
Improving prediction accuracy of cooling load using EMD, PSR and RBFNN
NASA Astrophysics Data System (ADS)
Shen, Limin; Wen, Yuanmei; Li, Xiaohong
2017-08-01
To increase the accuracy for the prediction of cooling load demand, this work presents an EMD (empirical mode decomposition)-PSR (phase space reconstruction) based RBFNN (radial basis function neural networks) method. Firstly, analyzed the chaotic nature of the real cooling load demand, transformed the non-stationary cooling load historical data into several stationary intrinsic mode functions (IMFs) by using EMD. Secondly, compared the RBFNN prediction accuracies of each IMFs and proposed an IMF combining scheme that is combine the lower-frequency components (called IMF4-IMF6 combined) while keep the higher frequency component (IMF1, IMF2, IMF3) and the residual unchanged. Thirdly, reconstruct phase space for each combined components separately, process the highest frequency component (IMF1) by differential method and predict with RBFNN in the reconstructed phase spaces. Real cooling load data of a centralized ice storage cooling systems in Guangzhou are used for simulation. The results show that the proposed hybrid method outperforms the traditional methods.
Infrared Imagery of Shuttle (IRIS). Task 2, summary report
NASA Technical Reports Server (NTRS)
Chocol, C. J.
1978-01-01
End-to-end tests of a 16 element indium antimonide sensor array and 10 channels of associated electronic signal processing were completed. Quantitative data were gathered on system responsivity, frequency response, noise, stray capacitance effects, and sensor paralleling. These tests verify that the temperature accuracies, predicted in the Task 1 study, can be obtained with a very carefully designed electro-optical flight system. Pre-flight and inflight calibration of a high quality are mandatory to obtain these accuracies. Also, optical crosstalk in the array-dewar assembly must be carefully eliminated by its design. Tests of the scaled up tracking system reticle also demonstrate that the predicted tracking system accuracies can be met in the flight system. In addition, improvements in the reticle pattern and electronics are possible, which will reduce the complexity of the flight system and increase tracking accuracy.
Karuppiah Ramachandran, Vignesh Raja; Alblas, Huibert J; Le, Duc V; Meratnia, Nirvana
2018-05-24
In the last decade, seizure prediction systems have gained a lot of attention because of their enormous potential to largely improve the quality-of-life of the epileptic patients. The accuracy of the prediction algorithms to detect seizure in real-world applications is largely limited because the brain signals are inherently uncertain and affected by various factors, such as environment, age, drug intake, etc., in addition to the internal artefacts that occur during the process of recording the brain signals. To deal with such ambiguity, researchers transitionally use active learning, which selects the ambiguous data to be annotated by an expert and updates the classification model dynamically. However, selecting the particular data from a pool of large ambiguous datasets to be labelled by an expert is still a challenging problem. In this paper, we propose an active learning-based prediction framework that aims to improve the accuracy of the prediction with a minimum number of labelled data. The core technique of our framework is employing the Bernoulli-Gaussian Mixture model (BGMM) to determine the feature samples that have the most ambiguity to be annotated by an expert. By doing so, our approach facilitates expert intervention as well as increasing medical reliability. We evaluate seven different classifiers in terms of the classification time and memory required. An active learning framework built on top of the best performing classifier is evaluated in terms of required annotation effort to achieve a high level of prediction accuracy. The results show that our approach can achieve the same accuracy as a Support Vector Machine (SVM) classifier using only 20 % of the labelled data and also improve the prediction accuracy even under the noisy condition.
A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures
2014-01-01
Background Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information. Results We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0. Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure. Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets. Conclusions Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/~hjabbari/software.php. PMID:24884954
Automated detection of brain atrophy patterns based on MRI for the prediction of Alzheimer's disease
Plant, Claudia; Teipel, Stefan J.; Oswald, Annahita; Böhm, Christian; Meindl, Thomas; Mourao-Miranda, Janaina; Bokde, Arun W.; Hampel, Harald; Ewers, Michael
2010-01-01
Subjects with mild cognitive impairment (MCI) have an increased risk to develop Alzheimer's disease (AD). Voxel-based MRI studies have demonstrated that widely distributed cortical and subcortical brain areas show atrophic changes in MCI, preceding the onset of AD-type dementia. Here we developed a novel data mining framework in combination with three different classifiers including support vector machine (SVM), Bayes statistics, and voting feature intervals (VFI) to derive a quantitative index of pattern matching for the prediction of the conversion from MCI to AD. MRI was collected in 32 AD patients, 24 MCI subjects and 18 healthy controls (HC). Nine out of 24 MCI subjects converted to AD after an average follow-up interval of 2.5 years. Using feature selection algorithms, brain regions showing the highest accuracy for the discrimination between AD and HC were identified, reaching a classification accuracy of up to 92%. The extracted AD clusters were used as a search region to extract those brain areas that are predictive of conversion to AD within MCI subjects. The most predictive brain areas included the anterior cingulate gyrus and orbitofrontal cortex. The best prediction accuracy, which was cross-validated via train-and-test, was 75% for the prediction of the conversion from MCI to AD. The present results suggest that novel multivariate methods of pattern matching reach a clinically relevant accuracy for the a priori prediction of the progression from MCI to AD. PMID:19961938
Foley, Alana E; Vasilyeva, Marina; Laski, Elida V
2017-06-01
This study examined the mediating role of children's use of decomposition strategies in the relation between visuospatial memory (VSM) and arithmetic accuracy. Children (N = 78; Age M = 9.36) completed assessments of VSM, arithmetic strategies, and arithmetic accuracy. Consistent with previous findings, VSM predicted arithmetic accuracy in children. Extending previous findings, the current study showed that the relation between VSM and arithmetic performance was mediated by the frequency of children's use of decomposition strategies. Identifying the role of arithmetic strategies in this relation has implications for increasing the math performance of children with lower VSM. Statement of contribution What is already known on this subject? The link between children's visuospatial working memory and arithmetic accuracy is well documented. Frequency of decomposition strategy use is positively related to children's arithmetic accuracy. Children's spatial skill positively predicts the frequency with which they use decomposition. What does this study add? Short-term visuospatial memory (VSM) positively relates to the frequency of children's decomposition use. Decomposition use mediates the relation between short-term VSM and arithmetic accuracy. Children with limited short-term VSM may struggle to use decomposition, decreasing accuracy. © 2016 The British Psychological Society.
Blower, Sally; Go, Myong-Hyun
2011-07-19
Mathematical models are useful tools for understanding and predicting epidemics. A recent innovative modeling study by Stehle and colleagues addressed the issue of how complex models need to be to ensure accuracy. The authors collected data on face-to-face contacts during a two-day conference. They then constructed a series of dynamic social contact networks, each of which was used to model an epidemic generated by a fast-spreading airborne pathogen. Intriguingly, Stehle and colleagues found that increasing model complexity did not always increase accuracy. Specifically, the most detailed contact network and a simplified version of this network generated very similar results. These results are extremely interesting and require further exploration to determine their generalizability.
Social Power Increases Interoceptive Accuracy
Moeini-Jazani, Mehrad; Knoeferle, Klemens; de Molière, Laura; Gatti, Elia; Warlop, Luk
2017-01-01
Building on recent psychological research showing that power increases self-focused attention, we propose that having power increases accuracy in perception of bodily signals, a phenomenon known as interoceptive accuracy. Consistent with our proposition, participants in a high-power experimental condition outperformed those in the control and low-power conditions in the Schandry heartbeat-detection task. We demonstrate that the effect of power on interoceptive accuracy is not explained by participants’ physiological arousal, affective state, or general intention for accuracy. Rather, consistent with our reasoning that experiencing power shifts attentional resources inward, we show that the effect of power on interoceptive accuracy is dependent on individuals’ chronic tendency to focus on their internal sensations. Moreover, we demonstrate that individuals’ chronic sense of power also predicts interoceptive accuracy similar to, and independent of, how their situationally induced feeling of power does. We therefore provide further support on the relation between power and enhanced perception of bodily signals. Our findings offer a novel perspective–a psychophysiological account–on how power might affect judgments and behavior. We highlight and discuss some of these intriguing possibilities for future research. PMID:28824501
A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction
Spencer, Matt; Eickholt, Jesse; Cheng, Jianlin
2014-01-01
Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80% and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test data set of 198 proteins, achieving a Q3 accuracy of 80.7% and a Sov accuracy of 74.2%. PMID:25750595
A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.
Spencer, Matt; Eickholt, Jesse; Jianlin Cheng
2015-01-01
Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q3 accuracy of 80.7 percent and a Sov accuracy of 74.2 percent.
Context Memory Decline in Middle Aged Adults is Related to Changes in Prefrontal Cortex Function
Kwon, Diana; Maillet, David; Pasvanis, Stamatoula; Ankudowich, Elizabeth; Grady, Cheryl L.; Rajah, M. Natasha
2016-01-01
The ability to encode and retrieve spatial and temporal contextual details of episodic memories (context memory) begins to decline at midlife. In the current study, event-related fMRI was used to investigate the neural correlates of context memory decline in healthy middle aged adults (MA) compared with young adults (YA). Participants were scanned while performing easy and hard versions of spatial and temporal context memory tasks. Scans were obtained at encoding and retrieval. Significant reductions in context memory retrieval accuracy were observed in MA, compared with YA. The fMRI results revealed that overall, both groups exhibited similar patterns of brain activity in parahippocampal cortex, ventral occipito-temporal regions and prefrontal cortex (PFC) during encoding. In contrast, at retrieval, there were group differences in ventral occipito-temporal and PFC activity, due to these regions being more activated in MA, compared with YA. Furthermore, only in YA, increased encoding activity in ventrolateral PFC, and increased retrieval activity in occipital cortex, predicted increased retrieval accuracy. In MA, increased retrieval activity in anterior PFC predicted increased retrieval accuracy. These results suggest that there are changes in PFC contributions to context memory at midlife. PMID:25882039
Kim, Da-Eun; Yang, Hyeri; Jang, Won-Hee; Jung, Kyoung-Mi; Park, Miyoung; Choi, Jin Kyu; Jung, Mi-Sook; Jeon, Eun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Park, Jung Eun; Sohn, Soo Jung; Kim, Tae Sung; Ahn, Il Young; Jeong, Tae-Cheon; Lim, Kyung-Min; Bae, SeungJin
2016-01-01
In order for a novel test method to be applied for regulatory purposes, its reliability and relevance, i.e., reproducibility and predictive capacity, must be demonstrated. Here, we examine the predictive capacity of a novel non-radioisotopic local lymph node assay, LLNA:BrdU-FCM (5-bromo-2'-deoxyuridine-flow cytometry), with a cutoff approach and inferential statistics as a prediction model. 22 reference substances in OECD TG429 were tested with a concurrent positive control, hexylcinnamaldehyde 25%(PC), and the stimulation index (SI) representing the fold increase in lymph node cells over the vehicle control was obtained. The optimal cutoff SI (2.7≤cutoff <3.5), with respect to predictive capacity, was obtained by a receiver operating characteristic curve, which produced 90.9% accuracy for the 22 substances. To address the inter-test variability in responsiveness, SI values standardized with PC were employed to obtain the optimal percentage cutoff (42.6≤cutoff <57.3% of PC), which produced 86.4% accuracy. A test substance may be diagnosed as a sensitizer if a statistically significant increase in SI is elicited. The parametric one-sided t-test and non-parametric Wilcoxon rank-sum test produced 77.3% accuracy. Similarly, a test substance could be defined as a sensitizer if the SI means of the vehicle control, and of the low, middle, and high concentrations were statistically significantly different, which was tested using ANOVA or Kruskal-Wallis, with post hoc analysis, Dunnett, or DSCF (Dwass-Steel-Critchlow-Fligner), respectively, depending on the equal variance test, producing 81.8% accuracy. The absolute SI-based cutoff approach produced the best predictive capacity, however the discordant decisions between prediction models need to be examined further. Copyright © 2015 Elsevier Inc. All rights reserved.
Vallejo, Roger L; Leeds, Timothy D; Gao, Guangtu; Parsons, James E; Martin, Kyle E; Evenhuis, Jason P; Fragomeni, Breno O; Wiens, Gregory D; Palti, Yniv
2017-02-01
Previously, we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative that enables exploitation of within-family genetic variation. We compared three GS models [single-step genomic best linear unbiased prediction (ssGBLUP), weighted ssGBLUP (wssGBLUP), and BayesB] to predict genomic-enabled breeding values (GEBV) for BCWD resistance in a commercial rainbow trout population, and compared the accuracy of GEBV to traditional estimates of breeding values (EBV) from a pedigree-based BLUP (P-BLUP) model. We also assessed the impact of sampling design on the accuracy of GEBV predictions. For these comparisons, we used BCWD survival phenotypes recorded on 7893 fish from 102 families, of which 1473 fish from 50 families had genotypes [57 K single nucleotide polymorphism (SNP) array]. Naïve siblings of the training fish (n = 930 testing fish) were genotyped to predict their GEBV and mated to produce 138 progeny testing families. In the following generation, 9968 progeny were phenotyped to empirically assess the accuracy of GEBV predictions made on their non-phenotyped parents. The accuracy of GEBV from all tested GS models were substantially higher than the P-BLUP model EBV. The highest increase in accuracy relative to the P-BLUP model was achieved with BayesB (97.2 to 108.8%), followed by wssGBLUP at iteration 2 (94.4 to 97.1%) and 3 (88.9 to 91.2%) and ssGBLUP (83.3 to 85.3%). Reducing the training sample size to n = ~1000 had no negative impact on the accuracy (0.67 to 0.72), but with n = ~500 the accuracy dropped to 0.53 to 0.61 if the training and testing fish were full-sibs, and even substantially lower, to 0.22 to 0.25, when they were not full-sibs. Using progeny performance data, we showed that the accuracy of genomic predictions is substantially higher than estimates obtained from the traditional pedigree-based BLUP model for BCWD resistance. Overall, we found that using a much smaller training sample size compared to similar studies in livestock, GS can substantially improve the selection accuracy and genetic gains for this trait in a commercial rainbow trout breeding population.
Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model
Li, Xiaoqing; Wang, Yu
2018-01-01
Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing technology. PMID:29351254
Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model.
Xin, Jingzhou; Zhou, Jianting; Yang, Simon X; Li, Xiaoqing; Wang, Yu
2018-01-19
Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing technology.
Bayesian decision support for coding occupational injury data.
Nanda, Gaurav; Grattan, Kathleen M; Chu, MyDzung T; Davis, Letitia K; Lehto, Mark R
2016-06-01
Studies on autocoding injury data have found that machine learning algorithms perform well for categories that occur frequently but often struggle with rare categories. Therefore, manual coding, although resource-intensive, cannot be eliminated. We propose a Bayesian decision support system to autocode a large portion of the data, filter cases for manual review, and assist human coders by presenting them top k prediction choices and a confusion matrix of predictions from Bayesian models. We studied the prediction performance of Single-Word (SW) and Two-Word-Sequence (TW) Naïve Bayes models on a sample of data from the 2011 Survey of Occupational Injury and Illness (SOII). We used the agreement in prediction results of SW and TW models, and various prediction strength thresholds for autocoding and filtering cases for manual review. We also studied the sensitivity of the top k predictions of the SW model, TW model, and SW-TW combination, and then compared the accuracy of the manually assigned codes to SOII data with that of the proposed system. The accuracy of the proposed system, assuming well-trained coders reviewing a subset of only 26% of cases flagged for review, was estimated to be comparable (86.5%) to the accuracy of the original coding of the data set (range: 73%-86.8%). Overall, the TW model had higher sensitivity than the SW model, and the accuracy of the prediction results increased when the two models agreed, and for higher prediction strength thresholds. The sensitivity of the top five predictions was 93%. The proposed system seems promising for coding injury data as it offers comparable accuracy and less manual coding. Accurate and timely coded occupational injury data is useful for surveillance as well as prevention activities that aim to make workplaces safer. Copyright © 2016 Elsevier Ltd and National Safety Council. All rights reserved.
Paroxysmal atrial fibrillation prediction method with shorter HRV sequences.
Boon, K H; Khalil-Hani, M; Malarvili, M B; Sia, C W
2016-10-01
This paper proposes a method that predicts the onset of paroxysmal atrial fibrillation (PAF), using heart rate variability (HRV) segments that are shorter than those applied in existing methods, while maintaining good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to stabilize (electrically) and prevent the onset of atrial arrhythmias with different pacing techniques. We investigate the effect of HRV features extracted from different lengths of HRV segments prior to PAF onset with the proposed PAF prediction method. The pre-processing stage of the predictor includes QRS detection, HRV quantification and ectopic beat correction. Time-domain, frequency-domain, non-linear and bispectrum features are then extracted from the quantified HRV. In the feature selection, the HRV feature set and classifier parameters are optimized simultaneously using an optimization procedure based on genetic algorithm (GA). Both full feature set and statistically significant feature subset are optimized by GA respectively. For the statistically significant feature subset, Mann-Whitney U test is used to filter non-statistical significance features that cannot pass the statistical test at 20% significant level. The final stage of our predictor is the classifier that is based on support vector machine (SVM). A 10-fold cross-validation is applied in performance evaluation, and the proposed method achieves 79.3% prediction accuracy using 15-minutes HRV segment. This accuracy is comparable to that achieved by existing methods that use 30-minutes HRV segments, most of which achieves accuracy of around 80%. More importantly, our method significantly outperforms those that applied segments shorter than 30 minutes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Annamalai, Alagappan; Harada, Megan Y; Chen, Melissa; Tran, Tram; Ko, Ara; Ley, Eric J; Nuno, Miriam; Klein, Andrew; Nissen, Nicholas; Noureddin, Mazen
2017-03-01
Critically ill cirrhotics require liver transplantation urgently, but are at high risk for perioperative mortality. The Model for End-stage Liver Disease (MELD) score, recently updated to incorporate serum sodium, estimates survival probability in patients with cirrhosis, but needs additional evaluation in the critically ill. The purpose of this study was to evaluate the predictive power of ICU admission MELD scores and identify clinical risk factors associated with increased mortality. This was a retrospective review of cirrhotic patients admitted to the ICU between January 2011 and December 2014. Patients who were discharged or underwent transplantation (survivors) were compared with those who died (nonsurvivors). Demographic characteristics, admission MELD scores, and clinical risk factors were recorded. Multivariate regression was used to identify independent predictors of mortality, and measures of model performance were assessed to determine predictive accuracy. Of 276 patients who met inclusion criteria, 153 were considered survivors and 123 were nonsurvivors. Survivor and nonsurvivor cohorts had similar demographic characteristics. Nonsurvivors had increased MELD, gastrointestinal bleeding, infection, mechanical ventilation, encephalopathy, vasopressors, dialysis, renal replacement therapy, requirement of blood products, and ICU length of stay. The MELD demonstrated low predictive power (c-statistic 0.73). Multivariate analysis identified MELD score (adjusted odds ratio [AOR] = 1.05), mechanical ventilation (AOR = 4.55), vasopressors (AOR = 3.87), and continuous renal replacement therapy (AOR = 2.43) as independent predictors of mortality, with stronger predictive accuracy (c-statistic 0.87). The MELD demonstrated relatively poor predictive accuracy in critically ill patients with cirrhosis and might not be the best indicator for prognosis in the ICU population. Prognostic accuracy is significantly improved when variables indicating organ support (mechanical ventilation, vasopressors, and continuous renal replacement therapy) are included in the model. Copyright © 2016. Published by Elsevier Inc.
Genomic Prediction Accounting for Residual Heteroskedasticity.
Ou, Zhining; Tempelman, Robert J; Steibel, Juan P; Ernst, Catherine W; Bates, Ronald O; Bello, Nora M
2015-11-12
Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. Copyright © 2016 Ou et al.
Confined turbulent swirling recirculating flow predictions. Ph.D. Thesis. Final Report
NASA Technical Reports Server (NTRS)
Abujelala, M. T.; Lilley, D. G.
1985-01-01
The capability and the accuracy of the STARPIC computer code in predicting confined turbulent swirling recirculating flows is presented. Inlet flow boundary conditions were demonstrated to be extremely important in simulating a flowfield via numerical calculations. The degree of swirl strength and expansion ratio have strong effects on the characteristics of swirling flow. In a nonswirling flow, a large corner recirculation zone exists in the flowfield with an expansion ratio greater than one. However, as the degree of inlet swirl increases, the size of this zone decreases and a central recirculation zone appears near the inlet. Generally, the size of the central zone increased with swirl strength and expansion ratio. Neither the standard k-epsilon turbulence mode nor its previous extensions show effective capability for predicting confined turbulent swirling recirculating flows. However, either reduced optimum values of three parameters in the mode or the empirical C sub mu formulation obtained via careful analysis of available turbulence measurements, can provide more acceptable accuracy in the prediction of these swirling flows.
Lee, J; Kachman, S D; Spangler, M L
2017-08-01
Genomic selection (GS) has become an integral part of genetic evaluation methodology and has been applied to all major livestock species, including beef and dairy cattle, pigs, and chickens. Significant contributions in increased accuracy of selection decisions have been clearly illustrated in dairy cattle after practical application of GS. In the majority of U.S. beef cattle breeds, similar efforts have also been made to increase the accuracy of genetic merit estimates through the inclusion of genomic information into routine genetic evaluations using a variety of methods. However, prediction accuracies can vary relative to panel density, the number of folds used for folds cross-validation, and the choice of dependent variables (e.g., EBV, deregressed EBV, adjusted phenotypes). The aim of this study was to evaluate the accuracy of genomic predictors for Red Angus beef cattle with different strategies used in training and evaluation. The reference population consisted of 9,776 Red Angus animals whose genotypes were imputed to 2 medium-density panels consisting of over 50,000 (50K) and approximately 80,000 (80K) SNP. Using the imputed panels, we determined the influence of marker density, exclusion (deregressed EPD adjusting for parental information [DEPD-PA]) or inclusion (deregressed EPD without adjusting for parental information [DEPD]) of parental information in the deregressed EPD used as the dependent variable, and the number of clusters used to partition training animals (3, 5, or 10). A BayesC model with π set to 0.99 was used to predict molecular breeding values (MBV) for 13 traits for which EPD existed. The prediction accuracies were measured as genetic correlations between MBV and weighted deregressed EPD. The average accuracies across all traits were 0.540 and 0.552 when using the 50K and 80K SNP panels, respectively, and 0.538, 0.541, and 0.561 when using 3, 5, and 10 folds, respectively, for cross-validation. Using DEPD-PA as the response variable resulted in higher accuracies of MBV than those obtained by DEPD for growth and carcass traits. When DEPD were used as the response variable, accuracies were greater for threshold traits and those that are sex limited, likely due to the fact that these traits suffer from a lack of information content and excluding animals in training with only parental information substantially decreases the training population size. It is recommended that the contribution of parental average to deregressed EPD should be removed in the construction of genomic prediction equations. The difference in terms of prediction accuracies between the 2 SNP panels or the number of folds compared herein was negligible.
Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions
Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.
2013-01-01
Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843
NASA Technical Reports Server (NTRS)
Farassat, F.; Dunn, M. H.; Padula, S. L.
1986-01-01
The development of a high speed propeller noise prediction code at Langley Research Center is described. The code utilizes two recent acoustic formulations in the time domain for subsonic and supersonic sources. The structure and capabilities of the code are discussed. Grid size study for accuracy and speed of execution on a computer is also presented. The code is tested against an earlier Langley code. Considerable increase in accuracy and speed of execution are observed. Some examples of noise prediction of a high speed propeller for which acoustic test data are available are given. A brisk derivation of formulations used is given in an appendix.
ERIC Educational Resources Information Center
Townsend, James T.; Altieri, Nicholas
2012-01-01
Measures of human efficiency under increases in mental workload or attentional limitations are vital in studying human perception, cognition, and action. Assays of efficiency as workload changes have typically been confined to either reaction times (RTs) or accuracy alone. Within the realm of RTs, a nonparametric measure called the "workload…
When high working memory capacity is and is not beneficial for predicting nonlinear processes.
Fischer, Helen; Holt, Daniel V
2017-04-01
Predicting the development of dynamic processes is vital in many areas of life. Previous findings are inconclusive as to whether higher working memory capacity (WMC) is always associated with using more accurate prediction strategies, or whether higher WMC can also be associated with using overly complex strategies that do not improve accuracy. In this study, participants predicted a range of systematically varied nonlinear processes based on exponential functions where prediction accuracy could or could not be enhanced using well-calibrated rules. Results indicate that higher WMC participants seem to rely more on well-calibrated strategies, leading to more accurate predictions for processes with highly nonlinear trajectories in the prediction region. Predictions of lower WMC participants, in contrast, point toward an increased use of simple exemplar-based prediction strategies, which perform just as well as more complex strategies when the prediction region is approximately linear. These results imply that with respect to predicting dynamic processes, working memory capacity limits are not generally a strength or a weakness, but that this depends on the process to be predicted.
Yang, Jing; He, Bao-Ji; Jang, Richard; Zhang, Yang; Shen, Hong-Bin
2015-01-01
Abstract Motivation: Cysteine-rich proteins cover many important families in nature but there are currently no methods specifically designed for modeling the structure of these proteins. The accuracy of disulfide connectivity pattern prediction, particularly for the proteins of higher-order connections, e.g. >3 bonds, is too low to effectively assist structure assembly simulations. Results: We propose a new hierarchical order reduction protocol called Cyscon for disulfide-bonding prediction. The most confident disulfide bonds are first identified and bonding prediction is then focused on the remaining cysteine residues based on SVR training. Compared with purely machine learning-based approaches, Cyscon improved the average accuracy of connectivity pattern prediction by 21.9%. For proteins with more than 5 disulfide bonds, Cyscon improved the accuracy by 585% on the benchmark set of PDBCYS. When applied to 158 non-redundant cysteine-rich proteins, Cyscon predictions helped increase (or decrease) the TM-score (or RMSD) of the ab initio QUARK modeling by 12.1% (or 14.4%). This result demonstrates a new avenue to improve the ab initio structure modeling for cysteine-rich proteins. Availability and implementation: http://www.csbio.sjtu.edu.cn/bioinf/Cyscon/ Contact: zhng@umich.edu or hbshen@sjtu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254435
Jeong, Jae Yoon; Kim, Tae Yeob; Sohn, Joo Hyun; Kim, Yongsoo; Jeong, Woo Kyoung; Oh, Young-Ha; Yoo, Kyo-Sang
2014-01-01
AIM: To evaluate the correlation between liver stiffness measurement (LSM) by real-time shear wave elastography (SWE) and liver fibrosis stage and the accuracy of LSM for predicting significant and advanced fibrosis, in comparison with serum markers. METHODS: We consecutively analyzed 70 patients with various chronic liver diseases. Liver fibrosis was staged from F0 to F4 according to the Batts and Ludwig scoring system. Significant and advanced fibrosis was defined as stage F ≥ 2 and F ≥ 3, respectively. The accuracy of prediction for fibrosis was analyzed using receiver operating characteristic curves. RESULTS: Seventy patients, 15 were belonged to F0-F1 stage, 20 F2, 13 F3 and 22 F4. LSM was increased with progression of fibrosis stage (F0-F1: 6.77 ± 1.72, F2: 9.98 ± 3.99, F3: 15.80 ± 7.73, and F4: 22.09 ± 10.09, P < 0.001). Diagnostic accuracies of LSM for prediction of F ≥ 2 and F ≥ 3 were 0.915 (95%CI: 0.824-0.968, P < 0.001) and 0.913 (95%CI: 0.821-0.967, P < 0.001), respectively. The cut-off values of LSM for prediction of F ≥ 2 and F ≥ 3 were 8.6 kPa with 78.2% sensitivity and 93.3% specificity and 10.46 kPa with 88.6% sensitivity and 80.0% specificity, respectively. However, there were no significant differences between LSM and serum hyaluronic acid and type IV collagen in diagnostic accuracy. CONCLUSION: SWE showed a significant correlation with the severity of liver fibrosis and was useful and accurate to predict significant and advanced fibrosis, comparable with serum markers. PMID:25320528
Jeong, Jae Yoon; Kim, Tae Yeob; Sohn, Joo Hyun; Kim, Yongsoo; Jeong, Woo Kyoung; Oh, Young-Ha; Yoo, Kyo-Sang
2014-10-14
To evaluate the correlation between liver stiffness measurement (LSM) by real-time shear wave elastography (SWE) and liver fibrosis stage and the accuracy of LSM for predicting significant and advanced fibrosis, in comparison with serum markers. We consecutively analyzed 70 patients with various chronic liver diseases. Liver fibrosis was staged from F0 to F4 according to the Batts and Ludwig scoring system. Significant and advanced fibrosis was defined as stage F ≥ 2 and F ≥ 3, respectively. The accuracy of prediction for fibrosis was analyzed using receiver operating characteristic curves. Seventy patients, 15 were belonged to F0-F1 stage, 20 F2, 13 F3 and 22 F4. LSM was increased with progression of fibrosis stage (F0-F1: 6.77 ± 1.72, F2: 9.98 ± 3.99, F3: 15.80 ± 7.73, and F4: 22.09 ± 10.09, P < 0.001). Diagnostic accuracies of LSM for prediction of F ≥ 2 and F ≥ 3 were 0.915 (95%CI: 0.824-0.968, P < 0.001) and 0.913 (95%CI: 0.821-0.967, P < 0.001), respectively. The cut-off values of LSM for prediction of F ≥ 2 and F ≥ 3 were 8.6 kPa with 78.2% sensitivity and 93.3% specificity and 10.46 kPa with 88.6% sensitivity and 80.0% specificity, respectively. However, there were no significant differences between LSM and serum hyaluronic acid and type IV collagen in diagnostic accuracy. SWE showed a significant correlation with the severity of liver fibrosis and was useful and accurate to predict significant and advanced fibrosis, comparable with serum markers.
Applicability of linear regression equation for prediction of chlorophyll content in rice leaves
NASA Astrophysics Data System (ADS)
Li, Yunmei
2005-09-01
A modeling approach is used to assess the applicability of the derived equations which are capable to predict chlorophyll content of rice leaves at a given view direction. Two radiative transfer models, including PROSPECT model operated at leaf level and FCR model operated at canopy level, are used in the study. The study is consisted of three steps: (1) Simulation of bidirectional reflectance from canopy with different leaf chlorophyll contents, leaf-area-index (LAI) and under storey configurations; (2) Establishment of prediction relations of chlorophyll content by stepwise regression; and (3) Assessment of the applicability of these relations. The result shows that the accuracy of prediction is affected by different under storey configurations and, however, the accuracy tends to be greatly improved with increase of LAI.
Influence of sex and ethnic tooth-size differences on mixed-dentition space analysis
Altherr, Edward R.; Koroluk, Lorne D.; Phillips, Ceib
2013-01-01
Introduction Most mixed-dentition space analyses were developed by using subjects of northwestern European descent and unspecified sex. The purpose of this study was to determine the predictive accuracy of the Tanaka-Johnston analysis in white and black subjects in North Carolina. Methods A total of 120 subjects (30 males and 30 females in each ethnic group) were recruited from clinics at the University of North Carolina School of Dentistry. Ethnicity was verified to 2 previous generations. All subjects were less than 21 years of age and had a full complement of permanent teeth. Digital calipers were used to measure the mesiodistal widths of all teeth on study models fabricated from alginate impressions. The predicted widths of the canines and the premolars in both arches were compared with the actual measured widths. Results In the maxillary arch, there was a significant interaction of ethnicity and sex on the predictive accuracy of the Tanaka-Johnston analysis (P = .03, factorial ANOVA). The predictive accuracy was significantly overestimated in the white female group (P <.001, least square means). In the mandibular arch, there was no significant interaction between ethnicity and sex (P = .49). Conclusions The Tanaka-Johnston analysis significantly overestimated in females (P <.0001) and underestimated in blacks (P <.0001) (factorial ANOVA). Regression equations were developed to increase the predictive accuracy in both arches. (Am J Orthod Dentofacial Orthop 2007;132:332-9) PMID:17826601
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Andrew K. Carlson,; William W. Taylor,; Hartikainen, Kelsey M.; Dana M. Infante,; Beard, Douglas; Lynch, Abigail
2017-01-01
Global climate change is predicted to increase air and stream temperatures and alter thermal habitat suitability for growth and survival of coldwater fishes, including brook charr (Salvelinus fontinalis), brown trout (Salmo trutta), and rainbow trout (Oncorhynchus mykiss). In a changing climate, accurate stream temperature modeling is increasingly important for sustainable salmonid management throughout the world. However, finite resource availability (e.g. funding, personnel) drives a tradeoff between thermal model accuracy and efficiency (i.e. cost-effective applicability at management-relevant spatial extents). Using different projected climate change scenarios, we compared the accuracy and efficiency of stream-specific and generalized (i.e. region-specific) temperature models for coldwater salmonids within and outside the State of Michigan, USA, a region with long-term stream temperature data and productive coldwater fisheries. Projected stream temperature warming between 2016 and 2056 ranged from 0.1 to 3.8 °C in groundwater-dominated streams and 0.2–6.8 °C in surface-runoff dominated systems in the State of Michigan. Despite their generally lower accuracy in predicting exact stream temperatures, generalized models accurately projected salmonid thermal habitat suitability in 82% of groundwater-dominated streams, including those with brook charr (80% accuracy), brown trout (89% accuracy), and rainbow trout (75% accuracy). In contrast, generalized models predicted thermal habitat suitability in runoff-dominated streams with much lower accuracy (54%). These results suggest that, amidst climate change and constraints in resource availability, generalized models are appropriate to forecast thermal conditions in groundwater-dominated streams within and outside Michigan and inform regional-level salmonid management strategies that are practical for coldwater fisheries managers, policy makers, and the public. We recommend fisheries professionals reserve resource-intensive stream-specific models for runoff-dominated systems containing high-priority fisheries resources (e.g. trophy individuals, endangered species) that will be directly impacted by projected stream warming.
Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Chen, Charles; Porth, Ilga; El-Kassaby, Yousry A
2015-05-09
Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.
USDA-ARS?s Scientific Manuscript database
Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates for selection. Originally these models were developed without considering genotype ' environment interaction (GE). Several authors have proposed extensions of the cannonical GS model that accomm...
Baker, Erich J; Walter, Nicole A R; Salo, Alex; Rivas Perea, Pablo; Moore, Sharon; Gonzales, Steven; Grant, Kathleen A
2017-03-01
The Monkey Alcohol Tissue Research Resource (MATRR) is a repository and analytics platform for detailed data derived from well-documented nonhuman primate (NHP) alcohol self-administration studies. This macaque model has demonstrated categorical drinking norms reflective of human drinking populations, resulting in consumption pattern classifications of very heavy drinking (VHD), heavy drinking (HD), binge drinking (BD), and low drinking (LD) individuals. Here, we expand on previous findings that suggest ethanol drinking patterns during initial drinking to intoxication can reliably predict future drinking category assignment. The classification strategy uses a machine-learning approach to examine an extensive set of daily drinking attributes during 90 sessions of induction across 7 cohorts of 5 to 8 monkeys for a total of 50 animals. A Random Forest classifier is employed to accurately predict categorical drinking after 12 months of self-administration. Predictive outcome accuracy is approximately 78% when classes are aggregated into 2 groups, "LD and BD" and "HD and VHD." A subsequent 2-step classification model distinguishes individual LD and BD categories with 90% accuracy and between HD and VHD categories with 95% accuracy. Average 4-category classification accuracy is 74%, and provides putative distinguishing behavioral characteristics between groupings. We demonstrate that data derived from the induction phase of this ethanol self-administration protocol have significant predictive power for future ethanol consumption patterns. Importantly, numerous predictive factors are longitudinal, measuring the change of drinking patterns through 3 stages of induction. Factors during induction that predict future heavy drinkers include being younger at the time of first intoxication and developing a shorter latency to first ethanol drink. Overall, this analysis identifies predictive characteristics in future very heavy drinkers that optimize intoxication, such as having increasingly fewer bouts with more drinks. This analysis also identifies characteristic avoidance of intoxicating topographies in future low drinkers, such as increasing number of bouts and waiting longer before the first ethanol drink. Copyright © 2017 The Authors Alcoholism: Clinical & Experimental Research published by Wiley Periodicals, Inc. on behalf of Research Society on Alcoholism.
Simulated biologic intelligence used to predict length of stay and survival of burns.
Frye, K E; Izenberg, S D; Williams, M D; Luterman, A
1996-01-01
From July 13, 1988, to May 14, 1995, 1585 patients with burns and no other injuries besides inhalation were treated; 4.5% did not survive. Artificial neural networks were trained on patient presentation data with known outcomes on 90% of the randomized cases. The remaining cases were then used to predict survival and length of stay in cases not trained on. Survival was predicted with more than 98% accuracy and length of stay to within a week with 72% accuracy in these cases. For anatomic area involved by burn, burns involving the feet, scalp, or both had the largest negative effect on the survival prediction. In survivors burns involving the buttocks, transport to this burn center by the military or by helicopter, electrical burns, hot tar burns, and inhalation were associated with increasing the length of stay prediction. Neural networks can be used to accurately predict the clinical outcome of a burn. What factors affect that prediction can be investigated.
Pinder, John E; Rowan, David J; Rasmussen, Joseph B; Smith, Jim T; Hinton, Thomas G; Whicker, F W
2014-08-01
Data from published studies and World Wide Web sources were combined to produce and test a regression model to predict Cs concentration ratios for freshwater fish species. The accuracies of predicted concentration ratios, which were computed using 1) species trophic levels obtained from random resampling of known food items and 2) K concentrations in the water for 207 fish from 44 species and 43 locations, were tested against independent observations of ratios for 57 fish from 17 species from 25 locations. Accuracy was assessed as the percent of observed to predicted ratios within factors of 2 or 3. Conservatism, expressed as the lack of under prediction, was assessed as the percent of observed to predicted ratios that were less than 2 or less than 3. The model's median observed to predicted ratio was 1.26, which was not significantly different from 1, and 50% of the ratios were between 0.73 and 1.85. The percentages of ratios within factors of 2 or 3 were 67 and 82%, respectively. The percentages of ratios that were <2 or <3 were 79 and 88%, respectively. An example for Perca fluviatilis demonstrated that increased prediction accuracy could be obtained when more detailed knowledge of diet was available to estimate trophic level. Copyright © 2014 Elsevier Ltd. All rights reserved.
Predicting online ratings based on the opinion spreading process
NASA Astrophysics Data System (ADS)
He, Xing-Sheng; Zhou, Ming-Yang; Zhuo, Zhao; Fu, Zhong-Qian; Liu, Jian-Guo
2015-10-01
Predicting users' online ratings is always a challenge issue and has drawn lots of attention. In this paper, we present a rating prediction method by combining the user opinion spreading process with the collaborative filtering algorithm, where user similarity is defined by measuring the amount of opinion a user transfers to another based on the primitive user-item rating matrix. The proposed method could produce a more precise rating prediction for each unrated user-item pair. In addition, we introduce a tunable parameter λ to regulate the preferential diffusion relevant to the degree of both opinion sender and receiver. The numerical results for Movielens and Netflix data sets show that this algorithm has a better accuracy than the standard user-based collaborative filtering algorithm using Cosine and Pearson correlation without increasing computational complexity. By tuning λ, our method could further boost the prediction accuracy when using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) as measurements. In the optimal cases, on Movielens and Netflix data sets, the corresponding algorithmic accuracy (MAE and RMSE) are improved 11.26% and 8.84%, 13.49% and 10.52% compared to the item average method, respectively.
Factoring vs linear modeling in rate estimation: a simulation study of relative accuracy.
Maldonado, G; Greenland, S
1998-07-01
A common strategy for modeling dose-response in epidemiology is to transform ordered exposures and covariates into sets of dichotomous indicator variables (that is, to factor the variables). Factoring tends to increase estimation variance, but it also tends to decrease bias and thus may increase or decrease total accuracy. We conducted a simulation study to examine the impact of factoring on the accuracy of rate estimation. Factored and unfactored Poisson regression models were fit to follow-up study datasets that were randomly generated from 37,500 population model forms that ranged from subadditive to supramultiplicative. In the situations we examined, factoring sometimes substantially improved accuracy relative to fitting the corresponding unfactored model, sometimes substantially decreased accuracy, and sometimes made little difference. The difference in accuracy between factored and unfactored models depended in a complicated fashion on the difference between the true and fitted model forms, the strength of exposure and covariate effects in the population, and the study size. It may be difficult in practice to predict when factoring is increasing or decreasing accuracy. We recommend, therefore, that the strategy of factoring variables be supplemented with other strategies for modeling dose-response.
ERIC Educational Resources Information Center
Bol, Linda; Hacker, Douglas J.; Walck, Camilla C.; Nunnery, John A.
2012-01-01
A 2 x 2 factorial design was employed in a quasi-experiment to investigate the effects of guidelines in group or individual settings on the calibration accuracy and achievement of 82 high school biology students. Significant main effects indicated that calibration practice with guidelines and practice in group settings increased prediction and…
Context Memory Decline in Middle Aged Adults is Related to Changes in Prefrontal Cortex Function.
Kwon, Diana; Maillet, David; Pasvanis, Stamatoula; Ankudowich, Elizabeth; Grady, Cheryl L; Rajah, M Natasha
2016-06-01
The ability to encode and retrieve spatial and temporal contextual details of episodic memories (context memory) begins to decline at midlife. In the current study, event-related fMRI was used to investigate the neural correlates of context memory decline in healthy middle aged adults (MA) compared with young adults (YA). Participants were scanned while performing easy and hard versions of spatial and temporal context memory tasks. Scans were obtained at encoding and retrieval. Significant reductions in context memory retrieval accuracy were observed in MA, compared with YA. The fMRI results revealed that overall, both groups exhibited similar patterns of brain activity in parahippocampal cortex, ventral occipito-temporal regions and prefrontal cortex (PFC) during encoding. In contrast, at retrieval, there were group differences in ventral occipito-temporal and PFC activity, due to these regions being more activated in MA, compared with YA. Furthermore, only in YA, increased encoding activity in ventrolateral PFC, and increased retrieval activity in occipital cortex, predicted increased retrieval accuracy. In MA, increased retrieval activity in anterior PFC predicted increased retrieval accuracy. These results suggest that there are changes in PFC contributions to context memory at midlife. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An ensemble framework for identifying essential proteins.
Zhang, Xue; Xiao, Wangxin; Acencio, Marcio Luis; Lemke, Ney; Wang, Xujing
2016-08-25
Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small. In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins. This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.
Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo; Eskridge, Kent; Crossa, José
2014-12-23
Categorical scores for disease susceptibility or resistance often are recorded in plant breeding. The aim of this study was to introduce genomic models for analyzing ordinal characters and to assess the predictive ability of genomic predictions for ordered categorical phenotypes using a threshold model counterpart of the Genomic Best Linear Unbiased Predictor (i.e., TGBLUP). The threshold model was used to relate a hypothetical underlying scale to the outward categorical response. We present an empirical application where a total of nine models, five without interaction and four with genomic × environment interaction (G×E) and genomic additive × additive × environment interaction (G×G×E), were used. We assessed the proposed models using data consisting of 278 maize lines genotyped with 46,347 single-nucleotide polymorphisms and evaluated for disease resistance [with ordinal scores from 1 (no disease) to 5 (complete infection)] in three environments (Colombia, Zimbabwe, and Mexico). Models with G×E captured a sizeable proportion of the total variability, which indicates the importance of introducing interaction to improve prediction accuracy. Relative to models based on main effects only, the models that included G×E achieved 9-14% gains in prediction accuracy; adding additive × additive interactions did not increase prediction accuracy consistently across locations. Copyright © 2015 Montesinos-López et al.
NASA Astrophysics Data System (ADS)
Lee, Soon Hwan; Kim, Ji Sun; Lee, Kang Yeol; Shon, Keon Tae
2017-04-01
Air quality due to increasing Particulate Matter(PM) in Korea in Asia is getting worse. At present, the PM forecast is announced based on the PM concentration predicted from the air quality prediction numerical model. However, forecast accuracy is not as high as expected due to various uncertainties for PM physical and chemical characteristics. The purpose of this study was to develop a numerical-statistically ensemble models to improve the accuracy of prediction of PM10 concentration. Numerical models used in this study are the three dimensional atmospheric model Weather Research and Forecasting(WRF) and the community multiscale air quality model (CMAQ). The target areas for the PM forecast are Seoul, Busan, Daegu, and Daejeon metropolitan areas in Korea. The data used in the model development are PM concentration and CMAQ predictions and the data period is 3 months (March 1 - May 31, 2014). The dynamic-statistical technics for reducing the systematic error of the CMAQ predictions was applied to the dynamic linear model(DLM) based on the Baysian Kalman filter technic. As a result of applying the metrics generated from the dynamic linear model to the forecasting of PM concentrations accuracy was improved. Especially, at the high PM concentration where the damage is relatively large, excellent improvement results are shown.
Fagerberg, Marie C; Maršál, Karel; Källén, Karin
2015-05-01
We aimed to validate a widely used US prediction model for vaginal birth after cesarean (Grobman et al. [8]) and modify it to suit Swedish conditions. Women having experienced one cesarean section and at least one subsequent delivery (n=49,472) in the Swedish Medical Birth Registry 1992-2011 were randomly divided into two data sets. In the development data set, variables associated with successful trial of labor were identified using multiple logistic regression. The predictive ability of the estimates previously published by Grobman et al., and of our modified and new estimates, respectively, was then evaluated using the validation data set. The accuracy of the models for prediction of vaginal birth after cesarean was measured by area under the receiver operating characteristics curve. For maternal age, body mass index, prior vaginal delivery, and prior labor arrest, the odds ratio estimates for vaginal birth after cesarean were similar to those previously published. The prediction accuracy increased when information on indication for the previous cesarean section was added (from area under the receiver operating characteristics curve=0.69-0.71), and increased further when maternal height and delivery unit cesarean section rates were included (area under the receiver operating characteristics curve=0.74). The correlation between the individual predicted vaginal birth after cesarean probability and the observed trial of labor success rate was high in all the respective predicted probability decentiles. Customization of prediction models for vaginal birth after cesarean is of considerable value. Choosing relevant indicators for a Swedish setting made it possible to achieve excellent prediction accuracy for success in trial of labor after cesarean. During the delicate process of counseling about preferred delivery mode after one cesarean section, considering the results of our study may facilitate the choice between a trial of labor or an elective repeat cesarean section. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Model Predictions and Observed Performance of JWST's Cryogenic Position Metrology System
NASA Technical Reports Server (NTRS)
Lunt, Sharon R.; Rhodes, David; DiAntonio, Andrew; Boland, John; Wells, Conrad; Gigliotti, Trevis; Johanning, Gary
2016-01-01
The James Webb Space Telescope cryogenic testing requires measurement systems that both obtain a very high degree of accuracy and can function in that environment. Close-range photogrammetry was identified as meeting those criteria. Testing the capability of a close-range photogrammetric system prior to its existence is a challenging problem. Computer simulation was chosen over building a scaled mock-up to allow for increased flexibility in testing various configurations. Extensive validation work was done to ensure that the actual as-built system meet accuracy and repeatability requirements. The simulated image data predicted the uncertainty in measurement to be within specification and this prediction was borne out experimentally. Uncertainty at all levels was verified experimentally to be less than 0.1 millimeters.
Shetty, N; Løvendahl, P; Lund, M S; Buitenhuis, A J
2017-01-01
The present study explored the effectiveness of Fourier transform mid-infrared (FT-IR) spectral profiles as a predictor for dry matter intake (DMI) and residual feed intake (RFI). The partial least squares regression method was used to develop the prediction models. The models were validated using different external test sets, one randomly leaving out 20% of the records (validation A), the second randomly leaving out 20% of cows (validation B), and a third (for DMI prediction models) randomly leaving out one cow (validation C). The data included 1,044 records from 140 cows; 97 were Danish Holstein and 43 Danish Jersey. Results showed better accuracies for validation A compared with other validation methods. Milk yield (MY) contributed largely to DMI prediction; MY explained 59% of the variation and the validated model error root mean square error of prediction (RMSEP) was 2.24kg. The model was improved by adding live weight (LW) as an additional predictor trait, where the accuracy R 2 increased from 0.59 to 0.72 and error RMSEP decreased from 2.24 to 1.83kg. When only the milk FT-IR spectral profile was used in DMI prediction, a lower prediction ability was obtained, with R 2 =0.30 and RMSEP=2.91kg. However, once the spectral information was added, along with MY and LW as predictors, model accuracy improved and R 2 increased to 0.81 and RMSEP decreased to 1.49kg. Prediction accuracies of RFI changed throughout lactation. The RFI prediction model for the early-lactation stage was better compared with across lactation or mid- and late-lactation stages, with R 2 =0.46 and RMSEP=1.70. The most important spectral wavenumbers that contributed to DMI and RFI prediction models included fat, protein, and lactose peaks. Comparable prediction results were obtained when using infrared-predicted fat, protein, and lactose instead of full spectra, indicating that FT-IR spectral data do not add significant new information to improve DMI and RFI prediction models. Therefore, in practice, if full FT-IR spectral data are not stored, it is possible to achieve similar DMI or RFI prediction results based on standard milk control data. For DMI, the milk fat region was responsible for the major variation in milk spectra; for RFI, the major variation in milk spectra was within the milk protein region. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Ensemble-based prediction of RNA secondary structures.
Aghaeepour, Nima; Hoos, Holger H
2013-04-24
Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between false negative and false positive base pair predictions. Finally, AveRNA can make use of arbitrary sets of secondary structure prediction procedures and can therefore be used to leverage improvements in prediction accuracy offered by algorithms and energy models developed in the future. Our data, MATLAB software and a web-based version of AveRNA are publicly available at http://www.cs.ubc.ca/labs/beta/Software/AveRNA.
Dias, Kaio Olímpio Das Graças; Gezan, Salvador Alejandro; Guimarães, Claudia Teixeira; Nazarian, Alireza; da Costa E Silva, Luciano; Parentoni, Sidney Netto; de Oliveira Guimarães, Paulo Evaristo; de Oliveira Anoni, Carina; Pádua, José Maria Villela; de Oliveira Pinto, Marcos; Noda, Roberto Willians; Ribeiro, Carlos Alexandre Gomes; de Magalhães, Jurandir Vieira; Garcia, Antonio Augusto Franco; de Souza, João Cândido; Guimarães, Lauro José Moreira; Pastina, Maria Marta
2018-07-01
Breeding for drought tolerance is a challenging task that requires costly, extensive, and precise phenotyping. Genomic selection (GS) can be used to maximize selection efficiency and the genetic gains in maize (Zea mays L.) breeding programs for drought tolerance. Here, we evaluated the accuracy of genomic selection (GS) using additive (A) and additive + dominance (AD) models to predict the performance of untested maize single-cross hybrids for drought tolerance in multi-environment trials. Phenotypic data of five drought tolerance traits were measured in 308 hybrids along eight trials under water-stressed (WS) and well-watered (WW) conditions over two years and two locations in Brazil. Hybrids' genotypes were inferred based on their parents' genotypes (inbred lines) using single-nucleotide polymorphism markers obtained via genotyping-by-sequencing. GS analyses were performed using genomic best linear unbiased prediction by fitting a factor analytic (FA) multiplicative mixed model. Two cross-validation (CV) schemes were tested: CV1 and CV2. The FA framework allowed for investigating the stability of additive and dominance effects across environments, as well as the additive-by-environment and the dominance-by-environment interactions, with interesting applications for parental and hybrid selection. Results showed differences in the predictive accuracy between A and AD models, using both CV1 and CV2, for the five traits in both water conditions. For grain yield (GY) under WS and using CV1, the AD model doubled the predictive accuracy in comparison to the A model. Through CV2, GS models benefit from borrowing information of correlated trials, resulting in an increase of 40% and 9% in the predictive accuracy of GY under WS for A and AD models, respectively. These results highlight the importance of multi-environment trial analyses using GS models that incorporate additive and dominance effects for genomic predictions of GY under drought in maize single-cross hybrids.
NASA Astrophysics Data System (ADS)
Huang, Bing; von Lilienfeld, O. Anatole
2016-10-01
The predictive accuracy of Machine Learning (ML) models of molecular properties depends on the choice of the molecular representation. Inspired by the postulates of quantum mechanics, we introduce a hierarchy of representations which meet uniqueness and target similarity criteria. To systematically control target similarity, we simply rely on interatomic many body expansions, as implemented in universal force-fields, including Bonding, Angular (BA), and higher order terms. Addition of higher order contributions systematically increases similarity to the true potential energy and predictive accuracy of the resulting ML models. We report numerical evidence for the performance of BAML models trained on molecular properties pre-calculated at electron-correlated and density functional theory level of theory for thousands of small organic molecules. Properties studied include enthalpies and free energies of atomization, heat capacity, zero-point vibrational energies, dipole-moment, polarizability, HOMO/LUMO energies and gap, ionization potential, electron affinity, and electronic excitations. After training, BAML predicts energies or electronic properties of out-of-sample molecules with unprecedented accuracy and speed.
Labrenz, Franziska; Icenhour, Adriane; Benson, Sven; Elsenbruch, Sigrid
2015-01-01
As a fundamental learning process, fear conditioning promotes the formation of associations between predictive cues and biologically significant signals. In its application to pain, conditioning may provide important insight into mechanisms underlying pain-related fear, although knowledge especially in interoceptive pain paradigms remains scarce. Furthermore, while the influence of contingency awareness on excitatory learning is subject of ongoing debate, its role in pain-related acquisition is poorly understood and essentially unknown regarding extinction as inhibitory learning. Therefore, we addressed the impact of contingency awareness on learned emotional responses to pain- and safety-predictive cues in a combined dataset of two pain-related conditioning studies. In total, 75 healthy participants underwent differential fear acquisition, during which rectal distensions as interoceptive unconditioned stimuli (US) were repeatedly paired with a predictive visual cue (conditioned stimulus; CS+) while another cue (CS−) was presented unpaired. During extinction, both CS were presented without US. CS valence, indicating learned emotional responses, and CS-US contingencies were assessed on visual analog scales (VAS). Based on an integrative measure of contingency accuracy, a median-split was performed to compare groups with low vs. high contingency accuracy regarding learned emotional responses. To investigate predictive value of contingency accuracy, regression analyses were conducted. Highly accurate individuals revealed more pronounced negative emotional responses to CS+ and increased positive responses to CS− when compared to participants with low contingency accuracy. Following extinction, highly accurate individuals had fully extinguished pain-predictive cue properties, while exhibiting persistent positive emotional responses to safety signals. In contrast, individuals with low accuracy revealed equally positive emotional responses to both, CS+ and CS−. Contingency accuracy predicted variance in the formation of positive responses to safety cues while no predictive value was found for danger cues following acquisition and for neither cue following extinction. Our findings underscore specific roles of learned danger and safety in pain-related acquisition and extinction. Contingency accuracy appears to distinctly impact learned emotional responses to safety and danger cues, supporting aversive learning to occur independently from CS-US awareness. The interplay of cognitive and emotional factors in shaping excitatory and inhibitory pain-related learning may contribute to altered pain processing, underscoring its clinical relevance in chronic pain. PMID:26640433
Labrenz, Franziska; Icenhour, Adriane; Benson, Sven; Elsenbruch, Sigrid
2015-01-01
As a fundamental learning process, fear conditioning promotes the formation of associations between predictive cues and biologically significant signals. In its application to pain, conditioning may provide important insight into mechanisms underlying pain-related fear, although knowledge especially in interoceptive pain paradigms remains scarce. Furthermore, while the influence of contingency awareness on excitatory learning is subject of ongoing debate, its role in pain-related acquisition is poorly understood and essentially unknown regarding extinction as inhibitory learning. Therefore, we addressed the impact of contingency awareness on learned emotional responses to pain- and safety-predictive cues in a combined dataset of two pain-related conditioning studies. In total, 75 healthy participants underwent differential fear acquisition, during which rectal distensions as interoceptive unconditioned stimuli (US) were repeatedly paired with a predictive visual cue (conditioned stimulus; CS(+)) while another cue (CS(-)) was presented unpaired. During extinction, both CS were presented without US. CS valence, indicating learned emotional responses, and CS-US contingencies were assessed on visual analog scales (VAS). Based on an integrative measure of contingency accuracy, a median-split was performed to compare groups with low vs. high contingency accuracy regarding learned emotional responses. To investigate predictive value of contingency accuracy, regression analyses were conducted. Highly accurate individuals revealed more pronounced negative emotional responses to CS(+) and increased positive responses to CS(-) when compared to participants with low contingency accuracy. Following extinction, highly accurate individuals had fully extinguished pain-predictive cue properties, while exhibiting persistent positive emotional responses to safety signals. In contrast, individuals with low accuracy revealed equally positive emotional responses to both, CS(+) and CS(-). Contingency accuracy predicted variance in the formation of positive responses to safety cues while no predictive value was found for danger cues following acquisition and for neither cue following extinction. Our findings underscore specific roles of learned danger and safety in pain-related acquisition and extinction. Contingency accuracy appears to distinctly impact learned emotional responses to safety and danger cues, supporting aversive learning to occur independently from CS-US awareness. The interplay of cognitive and emotional factors in shaping excitatory and inhibitory pain-related learning may contribute to altered pain processing, underscoring its clinical relevance in chronic pain.
Genomic Prediction of Gene Bank Wheat Landraces.
Crossa, José; Jarquín, Diego; Franco, Jorge; Pérez-Rodríguez, Paulino; Burgueño, Juan; Saint-Pierre, Carolina; Vikram, Prashant; Sansaloni, Carolina; Petroli, Cesar; Akdemir, Deniz; Sneller, Clay; Reynolds, Matthew; Tattaris, Maria; Payne, Thomas; Guzman, Carlos; Peña, Roberto J; Wenzl, Peter; Singh, Sukhwinder
2016-07-07
This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, "diversity" and "prediction", including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15-20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials. Copyright © 2016 Crossa et al.
A scoring algorithm for predicting the presence of adult asthma: a prospective derivation study.
Tomita, Katsuyuki; Sano, Hiroyuki; Chiba, Yasutaka; Sato, Ryuji; Sano, Akiko; Nishiyama, Osamu; Iwanaga, Takashi; Higashimoto, Yuji; Haraguchi, Ryuta; Tohda, Yuji
2013-03-01
To predict the presence of asthma in adult patients with respiratory symptoms, we developed a scoring algorithm using clinical parameters. We prospectively analysed 566 adult outpatients who visited Kinki University Hospital for the first time with complaints of nonspecific respiratory symptoms. Asthma was comprehensively diagnosed by specialists using symptoms, signs, and objective tools including bronchodilator reversibility and/or the assessment of bronchial hyperresponsiveness (BHR). Multiple logistic regression analysis was performed to categorise patients and determine the accuracy of diagnosing asthma. A scoring algorithm using the symptom-sign score was developed, based on diurnal variation of symptoms (1 point), recurrent episodes (2 points), medical history of allergic diseases (1 point), and wheeze sound (2 points). A score of >3 had 35% sensitivity and 97% specificity for discriminating between patients with and without asthma and assigned a high probability of having asthma (accuracy 90%). A score of 1 or 2 points assigned intermediate probability (accuracy 68%). After providing additional data of forced expiratory volume in 1 second/forced vital capacity (FEV(1)/FVC) ratio <0.7, the post-test probability of having asthma was increased to 93%. A score of 0 points assigned low probability (accuracy 31%). After providing additional data of positive reversibility, the post-test probability of having asthma was increased to 88%. This pragmatic diagnostic algorithm is useful for predicting the presence of adult asthma and for determining the appropriate time for consultation with a pulmonologist.
Competition Processes and Proactive Interference in Short-Term Memory
ERIC Educational Resources Information Center
Bennett, Raymond W.; Kurzeja, Paul L.
1976-01-01
In an experiment using single-word items, subjects are run under three different speed-accuracy trade-off conditions. A competition model would predict that when subjects are forced to respond quickly, there will be an increase in errors, and these will be from recent past items. The prediction was confirmed. (CHK)
USDA-ARS?s Scientific Manuscript database
Single-step Genomic Best Linear Unbiased Predictor (ssGBLUP) has become increasingly popular for whole-genome prediction (WGP) modeling as it utilizes any available pedigree and phenotypes on both genotyped and non-genotyped individuals. The WGP accuracy of ssGBLUP has been demonstrated to be greate...
CD-Based Indices for Link Prediction in Complex Network.
Wang, Tao; Wang, Hongjue; Wang, Xiaoxia
2016-01-01
Lots of similarity-based algorithms have been designed to deal with the problem of link prediction in the past decade. In order to improve prediction accuracy, a novel cosine similarity index CD based on distance between nodes and cosine value between vectors is proposed in this paper. Firstly, node coordinate matrix can be obtained by node distances which are different from distance matrix and row vectors of the matrix are regarded as coordinates of nodes. Then, cosine value between node coordinates is used as their similarity index. A local community density index LD is also proposed. Then, a series of CD-based indices include CD-LD-k, CD*LD-k, CD-k and CDI are presented and applied in ten real networks. Experimental results demonstrate the effectiveness of CD-based indices. The effects of network clustering coefficient and assortative coefficient on prediction accuracy of indices are analyzed. CD-LD-k and CD*LD-k can improve prediction accuracy without considering the assortative coefficient of network is negative or positive. According to analysis of relative precision of each method on each network, CD-LD-k and CD*LD-k indices have excellent average performance and robustness. CD and CD-k indices perform better on positive assortative networks than on negative assortative networks. For negative assortative networks, we improve and refine CD index, referred as CDI index, combining the advantages of CD index and evolutionary mechanism of the network model BA. Experimental results reveal that CDI index can increase prediction accuracy of CD on negative assortative networks.
CD-Based Indices for Link Prediction in Complex Network
Wang, Tao; Wang, Hongjue; Wang, Xiaoxia
2016-01-01
Lots of similarity-based algorithms have been designed to deal with the problem of link prediction in the past decade. In order to improve prediction accuracy, a novel cosine similarity index CD based on distance between nodes and cosine value between vectors is proposed in this paper. Firstly, node coordinate matrix can be obtained by node distances which are different from distance matrix and row vectors of the matrix are regarded as coordinates of nodes. Then, cosine value between node coordinates is used as their similarity index. A local community density index LD is also proposed. Then, a series of CD-based indices include CD-LD-k, CD*LD-k, CD-k and CDI are presented and applied in ten real networks. Experimental results demonstrate the effectiveness of CD-based indices. The effects of network clustering coefficient and assortative coefficient on prediction accuracy of indices are analyzed. CD-LD-k and CD*LD-k can improve prediction accuracy without considering the assortative coefficient of network is negative or positive. According to analysis of relative precision of each method on each network, CD-LD-k and CD*LD-k indices have excellent average performance and robustness. CD and CD-k indices perform better on positive assortative networks than on negative assortative networks. For negative assortative networks, we improve and refine CD index, referred as CDI index, combining the advantages of CD index and evolutionary mechanism of the network model BA. Experimental results reveal that CDI index can increase prediction accuracy of CD on negative assortative networks. PMID:26752405
Modified linear predictive coding approach for moving target tracking by Doppler radar
NASA Astrophysics Data System (ADS)
Ding, Yipeng; Lin, Xiaoyi; Sun, Ke-Hui; Xu, Xue-Mei; Liu, Xi-Yao
2016-07-01
Doppler radar is a cost-effective tool for moving target tracking, which can support a large range of civilian and military applications. A modified linear predictive coding (LPC) approach is proposed to increase the target localization accuracy of the Doppler radar. Based on the time-frequency analysis of the received echo, the proposed approach first real-time estimates the noise statistical parameters and constructs an adaptive filter to intelligently suppress the noise interference. Then, a linear predictive model is applied to extend the available data, which can help improve the resolution of the target localization result. Compared with the traditional LPC method, which empirically decides the extension data length, the proposed approach develops an error array to evaluate the prediction accuracy and thus, adjust the optimum extension data length intelligently. Finally, the prediction error array is superimposed with the predictor output to correct the prediction error. A series of experiments are conducted to illustrate the validity and performance of the proposed techniques.
Holland, Katherine D; Bouley, Thomas M; Horn, Paul S
2017-07-01
Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a combination of algorithms increased the accuracy to 0.92. Potentially pathogenic variants are present in population-based sources. Most computational algorithms overestimate pathogenicity; however, a weighted combination of several algorithms increased classification accuracy to >0.90. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.
Toth, Jeffrey P.; Daniels, Karen A.; Solinger, Lisa A.
2011-01-01
How do aging and prior knowledge affect memory and metamemory? We explored this question in the context of a dual-process approach to Judgments of Learning (JOLs) which require people to predict their ability to remember information at a later time. Young and older adults (n's = 36, mean ages = 20.2 & 73.1) studied the names of actors that were famous in the 1950s or 1990s, providing a JOL for each. Recognition memory for studied and unstudied actors was then assessed using a Recollect/Know/No-Memory (R/K/N) judgment task. Results showed that prior knowledge increased recollection in both age groups such that older adults recollected significantly more 1950s actors than younger adults. Also, for both age groups and both decades, actors judged R at test garnered significantly higher JOLs at study than actors judged K or N. However, while the young showed benefits of prior knowledge on relative JOL accuracy, older adults did not, showing lower levels of JOL accuracy for 1950s actors despite having higher recollection for, and knowledge about, those actors. Overall, the data suggest that prior knowledge can be a double-edged sword, increasing the availability of details that can support later recollection, but also increasing non-diagnostic feelings of familiarity that can reduce the accuracy of memory predictions. PMID:21480715
Uemoto, Yoshinobu; Sasaki, Shinji; Kojima, Takatoshi; Sugimoto, Yoshikazu; Watanabe, Toshio
2015-11-19
Genetic variance that is not captured by single nucleotide polymorphisms (SNPs) is due to imperfect linkage disequilibrium (LD) between SNPs and quantitative trait loci (QTLs), and the extent of LD between SNPs and QTLs depends on different minor allele frequencies (MAF) between them. To evaluate the impact of MAF of QTLs on genomic evaluation, we performed a simulation study using real cattle genotype data. In total, 1368 Japanese Black cattle and 592,034 SNPs (Illumina BovineHD BeadChip) were used. We simulated phenotypes using real genotypes under different scenarios, varying the MAF categories, QTL heritability, number of QTLs, and distribution of QTL effect. After generating true breeding values and phenotypes, QTL heritability was estimated and the prediction accuracy of genomic estimated breeding value (GEBV) was assessed under different SNP densities, prediction models, and population size by a reference-test validation design. The extent of LD between SNPs and QTLs in this population was higher in the QTLs with high MAF than in those with low MAF. The effect of MAF of QTLs depended on the genetic architecture, evaluation strategy, and population size in genomic evaluation. In genetic architecture, genomic evaluation was affected by the MAF of QTLs combined with the QTL heritability and the distribution of QTL effect. The number of QTL was not affected on genomic evaluation if the number of QTL was more than 50. In the evaluation strategy, we showed that different SNP densities and prediction models affect the heritability estimation and genomic prediction and that this depends on the MAF of QTLs. In addition, accurate QTL heritability and GEBV were obtained using denser SNP information and the prediction model accounted for the SNPs with low and high MAFs. In population size, a large sample size is needed to increase the accuracy of GEBV. The MAF of QTL had an impact on heritability estimation and prediction accuracy. Most genetic variance can be captured using denser SNPs and the prediction model accounted for MAF, but a large sample size is needed to increase the accuracy of GEBV under all QTL MAF categories.
Accuracy of three-dimensional multislice view Doppler in diagnosis of morbid adherent placenta
Abdel Moniem, Alaa M.; Ibrahim, Ahmed; Akl, Sherif A.; Aboul-Enen, Loay; Abdelazim, Ibrahim A.
2015-01-01
Objective To detect the accuracy of the three-dimensional multislice view (3D MSV) Doppler in the diagnosis of morbid adherent placenta (MAP). Material and Methods Fifty pregnant women at ≥28 weeks gestation with suspected MAP were included in this prospective study. Two dimensional (2D) trans-abdominal gray-scale ultrasound scan was performed for the subjects to confirm the gestational age, placental location, and findings suggestive of MAP, followed by the 3D power Doppler and then the 3D MSV Doppler to confirm the diagnosis of MAP. Intraoperative findings and histopathology results of removed uteri in cases managed by emergency hysterectomy were compared with preoperative sonographic findings to detect the accuracy of the 3D MSV Doppler in the diagnosis of MAP. Results The 3D MSV Doppler increased the accuracy and predictive values of the diagnostic criteria of MAP compared with the 3D power Doppler. The sensitivity and negative predictive value (NPV) (79.6% and 82.2%, respectively) of crowded vessels over the peripheral sub-placental zone to detect difficult placental separation and considerable intraoperative blood loss in cases of MAP using the 3D power Doppler was increased to 82.6% and 84%, respectively, using the 3D MSV Doppler. In addition, the sensitivity, specificity, and positive predictive value (PPV) (90.9%, 68.8%, and 47%, respectively) of the disruption of the uterine serosa-bladder interface for the detection of emergency hysterectomy in cases of MAP using the 3D power Doppler was increased to 100%, 71.8%, and 50%, respectively, using the 3D MSV Doppler. Conclusion The 3D MSV Doppler is a useful adjunctive tool to the 3D power Doppler or color Doppler to refine the diagnosis of MAP. PMID:26401104
Genomic Prediction of Gene Bank Wheat Landraces
Crossa, José; Jarquín, Diego; Franco, Jorge; Pérez-Rodríguez, Paulino; Burgueño, Juan; Saint-Pierre, Carolina; Vikram, Prashant; Sansaloni, Carolina; Petroli, Cesar; Akdemir, Deniz; Sneller, Clay; Reynolds, Matthew; Tattaris, Maria; Payne, Thomas; Guzman, Carlos; Peña, Roberto J.; Wenzl, Peter; Singh, Sukhwinder
2016-01-01
This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials. PMID:27172218
Accuracy of Predicted Genomic Breeding Values in Purebred and Crossbred Pigs.
Hidalgo, André M; Bastiaansen, John W M; Lopes, Marcos S; Harlizius, Barbara; Groenen, Martien A M; de Koning, Dirk-Jan
2015-05-26
Genomic selection has been widely implemented in dairy cattle breeding when the aim is to improve performance of purebred animals. In pigs, however, the final product is a crossbred animal. This may affect the efficiency of methods that are currently implemented for dairy cattle. Therefore, the objective of this study was to determine the accuracy of predicted breeding values in crossbred pigs using purebred genomic and phenotypic data. A second objective was to compare the predictive ability of SNPs when training is done in either single or multiple populations for four traits: age at first insemination (AFI); total number of piglets born (TNB); litter birth weight (LBW); and litter variation (LVR). We performed marker-based and pedigree-based predictions. Within-population predictions for the four traits ranged from 0.21 to 0.72. Multi-population prediction yielded accuracies ranging from 0.18 to 0.67. Predictions across purebred populations as well as predicting genetic merit of crossbreds from their purebred parental lines for AFI performed poorly (not significantly different from zero). In contrast, accuracies of across-population predictions and accuracies of purebred to crossbred predictions for LBW and LVR ranged from 0.08 to 0.31 and 0.11 to 0.31, respectively. Accuracy for TNB was zero for across-population prediction, whereas for purebred to crossbred prediction it ranged from 0.08 to 0.22. In general, marker-based outperformed pedigree-based prediction across populations and traits. However, in some cases pedigree-based prediction performed similarly or outperformed marker-based prediction. There was predictive ability when purebred populations were used to predict crossbred genetic merit using an additive model in the populations studied. AFI was the only exception, indicating that predictive ability depends largely on the genetic correlation between PB and CB performance, which was 0.31 for AFI. Multi-population prediction was no better than within-population prediction for the purebred validation set. Accuracy of prediction was very trait-dependent. Copyright © 2015 Hidalgo et al.
Oshida, Sotaro; Ogasawara, Kuniaki; Saura, Hiroaki; Yoshida, Koji; Fujiwara, Shunro; Kojima, Daigo; Kobayashi, Masakazu; Yoshida, Kenji; Kubo, Yoshitaka; Ogawa, Akira
2015-01-01
The purpose of the present study was to determine whether preoperative measurement of cerebral blood flow (CBF) with acetazolamide in addition to preoperative measurement of CBF at the resting state increases the predictive accuracy of development of cerebral hyperperfusion after carotid endarterectomy (CEA). CBF at the resting state and cerebrovascular reactivity (CVR) to acetazolamide were quantitatively assessed using N-isopropyl-p-[(123)I]-iodoamphetamine (IMP)-autoradiography method with single-photon emission computed tomography (SPECT) before CEA in 500 patients with ipsilateral internal carotid artery stenosis (≥ 70%). CBF measurement using (123)I-IMP SPECT was also performed immediately and 3 days after CEA. A region of interest (ROI) was automatically placed in the middle cerebral artery territory in the affected cerebral hemisphere using a three-dimensional stereotactic ROI template. Preoperative decreases in CBF at the resting state [95% confidence intervals (CIs), 0.855 to 0.967; P = 0.0023] and preoperative decreases in CVR to acetazolamide (95% CIs, 0.844 to 0.912; P < 0.0001) were significant independent predictors of post-CEA hyperperfusion. The area under the receiver operating characteristic curve for prediction of the development of post-CEA hyperperfusion was significantly greater for CVR to acetazolamide than for CBF at the resting state (difference between areas, 0.173; P < 0.0001). Sensitivity, specificity, and positive- and negative-predictive values for the prediction of the development of post-CEA hyperperfusion were significantly greater for CVR to acetazolamide than for CBF at the resting state (P < 0.05, respectively). The present study demonstrated that preoperative measurement of CBF with acetazolamide in addition to preoperative measurement of CBF at the resting state increases the predictive accuracy of the development of post-CEA hyperperfusion.
The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation.
Luan, Tu; Woolliams, John A; Lien, Sigbjørn; Kent, Matthew; Svendsen, Morten; Meuwissen, Theo H E
2009-11-01
Genomic Selection (GS) is a newly developed tool for the estimation of breeding values for quantitative traits through the use of dense markers covering the whole genome. For a successful application of GS, accuracy of the prediction of genomewide breeding value (GW-EBV) is a key issue to consider. Here we investigated the accuracy and possible bias of GW-EBV prediction, using real bovine SNP genotyping (18,991 SNPs) and phenotypic data of 500 Norwegian Red bulls. The study was performed on milk yield, fat yield, protein yield, first lactation mastitis traits, and calving ease. Three methods, best linear unbiased prediction (G-BLUP), Bayesian statistics (BayesB), and a mixture model approach (MIXTURE), were used to estimate marker effects, and their accuracy and bias were estimated by using cross-validation. The accuracies of the GW-EBV prediction were found to vary widely between 0.12 and 0.62. G-BLUP gave overall the highest accuracy. We observed a strong relationship between the accuracy of the prediction and the heritability of the trait. GW-EBV prediction for production traits with high heritability achieved higher accuracy and also lower bias than health traits with low heritability. To achieve a similar accuracy for the health traits probably more records will be needed.
Simulating Memory Impairment for Child Sexual Abuse.
Newton, Jeremy W; Hobbs, Sue D
2015-08-01
The current study investigated effects of simulated memory impairment on recall of child sexual abuse (CSA) information. A total of 144 adults were tested for memory of a written CSA scenario in which they role-played as the victim. There were four experimental groups and two testing sessions. During Session 1, participants read a CSA story and recalled it truthfully (Genuine group), omitted CSA information (Omission group), exaggerated CSA information (Commission group), or did not recall the story at all (No Rehearsal group). One week later, at Session 2, all participants were told to recount the scenario truthfully, and their memory was then tested using free recall and cued recall questions. The Session 1 manipulation affected memory accuracy during Session 2. Specifically, compared with the Genuine group's performance, the Omission, Commission, or No Rehearsal groups' performance was characterized by increased omission and commission errors and decreased reporting of correct details. Victim blame ratings (i.e., victim responsibility and provocativeness) and participant gender predicted increased error and decreased accuracy, whereas perpetrator blame ratings predicted decreased error and increased accuracy. Findings are discussed in relation to factors that may affect memory for CSA information. Copyright © 2015 John Wiley & Sons, Ltd.
Scheid, Anika; Nebel, Markus E
2012-07-09
Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
2012-01-01
Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
EVALUATING RISK-PREDICTION MODELS USING DATA FROM ELECTRONIC HEALTH RECORDS.
Wang, L E; Shaw, Pamela A; Mathelier, Hansie M; Kimmel, Stephen E; French, Benjamin
2016-03-01
The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., "false negatives"). We derive estimators for sensitivity and specificity if misclassification is independent of marker values. In simulation studies, we quantify the potential for bias in prediction accuracy summaries if misclassification depends on marker values. We compare the accuracy of alternative prognostic models for 30-day all-cause hospital readmission among 4548 patients discharged from the University of Pennsylvania Health System with a primary diagnosis of heart failure. Simulation studies indicate that if misclassification depends on marker values, then the estimated accuracy improvement is also biased, but the direction of the bias depends on the direction of the association between markers and the probability of misclassification. In our application, 29% of the 1143 readmitted patients were readmitted to a hospital elsewhere in Pennsylvania, which reduced prediction accuracy. Outcome misclassification can result in erroneous conclusions regarding the accuracy of risk-prediction models.
Boon, K H; Khalil-Hani, M; Malarvili, M B
2018-01-01
This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.
Improved method for predicting protein fold patterns with ensemble classifiers.
Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C
2012-01-27
Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.
Parsimonious data: How a single Facebook like predicts voting behavior in multiparty systems.
Kristensen, Jakob Bæk; Albrechtsen, Thomas; Dahl-Nielsen, Emil; Jensen, Michael; Skovrind, Magnus; Bornakke, Tobias
2017-01-01
This study shows how liking politicians' public Facebook posts can be used as an accurate measure for predicting present-day voter intention in a multiparty system. We highlight that a few, but selective digital traces produce prediction accuracies that are on par or even greater than most current approaches based upon bigger and broader datasets. Combining the online and offline, we connect a subsample of surveyed respondents to their public Facebook activity and apply machine learning classifiers to explore the link between their political liking behaviour and actual voting intention. Through this work, we show that even a single selective Facebook like can reveal as much about political voter intention as hundreds of heterogeneous likes. Further, by including the entire political like history of the respondents, our model reaches prediction accuracies above previous multiparty studies (60-70%). The main contribution of this paper is to show how public like-activity on Facebook allows political profiling of individual users in a multiparty system with accuracies above previous studies. Beside increased accuracies, the paper shows how such parsimonious measures allows us to generalize our findings to the entire population of a country and even across national borders, to other political multiparty systems. The approach in this study relies on data that are publicly available, and the simple setup we propose can with some limitations, be generalized to millions of users in other multiparty systems.
Improved Short-Term Clock Prediction Method for Real-Time Positioning.
Lv, Yifei; Dai, Zhiqiang; Zhao, Qile; Yang, Sheng; Zhou, Jinning; Liu, Jingnan
2017-06-06
The application of real-time precise point positioning (PPP) requires real-time precise orbit and clock products that should be predicted within a short time to compensate for the communication delay or data gap. Unlike orbit correction, clock correction is difficult to model and predict. The widely used linear model hardly fits long periodic trends with a small data set and exhibits significant accuracy degradation in real-time prediction when a large data set is used. This study proposes a new prediction model for maintaining short-term satellite clocks to meet the high-precision requirements of real-time clocks and provide clock extrapolation without interrupting the real-time data stream. Fast Fourier transform (FFT) is used to analyze the linear prediction residuals of real-time clocks. The periodic terms obtained through FFT are adopted in the sliding window prediction to achieve a significant improvement in short-term prediction accuracy. This study also analyzes and compares the accuracy of short-term forecasts (less than 3 h) by using different length observations. Experimental results obtained from International GNSS Service (IGS) final products and our own real-time clocks show that the 3-h prediction accuracy is better than 0.85 ns. The new model can replace IGS ultra-rapid products in the application of real-time PPP. It is also found that there is a positive correlation between the prediction accuracy and the short-term stability of on-board clocks. Compared with the accuracy of the traditional linear model, the accuracy of the static PPP using the new model of the 2-h prediction clock in N, E, and U directions is improved by about 50%. Furthermore, the static PPP accuracy of 2-h clock products is better than 0.1 m. When an interruption occurs in the real-time model, the accuracy of the kinematic PPP solution using 1-h clock prediction product is better than 0.2 m, without significant accuracy degradation. This model is of practical significance because it solves the problems of interruption and delay in data broadcast in real-time clock estimation and can meet the requirements of real-time PPP.
Can machine-learning improve cardiovascular risk prediction using routine clinical data?
Kai, Joe; Garibaldi, Jonathan M.; Qureshi, Nadeem
2017-01-01
Background Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Methods Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the ‘receiver operating curve’ (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). Findings 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723–0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739–0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755–0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755–0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759–0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Conclusions Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others. PMID:28376093
Can machine-learning improve cardiovascular risk prediction using routine clinical data?
Weng, Stephen F; Reps, Jenna; Kai, Joe; Garibaldi, Jonathan M; Qureshi, Nadeem
2017-01-01
Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the 'receiver operating curve' (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723-0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739-0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755-0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755-0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759-0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.
Evaluation of an ensemble of genetic models for prediction of a quantitative trait.
Milton, Jacqueline N; Steinberg, Martin H; Sebastiani, Paola
2014-01-01
Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble.
Barbieri, Christopher E; Cha, Eugene K; Chromecki, Thomas F; Dunning, Allison; Lotan, Yair; Svatek, Robert S; Scherr, Douglas S; Karakiewicz, Pierre I; Sun, Maxine; Mazumdar, Madhu; Shariat, Shahrokh F
2012-03-01
• To employ decision curve analysis to determine the impact of nuclear matrix protein 22 (NMP22) on clinical decision making in the detection of bladder cancer using data from a prospective trial. • The study included 1303 patients at risk for bladder cancer who underwent cystoscopy, urine cytology and measurement of urinary NMP22 levels. • We constructed several prediction models to estimate risk of bladder cancer. The base model was generated using patient characteristics (age, gender, race, smoking and haematuria); cytology and NMP22 were added to the base model to determine effects on predictive accuracy. • Clinical net benefit was calculated by summing the benefits and subtracting the harms and weighting these by the threshold probability at which a patient or clinician would opt for cystoscopy. • In all, 72 patients were found to have bladder cancer (5.5%). In univariate analyses, NMP22 was the strongest predictor of bladder cancer presence (predictive accuracy 71.3%), followed by age (67.5%) and cytology (64.3%). • In multivariable prediction models, NMP22 improved the predictive accuracy of the base model by 8.2% (area under the curve 70.2-78.4%) and of the base model plus cytology by 4.2% (area under the curve 75.9-80.1%). • Decision curve analysis revealed that adding NMP22 to other models increased clinical benefit, particularly at higher threshold probabilities. • NMP22 is a strong, independent predictor of bladder cancer. • Addition of NMP22 improves the accuracy of standard predictors by a statistically and clinically significant margin. • Decision curve analysis suggests that integration of NMP22 into clinical decision making helps avoid unnecessary cystoscopies, with minimal increased risk of missing a cancer. © 2011 THE AUTHORS. BJU INTERNATIONAL © 2011 BJU INTERNATIONAL.
Improved fibrosis staging by elastometry and blood test in chronic hepatitis C.
Calès, Paul; Boursier, Jérôme; Ducancelle, Alexandra; Oberti, Frédéric; Hubert, Isabelle; Hunault, Gilles; de Lédinghen, Victor; Zarski, Jean-Pierre; Salmon, Dominique; Lunel, Françoise
2014-07-01
Our main objective was to improve non-invasive fibrosis staging accuracy by resolving the limits of previous methods via new test combinations. Our secondary objectives were to improve staging precision, by developing a detailed fibrosis classification, and reliability (personalized accuracy) determination. All patients (729) included in the derivation population had chronic hepatitis C, liver biopsy, 6 blood tests and Fibroscan. Validation populations included 1584 patients. The most accurate combination was provided by using most markers of FibroMeter and Fibroscan results targeted for significant fibrosis, i.e. 'E-FibroMeter'. Its classification accuracy (91.7%) and precision (assessed by F difference with Metavir: 0.62 ± 0.57) were better than those of FibroMeter (84.1%, P < 0.001; 0.72 ± 0.57, P < 0.001), Fibroscan (88.2%, P = 0.011; 0.68 ± 0.57, P = 0.020), and a previous CSF-SF classification of FibroMeter + Fibroscan (86.7%, P < 0.001; 0.65 ± 0.57, P = 0.044). The accuracy for fibrosis absence (F0) was increased, e.g. from 16.0% with Fibroscan to 75.0% with E-FibroMeter (P < 0.001). Cirrhosis sensitivity was improved, e.g. E-FibroMeter: 92.7% vs. Fibroscan: 83.3%, P = 0.004. The combination improved reliability by deleting unreliable results (accuracy <50%) observed with a single test (1.2% of patients) and increasing optimal reliability (accuracy ≥85%) from 80.4% of patients with Fibroscan (accuracy: 90.9%) to 94.2% of patients with E-FibroMeter (accuracy: 92.9%), P < 0.001. The patient rate with 100% predictive values for cirrhosis by the best combination was twice (36.2%) that of the best single test (FibroMeter: 16.2%, P < 0.001). The new test combination increased: accuracy, globally and especially in patients without fibrosis, staging precision, cirrhosis prediction, and even reliability, thus offering improved fibrosis staging. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Boomerang: A method for recursive reclassification.
Devlin, Sean M; Ostrovnaya, Irina; Gönen, Mithat
2016-09-01
While there are many validated prognostic classifiers used in practice, often their accuracy is modest and heterogeneity in clinical outcomes exists in one or more risk subgroups. Newly available markers, such as genomic mutations, may be used to improve the accuracy of an existing classifier by reclassifying patients from a heterogenous group into a higher or lower risk category. The statistical tools typically applied to develop the initial classifiers are not easily adapted toward this reclassification goal. In this article, we develop a new method designed to refine an existing prognostic classifier by incorporating new markers. The two-stage algorithm called Boomerang first searches for modifications of the existing classifier that increase the overall predictive accuracy and then merges to a prespecified number of risk groups. Resampling techniques are proposed to assess the improvement in predictive accuracy when an independent validation data set is not available. The performance of the algorithm is assessed under various simulation scenarios where the marker frequency, degree of censoring, and total sample size are varied. The results suggest that the method selects few false positive markers and is able to improve the predictive accuracy of the classifier in many settings. Lastly, the method is illustrated on an acute myeloid leukemia data set where a new refined classifier incorporates four new mutations into the existing three category classifier and is validated on an independent data set. © 2016, The International Biometric Society.
Boomerang: A Method for Recursive Reclassification
Devlin, Sean M.; Ostrovnaya, Irina; Gönen, Mithat
2016-01-01
Summary While there are many validated prognostic classifiers used in practice, often their accuracy is modest and heterogeneity in clinical outcomes exists in one or more risk subgroups. Newly available markers, such as genomic mutations, may be used to improve the accuracy of an existing classifier by reclassifying patients from a heterogenous group into a higher or lower risk category. The statistical tools typically applied to develop the initial classifiers are not easily adapted towards this reclassification goal. In this paper, we develop a new method designed to refine an existing prognostic classifier by incorporating new markers. The two-stage algorithm called Boomerang first searches for modifications of the existing classifier that increase the overall predictive accuracy and then merges to a pre-specified number of risk groups. Resampling techniques are proposed to assess the improvement in predictive accuracy when an independent validation data set is not available. The performance of the algorithm is assessed under various simulation scenarios where the marker frequency, degree of censoring, and total sample size are varied. The results suggest that the method selects few false positive markers and is able to improve the predictive accuracy of the classifier in many settings. Lastly, the method is illustrated on an acute myeloid leukemia dataset where a new refined classifier incorporates four new mutations into the existing three category classifier and is validated on an independent dataset. PMID:26754051
Chemically intuited, large-scale screening of MOFs by machine learning techniques
NASA Astrophysics Data System (ADS)
Borboudakis, Giorgos; Stergiannakos, Taxiarchis; Frysali, Maria; Klontzas, Emmanuel; Tsamardinos, Ioannis; Froudakis, George E.
2017-10-01
A novel computational methodology for large-scale screening of MOFs is applied to gas storage with the use of machine learning technologies. This approach is a promising trade-off between the accuracy of ab initio methods and the speed of classical approaches, strategically combined with chemical intuition. The results demonstrate that the chemical properties of MOFs are indeed predictable (stochastically, not deterministically) using machine learning methods and automated analysis protocols, with the accuracy of predictions increasing with sample size. Our initial results indicate that this methodology is promising to apply not only to gas storage in MOFs but in many other material science projects.
Adjusted Clinical Groups: Predictive Accuracy for Medicaid Enrollees in Three States
Adams, E. Kathleen; Bronstein, Janet M.; Raskind-Hood, Cheryl
2002-01-01
Actuarial split-sample methods were used to assess predictive accuracy of adjusted clinical groups (ACGs) for Medicaid enrollees in Georgia, Mississippi (lagging in managed care penetration), and California. Accuracy for two non-random groups—high-cost and located in urban poor areas—was assessed. Measures for random groups were derived with and without short-term enrollees to assess the effect of turnover on predictive accuracy. ACGs improved predictive accuracy for high-cost conditions in all States, but did so only for those in Georgia's poorest urban areas. Higher and more unpredictable expenses of short-term enrollees moderated the predictive power of ACGs. This limitation was significant in Mississippi due in part, to that State's very high proportion of short-term enrollees. PMID:12545598
Kusumoto, Dai; Lachmann, Mark; Kunihiro, Takeshi; Yuasa, Shinsuke; Kishino, Yoshikazu; Kimura, Mai; Katsuki, Toshiomi; Itoh, Shogo; Seki, Tomohisa; Fukuda, Keiichi
2018-06-05
Deep learning technology is rapidly advancing and is now used to solve complex problems. Here, we used deep learning in convolutional neural networks to establish an automated method to identify endothelial cells derived from induced pluripotent stem cells (iPSCs), without the need for immunostaining or lineage tracing. Networks were trained to predict whether phase-contrast images contain endothelial cells based on morphology only. Predictions were validated by comparison to immunofluorescence staining for CD31, a marker of endothelial cells. Method parameters were then automatically and iteratively optimized to increase prediction accuracy. We found that prediction accuracy was correlated with network depth and pixel size of images to be analyzed. Finally, K-fold cross-validation confirmed that optimized convolutional neural networks can identify endothelial cells with high performance, based only on morphology. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Mammographic density, breast cancer risk and risk prediction
Vachon, Celine M; van Gils, Carla H; Sellers, Thomas A; Ghosh, Karthik; Pruthi, Sandhya; Brandt, Kathleen R; Pankratz, V Shane
2007-01-01
In this review, we examine the evidence for mammographic density as an independent risk factor for breast cancer, describe the risk prediction models that have incorporated density, and discuss the current and future implications of using mammographic density in clinical practice. Mammographic density is a consistent and strong risk factor for breast cancer in several populations and across age at mammogram. Recently, this risk factor has been added to existing breast cancer risk prediction models, increasing the discriminatory accuracy with its inclusion, albeit slightly. With validation, these models may replace the existing Gail model for clinical risk assessment. However, absolute risk estimates resulting from these improved models are still limited in their ability to characterize an individual's probability of developing cancer. Promising new measures of mammographic density, including volumetric density, which can be standardized using full-field digital mammography, will likely result in a stronger risk factor and improve accuracy of risk prediction models. PMID:18190724
Page, Richard B; Scrivani, Peter V; Dykes, Nathan L; Erb, Hollis N; Hobbs, Jeff M
2006-01-01
Our purpose was to determine the accuracy of increased thyroid activity for diagnosing hyperthyroidism in cats suspected of having that disease during pertechnetate scintigraphy using subcutaneous rather than intravenous radioisotope administration. Increased thyroid activity was determined by two methods: the thyroid:salivary ratio (T:S) and visual inspection. These assessments were made on the ventral scintigram of the head and neck. Scintigraphy was performed by injecting sodium pertechnetate (111 MBq, SQ) in the right-dorsal-lumbar region; static-acquisition images were obtained 20 min after injection. We used 49 cats; 34 (69%) had hyperthyroidism based on serum-chemistry analysis. Using a Wilcoxon's rank-sum test, a significant difference (P < 0.0001) was detected in the T:S between cats with and without hyperthyroidism. Using a decision criterion of 2.0 for the T:S, the test accurately predicted hyperthyroidism in 32/34 cats (sensitivity, 94%; 95% confidence interval (CI), 85-100%) and correctly predicted that hyperthyroidism was absent in 15/15 cats (specificity, 100%; CI, 97-100%). Using visual inspection, the test accurately predicted hyperthyroidism in 34/34 cats (sensitivity, 100%; CI, 99-100%) and correctly predicted that hyperthyroidism was absent in 12/15 cats (specificity, 80%; CI, 56-100%). The positive and negative predictive values were high for a wide range of prevalence of hyperthyroidism. And, the test had excellent agreement within and between examiners. Therefore, detecting increased thyroid activity during pertechnetate scintigraphy by subcutaneous injection is an accurate and reproducible test for feline hyperthyroidism.
Multi-Stage Target Tracking with Drift Correction and Position Prediction
NASA Astrophysics Data System (ADS)
Chen, Xin; Ren, Keyan; Hou, Yibin
2018-04-01
Most existing tracking methods are hard to combine accuracy and performance, and do not consider the shift between clarity and blur that often occurs. In this paper, we propound a multi-stage tracking framework with two particular modules: position prediction and corrective measure. We conduct tracking based on correlation filter with a corrective measure module to increase both performance and accuracy. Specifically, a convolutional network is used for solving the blur problem in realistic scene, training methodology that training dataset with blur images generated by the three blur algorithms. Then, we propose a position prediction module to reduce the computation cost and make tracker more capable of fast motion. Experimental result shows that our tracking method is more robust compared to others and more accurate on the benchmark sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tennenberg, S.D.; Jacobs, M.P.; Solomkin, J.S.
1987-04-01
Two methods for predicting adult respiratory distress syndrome (ARDS) were evaluated prospectively in a group of 81 multitrauma and sepsis patients considered at clinical high risk. A popular ARDS risk-scoring method, employing discriminant analysis equations (weighted risk criteria and oxygenation characteristics), yielded a predictive accuracy of 59% and a false-negative rate of 22%. Pulmonary alveolar-capillary permeability (PACP) was determined with a radioaerosol lung-scan technique in 23 of these 81 patients, representing a statistically similar subgroup. Lung scanning achieved a predictive accuracy of 71% (after excluding patients with unilateral pulmonary contusion) and gave no false-negatives. We propose a combination of clinicalmore » risk identification and functional determination of PACP to assess a patient's risk of developing ARDS.« less
Porto, William F; Pires, Állan S; Franco, Octavio L
2017-08-07
The antimicrobial activity prediction tools aim to help the novel antimicrobial peptides (AMP) sequences discovery, utilizing machine learning methods. Such approaches have gained increasing importance in the generation of novel synthetic peptides by means of rational design techniques. This study focused on predictive ability of such approaches to determine the antimicrobial sequence activities, which were previously characterized at the protein level by in vitro studies. Using four web servers and one standalone software, we evaluated 78 sequences generated by the so-called linguistic model, being 40 designed and 38 shuffled sequences, with ∼60 and ∼25% of identity to AMPs, respectively. The ab initio molecular modelling of such sequences indicated that the structure does not affect the predictions, as both sets present similar structures. Overall, the systems failed on predicting shuffled versions of designed peptides, as they are identical in AMPs composition, which implies in accuracies below 30%. The prediction accuracy is negatively affected by the low specificity of all systems here evaluated, as they, on the other hand, reached 100% of sensitivity. Our results suggest that complementary approaches with high specificity, not necessarily high accuracy, should be developed to be used together with the current systems, overcoming their limitations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Spittle, Alicia J; Lee, Katherine J; Spencer-Smith, Megan; Lorefice, Lucy E; Anderson, Peter J; Doyle, Lex W
2015-01-01
The primary aim of this study was to investigate the accuracy of the Alberta Infant Motor Scale (AIMS) and Neuro-Sensory Motor Developmental Assessment (NSMDA) over the first year of life for predicting motor impairment at 4 years in preterm children. The secondary aims were to assess the predictive value of serial assessments over the first year and when using a combination of these two assessment tools in follow-up. Children born <30 weeks' gestation were prospectively recruited and assessed at 4, 8 and 12 months' corrected age using the AIMS and NSMDA. At 4 years' corrected age children were assessed for cerebral palsy (CP) and motor impairment using the Movement Assessment Battery for Children 2nd-edition (MABC-2). We calculated accuracy of the AIMS and NSMDA for predicting CP and MABC-2 scores ≤15th (at-risk of motor difficulty) and ≤5th centile (significant motor difficulty) for each test (AIMS and NSMDA) at 4, 8 and 12 months, for delay on one, two or all three of the time points over the first year, and finally for delay on both tests at each time point. Accuracy for predicting motor impairment was good for each test at each age, although false positives were common. Motor impairment on the MABC-2 (scores ≤5th and ≤15th) was most accurately predicted by the AIMS at 4 months, whereas CP was most accurately predicted by the NSMDA at 12 months. In regards to serial assessments, the likelihood ratio for motor impairment increased with the number of delayed assessments. When combining both the NSMDA and AIMS the best accuracy was achieved at 4 months, although results were similar at 8 and 12 months. Motor development during the first year of life in preterm infants assessed with the AIMS and NSMDA is predictive of later motor impairment at preschool age. However, false positives are common and therefore it is beneficial to follow-up children at high risk of motor impairment at more than one time point, or to use a combination of assessment tools. ACTR.org.au ACTRN12606000252516.
Research on Improved Depth Belief Network-Based Prediction of Cardiovascular Diseases
Zhang, Hongpo
2018-01-01
Quantitative analysis and prediction can help to reduce the risk of cardiovascular disease. Quantitative prediction based on traditional model has low accuracy. The variance of model prediction based on shallow neural network is larger. In this paper, cardiovascular disease prediction model based on improved deep belief network (DBN) is proposed. Using the reconstruction error, the network depth is determined independently, and unsupervised training and supervised optimization are combined. It ensures the accuracy of model prediction while guaranteeing stability. Thirty experiments were performed independently on the Statlog (Heart) and Heart Disease Database data sets in the UCI database. Experimental results showed that the mean of prediction accuracy was 91.26% and 89.78%, respectively. The variance of prediction accuracy was 5.78 and 4.46, respectively. PMID:29854369
Predicting metabolic syndrome using decision tree and support vector machine methods.
Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh
2016-05-01
Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in predicting metabolic syndrome. According to this study, in cases where only the final result of the decision is regarded significant, SVM method can be used with acceptable accuracy in decision making medical issues. This method has not been implemented in the previous research.
Duan, Liwei; Zhang, Sheng; Lin, Zhaofen
2017-02-01
To explore the method and performance of using multiple indices to diagnose sepsis and to predict the prognosis of severe ill patients. Critically ill patients at first admission to intensive care unit (ICU) of Changzheng Hospital, Second Military Medical University, from January 2014 to September 2015 were enrolled if the following conditions were satisfied: (1) patients were 18-75 years old; (2) the length of ICU stay was more than 24 hours; (3) All records of the patients were available. Data of the patients was collected by searching the electronic medical record system. Logistic regression model was formulated to create the new combined predictive indicator and the receiver operating characteristic (ROC) curve for the new predictive indicator was built. The area under the ROC curve (AUC) for both the new indicator and original ones were compared. The optimal cut-off point was obtained where the Youden index reached the maximum value. Diagnostic parameters such as sensitivity, specificity and predictive accuracy were also calculated for comparison. Finally, individual values were substituted into the equation to test the performance in predicting clinical outcomes. A total of 362 patients (218 males and 144 females) were enrolled in our study and 66 patients died. The average age was (48.3±19.3) years old. (1) For the predictive model only containing categorical covariants [including procalcitonin (PCT), lipopolysaccharide (LPS), infection, white blood cells count (WBC) and fever], increased PCT, increased WBC and fever were demonstrated to be independent risk factors for sepsis in the logistic equation. The AUC for the new combined predictive indicator was higher than that of any other indictor, including PCT, LPS, infection, WBC and fever (0.930 vs. 0.661, 0.503, 0.570, 0.837, 0.800). The optimal cut-off value for the new combined predictive indicator was 0.518. Using the new indicator to diagnose sepsis, the sensitivity, specificity and diagnostic accuracy rate were 78.00%, 93.36% and 87.47%, respectively. One patient was randomly selected, and the clinical data was substituted into the probability equation for prediction. The calculated value was 0.015, which was less than the cut-off value (0.518), indicating that the prognosis was non-sepsis at an accuracy of 87.47%. (2) For the predictive model only containing continuous covariants, the logistic model which combined acute physiology and chronic health evaluation II (APACHE II) score and sequential organ failure assessment (SOFA) score to predict in-hospital death events, both APACHE II score and SOFA score were independent risk factors for death. The AUC for the new predictive indicator was higher than that of APACHE II score and SOFA score (0.834 vs. 0.812, 0.813). The optimal cut-off value for the new combined predictive indicator in predicting in-hospital death events was 0.236, and the corresponding sensitivity, specificity and diagnostic accuracy for the combined predictive indicator were 73.12%, 76.51% and 75.70%, respectively. One patient was randomly selected, and the APACHE II score and SOFA score was substituted into the probability equation for prediction. The calculated value was 0.570, which was higher than the cut-off value (0.236), indicating that the death prognosis at an accuracy of 75.70%. The combined predictive indicator, which is formulated by logistic regression models, is superior to any single indicator in predicting sepsis or in-hospital death events.
Tests for predicting complications of pre-eclampsia: A protocol for systematic reviews
Thangaratinam, Shakila; Coomarasamy, Arri; Sharp, Steve; O'Mahony, Fidelma; O'Brien, Shaughn; Ismail, Khaled MK; Khan, Khalid S
2008-01-01
Background Pre-eclampsia is associated with several complications. Early prediction of complications and timely management is needed for clinical care of these patients to avert fetal and maternal mortality and morbidity. There is a need to identify best testing strategies in pre eclampsia to identify the women at increased risk of complications. We aim to determine the accuracy of various tests to predict complications of pre-eclampsia by systematic quantitative reviews. Method We performed extensive search in MEDLINE (1951–2004), EMBASE (1974–2004) and also will also include manual searches of bibliographies of primary and review articles. An initial search has revealed 19500 citations. Two reviewers will independently select studies and extract data on study characteristics, quality and accuracy. Accuracy data will be used to construct 2 × 2 tables. Data synthesis will involve assessment for heterogeneity and appropriately pooling of results to produce summary Receiver Operating Characteristics (ROC) curve and summary likelihood ratios. Discussion This review will generate predictive information and integrate that with therapeutic effectiveness to determine the absolute benefit and harm of available therapy in reducing complications in women with pre-eclampsia. PMID:18694494
Aboagye-Sarfo, Patrick; Mai, Qun; Sanfilippo, Frank M; Preen, David B; Stewart, Louise M; Fatovich, Daniel M
2015-10-01
To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters' models. Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters' models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters' method. VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand. Copyright © 2015 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Gordon, Roberta R.
1988-01-01
Investigation into the most effective use of a kindergarten screening battery to predict second-grade reading and mathematics achievement found that a combination of 10 readiness subtests resulted in the same degree of accuracy as that obtained using the entire battery. However, neither version was accurate enough to be useful. (Author/CB)
A novel feature extraction scheme with ensemble coding for protein-protein interaction prediction.
Du, Xiuquan; Cheng, Jiaxing; Zheng, Tingting; Duan, Zheng; Qian, Fulan
2014-07-18
Protein-protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/ DXECPPI/index.jsp.
The accuracy of new wheelchair users' predictions about their future wheelchair use.
Hoenig, Helen; Griffiths, Patricia; Ganesh, Shanti; Caves, Kevin; Harris, Frances
2012-06-01
This study examined the accuracy of new wheelchair user predictions about their future wheelchair use. This was a prospective cohort study of 84 community-dwelling veterans provided a new manual wheelchair. The association between predicted and actual wheelchair use was strong at 3 mos (ϕ coefficient = 0.56), with 90% of those who anticipated using the wheelchair at 3 mos still using it (i.e., positive predictive value = 0.96) and 60% of those who anticipated not using it indeed no longer using the wheelchair (i.e., negative predictive value = 0.60, overall accuracy = 0.92). Predictive accuracy diminished over time, with overall accuracy declining from 0.92 at 3 mos to 0.66 at 6 mos. At all time points, and for all types of use, patients better predicted use as opposed to disuse, with correspondingly higher positive than negative predictive values. Accuracy of prediction of use in specific indoor and outdoor locations varied according to location. This study demonstrates the importance of better understanding the potential mismatch between the anticipated and actual patterns of wheelchair use. The findings suggest that users can be relied upon to accurately predict their basic wheelchair-related needs in the short-term. Further exploration is needed to identify characteristics that will aid users and their providers in more accurately predicting mobility needs for the long-term.
Pang, Hui; Han, Bing; Fu, Qiang; Zong, Zhenkun
2017-07-05
The presence of acute myocardial infarction (AMI) confers a poor prognosis in atrial fibrillation (AF), associated with increased mortality dramatically. This study aimed to evaluate the predictive value of CHADS 2 and CHA 2 DS 2 -VASc scores for AMI in patients with AF. This retrospective study enrolled 5140 consecutive nonvalvular AF patients, 300 patients with AMI and 4840 patients without AMI. We identified the optimal cut-off values of the CHADS 2 and CHA 2 DS 2 -VASc scores each based on receiver operating characteristic curves to predict the risk of AMI. Both CHADS 2 score and CHA 2 DS 2 -VASc score were associated with an increased odds ratio of the prevalence of AMI in patients with AF, after adjustment for hyperlipidaemia, hyperuricemia, hyperthyroidism, hypothyroidism and obstructive sleep apnea. The present results showed that the area under the curve (AUC) for CHADS 2 score was 0.787 with a similar accuracy of the CHA 2 DS 2 -VASc score (AUC 0.750) in predicting "high-risk" AF patients who developed AMI. However, the predictive accuracy of the two clinical-based risk scores was fair. The CHA 2 DS 2 -VASc score has fair predictive value for identifying high-risk patients with AF and is not significantly superior to CHADS 2 in predicting patients who develop AMI.
Fuzzy regression modeling for tool performance prediction and degradation detection.
Li, X; Er, M J; Lim, B S; Zhou, J H; Gan, O P; Rutkowski, L
2010-10-01
In this paper, the viability of using Fuzzy-Rule-Based Regression Modeling (FRM) algorithm for tool performance and degradation detection is investigated. The FRM is developed based on a multi-layered fuzzy-rule-based hybrid system with Multiple Regression Models (MRM) embedded into a fuzzy logic inference engine that employs Self Organizing Maps (SOM) for clustering. The FRM converts a complex nonlinear problem to a simplified linear format in order to further increase the accuracy in prediction and rate of convergence. The efficacy of the proposed FRM is tested through a case study - namely to predict the remaining useful life of a ball nose milling cutter during a dry machining process of hardened tool steel with a hardness of 52-54 HRc. A comparative study is further made between four predictive models using the same set of experimental data. It is shown that the FRM is superior as compared with conventional MRM, Back Propagation Neural Networks (BPNN) and Radial Basis Function Networks (RBFN) in terms of prediction accuracy and learning speed.
The Importance of Calibration in Clinical Psychology.
Lindhiem, Oliver; Petersen, Isaac T; Mentch, Lucas K; Youngstrom, Eric A
2018-02-01
Accuracy has several elements, not all of which have received equal attention in the field of clinical psychology. Calibration, the degree to which a probabilistic estimate of an event reflects the true underlying probability of the event, has largely been neglected in the field of clinical psychology in favor of other components of accuracy such as discrimination (e.g., sensitivity, specificity, area under the receiver operating characteristic curve). Although it is frequently overlooked, calibration is a critical component of accuracy with particular relevance for prognostic models and risk-assessment tools. With advances in personalized medicine and the increasing use of probabilistic (0% to 100%) estimates and predictions in mental health research, the need for careful attention to calibration has become increasingly important.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christian, Mark H; Hadjerioua, Boualem; Lee, Kyutae
2015-01-01
The following paper represents the results of an investigation into the impact of the number and placement of Current Meter (CM) flow sensors on the accuracy to which they are capable of predicting the overall flow rate. Flow measurement accuracy is of particular importance in multiunit plants because it plays a pivotal role in determining the operational efficiency characteristics of each unit, allowing the operator to select the unit (or combination of units) which most efficiently meet demand. Several case studies have demonstrated that optimization of unit dispatch has the potential to increase plant efficiencies from between 1 to 4.4more » percent [2] [3]. Unfortunately current industry standards do not have an established methodology to measure the flow rate through hydropower units with short converging intakes (SCI); the only direction provided is that CM sensors should be used. The most common application of CM is horizontally, along a trolley which is incrementally lowered across a measurement cross section. As such, the measurement resolution is defined horizontally and vertically by the number of CM and the number of measurement increments respectively. There has not been any published research on the role of resolution in either direction on the accuracy of flow measurement. The work below investigates the effectiveness of flow measurement in a SCI by performing a case study in which point velocity measurements were extracted from a physical plant and then used to calculate a series of reference flow distributions. These distributions were then used to perform sensitivity studies on the relation between the number of CM and the accuracy to which the flow rate was predicted. The following research uncovered that a minimum of 795 plants contain SCI, a quantity which represents roughly 12% of total domestic hydropower capacity. In regards to measurement accuracy, it was determined that accuracy ceases to increase considerably due to strict increases in vertical resolution beyond the application of 49 transects. Moreover the research uncovered that the application of 5 CM (when applied at 49 vertical transects) resulted in an average accuracy of 95.6% and the application of additional sensors resulted in a linear increase in accuracy up to 17 CM which had an average accuracy of 98.5%. Beyond 17 CM incremental increases in accuracy due to the addition of CM was found decrease exponentially. Future work that will be performed in this area will investigate the use of computational fluid dynamics to acquire a broader range of flow fields within SCI.« less
Tamura, Takeyuki; Akutsu, Tatsuya
2007-11-30
Subcellular location prediction of proteins is an important and well-studied problem in bioinformatics. This is a problem of predicting which part in a cell a given protein is transported to, where an amino acid sequence of the protein is given as an input. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Since existing predictors are based on various heuristics, it is important to develop a simple method with high prediction accuracies. In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT. Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html.
Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)
NASA Astrophysics Data System (ADS)
Razali, Nazim; Mustapha, Aida; Yatim, Faiz Ahmad; Aziz, Ruhaya Ab
2017-08-01
The issues of modeling asscoiation football prediction model has become increasingly popular in the last few years and many different approaches of prediction models have been proposed with the point of evaluating the attributes that lead a football team to lose, draw or win the match. There are three types of approaches has been considered for predicting football matches results which include statistical approaches, machine learning approaches and Bayesian approaches. Lately, many studies regarding football prediction models has been produced using Bayesian approaches. This paper proposes a Bayesian Networks (BNs) to predict the results of football matches in term of home win (H), away win (A) and draw (D). The English Premier League (EPL) for three seasons of 2010-2011, 2011-2012 and 2012-2013 has been selected and reviewed. K-fold cross validation has been used for testing the accuracy of prediction model. The required information about the football data is sourced from a legitimate site at http://www.football-data.co.uk. BNs achieved predictive accuracy of 75.09% in average across three seasons. It is hoped that the results could be used as the benchmark output for future research in predicting football matches results.
Accuracy and Calibration of High Explosive Thermodynamic Equations of State
2010-08-01
physics descriptions, but can also mean increased calibration complexity. A generalized extent of aluminum reaction, the Jones-Wilkins-Lee ( JWL ) based...predictions compared to experiments 3 3 PAX-30 JWL and JWLB cylinder test predictions compared to experiments 4 4 PAX-29 JWL and JWLB cylinder test...predictions compared to experiments 5 5 Experiment and modeling comparisons for HMX/AI 85/15 7 TABLES 1 LX-14 JWL and JWLB cylinder test velocity
Performance of genomic prediction within and across generations in maritime pine.
Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent
2016-08-11
Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
Guilloux, Jean-Philippe; Bassi, Sabrina; Ding, Ying; Walsh, Chris; Turecki, Gustavo; Tseng, George; Cyranowski, Jill M; Sibille, Etienne
2015-02-01
Major depressive disorder (MDD) in general, and anxious-depression in particular, are characterized by poor rates of remission with first-line treatments, contributing to the chronic illness burden suffered by many patients. Prospective research is needed to identify the biomarkers predicting nonremission prior to treatment initiation. We collected blood samples from a discovery cohort of 34 adult MDD patients with co-occurring anxiety and 33 matched, nondepressed controls at baseline and after 12 weeks (of citalopram plus psychotherapy treatment for the depressed cohort). Samples were processed on gene arrays and group differences in gene expression were investigated. Exploratory analyses suggest that at pretreatment baseline, nonremitting patients differ from controls with gene function and transcription factor analyses potentially related to elevated inflammation and immune activation. In a second phase, we applied an unbiased machine learning prediction model and corrected for model-selection bias. Results show that baseline gene expression predicted nonremission with 79.4% corrected accuracy with a 13-gene model. The same gene-only model predicted nonremission after 8 weeks of citalopram treatment with 76% corrected accuracy in an independent validation cohort of 63 MDD patients treated with citalopram at another institution. Together, these results demonstrate the potential, but also the limitations, of baseline peripheral blood-based gene expression to predict nonremission after citalopram treatment. These results not only support their use in future prediction tools but also suggest that increased accuracy may be obtained with the inclusion of additional predictors (eg, genetics and clinical scales).
Janoff, Daniel M; Davol, Patrick; Hazzard, James; Lemmers, Michael J; Paduch, Darius A; Barry, John M
2004-01-01
Computerized tomography (CT) with 3-dimensional (3-D) reconstruction has gained acceptance as an imaging study to evaluate living renal donors. We report our experience with this technique in 199 consecutive patients to validate its predictions of arterial anatomy and kidney volumes. Between January 1997 and March 2002, 199 living donor nephrectomies were performed at our institution using an open technique. During the operation arterial anatomy was recorded as well as kidney weight in 98 patients and displacement volume in 27. Each donor had been evaluated preoperatively by CT angiography with 3-D reconstruction. Arterial anatomy described by a staff radiologist was compared with intraoperative findings. CT estimated volumes were reported. Linear correlation graphs were generated to assess the reliability of CT volume predictions. The accuracy of CT angiography for predicting arterial anatomy was 90.5%. However, as the number of renal arteries increased, predictive accuracy decreased. The ability of CT to predict multiple arteries remained high with a positive predictive value of 95.2%. Calculated CT volume and kidney weight significantly correlated (0.654). However, the coefficient of variation index (how much average CT volume differed from measured intraoperative volume) was 17.8%. CT angiography with 3-D reconstruction accurately predicts arterial vasculature in more than 90% of patients and it can be used to compare renal volumes. However, accuracy decreases with multiple renal arteries and volume comparisons may be inaccurate when the difference in kidney volumes is within 17.8%.
Wager, Tor D.; Atlas, Lauren Y.; Leotti, Lauren A.; Rilling, James K.
2012-01-01
Recent studies have identified brain correlates of placebo analgesia, but none have assessed how accurately patterns of brain activity can predict individual differences in placebo responses. We reanalyzed data from two fMRI studies of placebo analgesia (N = 47), using patterns of fMRI activity during the anticipation and experience of pain to predict new subjects’ scores on placebo analgesia and placebo-induced changes in pain processing. We used a cross-validated regression procedure, LASSO-PCR, which provided both unbiased estimates of predictive accuracy and interpretable maps of which regions are most important for prediction. Increased anticipatory activity in a frontoparietal network and decreases in a posterior insular/temporal network predicted placebo analgesia. Patterns of anticipatory activity across the cortex predicted a moderate amount of variance in the placebo response (~12% overall, ~40% for study 2 alone), which is substantial considering the multiple likely contributing factors. The most predictive regions were those associated with emotional appraisal, rather than cognitive control or pain processing. During pain, decreases in limbic and paralimbic regions most strongly predicted placebo analgesia. Responses within canonical pain-processing regions explained significant variance in placebo analgesia, but the pattern of effects was inconsistent with widespread decreases in nociceptive processing. Together, the findings suggest that engagement of emotional appraisal circuits drives individual variation in placebo analgesia, rather than early suppression of nociceptive processing. This approach provides a framework that will allow prediction accuracy to increase as new studies provide more precise information for future predictive models. PMID:21228154
Grossi, D A; Brito, L F; Jafarikia, M; Schenkel, F S; Feng, Z
2018-04-30
The uptake of genomic selection (GS) by the swine industry is still limited by the costs of genotyping. A feasible alternative to overcome this challenge is to genotype animals using an affordable low-density (LD) single nucleotide polymorphism (SNP) chip panel followed by accurate imputation to a high-density panel. Therefore, the main objective of this study was to screen incremental densities of LD panels in order to systematically identify one that balances the tradeoffs among imputation accuracy, prediction accuracy of genomic estimated breeding values (GEBVs), and genotype density (directly associated with genotyping costs). Genotypes using the Illumina Porcine60K BeadChip were available for 1378 Duroc (DU), 2361 Landrace (LA) and 3192 Yorkshire (YO) pigs. In addition, pseudo-phenotypes (de-regressed estimated breeding values) for five economically important traits were provided for the analysis. The reference population for genotyping imputation consisted of 931 DU, 1631 LA and 2103 YO animals and the remainder individuals were included in the validation population of each breed. A LD panel of 3000 evenly spaced SNPs (LD3K) yielded high imputation accuracy rates: 93.78% (DU), 97.07% (LA) and 97.00% (YO) and high correlations (>0.97) between the predicted GEBVs using the actual 60 K SNP genotypes and the imputed 60 K SNP genotypes for all traits and breeds. The imputation accuracy was influenced by the reference population size as well as the amount of parental genotype information available in the reference population. However, parental genotype information became less important when the LD panel had at least 3000 SNPs. The correlation of the GEBVs directly increased with an increase in imputation accuracy. When genotype information for both parents was available, a panel of 300 SNPs (imputed to 60 K) yielded GEBV predictions highly correlated (⩾0.90) with genomic predictions obtained based on the true 60 K panel, for all traits and breeds. For a small reference population size with no parents on reference population, it is recommended the use of a panel at least as dense as the LD3K and, when there are two parents in the reference population, a panel as small as the LD300 might be a feasible option. These findings are of great importance for the development of LD panels for swine in order to reduce genotyping costs, increase the uptake of GS and, therefore, optimize the profitability of the swine industry.
Electrophysiological evidence for preserved primacy of lexical prediction in aging.
Dave, Shruti; Brothers, Trevor A; Traxler, Matthew J; Ferreira, Fernanda; Henderson, John M; Swaab, Tamara Y
2018-05-28
Young adults show consistent neural benefits of predictable contexts when processing upcoming words, but these benefits are less clear-cut in older adults. Here we disentangle the neural correlates of prediction accuracy and contextual support during word processing, in order to test current theories that suggest that neural mechanisms underlying predictive processing are specifically impaired in older adults. During a sentence comprehension task, older and younger readers were asked to predict passage-final words and report the accuracy of these predictions. Age-related reductions were observed for N250 and N400 effects of prediction accuracy, as well as for N400 effects of contextual support independent of prediction accuracy. Furthermore, temporal primacy of predictive processing (i.e., earlier facilitation for successful predictions) was preserved across the lifespan, suggesting that predictive mechanisms are unlikely to be uniquely impaired in older adults. In addition, older adults showed prediction effects on frontal post-N400 positivities (PNPs) that were similar in amplitude to PNPs in young adults. Previous research has shown correlations between verbal fluency and lexical prediction in older adult readers, suggesting that the production system may be linked to capacity for lexical prediction, especially in aging. The current study suggests that verbal fluency modulates PNP effects of contextual support, but not prediction accuracy. Taken together, our findings suggest that aging does not result in specific declines in lexical prediction. Copyright © 2018 Elsevier Ltd. All rights reserved.
Strain dependency of the effects of nicotine and mecamylamine in a rat model of attention.
Hahn, Britta; Riegger, Katelyn E; Elmer, Greg I
2016-04-01
Processes of attention have a heritable component, suggesting that genetic predispositions may predict variability in the response to attention-enhancing drugs. Among lead compounds with attention-enhancing properties are nicotinic acetylcholine receptor (nAChR) agonists. This study aims to test, by comparing three rat strains, whether genotype may influence the sensitivity to nicotine in the 5-choice serial reaction time task (5-CSRTT), a rodent model of attention. Strains tested were Long Evans (LE), Sprague Dawley (SD), and Wistar rats. The 5-CSRTT requires responses to light stimuli presented randomly in one of five locations. The effect of interest was an increased percentage of responses in the correct location (accuracy), the strongest indicator of improved attention. Nicotine (0.05-0.2 mg/kg s.c.) reduced omission errors and response latency and increased anticipatory responding in all strains. In contrast, nicotine dose-dependently increased accuracy in Wistar rats only. The nAChR antagonist mecamylamine (0.75-3 mg/kg s.c.) increased omissions, slowed responses, and reduced anticipatory responding in all strains. There were no effects on accuracy, which was surprising giving the clear improvement with nicotine in the Wistar group. The findings suggest strain differences in the attention-enhancing effects of nicotine, which would indicate that genetic predispositions predict variability in the efficacy of nAChR compounds for enhancing attention. The absence of effect of mecamylamine on response accuracy may suggest a contribution of nAChR desensitization to the attention-enhancing effects of nicotine.
Lessons in molecular recognition. 2. Assessing and improving cross-docking accuracy.
Sutherland, Jeffrey J; Nandigam, Ravi K; Erickson, Jon A; Vieth, Michal
2007-01-01
Docking methods are used to predict the manner in which a ligand binds to a protein receptor. Many studies have assessed the success rate of programs in self-docking tests, whereby a ligand is docked into the protein structure from which it was extracted. Cross-docking, or using a protein structure from a complex containing a different ligand, provides a more realistic assessment of a docking program's ability to reproduce X-ray results. In this work, cross-docking was performed with CDocker, Fred, and Rocs using multiple X-ray structures for eight proteins (two kinases, one nuclear hormone receptor, one serine protease, two metalloproteases, and two phosphodiesterases). While average cross-docking accuracy is not encouraging, it is shown that using the protein structure from the complex that contains the bound ligand most similar to the docked ligand increases docking accuracy for all methods ("similarity selection"). Identifying the most successful protein conformer ("best selection") and similarity selection substantially reduce the difference between self-docking and average cross-docking accuracy. We identify universal predictors of docking accuracy (i.e., showing consistent behavior across most protein-method combinations), and show that models for predicting docking accuracy built using these parameters can be used to select the most appropriate docking method.
Genomic prediction of reproduction traits for Merino sheep.
Bolormaa, S; Brown, D J; Swan, A A; van der Werf, J H J; Hayes, B J; Daetwyler, H D
2017-06-01
Economically important reproduction traits in sheep, such as number of lambs weaned and litter size, are expressed only in females and later in life after most selection decisions are made, which makes them ideal candidates for genomic selection. Accurate genomic predictions would lead to greater genetic gain for these traits by enabling accurate selection of young rams with high genetic merit. The aim of this study was to design and evaluate the accuracy of a genomic prediction method for female reproduction in sheep using daughter trait deviations (DTD) for sires and ewe phenotypes (when individual ewes were genotyped) for three reproduction traits: number of lambs born (NLB), litter size (LSIZE) and number of lambs weaned. Genomic best linear unbiased prediction (GBLUP), BayesR and pedigree BLUP analyses of the three reproduction traits measured on 5340 sheep (4503 ewes and 837 sires) with real and imputed genotypes for 510 174 SNPs were performed. The prediction of breeding values using both sire and ewe trait records was validated in Merino sheep. Prediction accuracy was evaluated by across sire family and random cross-validations. Accuracies of genomic estimated breeding values (GEBVs) were assessed as the mean Pearson correlation adjusted by the accuracy of the input phenotypes. The addition of sire DTD into the prediction analysis resulted in higher accuracies compared with using only ewe records in genomic predictions or pedigree BLUP. Using GBLUP, the average accuracy based on the combined records (ewes and sire DTD) was 0.43 across traits, but the accuracies varied by trait and type of cross-validations. The accuracies of GEBVs from random cross-validations (range 0.17-0.61) were higher than were those from sire family cross-validations (range 0.00-0.51). The GEBV accuracies of 0.41-0.54 for NLB and LSIZE based on the combined records were amongst the highest in the study. Although BayesR was not significantly different from GBLUP in prediction accuracy, it identified several candidate genes which are known to be associated with NLB and LSIZE. The approach provides a way to make use of all data available in genomic prediction for traits that have limited recording. © 2017 Stichting International Foundation for Animal Genetics.
Pośpiech, Ewelina; Wojas-Pelc, Anna; Walsh, Susan; Liu, Fan; Maeda, Hitoshi; Ishikawa, Takaki; Skowron, Małgorzata; Kayser, Manfred; Branicki, Wojciech
2014-07-01
The role of epistatic effects in the determination of complex traits is often underlined but its significance in the prediction of pigmentation phenotypes has not been evaluated so far. The prediction of pigmentation from genetic data can be useful in forensic science to describe the physical appearance of an unknown offender, victim, or missing person who cannot be identified via conventional DNA profiling. Available forensic DNA prediction systems enable the reliable prediction of several eye and hair colour categories. However, there is still space for improvement. Here we verified the association of 38 candidate DNA polymorphisms from 13 genes and explored the extent to which interactions between them may be involved in human pigmentation and their impact on forensic DNA prediction in particular. The model-building set included 718 Polish samples and the model-verification set included 307 independent Polish samples and additional 72 samples from Japan. In total, 29 significant SNP-SNP interactions were found with 5 of them showing an effect on phenotype prediction. For predicting green eye colour, interactions between HERC2 rs12913832 and OCA2 rs1800407 as well as TYRP1 rs1408799 raised the prediction accuracy expressed by AUC from 0.667 to 0.697 and increased the prediction sensitivity by >3%. Interaction between MC1R 'R' variants and VDR rs731236 increased the sensitivity for light skin by >1% and by almost 3% for dark skin colour prediction. Interactions between VDR rs1544410 and TYR rs1042602 as well as between MC1R 'R' variants and HERC2 rs12913832 provided an increase in red/non-red hair prediction accuracy from an AUC of 0.902-0.930. Our results thus underline epistasis as a common phenomenon in human pigmentation genetics and demonstrate that considering SNP-SNP interactions in forensic DNA phenotyping has little impact on eye, hair and skin colour prediction. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Task experience and children’s working memory performance: A perspective from recall timing
Towse, John N.; Cowan, Nelson; Horton, Neil J.; Whytock, Shealagh
2008-01-01
Working memory is an important theoretical construct among children, and measures of its capacity predict a range of cognitive skills and abilities. Data from 9-and 11-year-old children illustrate how a chronometric analysis of recall can complement and elaborate recall accuracy in advancing our understanding of working memory. A reading span task was completed by 130 children, 75 of whom were tested on two occasions, with sequence length either increasing or decreasing during test administration. Substantial pauses occur during participants’ recall sequences and they represent consistent performance traits over time, whilst also varying with recall circumstances and task history. Recall pauses help to predict reading and number skills, alongside as well as separate from levels of recall accuracy. The task demands of working memory change as a function of task experience, with a combination of accuracy and response timing in novel task situations being the strongest predictor of cognitive attainment. PMID:18473637
Study design requirements for RNA sequencing-based breast cancer diagnostics.
Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias
2016-02-01
Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.
A Coupled Surface Nudging Scheme for use in Retrospective ...
A surface analysis nudging scheme coupling atmospheric and land surface thermodynamic parameters has been implemented into WRF v3.8 (latest version) for use with retrospective weather and climate simulations, as well as for applications in air quality, hydrology, and ecosystem modeling. This scheme is known as the flux-adjusting surface data assimilation system (FASDAS) developed by Alapaty et al. (2008). This scheme provides continuous adjustments for soil moisture and temperature (via indirect nudging) and for surface air temperature and water vapor mixing ratio (via direct nudging). The simultaneous application of indirect and direct nudging maintains greater consistency between the soil temperature–moisture and the atmospheric surface layer mass-field variables. The new method, FASDAS, consistently improved the accuracy of the model simulations at weather prediction scales for different horizontal grid resolutions, as well as for high resolution regional climate predictions. This new capability has been released in WRF Version 3.8 as option grid_sfdda = 2. This new capability increased the accuracy of atmospheric inputs for use air quality, hydrology, and ecosystem modeling research to improve the accuracy of respective end-point research outcome. IMPACT: A new method, FASDAS, was implemented into the WRF model to consistently improve the accuracy of the model simulations at weather prediction scales for different horizontal grid resolutions, as wel
The advantages of the surface Laplacian in brain-computer interface research.
McFarland, Dennis J
2015-09-01
Brain-computer interface (BCI) systems frequently use signal processing methods, such as spatial filtering, to enhance performance. The surface Laplacian can reduce spatial noise and aid in identification of sources. In BCI research, these two functions of the surface Laplacian correspond to prediction accuracy and signal orthogonality. In the present study, an off-line analysis of data from a sensorimotor rhythm-based BCI task dissociated these functions of the surface Laplacian by comparing nearest-neighbor and next-nearest neighbor Laplacian algorithms. The nearest-neighbor Laplacian produced signals that were more orthogonal while the next-nearest Laplacian produced signals that resulted in better accuracy. Both prediction and signal identification are important for BCI research. Better prediction of user's intent produces increased speed and accuracy of communication and control. Signal identification is important for ruling out the possibility of control by artifacts. Identifying the nature of the control signal is relevant both to understanding exactly what is being studied and in terms of usability for individuals with limited motor control. Copyright © 2014 Elsevier B.V. All rights reserved.
Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines.
Nielsen, Nanna Hellum; Jahoor, Ahmed; Jensen, Jens Due; Orabi, Jihad; Cericola, Fabio; Edriss, Vahid; Jensen, Just
2016-01-01
Genomic selection was recently introduced in plant breeding. The objective of this study was to develop genomic prediction for important seed quality parameters in spring barley. The aim was to predict breeding values without expensive phenotyping of large sets of lines. A total number of 309 advanced spring barley lines tested at two locations each with three replicates were phenotyped and each line was genotyped by Illumina iSelect 9Kbarley chip. The population originated from two different breeding sets, which were phenotyped in two different years. Phenotypic measurements considered were: seed size, protein content, protein yield, test weight and ergosterol content. A leave-one-out cross-validation strategy revealed high prediction accuracies ranging between 0.40 and 0.83. Prediction across breeding sets resulted in reduced accuracies compared to the leave-one-out strategy. Furthermore, predicting across full and half-sib-families resulted in reduced prediction accuracies. Additionally, predictions were performed using reduced marker sets and reduced training population sets. In conclusion, using less than 200 lines in the training set can result in low prediction accuracy, and the accuracy will then be highly dependent on the family structure of the selected training set. However, the results also indicate that relatively small training sets (200 lines) are sufficient for genomic prediction in commercial barley breeding. In addition, our results indicate a minimum marker set of 1,000 to decrease the risk of low prediction accuracy for some traits or some families.
Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines
Nielsen, Nanna Hellum; Jahoor, Ahmed; Jensen, Jens Due; Orabi, Jihad; Cericola, Fabio; Edriss, Vahid; Jensen, Just
2016-01-01
Genomic selection was recently introduced in plant breeding. The objective of this study was to develop genomic prediction for important seed quality parameters in spring barley. The aim was to predict breeding values without expensive phenotyping of large sets of lines. A total number of 309 advanced spring barley lines tested at two locations each with three replicates were phenotyped and each line was genotyped by Illumina iSelect 9Kbarley chip. The population originated from two different breeding sets, which were phenotyped in two different years. Phenotypic measurements considered were: seed size, protein content, protein yield, test weight and ergosterol content. A leave-one-out cross-validation strategy revealed high prediction accuracies ranging between 0.40 and 0.83. Prediction across breeding sets resulted in reduced accuracies compared to the leave-one-out strategy. Furthermore, predicting across full and half-sib-families resulted in reduced prediction accuracies. Additionally, predictions were performed using reduced marker sets and reduced training population sets. In conclusion, using less than 200 lines in the training set can result in low prediction accuracy, and the accuracy will then be highly dependent on the family structure of the selected training set. However, the results also indicate that relatively small training sets (200 lines) are sufficient for genomic prediction in commercial barley breeding. In addition, our results indicate a minimum marker set of 1,000 to decrease the risk of low prediction accuracy for some traits or some families. PMID:27783639
An automated decision-tree approach to predicting protein interaction hot spots.
Darnell, Steven J; Page, David; Mitchell, Julie C
2007-09-01
Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface. 2007 Wiley-Liss, Inc.
Low, Yen S.; Sedykh, Alexander; Rusyn, Ivan; Tropsha, Alexander
2017-01-01
Cheminformatics approaches such as Quantitative Structure Activity Relationship (QSAR) modeling have been used traditionally for predicting chemical toxicity. In recent years, high throughput biological assays have been increasingly employed to elucidate mechanisms of chemical toxicity and predict toxic effects of chemicals in vivo. The data generated in such assays can be considered as biological descriptors of chemicals that can be combined with molecular descriptors and employed in QSAR modeling to improve the accuracy of toxicity prediction. In this review, we discuss several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and compare their performance across several data sets. We conclude that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone. We discuss the outlook for such interdisciplinary methods and offer recommendations to further improve the accuracy and interpretability of computational models that predict chemical toxicity. PMID:24805064
Improving CSF biomarker accuracy in predicting prevalent and incident Alzheimer disease
Fagan, A.M.; Williams, M.M.; Ghoshal, N.; Aeschleman, M.; Grant, E.A.; Marcus, D.S.; Mintun, M.A.; Holtzman, D.M.; Morris, J.C.
2011-01-01
Objective: To investigate factors, including cognitive and brain reserve, which may independently predict prevalent and incident dementia of the Alzheimer type (DAT) and to determine whether inclusion of identified factors increases the predictive accuracy of the CSF biomarkers Aβ42, tau, ptau181, tau/Aβ42, and ptau181/Aβ42. Methods: Logistic regression identified variables that predicted prevalent DAT when considered together with each CSF biomarker in a cross-sectional sample of 201 participants with normal cognition and 46 with DAT. The area under the receiver operating characteristic curve (AUC) from the resulting model was compared with the AUC generated using the biomarker alone. In a second sample with normal cognition at baseline and longitudinal data available (n = 213), Cox proportional hazards models identified variables that predicted incident DAT together with each biomarker, and the models' concordance probability estimate (CPE), which was compared to the CPE generated using the biomarker alone. Results: APOE genotype including an ε4 allele, male gender, and smaller normalized whole brain volumes (nWBV) were cross-sectionally associated with DAT when considered together with every biomarker. In the longitudinal sample (mean follow-up = 3.2 years), 14 participants (6.6%) developed DAT. Older age predicted a faster time to DAT in every model, and greater education predicted a slower time in 4 of 5 models. Inclusion of ancillary variables resulted in better cross-sectional prediction of DAT for all biomarkers (p < 0.0021), and better longitudinal prediction for 4 of 5 biomarkers (p < 0.0022). Conclusions: The predictive accuracy of CSF biomarkers is improved by including age, education, and nWBV in analyses. PMID:21228296
Lu, Liqiang; Gopalan, Balaji; Benyahia, Sofiane
2017-06-21
Several discrete particle methods exist in the open literature to simulate fluidized bed systems, such as discrete element method (DEM), time driven hard sphere (TDHS), coarse-grained particle method (CGPM), coarse grained hard sphere (CGHS), and multi-phase particle-in-cell (MP-PIC). These different approaches usually solve the fluid phase in a Eulerian fixed frame of reference and the particle phase using the Lagrangian method. The first difference between these models lies in tracking either real particles or lumped parcels. The second difference is in the treatment of particle-particle interactions: by calculating collision forces (DEM and CGPM), using momentum conservation laws (TDHS and CGHS),more » or based on particle stress model (MP-PIC). These major model differences lead to a wide range of results accuracy and computation speed. However, these models have never been compared directly using the same experimental dataset. In this research, a small-scale fluidized bed is simulated with these methods using the same open-source code MFIX. The results indicate that modeling the particle-particle collision by TDHS increases the computation speed while maintaining good accuracy. Also, lumping few particles in a parcel increases the computation speed with little loss in accuracy. However, modeling particle-particle interactions with solids stress leads to a big loss in accuracy with a little increase in computation speed. The MP-PIC method predicts an unphysical particle-particle overlap, which results in incorrect voidage distribution and incorrect overall bed hydrodynamics. Based on this study, we recommend using the CGHS method for fluidized bed simulations due to its computational speed that rivals that of MPPIC while maintaining a much better accuracy.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Liqiang; Gopalan, Balaji; Benyahia, Sofiane
Several discrete particle methods exist in the open literature to simulate fluidized bed systems, such as discrete element method (DEM), time driven hard sphere (TDHS), coarse-grained particle method (CGPM), coarse grained hard sphere (CGHS), and multi-phase particle-in-cell (MP-PIC). These different approaches usually solve the fluid phase in a Eulerian fixed frame of reference and the particle phase using the Lagrangian method. The first difference between these models lies in tracking either real particles or lumped parcels. The second difference is in the treatment of particle-particle interactions: by calculating collision forces (DEM and CGPM), using momentum conservation laws (TDHS and CGHS),more » or based on particle stress model (MP-PIC). These major model differences lead to a wide range of results accuracy and computation speed. However, these models have never been compared directly using the same experimental dataset. In this research, a small-scale fluidized bed is simulated with these methods using the same open-source code MFIX. The results indicate that modeling the particle-particle collision by TDHS increases the computation speed while maintaining good accuracy. Also, lumping few particles in a parcel increases the computation speed with little loss in accuracy. However, modeling particle-particle interactions with solids stress leads to a big loss in accuracy with a little increase in computation speed. The MP-PIC method predicts an unphysical particle-particle overlap, which results in incorrect voidage distribution and incorrect overall bed hydrodynamics. Based on this study, we recommend using the CGHS method for fluidized bed simulations due to its computational speed that rivals that of MPPIC while maintaining a much better accuracy.« less
Prediction of recovery of motor function after stroke.
Stinear, Cathy
2010-12-01
Stroke is a leading cause of disability. The ability to live independently after stroke depends largely on the reduction of motor impairment and the recovery of motor function. Accurate prediction of motor recovery assists rehabilitation planning and supports realistic goal setting by clinicians and patients. Initial impairment is negatively related to degree of recovery, but inter-individual variability makes accurate prediction difficult. Neuroimaging and neurophysiological assessments can be used to measure the extent of stroke damage to the motor system and predict subsequent recovery of function, but these techniques are not yet used routinely. The use of motor impairment scores and neuroimaging has been refined by two recent studies in which these investigations were used at multiple time points early after stroke. Voluntary finger extension and shoulder abduction within 5 days of stroke predicted subsequent recovery of upper-limb function. Diffusion-weighted imaging within 7 days detected the effects of stroke on caudal motor pathways and was predictive of lasting motor impairment. Thus, investigations done soon after stroke had good prognostic value. The potential prognostic value of cortical activation and neural plasticity has been explored for the first time by two recent studies. Functional MRI detected a pattern of cortical activation at the acute stage that was related to subsequent reduction in motor impairment. Transcranial magnetic stimulation enabled measurement of neural plasticity in the primary motor cortex, which was related to subsequent disability. These studies open interesting new lines of enquiry. WHERE NEXT?: The accuracy of prediction might be increased by taking into account the motor system's capacity for functional reorganisation in response to therapy, in addition to the extent of stroke-related damage. Improved prognostic accuracy could also be gained by combining simple tests of motor impairment with neuroimaging, genotyping, and neurophysiological assessment of neural plasticity. The development of algorithms to guide the sequential combinations of these assessments could also further increase accuracy, in addition to improving rehabilitation planning and outcomes. Copyright © 2010 Elsevier Ltd. All rights reserved.
Clinical versus actuarial judgment.
Dawes, R M; Faust, D; Meehl, P E
1989-03-31
Professionals are frequently consulted to diagnose and predict human behavior; optimal treatment and planning often hinge on the consultant's judgmental accuracy. The consultant may rely on one of two contrasting approaches to decision-making--the clinical and actuarial methods. Research comparing these two approaches shows the actuarial method to be superior. Factors underlying the greater accuracy of actuarial methods, sources of resistance to the scientific findings, and the benefits of increased reliance on actuarial approaches are discussed.
Practical approach to subject-specific estimation of knee joint contact force.
Knarr, Brian A; Higginson, Jill S
2015-08-20
Compressive forces experienced at the knee can significantly contribute to cartilage degeneration. Musculoskeletal models enable predictions of the internal forces experienced at the knee, but validation is often not possible, as experimental data detailing loading at the knee joint is limited. Recently available data reporting compressive knee force through direct measurement using instrumented total knee replacements offer a unique opportunity to evaluate the accuracy of models. Previous studies have highlighted the importance of subject-specificity in increasing the accuracy of model predictions; however, these techniques may be unrealistic outside of a research setting. Therefore, the goal of our work was to identify a practical approach for accurate prediction of tibiofemoral knee contact force (KCF). Four methods for prediction of knee contact force were compared: (1) standard static optimization, (2) uniform muscle coordination weighting, (3) subject-specific muscle coordination weighting and (4) subject-specific strength adjustments. Walking trials for three subjects with instrumented knee replacements were used to evaluate the accuracy of model predictions. Predictions utilizing subject-specific muscle coordination weighting yielded the best agreement with experimental data; however this method required in vivo data for weighting factor calibration. Including subject-specific strength adjustments improved models' predictions compared to standard static optimization, with errors in peak KCF less than 0.5 body weight for all subjects. Overall, combining clinical assessments of muscle strength with standard tools available in the OpenSim software package, such as inverse kinematics and static optimization, appears to be a practical method for predicting joint contact force that can be implemented for many applications. Copyright © 2015 Elsevier Ltd. All rights reserved.
Practical approach to subject-specific estimation of knee joint contact force
Knarr, Brian A.; Higginson, Jill S.
2015-01-01
Compressive forces experienced at the knee can significantly contribute to cartilage degeneration. Musculoskeletal models enable predictions of the internal forces experienced at the knee, but validation is often not possible, as experimental data detailing loading at the knee joint is limited. Recently available data reporting compressive knee force through direct measurement using instrumented total knee replacements offer a unique opportunity to evaluate the accuracy of models. Previous studies have highlighted the importance of subject-specificity in increasing the accuracy of model predictions; however, these techniques may be unrealistic outside of a research setting. Therefore, the goal of our work was to identify a practical approach for accurate prediction of tibiofemoral knee contact force (KCF). Four methods for prediction of knee contact force were compared: (1) standard static optimization, (2) uniform muscle coordination weighting, (3) subject-specific muscle coordination weighting and (4) subject-specific strength adjustments. Walking trials for three subjects with instrumented knee replacements were used to evaluate the accuracy of model predictions. Predictions utilizing subject-specific muscle coordination weighting yielded the best agreement with experimental data, however this method required in vivo data for weighting factor calibration. Including subject-specific strength adjustments improved models’ predictions compared to standard static optimization, with errors in peak KCF less than 0.5 body weight for all subjects. Overall, combining clinical assessments of muscle strength with standard tools available in the OpenSim software package, such as inverse kinematics and static optimization, appears to be a practical method for predicting joint contact force that can be implemented for many applications. PMID:25952546
Davey, James A; Chica, Roberto A
2015-04-01
Computational protein design (CPD) predictions are highly dependent on the structure of the input template used. However, it is unclear how small differences in template geometry translate to large differences in stability prediction accuracy. Herein, we explored how structural changes to the input template affect the outcome of stability predictions by CPD. To do this, we prepared alternate templates by Rotamer Optimization followed by energy Minimization (ROM) and used them to recapitulate the stability of 84 protein G domain β1 mutant sequences. In the ROM process, side-chain rotamers for wild-type (WT) or mutant sequences are optimized on crystal or nuclear magnetic resonance (NMR) structures prior to template minimization, resulting in alternate structures termed ROM templates. We show that use of ROM templates prepared from sequences known to be stable results predominantly in improved prediction accuracy compared to using the minimized crystal or NMR structures. Conversely, ROM templates prepared from sequences that are less stable than the WT reduce prediction accuracy by increasing the number of false positives. These observed changes in prediction outcomes are attributed to differences in side-chain contacts made by rotamers in ROM templates. Finally, we show that ROM templates prepared from sequences that are unfolded or that adopt a nonnative fold result in the selective enrichment of sequences that are also unfolded or that adopt a nonnative fold, respectively. Our results demonstrate the existence of a rotamer bias caused by the input template that can be harnessed to skew predictions toward sequences displaying desired characteristics. © 2014 The Protein Society.
Towards Cooperative Predictive Data Mining in Competitive Environments
NASA Astrophysics Data System (ADS)
Lisý, Viliam; Jakob, Michal; Benda, Petr; Urban, Štěpán; Pěchouček, Michal
We study the problem of predictive data mining in a competitive multi-agent setting, in which each agent is assumed to have some partial knowledge required for correctly classifying a set of unlabelled examples. The agents are self-interested and therefore need to reason about the trade-offs between increasing their classification accuracy by collaborating with other agents and disclosing their private classification knowledge to other agents through such collaboration. We analyze the problem and propose a set of components which can enable cooperation in this otherwise competitive task. These components include measures for quantifying private knowledge disclosure, data-mining models suitable for multi-agent predictive data mining, and a set of strategies by which agents can improve their classification accuracy through collaboration. The overall framework and its individual components are validated on a synthetic experimental domain.
Mitigating Errors in External Respiratory Surrogate-Based Models of Tumor Position
DOE Office of Scientific and Technical Information (OSTI.GOV)
Malinowski, Kathleen T.; Fischell Department of Bioengineering, University of Maryland, College Park, MD; McAvoy, Thomas J.
2012-04-01
Purpose: To investigate the effect of tumor site, measurement precision, tumor-surrogate correlation, training data selection, model design, and interpatient and interfraction variations on the accuracy of external marker-based models of tumor position. Methods and Materials: Cyberknife Synchrony system log files comprising synchronously acquired positions of external markers and the tumor from 167 treatment fractions were analyzed. The accuracy of Synchrony, ordinary-least-squares regression, and partial-least-squares regression models for predicting the tumor position from the external markers was evaluated. The quantity and timing of the data used to build the predictive model were varied. The effects of tumor-surrogate correlation and the precisionmore » in both the tumor and the external surrogate position measurements were explored by adding noise to the data. Results: The tumor position prediction errors increased during the duration of a fraction. Increasing the training data quantities did not always lead to more accurate models. Adding uncorrelated noise to the external marker-based inputs degraded the tumor-surrogate correlation models by 16% for partial-least-squares and 57% for ordinary-least-squares. External marker and tumor position measurement errors led to tumor position prediction changes 0.3-3.6 times the magnitude of the measurement errors, varying widely with model algorithm. The tumor position prediction errors were significantly associated with the patient index but not with the fraction index or tumor site. Partial-least-squares was as accurate as Synchrony and more accurate than ordinary-least-squares. Conclusions: The accuracy of surrogate-based inferential models of tumor position was affected by all the investigated factors, except for the tumor site and fraction index.« less
Comparing ordinary kriging and inverse distance weighting for soil as pollution in Beijing.
Qiao, Pengwei; Lei, Mei; Yang, Sucai; Yang, Jun; Guo, Guanghui; Zhou, Xiaoyong
2018-06-01
Spatial interpolation method is the basis of soil heavy metal pollution assessment and remediation. The existing evaluation index for interpolation accuracy did not combine with actual situation. The selection of interpolation methods needs to be based on specific research purposes and research object characteristics. In this paper, As pollution in soils of Beijing was taken as an example. The prediction accuracy of ordinary kriging (OK) and inverse distance weighted (IDW) were evaluated based on the cross validation results and spatial distribution characteristics of influencing factors. The results showed that, under the condition of specific spatial correlation, the cross validation results of OK and IDW for every soil point and the prediction accuracy of spatial distribution trend are similar. But the prediction accuracy of OK for the maximum and minimum is less than IDW, while the number of high pollution areas identified by OK are less than IDW. It is difficult to identify the high pollution areas fully by OK, which shows that the smoothing effect of OK is obvious. In addition, with increasing of the spatial correlation of As concentration, the cross validation error of OK and IDW decreases, and the high pollution area identified by OK is approaching the result of IDW, which can identify the high pollution areas more comprehensively. However, because the semivariogram constructed by OK interpolation method is more subjective and requires larger number of soil samples, IDW is more suitable for spatial prediction of heavy metal pollution in soils.
The need to approximate the use-case in clinical machine learning
Saeb, Sohrab; Jayaraman, Arun; Mohr, David C.; Kording, Konrad P.
2017-01-01
Abstract The availability of smartphone and wearable sensor technology is leading to a rapid accumulation of human subject data, and machine learning is emerging as a technique to map those data into clinical predictions. As machine learning algorithms are increasingly used to support clinical decision making, it is vital to reliably quantify their prediction accuracy. Cross-validation (CV) is the standard approach where the accuracy of such algorithms is evaluated on part of the data the algorithm has not seen during training. However, for this procedure to be meaningful, the relationship between the training and the validation set should mimic the relationship between the training set and the dataset expected for the clinical use. Here we compared two popular CV methods: record-wise and subject-wise. While the subject-wise method mirrors the clinically relevant use-case scenario of diagnosis in newly recruited subjects, the record-wise strategy has no such interpretation. Using both a publicly available dataset and a simulation, we found that record-wise CV often massively overestimates the prediction accuracy of the algorithms. We also conducted a systematic review of the relevant literature, and found that this overly optimistic method was used by almost half of the retrieved studies that used accelerometers, wearable sensors, or smartphones to predict clinical outcomes. As we move towards an era of machine learning-based diagnosis and treatment, using proper methods to evaluate their accuracy is crucial, as inaccurate results can mislead both clinicians and data scientists. PMID:28327985
Parsimonious data: How a single Facebook like predicts voting behavior in multiparty systems
Albrechtsen, Thomas; Dahl-Nielsen, Emil; Jensen, Michael; Skovrind, Magnus
2017-01-01
This study shows how liking politicians’ public Facebook posts can be used as an accurate measure for predicting present-day voter intention in a multiparty system. We highlight that a few, but selective digital traces produce prediction accuracies that are on par or even greater than most current approaches based upon bigger and broader datasets. Combining the online and offline, we connect a subsample of surveyed respondents to their public Facebook activity and apply machine learning classifiers to explore the link between their political liking behaviour and actual voting intention. Through this work, we show that even a single selective Facebook like can reveal as much about political voter intention as hundreds of heterogeneous likes. Further, by including the entire political like history of the respondents, our model reaches prediction accuracies above previous multiparty studies (60–70%). The main contribution of this paper is to show how public like-activity on Facebook allows political profiling of individual users in a multiparty system with accuracies above previous studies. Beside increased accuracies, the paper shows how such parsimonious measures allows us to generalize our findings to the entire population of a country and even across national borders, to other political multiparty systems. The approach in this study relies on data that are publicly available, and the simple setup we propose can with some limitations, be generalized to millions of users in other multiparty systems. PMID:28931023
Achamrah, Najate; Jésus, Pierre; Grigioni, Sébastien; Rimbert, Agnès; Petit, André; Déchelotte, Pierre; Folope, Vanessa; Coëffier, Moïse
2018-01-01
Predictive equations have been specifically developed for obese patients to estimate resting energy expenditure (REE). Body composition (BC) assessment is needed for some of these equations. We assessed the impact of BC methods on the accuracy of specific predictive equations developed in obese patients. REE was measured (mREE) by indirect calorimetry and BC assessed by bioelectrical impedance analysis (BIA) and dual-energy X-ray absorptiometry (DXA). mREE, percentages of prediction accuracy (±10% of mREE) were compared. Predictive equations were studied in 2588 obese patients. Mean mREE was 1788 ± 6.3 kcal/24 h. Only the Müller (BIA) and Harris & Benedict (HB) equations provided REE with no difference from mREE. The Huang, Müller, Horie-Waitzberg, and HB formulas provided a higher accurate prediction (>60% of cases). The use of BIA provided better predictions of REE than DXA for the Huang and Müller equations. Inversely, the Horie-Waitzberg and Lazzer formulas provided a higher accuracy using DXA. Accuracy decreased when applied to patients with BMI ≥ 40, except for the Horie-Waitzberg and Lazzer (DXA) formulas. Müller equations based on BIA provided a marked improvement of REE prediction accuracy than equations not based on BC. The interest of BC to improve REE predictive equations accuracy in obese patients should be confirmed. PMID:29320432
The efficiency of genome-wide selection for genetic improvement of net merit.
Togashi, K; Lin, C Y; Yamazaki, T
2011-10-01
Four methods of selection for net merit comprising 2 correlated traits were compared in this study: 1) EBV-only index (I₁), which consists of the EBV of both traits (i.e., traditional 2-trait BLUP selection); 2) GEBV-only index (I₂), which comprises the genomic EBV (GEBV) of both traits; 3) GEBV-assisted index (I₃), which combines both the EBV and the GEBV of both traits; and 4) GBV-assisted index (I₄), which combines both the EBV and the true genomic breeding value (GBV) of both traits. Comparisons of these indices were based on 3 evaluation criteria [selection accuracy, genetic response (ΔH), and relative efficiency] under 64 scenarios that arise from combining 2 levels of genetic correlation (r(G)), 2 ratios of genetic variances between traits, 2 ratios of the genomic variance to total genetic variances for trait 1, 4 accuracies of EBV, and 2 proportions of r(G) explained by the GBV. Both selection accuracy and genetic responses of the indices I₁, I₃, and I₄ increased as the accuracy of EBV increased, but the efficiency of the indices I₃ and I₄ relative to I₁ decreased as the accuracy of EBV increased. The relative efficiency of both I₃ and I₄ was generally greater when the accuracy of EBV was 0.6 than when it was 0.9, suggesting that the genomic markers are most useful to assist selection when the accuracy of EBV is low. The GBV-assisted index I₄ was superior to the GEBV-assisted I₃ in all 64 cases examined, indicating the importance of improving the accuracy of prediction of genomic breeding values. Other parameters being identical, increasing the genetic variance of a high heritability trait would increase the genetic response of the genomic indices (I₂, I₃, and I₄). The genetic responses to I₂, I₃, and I(4) was greater when the genetic correlation between traits was positive (r(G) = 0.5) than when it was negative (r(G) = -0.5). The results of this study indicate that the effectiveness of the GEBV-assisted index I₃ is affected by heritability of and genetic correlation between traits, the ratio of genetic variances between traits, the genomic-genetic variance ratio of each index trait, the proportion of genetic correlation accounted for by the genomic markers, and the accuracy of predictions of both EBV and GBV. However, most of these affecting factors are genetic characteristics of a population that is beyond the control of the breeders. The key factor subject to manipulation is to maximize both the proportion of the genetic variance explained by GEBV and the accuracy of both GEBV and EBV. The developed procedures provide means to investigate the efficiency of various genomic indices for any given combination of the genetic factors studied.
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
NASA Astrophysics Data System (ADS)
Tseng, Chien-Hsun
2018-06-01
This paper aims to develop a multidimensional wave digital filtering network for predicting static and dynamic behaviors of composite laminate based on the FSDT. The resultant network is, thus, an integrated platform that can perform not only the free vibration but also the bending deflection of moderate thick symmetric laminated plates with low plate side-to-thickness ratios (< = 20). Safeguarded by the Courant-Friedrichs-Levy stability condition with the least restriction in terms of optimization technique, the present method offers numerically high accuracy, stability and efficiency to proceed a wide range of modulus ratios for the FSDT laminated plates. Instead of using a constant shear correction factor (SCF) with a limited numerical accuracy for the bending deflection, an optimum SCF is particularly sought by looking for a minimum ratio of change in the transverse shear energy. This way, it can predict as good results in terms of accuracy for certain cases of bending deflection. Extensive simulation results carried out for the prediction of maximum bending deflection have demonstratively proven that the present method outperforms those based on the higher-order shear deformation and layerwise plate theories. To the best of our knowledge, this is the first work that shows an optimal selection of SCF can significantly increase the accuracy of FSDT-based laminates especially compared to the higher order theory disclaiming any correction. The highest accuracy of overall solution is compared to the 3D elasticity equilibrium one.
A New Scheme to Characterize and Identify Protein Ubiquitination Sites.
Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Lai, K Robert; Lee, Tzong-Yi
2017-01-01
Protein ubiquitination, involving the conjugation of ubiquitin on lysine residue, serves as an important modulator of many cellular functions in eukaryotes. Recent advancements in proteomic technology have stimulated increasing interest in identifying ubiquitination sites. However, most computational tools for predicting ubiquitination sites are focused on small-scale data. With an increasing number of experimentally verified ubiquitination sites, we were motivated to design a predictive model for identifying lysine ubiquitination sites for large-scale proteome dataset. This work assessed not only single features, such as amino acid composition (AAC), amino acid pair composition (AAPC) and evolutionary information, but also the effectiveness of incorporating two or more features into a hybrid approach to model construction. The support vector machine (SVM) was applied to generate the prediction models for ubiquitination site identification. Evaluation by five-fold cross-validation showed that the SVM models learned from the combination of hybrid features delivered a better prediction performance. Additionally, a motif discovery tool, MDDLogo, was adopted to characterize the potential substrate motifs of ubiquitination sites. The SVM models integrating the MDDLogo-identified substrate motifs could yield an average accuracy of 68.70 percent. Furthermore, the independent testing result showed that the MDDLogo-clustered SVM models could provide a promising accuracy (78.50 percent) and perform better than other prediction tools. Two cases have demonstrated the effective prediction of ubiquitination sites with corresponding substrate motifs.
Beaulieu, Jean; Doerksen, Trevor K; MacKay, John; Rainville, André; Bousquet, Jean
2014-12-02
Genomic selection (GS) may improve selection response over conventional pedigree-based selection if markers capture more detailed information than pedigrees in recently domesticated tree species and/or make it more cost effective. Genomic prediction accuracies using 1748 trees and 6932 SNPs representative of as many distinct gene loci were determined for growth and wood traits in white spruce, within and between environments and breeding groups (BG), each with an effective size of Ne ≈ 20. Marker subsets were also tested. Model fits and/or cross-validation (CV) prediction accuracies for ridge regression (RR) and the least absolute shrinkage and selection operator models approached those of pedigree-based models. With strong relatedness between CV sets, prediction accuracies for RR within environment and BG were high for wood (r = 0.71-0.79) and moderately high for growth (r = 0.52-0.69) traits, in line with trends in heritabilities. For both classes of traits, these accuracies achieved between 83% and 92% of those obtained with phenotypes and pedigree information. Prediction into untested environments remained moderately high for wood (r ≥ 0.61) but dropped significantly for growth (r ≥ 0.24) traits, emphasizing the need to phenotype in all test environments and model genotype-by-environment interactions for growth traits. Removing relatedness between CV sets sharply decreased prediction accuracies for all traits and subpopulations, falling near zero between BGs with no known shared ancestry. For marker subsets, similar patterns were observed but with lower prediction accuracies. Given the need for high relatedness between CV sets to obtain good prediction accuracies, we recommend to build GS models for prediction within the same breeding population only. Breeding groups could be merged to build genomic prediction models as long as the total effective population size does not exceed 50 individuals in order to obtain high prediction accuracy such as that obtained in the present study. A number of markers limited to a few hundred would not negatively impact prediction accuracies, but these could decrease more rapidly over generations. The most promising short-term approach for genomic selection would likely be the selection of superior individuals within large full-sib families vegetatively propagated to implement multiclonal forestry.
The diagnosis of acute appendicitis in a pediatric population: to CT or not to CT.
Stephen, Antonia E; Segev, Dorry L; Ryan, Daniel P; Mullins, Mark E; Kim, Samuel H; Schnitzer, Jay J; Doody, Daniel P
2003-03-01
The aim of this study was to determine if focused appendiceal computed tomography with colon contrast (FACT-CC) increases the accuracy of the preoperative diagnosis of acute appendicitis in children. A 5-year retrospective review was conducted of a university hospital database of 283 patients (age 0.8 to 19.3 years; mean, 11.3 years) treated with appendectomy for presumed acute appendicitis. Of the 283 patients in whom appendectomies were performed, 268 were confirmed by pathologic analysis of the specimen to have acute appendicitis for a diagnostic accuracy in our institution of 94.7%. Ninety-six patients (34%) underwent FACT-CC scans as part of their preoperative evaluation. The sensitivity of the computed tomography (CT) scan was 94.6%, and the positive predictive value was 95.6%. In girls older than 10 years, CT imaging was not significantly more accurate in predicting appendicitis than examination alone (93.9% v. 87.5%; P =.46). Preoperative FACT-CC did not increase the accuracy in diagnosing appendicitis when compared with patients diagnosed by history, physical examination and laboratory studies. If there was a strong suspicion of appendicitis, a negative CT scan did not exclude the diagnosis of appendicitis. However, focused appendiceal CT scan is a sensitive test with a high positive predictive value and may be useful in a patient with an atypical history or examination. Copyright 2003, Elsevier Science (USA). All rights reserved.
2009-01-01
Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835
Uribe-Rivera, David E; Soto-Azat, Claudio; Valenzuela-Sánchez, Andrés; Bizama, Gustavo; Simonetti, Javier A; Pliscoff, Patricio
2017-07-01
Climate change is a major threat to biodiversity; the development of models that reliably predict its effects on species distributions is a priority for conservation biogeography. Two of the main issues for accurate temporal predictions from Species Distribution Models (SDM) are model extrapolation and unrealistic dispersal scenarios. We assessed the consequences of these issues on the accuracy of climate-driven SDM predictions for the dispersal-limited Darwin's frog Rhinoderma darwinii in South America. We calibrated models using historical data (1950-1975) and projected them across 40 yr to predict distribution under current climatic conditions, assessing predictive accuracy through the area under the ROC curve (AUC) and True Skill Statistics (TSS), contrasting binary model predictions against temporal-independent validation data set (i.e., current presences/absences). To assess the effects of incorporating dispersal processes we compared the predictive accuracy of dispersal constrained models with no dispersal limited SDMs; and to assess the effects of model extrapolation on the predictive accuracy of SDMs, we compared this between extrapolated and no extrapolated areas. The incorporation of dispersal processes enhanced predictive accuracy, mainly due to a decrease in the false presence rate of model predictions, which is consistent with discrimination of suitable but inaccessible habitat. This also had consequences on range size changes over time, which is the most used proxy for extinction risk from climate change. The area of current climatic conditions that was absent in the baseline conditions (i.e., extrapolated areas) represents 39% of the study area, leading to a significant decrease in predictive accuracy of model predictions for those areas. Our results highlight (1) incorporating dispersal processes can improve predictive accuracy of temporal transference of SDMs and reduce uncertainties of extinction risk assessments from global change; (2) as geographical areas subjected to novel climates are expected to arise, they must be reported as they show less accurate predictions under future climate scenarios. Consequently, environmental extrapolation and dispersal processes should be explicitly incorporated to report and reduce uncertainties in temporal predictions of SDMs, respectively. Doing so, we expect to improve the reliability of the information we provide for conservation decision makers under future climate change scenarios. © 2017 by the Ecological Society of America.
Morphological Awareness and Children's Writing: Accuracy, Error, and Invention
McCutchen, Deborah; Stull, Sara
2014-01-01
This study examined the relationship between children's morphological awareness and their ability to produce accurate morphological derivations in writing. Fifth-grade U.S. students (n = 175) completed two writing tasks that invited or required morphological manipulation of words. We examined both accuracy and error, specifically errors in spelling and errors of the sort we termed morphological inventions, which entailed inappropriate, novel pairings of stems and suffixes. Regressions were used to determine the relationship between morphological awareness, morphological accuracy, and spelling accuracy, as well as between morphological awareness and morphological inventions. Linear regressions revealed that morphological awareness uniquely predicted children's generation of accurate morphological derivations, regardless of whether or not accurate spelling was required. A logistic regression indicated that morphological awareness was also uniquely predictive of morphological invention, with higher morphological awareness increasing the probability of morphological invention. These findings suggest that morphological knowledge may not only assist children with spelling during writing, but may also assist with word production via generative experimentation with morphological rules during sentence generation. Implications are discussed for the development of children's morphological knowledge and relationships with writing. PMID:25663748
Trueman, Rebecca C; Brooks, Simon P; Dunnett, Stephen B
2005-04-30
Within a broader programme developing murine models of Huntington's disease (HD), we have sought to develop a test of implicit learning for the mouse. Mice were trained in a novel serial visual discrimination task in the '9-hole box' operant test apparatus, followed by retesting after either bilateral quinolinic acid striatal lesions or sham lesions. In the task, each trial involves two sequential responses: an initial light stimulus is presented randomly in one of five holes, to which a nose-poke response results in the first light being extinguished and a second light is illuminated in a different hole. Response to the second light results in food reward, followed by a brief interval before the next trial. When the first light was in one of three of the five holes, the location of the second light was unpredictable in any of the remaining four holes; by contrast, if the first light occurred in one of the other two of the five holes, then the location of the second light was entirely predictable, being the hole two steps to the left or to the right, respectively. Reaction times and accuracy of responding were recorded to both stimuli. The mice learned the task with a degree of accuracy, and they demonstrated clear implicit learning, as measured by increased accuracy and reduced latency to respond to the presentation of the predictable stimulus. Striatal lesions disrupted performance, reducing accuracy for both the first and second stimuli and increasing response latencies for the second stimuli. The decrease in accuracy by the lesioned animals was accompanied by increases in perseverative nose-poking and inappropriate magazine entries throughout the trials, but the lesioned mice still showed a similar benefit (albeit, against a lower baseline of performance) from the implicit knowledge provided on predictable trials. The data validates the task as a sensitive probe for determining implicit learning deficits in the mouse, and suggests that the consequences of striatal lesions, while disrupting performance of skilled stimulus-response habits, are not selective to the process underlying implicit learning.
Application and analysis of debris-flow early warning system in Wenchuan earthquake-affected area
NASA Astrophysics Data System (ADS)
Liu, D. L.; Zhang, S. J.; Yang, H. J.; Zhao, L. Q.; Jiang, Y. H.; Tang, D.; Leng, X. P.
2016-02-01
The activities of debris flow (DF) in the Wenchuan earthquake-affected area significantly increased after the earthquake on 12 May 2008. The safety of the lives and property of local people is threatened by DFs. A physics-based early warning system (EWS) for DF forecasting was developed and applied in this earthquake area. This paper introduces an application of the system in the Wenchuan earthquake-affected area and analyzes the prediction results via a comparison to the DF events triggered by the strong rainfall events reported by the local government. The prediction accuracy and efficiency was first compared with a contribution-factor-based system currently used by the weather bureau of Sichuan province. The storm on 17 August 2012 was used as a case study for this comparison. The comparison shows that the false negative rate and false positive rate of the new system is, respectively, 19 and 21 % lower than the system based on the contribution factors. Consequently, the prediction accuracy is obviously higher than the system based on the contribution factors with a higher operational efficiency. On the invitation of the weather bureau of Sichuan province, the authors upgraded their prediction system of DF by using this new system before the monsoon of Wenchuan earthquake-affected area in 2013. Two prediction cases on 9 July 2013 and 10 July 2014 were chosen to further demonstrate that the new EWS has high stability, efficiency, and prediction accuracy.
Sankey, Joel B.; McVay, Jason C.; Kreitler, Jason R.; Hawbaker, Todd J.; Vaillant, Nicole; Lowe, Scott
2015-01-01
Increased sedimentation following wildland fire can negatively impact water supply and water quality. Understanding how changing fire frequency, extent, and location will affect watersheds and the ecosystem services they supply to communities is of great societal importance in the western USA and throughout the world. In this work we assess the utility of the InVEST (Integrated Valuation of Ecosystem Services and Tradeoffs) Sediment Retention Model to accurately characterize erosion and sedimentation of burned watersheds. InVEST was developed by the Natural Capital Project at Stanford University (Tallis et al., 2014) and is a suite of GIS-based implementations of common process models, engineered for high-end computing to allow the faster simulation of larger landscapes and incorporation into decision-making. The InVEST Sediment Retention Model is based on common soil erosion models (e.g., USLE – Universal Soil Loss Equation) and determines which areas of the landscape contribute the greatest sediment loads to a hydrological network and conversely evaluate the ecosystem service of sediment retention on a watershed basis. In this study, we evaluate the accuracy and uncertainties for InVEST predictions of increased sedimentation after fire, using measured postfire sediment yields available for many watersheds throughout the western USA from an existing, published large database. We show that the model can be parameterized in a relatively simple fashion to predict post-fire sediment yield with accuracy. Our ultimate goal is to use the model to accurately predict variability in post-fire sediment yield at a watershed scale as a function of future wildfire conditions.
NASA Astrophysics Data System (ADS)
Johnson, Traci L.; Sharon, Keren
2016-11-01
Until now, systematic errors in strong gravitational lens modeling have been acknowledged but have never been fully quantified. Here, we launch an investigation into the systematics induced by constraint selection. We model the simulated cluster Ares 362 times using random selections of image systems with and without spectroscopic redshifts and quantify the systematics using several diagnostics: image predictability, accuracy of model-predicted redshifts, enclosed mass, and magnification. We find that for models with >15 image systems, the image plane rms does not decrease significantly when more systems are added; however, the rms values quoted in the literature may be misleading as to the ability of a model to predict new multiple images. The mass is well constrained near the Einstein radius in all cases, and systematic error drops to <2% for models using >10 image systems. Magnification errors are smallest along the straight portions of the critical curve, and the value of the magnification is systematically lower near curved portions. For >15 systems, the systematic error on magnification is ∼2%. We report no trend in magnification error with the fraction of spectroscopic image systems when selecting constraints at random; however, when using the same selection of constraints, increasing this fraction up to ∼0.5 will increase model accuracy. The results suggest that the selection of constraints, rather than quantity alone, determines the accuracy of the magnification. We note that spectroscopic follow-up of at least a few image systems is crucial because models without any spectroscopic redshifts are inaccurate across all of our diagnostics.
NASA Astrophysics Data System (ADS)
Wang, H. B.; Zhao, C. Y.; Zhang, W.; Zhan, J. W.; Yu, S. X.
2015-09-01
The Earth gravitational filed model is a kind of important dynamic model in satellite orbit computation. In recent years, several space gravity missions have obtained great success, prompting a lot of gravitational filed models to be published. In this paper, 2 classical models (JGM3, EGM96) and 4 latest models, including EIGEN-CHAMP05S, GGM03S, GOCE02S, and EGM2008 are evaluated by being employed in the precision orbit determination (POD) and prediction, based on the laser range observation of four low earth orbit (LEO) satellites, including CHAMP, GFZ-1, GRACE-A, and SWARM-A. The residual error of observation in POD is adopted to describe the accuracy of six gravitational field models. We show the main results as follows: (1) for LEO POD, the accuracies of 4 latest models (EIGEN-CHAMP05S, GGM03S, GOCE02S, and EGM2008) are at the same level, and better than those of 2 classical models (JGM3, EGM96); (2) If taking JGM3 as reference, EGM96 model's accuracy is better in most situations, and the accuracies of the 4 latest models are improved by 12%-47% in POD and 63% in prediction, respectively. We also confirm that the model's accuracy in POD is enhanced with the increasing degree and order if they are smaller than 70, and when they exceed 70 the accuracy keeps stable, and is unrelated with the increasing degree, meaning that the model's degree and order truncated to 70 are sufficient to meet the requirement of LEO orbit computation with centimeter level precision.
Bhimarao; Bhat, Venkataramana; Gowda, Puttanna VN
2015-01-01
Background The high incidence of IUGR and its low recognition lead to increasing perinatal morbidity and mortality for which prediction of IUGR with timely management decisions is of paramount importance. Many studies have compared the efficacy of several gestational age independent parameters and found that TCD/AC is a better predictor of asymmetric IUGR. Aim To compare the accuracy of transcerebellar diameter/abdominal circumference with head circumference/abdominal circumference in predicting asymmetric intrauterine growth retardation after 20 weeks of gestation. Materials and Methods The prospective study was conducted over a period of one year on 50 clinically suspected IUGR pregnancies who were evaluated with 3.5 MHz frequency ultrasound scanner by a single sonologist. BPD, HC, AC and FL along with TCD were measured for assessing the sonological gestational age. Two morphometric ratios- TCD/AC and HC/AC were calculated. Estimated fetal weight was calculated for all these pregnancies and its percentile was determined. Statistical Methods The TCD/AC and HC/AC ratios were correlated with advancing gestational age to know if these were related to GA. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and diagnostic accuracy (DA) for TCD/AC and HC/AC ratios in evaluating IUGR fetuses were calculated. Results In the present study, linear relation of TCD and HC in IUGR fetuses with gestation was noted. The sensitivity, specificity, PPV, NPV & DA were 88%, 93.5%, 77.1%, 96.3% & 92.4% respectively for TCD/AC ratio versus 84%, 92%, 72.4%, 95.8% & 90.4% respectively for HC/AC ratio in predicting IUGR. Conclusion Both ratios were gestational age independent and can be used in detecting IUGR with good diagnostic accuracy. However, TCD/AC ratio had a better diagnostic validity and accuracy compared to HC/AC ratio in predicting asymmetric IUGR. PMID:26557588
Genomic and pedigree-based prediction for leaf, stem, and stripe rust resistance in wheat.
Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Huerta-Espino, Julio; Lan, Caixia; Bhavani, Sridhar; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E
2017-07-01
Genomic prediction for seedling and adult plant resistance to wheat rusts was compared to prediction using few markers as fixed effects in a least-squares approach and pedigree-based prediction. The unceasing plant-pathogen arms race and ephemeral nature of some rust resistance genes have been challenging for wheat (Triticum aestivum L.) breeding programs and farmers. Hence, it is important to devise strategies for effective evaluation and exploitation of quantitative rust resistance. One promising approach that could accelerate gain from selection for rust resistance is 'genomic selection' which utilizes dense genome-wide markers to estimate the breeding values (BVs) for quantitative traits. Our objective was to compare three genomic prediction models including genomic best linear unbiased prediction (GBLUP), GBLUP A that was GBLUP with selected loci as fixed effects and reproducing kernel Hilbert spaces-markers (RKHS-M) with least-squares (LS) approach, RKHS-pedigree (RKHS-P), and RKHS markers and pedigree (RKHS-MP) to determine the BVs for seedling and/or adult plant resistance (APR) to leaf rust (LR), stem rust (SR), and stripe rust (YR). The 333 lines in the 45th IBWSN and the 313 lines in the 46th IBWSN were genotyped using genotyping-by-sequencing and phenotyped in replicated trials. The mean prediction accuracies ranged from 0.31-0.74 for LR seedling, 0.12-0.56 for LR APR, 0.31-0.65 for SR APR, 0.70-0.78 for YR seedling, and 0.34-0.71 for YR APR. For most datasets, the RKHS-MP model gave the highest accuracies, while LS gave the lowest. GBLUP, GBLUP A, RKHS-M, and RKHS-P models gave similar accuracies. Using genome-wide marker-based models resulted in an average of 42% increase in accuracy over LS. We conclude that GS is a promising approach for improvement of quantitative rust resistance and can be implemented in the breeding pipeline.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
Improving transmembrane protein consensus topology prediction using inter-helical interaction.
Wang, Han; Zhang, Chao; Shi, Xiaohu; Zhang, Li; Zhou, You
2012-11-01
Alpha helix transmembrane proteins (αTMPs) represent roughly 30% of all open reading frames (ORFs) in a typical genome and are involved in many critical biological processes. Due to the special physicochemical properties, it is hard to crystallize and obtain high resolution structures experimentally, thus, sequence-based topology prediction is highly desirable for the study of transmembrane proteins (TMPs), both in structure prediction and function prediction. Various model-based topology prediction methods have been developed, but the accuracy of those individual predictors remain poor due to the limitation of the methods or the features they used. Thus, the consensus topology prediction method becomes practical for high accuracy applications by combining the advances of the individual predictors. Here, based on the observation that inter-helical interactions are commonly found within the transmembrane helixes (TMHs) and strongly indicate the existence of them, we present a novel consensus topology prediction method for αTMPs, CNTOP, which incorporates four top leading individual topology predictors, and further improves the prediction accuracy by using the predicted inter-helical interactions. The method achieved 87% prediction accuracy based on a benchmark dataset and 78% accuracy based on a non-redundant dataset which is composed of polytopic αTMPs. Our method derives the highest topology accuracy than any other individual predictors and consensus predictors, at the same time, the TMHs are more accurately predicted in their length and locations, where both the false positives (FPs) and the false negatives (FNs) decreased dramatically. The CNTOP is available at: http://ccst.jlu.edu.cn/JCSB/cntop/CNTOP.html. Copyright © 2012 Elsevier B.V. All rights reserved.
Artificial Intelligence Systems as Prognostic and Predictive Tools in Ovarian Cancer.
Enshaei, A; Robson, C N; Edmondson, R J
2015-11-01
The ability to provide accurate prognostic and predictive information to patients is becoming increasingly important as clinicians enter an era of personalized medicine. For a disease as heterogeneous as epithelial ovarian cancer, conventional algorithms become too complex for routine clinical use. This study therefore investigated the potential for an artificial intelligence model to provide this information and compared it with conventional statistical approaches. The authors created a database comprising 668 cases of epithelial ovarian cancer during a 10-year period and collected data routinely available in a clinical environment. They also collected survival data for all the patients, then constructed an artificial intelligence model capable of comparing a variety of algorithms and classifiers alongside conventional statistical approaches such as logistic regression. The model was used to predict overall survival and demonstrated that an artificial neural network (ANN) algorithm was capable of predicting survival with high accuracy (93 %) and an area under the curve (AUC) of 0.74 and that this outperformed logistic regression. The model also was used to predict the outcome of surgery and again showed that ANN could predict outcome (complete/optimal cytoreduction vs. suboptimal cytoreduction) with 77 % accuracy and an AUC of 0.73. These data are encouraging and demonstrate that artificial intelligence systems may have a role in providing prognostic and predictive data for patients. The performance of these systems likely will improve with increasing data set size, and this needs further investigation.
Mapping water table depth using geophysical and environmental variables.
Buchanan, S; Triantafilis, J
2009-01-01
Despite its importance, accurate representation of the spatial distribution of water table depth remains one of the greatest deficiencies in many hydrological investigations. Historically, both inverse distance weighting (IDW) and ordinary kriging (OK) have been used to interpolate depths. These methods, however, have major limitations: namely they require large numbers of measurements to represent the spatial variability of water table depth and they do not represent the variation between measurement points. We address this issue by assessing the benefits of using stepwise multiple linear regression (MLR) with three different ancillary data sets to predict the water table depth at 100-m intervals. The ancillary data sets used are Electromagnetic (EM34 and EM38), gamma radiometric: potassium (K), uranium (eU), thorium (eTh), total count (TC), and morphometric data. Results show that MLR offers significant precision and accuracy benefits over OK and IDW. Inclusion of the morphometric data set yielded the greatest (16%) improvement in prediction accuracy compared with IDW, followed by the electromagnetic data set (5%). Use of the gamma radiometric data set showed no improvement. The greatest improvement, however, resulted when all data sets were combined (37% increase in prediction accuracy over IDW). Significantly, however, the use of MLR also allows for prediction in variations in water table depth between measurement points, which is crucial for land management.
Increasing the predictive accuracy of amyloid-β blood-borne biomarkers in Alzheimer's disease.
Watt, Andrew D; Perez, Keyla A; Faux, Noel G; Pike, Kerryn E; Rowe, Christopher C; Bourgeat, Pierrick; Salvado, Olivier; Masters, Colin L; Villemagne, Victor L; Barnham, Kevin J
2011-01-01
Diagnostic measures for Alzheimer's disease (AD) commonly rely on evaluating the levels of amyloid-β (Aβ) peptides within the cerebrospinal fluid (CSF) of affected individuals. These levels are often combined with levels of an additional non-Aβ marker to increase predictive accuracy. Recent efforts to overcome the invasive nature of CSF collection led to the observation of Aβ species within the blood cellular fraction, however, little is known of what additional biomarkers may be found in this membranous fraction. The current study aimed to undertake a discovery-based proteomic investigation of the blood cellular fraction from AD patients (n = 18) and healthy controls (HC; n = 15) using copper immobilized metal affinity capture and Surface Enhanced Laser Desorption/Ionisation Time-Of-Flight Mass Spectrometry. Three candidate biomarkers were observed which could differentiate AD patients from HC (ROC AUC > 0.8). Bivariate pairwise comparisons revealed significant correlations between these markers and measures of AD severity including; MMSE, composite memory, brain amyloid burden, and hippocampal volume. A partial least squares regression model was generated using the three candidate markers along with blood levels of Aβ. This model was able to distinguish AD from HC with high specificity (90%) and sensitivity (77%) and was able to separate individuals with mild cognitive impairment (MCI) who converted to AD from MCI non-converters. While requiring further characterization, these candidate biomarkers reaffirm the potential efficacy of blood-based investigations into neurodegenerative conditions. Furthermore, the findings indicate that the incorporation of non-amyloid markers into predictive models, function to increase the accuracy of the diagnostic potential of Aβ.
Taxi Time Prediction at Charlotte Airport Using Fast-Time Simulation and Machine Learning Techniques
NASA Technical Reports Server (NTRS)
Lee, Hanbong
2016-01-01
Accurate taxi time prediction is required for enabling efficient runway scheduling that can increase runway throughput and reduce taxi times and fuel consumptions on the airport surface. Currently NASA and American Airlines are jointly developing a decision-support tool called Spot and Runway Departure Advisor (SARDA) that assists airport ramp controllers to make gate pushback decisions and improve the overall efficiency of airport surface traffic. In this presentation, we propose to use Linear Optimized Sequencing (LINOS), a discrete-event fast-time simulation tool, to predict taxi times and provide the estimates to the runway scheduler in real-time airport operations. To assess its prediction accuracy, we also introduce a data-driven analytical method using machine learning techniques. These two taxi time prediction methods are evaluated with actual taxi time data obtained from the SARDA human-in-the-loop (HITL) simulation for Charlotte Douglas International Airport (CLT) using various performance measurement metrics. Based on the taxi time prediction results, we also discuss how the prediction accuracy can be affected by the operational complexity at this airport and how we can improve the fast time simulation model before implementing it with an airport scheduling algorithm in a real-time environment.
Integrative genetic risk prediction using non-parametric empirical Bayes classification.
Zhao, Sihai Dave
2017-06-01
Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.
Genotyping by sequencing for genomic prediction in a soybean breeding population.
Jarquín, Diego; Kocak, Kyle; Posadas, Luis; Hyma, Katie; Jedlicka, Joseph; Graef, George; Lorenz, Aaron
2014-08-29
Advances in genotyping technology, such as genotyping by sequencing (GBS), are making genomic prediction more attractive to reduce breeding cycle times and costs associated with phenotyping. Genomic prediction and selection has been studied in several crop species, but no reports exist in soybean. The objectives of this study were (i) evaluate prospects for genomic selection using GBS in a typical soybean breeding program and (ii) evaluate the effect of GBS marker selection and imputation on genomic prediction accuracy. To achieve these objectives, a set of soybean lines sampled from the University of Nebraska Soybean Breeding Program were genotyped using GBS and evaluated for yield and other agronomic traits at multiple Nebraska locations. Genotyping by sequencing scored 16,502 single nucleotide polymorphisms (SNPs) with minor-allele frequency (MAF) > 0.05 and percentage of missing values ≤ 5% on 301 elite soybean breeding lines. When SNPs with up to 80% missing values were included, 52,349 SNPs were scored. Prediction accuracy for grain yield, assessed using cross validation, was estimated to be 0.64, indicating good potential for using genomic selection for grain yield in soybean. Filtering SNPs based on missing data percentage had little to no effect on prediction accuracy, especially when random forest imputation was used to impute missing values. The highest accuracies were observed when random forest imputation was used on all SNPs, but differences were not significant. A standard additive G-BLUP model was robust; modeling additive-by-additive epistasis did not provide any improvement in prediction accuracy. The effect of training population size on accuracy began to plateau around 100, but accuracy steadily climbed until the largest possible size was used in this analysis. Including only SNPs with MAF > 0.30 provided higher accuracies when training populations were smaller. Using GBS for genomic prediction in soybean holds good potential to expedite genetic gain. Our results suggest that standard additive G-BLUP models can be used on unfiltered, imputed GBS data without loss in accuracy.
Wren, Christopher; Vogel, Melanie; Lord, Stephen; Abrams, Dominic; Bourke, John; Rees, Philip; Rosenthal, Eric
2012-02-01
The aim of this study was to examine the accuracy in predicting pathway location in children with Wolff-Parkinson-White syndrome for each of seven published algorithms. ECGs from 100 consecutive children with Wolff-Parkinson-White syndrome undergoing electrophysiological study were analysed by six investigators using seven published algorithms, six of which had been developed in adult patients. Accuracy and concordance of predictions were adjusted for the number of pathway locations. Accessory pathways were left-sided in 49, septal in 20 and right-sided in 31 children. Overall accuracy of prediction was 30-49% for the exact location and 61-68% including adjacent locations. Concordance between investigators varied between 41% and 86%. No algorithm was better at predicting septal pathways (accuracy 5-35%, improving to 40-78% including adjacent locations), but one was significantly worse. Predictive accuracy was 24-53% for the exact location of right-sided pathways (50-71% including adjacent locations) and 32-55% for the exact location of left-sided pathways (58-73% including adjacent locations). All algorithms were less accurate in our hands than in other authors' own assessment. None performed well in identifying midseptal or right anteroseptal accessory pathway locations.
Technow, Frank; Schrag, Tobias A.; Schipprack, Wolfgang; Bauer, Eva; Simianer, Henner; Melchinger, Albrecht E.
2014-01-01
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill–Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker–QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in testcrosses. PMID:24850820
Heidaritabar, M; Wolc, A; Arango, J; Zeng, J; Settar, P; Fulton, J E; O'Sullivan, N P; Bastiaansen, J W M; Fernando, R L; Garrick, D J; Dekkers, J C M
2016-10-01
Most genomic prediction studies fit only additive effects in models to estimate genomic breeding values (GEBV). However, if dominance genetic effects are an important source of variation for complex traits, accounting for them may improve the accuracy of GEBV. We investigated the effect of fitting dominance and additive effects on the accuracy of GEBV for eight egg production and quality traits in a purebred line of brown layers using pedigree or genomic information (42K single-nucleotide polymorphism (SNP) panel). Phenotypes were corrected for the effect of hatch date. Additive and dominance genetic variances were estimated using genomic-based [genomic best linear unbiased prediction (GBLUP)-REML and BayesC] and pedigree-based (PBLUP-REML) methods. Breeding values were predicted using a model that included both additive and dominance effects and a model that included only additive effects. The reference population consisted of approximately 1800 animals hatched between 2004 and 2009, while approximately 300 young animals hatched in 2010 were used for validation. Accuracy of prediction was computed as the correlation between phenotypes and estimated breeding values of the validation animals divided by the square root of the estimate of heritability in the whole population. The proportion of dominance variance to total phenotypic variance ranged from 0.03 to 0.22 with PBLUP-REML across traits, from 0 to 0.03 with GBLUP-REML and from 0.01 to 0.05 with BayesC. Accuracies of GEBV ranged from 0.28 to 0.60 across traits. Inclusion of dominance effects did not improve the accuracy of GEBV, and differences in their accuracies between genomic-based methods were small (0.01-0.05), with GBLUP-REML yielding higher prediction accuracies than BayesC for egg production, egg colour and yolk weight, while BayesC yielded higher accuracies than GBLUP-REML for the other traits. In conclusion, fitting dominance effects did not impact accuracy of genomic prediction of breeding values in this population. © 2016 Blackwell Verlag GmbH.
Modalities, Relations, and Learning
NASA Astrophysics Data System (ADS)
Müller, Martin Eric
While the popularity of statistical, probabilistic and exhaustive machine learning techniques still increases, relational and logic approaches are still a niche market in research. While the former approaches focus on predictive accuracy, the latter ones prove to be indispensable in knowledge discovery.
Spittle, Alicia J.; Lee, Katherine J.; Spencer-Smith, Megan; Lorefice, Lucy E.; Anderson, Peter J.; Doyle, Lex W.
2015-01-01
Aim The primary aim of this study was to investigate the accuracy of the Alberta Infant Motor Scale (AIMS) and Neuro-Sensory Motor Developmental Assessment (NSMDA) over the first year of life for predicting motor impairment at 4 years in preterm children. The secondary aims were to assess the predictive value of serial assessments over the first year and when using a combination of these two assessment tools in follow-up. Method Children born <30 weeks’ gestation were prospectively recruited and assessed at 4, 8 and 12 months’ corrected age using the AIMS and NSMDA. At 4 years’ corrected age children were assessed for cerebral palsy (CP) and motor impairment using the Movement Assessment Battery for Children 2nd-edition (MABC-2). We calculated accuracy of the AIMS and NSMDA for predicting CP and MABC-2 scores ≤15th (at-risk of motor difficulty) and ≤5th centile (significant motor difficulty) for each test (AIMS and NSMDA) at 4, 8 and 12 months, for delay on one, two or all three of the time points over the first year, and finally for delay on both tests at each time point. Results Accuracy for predicting motor impairment was good for each test at each age, although false positives were common. Motor impairment on the MABC-2 (scores ≤5th and ≤15th) was most accurately predicted by the AIMS at 4 months, whereas CP was most accurately predicted by the NSMDA at 12 months. In regards to serial assessments, the likelihood ratio for motor impairment increased with the number of delayed assessments. When combining both the NSMDA and AIMS the best accuracy was achieved at 4 months, although results were similar at 8 and 12 months. Interpretation Motor development during the first year of life in preterm infants assessed with the AIMS and NSMDA is predictive of later motor impairment at preschool age. However, false positives are common and therefore it is beneficial to follow-up children at high risk of motor impairment at more than one time point, or to use a combination of assessment tools. Trial Registration ACTR.org.au ACTRN12606000252516 PMID:25970619
Beaulieu, J; Doerksen, T; Clément, S; MacKay, J; Bousquet, J
2014-01-01
Genomic selection (GS) is of interest in breeding because of its potential for predicting the genetic value of individuals and increasing genetic gains per unit of time. To date, very few studies have reported empirical results of GS potential in the context of large population sizes and long breeding cycles such as for boreal trees. In this study, we assessed the effectiveness of marker-aided selection in an undomesticated white spruce (Picea glauca (Moench) Voss) population of large effective size using a GS approach. A discovery population of 1694 trees representative of 214 open-pollinated families from 43 natural populations was phenotyped for 12 wood and growth traits and genotyped for 6385 single-nucleotide polymorphisms (SNPs) mined in 2660 gene sequences. GS models were built to predict estimated breeding values using all the available SNPs or SNP subsets of the largest absolute effects, and they were validated using various cross-validation schemes. The accuracy of genomic estimated breeding values (GEBVs) varied from 0.327 to 0.435 when the training and the validation data sets shared half-sibs that were on average 90% of the accuracies achieved through traditionally estimated breeding values. The trend was also the same for validation across sites. As expected, the accuracy of GEBVs obtained after cross-validation with individuals of unknown relatedness was lower with about half of the accuracy achieved when half-sibs were present. We showed that with the marker densities used in the current study, predictions with low to moderate accuracy could be obtained within a large undomesticated population of related individuals, potentially resulting in larger gains per unit of time with GS than with the traditional approach. PMID:24781808
Mehrban, Hossein; Lee, Deuk Hwan; Moradi, Mohammad Hossein; IlCho, Chung; Naserkheil, Masoumeh; Ibáñez-Escriche, Noelia
2017-01-04
Hanwoo beef is known for its marbled fat, tenderness, juiciness and characteristic flavor, as well as for its low cholesterol and high omega 3 fatty acid contents. As yet, there has been no comprehensive investigation to estimate genomic selection accuracy for carcass traits in Hanwoo cattle using dense markers. This study aimed at evaluating the accuracy of alternative statistical methods that differed in assumptions about the underlying genetic model for various carcass traits: backfat thickness (BT), carcass weight (CW), eye muscle area (EMA), and marbling score (MS). Accuracies of direct genomic breeding values (DGV) for carcass traits were estimated by applying fivefold cross-validation to a dataset including 1183 animals and approximately 34,000 single nucleotide polymorphisms (SNPs). Accuracies of BayesC, Bayesian LASSO (BayesL) and genomic best linear unbiased prediction (GBLUP) methods were similar for BT, EMA and MS. However, for CW, DGV accuracy was 7% higher with BayesC than with BayesL and GBLUP. The increased accuracy of BayesC, compared to GBLUP and BayesL, was maintained for CW, regardless of the training sample size, but not for BT, EMA, and MS. Genome-wide association studies detected consistent large effects for SNPs on chromosomes 6 and 14 for CW. The predictive performance of the models depended on the trait analyzed. For CW, the results showed a clear superiority of BayesC compared to GBLUP and BayesL. These findings indicate the importance of using a proper variable selection method for genomic selection of traits and also suggest that the genetic architecture that underlies CW differs from that of the other carcass traits analyzed. Thus, our study provides significant new insights into the carcass traits of Hanwoo cattle.
Modi, Payal; Glavis-Bloom, Justin; Nasrin, Sabiha; Guy, Allysia; Chowa, Erika P; Dvor, Nathan; Dworkis, Daniel A; Oh, Michael; Silvestri, David M; Strasberg, Stephen; Rege, Soham; Noble, Vicki E; Alam, Nur H; Levine, Adam C
2016-01-01
Although dehydration from diarrhea is a leading cause of morbidity and mortality in children under five, existing methods of assessing dehydration status in children have limited accuracy. To assess the accuracy of point-of-care ultrasound measurement of the aorta-to-IVC ratio as a predictor of dehydration in children. A prospective cohort study of children under five years with acute diarrhea was conducted in the rehydration unit of the International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b). Ultrasound measurements of aorta-to-IVC ratio and dehydrated weight were obtained on patient arrival. Percent weight change was monitored during rehydration to classify children as having "some dehydration" with weight change 3-9% or "severe dehydration" with weight change > 9%. Logistic regression analysis and Receiver-Operator Characteristic (ROC) curves were used to evaluate the accuracy of aorta-to-IVC ratio as a predictor of dehydration severity. 850 children were enrolled, of which 771 were included in the final analysis. Aorta to IVC ratio was a significant predictor of the percent dehydration in children with acute diarrhea, with each 1-point increase in the aorta to IVC ratio predicting a 1.1% increase in the percent dehydration of the child. However, the area under the ROC curve (0.60), sensitivity (67%), and specificity (49%), for predicting severe dehydration were all poor. Point-of-care ultrasound of the aorta-to-IVC ratio was statistically associated with volume status, but was not accurate enough to be used as an independent screening tool for dehydration in children under five years presenting with acute diarrhea in a resource-limited setting.
Modi, Payal; Glavis-Bloom, Justin; Nasrin, Sabiha; Guy, Allysia; Rege, Soham; Noble, Vicki E.; Alam, Nur H.; Levine, Adam C.
2016-01-01
Introduction Although dehydration from diarrhea is a leading cause of morbidity and mortality in children under five, existing methods of assessing dehydration status in children have limited accuracy. Objective To assess the accuracy of point-of-care ultrasound measurement of the aorta-to-IVC ratio as a predictor of dehydration in children. Methods A prospective cohort study of children under five years with acute diarrhea was conducted in the rehydration unit of the International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b). Ultrasound measurements of aorta-to-IVC ratio and dehydrated weight were obtained on patient arrival. Percent weight change was monitored during rehydration to classify children as having “some dehydration” with weight change 3–9% or “severe dehydration” with weight change > 9%. Logistic regression analysis and Receiver-Operator Characteristic (ROC) curves were used to evaluate the accuracy of aorta-to-IVC ratio as a predictor of dehydration severity. Results 850 children were enrolled, of which 771 were included in the final analysis. Aorta to IVC ratio was a significant predictor of the percent dehydration in children with acute diarrhea, with each 1-point increase in the aorta to IVC ratio predicting a 1.1% increase in the percent dehydration of the child. However, the area under the ROC curve (0.60), sensitivity (67%), and specificity (49%), for predicting severe dehydration were all poor. Conclusions Point-of-care ultrasound of the aorta-to-IVC ratio was statistically associated with volume status, but was not accurate enough to be used as an independent screening tool for dehydration in children under five years presenting with acute diarrhea in a resource-limited setting. PMID:26766306
NASA Technical Reports Server (NTRS)
Berman, A. L.
1976-01-01
In the last two decades, increasingly sophisticated deep space missions have placed correspondingly stringent requirements on navigational accuracy. As part of the effort to increase navigational accuracy, and hence the quality of radiometric data, much effort has been expended in an attempt to understand and compute the tropospheric effect on range (and hence range rate) data. The general approach adopted has been that of computing a zenith range refraction, and then mapping this refraction to any arbitrary elevation angle via an empirically derived function of elevation. The prediction of zenith range refraction derived from surface measurements of meteorological parameters is presented. Refractivity is separated into wet (water vapor pressure) and dry (atmospheric pressure) components. The integration of dry refractivity is shown to be exact. Attempts to integrate wet refractivity directly prove ineffective; however, several empirical models developed by the author and other researchers at JPL are discussed. The best current wet refraction model is here considered to be a separate day/night model, which is proportional to surface water vapor pressure and inversely proportional to surface temperature. Methods are suggested that might improve the accuracy of the wet range refraction model.
Dopamine reward prediction-error signalling: a two-component response
Schultz, Wolfram
2017-01-01
Environmental stimuli and objects, including rewards, are often processed sequentially in the brain. Recent work suggests that the phasic dopamine reward prediction-error response follows a similar sequential pattern. An initial brief, unselective and highly sensitive increase in activity unspecifically detects a wide range of environmental stimuli, then quickly evolves into the main response component, which reflects subjective reward value and utility. This temporal evolution allows the dopamine reward prediction-error signal to optimally combine speed and accuracy. PMID:26865020
ERIC Educational Resources Information Center
Kwon, Heekyung
2011-01-01
The objective of this study is to provide a systematic account of three typical phenomena surrounding absolute accuracy of metacomprehension assessments: (1) the absolute accuracy of predictions is typically quite low; (2) there exist individual differences in absolute accuracy of predictions as a function of reading skill; and (3) postdictions…
Sweat loss prediction using a multi-model approach
NASA Astrophysics Data System (ADS)
Xu, Xiaojiang; Santee, William R.
2011-07-01
A new multi-model approach (MMA) for sweat loss prediction is proposed to improve prediction accuracy. MMA was computed as the average of sweat loss predicted by two existing thermoregulation models: i.e., the rational model SCENARIO and the empirical model Heat Strain Decision Aid (HSDA). Three independent physiological datasets, a total of 44 trials, were used to compare predictions by MMA, SCENARIO, and HSDA. The observed sweat losses were collected under different combinations of uniform ensembles, environmental conditions (15-40°C, RH 25-75%), and exercise intensities (250-600 W). Root mean square deviation (RMSD), residual plots, and paired t tests were used to compare predictions with observations. Overall, MMA reduced RMSD by 30-39% in comparison with either SCENARIO or HSDA, and increased the prediction accuracy to 66% from 34% or 55%. Of the MMA predictions, 70% fell within the range of mean observed value ± SD, while only 43% of SCENARIO and 50% of HSDA predictions fell within the same range. Paired t tests showed that differences between observations and MMA predictions were not significant, but differences between observations and SCENARIO or HSDA predictions were significantly different for two datasets. Thus, MMA predicted sweat loss more accurately than either of the two single models for the three datasets used. Future work will be to evaluate MMA using additional physiological data to expand the scope of populations and conditions.
HIV-1 protease cleavage site prediction based on two-stage feature selection method.
Niu, Bing; Yuan, Xiao-Cheng; Roeper, Preston; Su, Qiang; Peng, Chun-Rong; Yin, Jing-Yuan; Ding, Juan; Li, HaiPeng; Lu, Wen-Cong
2013-03-01
Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.
Yee, Susan Harrell; Barron, Mace G
2010-02-01
Coral reefs have experienced extensive mortality over the past few decades as a result of temperature-induced mass bleaching events. There is an increasing realization that other environmental factors, including water mixing, solar radiation, water depth, and water clarity, interact with temperature to either exacerbate bleaching or protect coral from mass bleaching. The relative contribution of these factors to variability in mass bleaching at a global scale has not been quantified, but can provide insights when making large-scale predictions of mass bleaching events. Using data from 708 bleaching surveys across the globe, a framework was developed to predict the probability of moderate or severe bleaching as a function of key environmental variables derived from global-scale remote-sensing data. The ability of models to explain spatial and temporal variability in mass bleaching events was quantified. Results indicated approximately 20% improved accuracy of predictions of bleaching when solar radiation and water mixing, in addition to elevated temperature, were incorporated into models, but predictive accuracy was variable among regions. Results provide insights into the effects of environmental parameters on bleaching at a global scale.
The influence of delaying judgments of learning on metacognitive accuracy: a meta-analytic review.
Rhodes, Matthew G; Tauber, Sarah K
2011-01-01
Many studies have examined the accuracy of predictions of future memory performance solicited through judgments of learning (JOLs). Among the most robust findings in this literature is that delaying predictions serves to substantially increase the relative accuracy of JOLs compared with soliciting JOLs immediately after study, a finding termed the delayed JOL effect. The meta-analyses reported in the current study examined the predominant theoretical accounts as well as potential moderators of the delayed JOL effect. The first meta-analysis examined the relative accuracy of delayed compared with immediate JOLs across 4,554 participants (112 effect sizes) through gamma correlations between JOLs and memory accuracy. Those data showed that delaying JOLs leads to robust benefits to relative accuracy (g = 0.93). The second meta-analysis examined memory performance for delayed compared with immediate JOLs across 3,807 participants (98 effect sizes). Those data showed that delayed JOLs result in a modest but reliable benefit for memory performance relative to immediate JOLs (g = 0.08). Findings from these meta-analyses are well accommodated by theories suggesting that delayed JOL accuracy reflects access to more diagnostic information from long-term memory rather than being a by-product of a retrieval opportunity. However, these data also suggest that theories proposing that the delayed JOL effect results from a memorial benefit or the match between the cues available for JOLs and those available at test may also provide viable explanatory mechanisms necessary for a comprehensive account.
Accuracies of univariate and multivariate genomic prediction models in African cassava.
Okeke, Uche Godfrey; Akdemir, Deniz; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2017-12-04
Genomic selection (GS) promises to accelerate genetic gain in plant breeding programs especially for crop species such as cassava that have long breeding cycles. Practically, to implement GS in cassava breeding, it is necessary to evaluate different GS models and to develop suitable models for an optimized breeding pipeline. In this paper, we compared (1) prediction accuracies from a single-trait (uT) and a multi-trait (MT) mixed model for a single-environment genetic evaluation (Scenario 1), and (2) accuracies from a compound symmetric multi-environment model (uE) parameterized as a univariate multi-kernel model to a multivariate (ME) multi-environment mixed model that accounts for genotype-by-environment interaction for multi-environment genetic evaluation (Scenario 2). For these analyses, we used 16 years of public cassava breeding data for six target cassava traits and a fivefold cross-validation scheme with 10-repeat cycles to assess model prediction accuracies. In Scenario 1, the MT models had higher prediction accuracies than the uT models for all traits and locations analyzed, which amounted to on average a 40% improved prediction accuracy. For Scenario 2, we observed that the ME model had on average (across all locations and traits) a 12% improved prediction accuracy compared to the uE model. We recommend the use of multivariate mixed models (MT and ME) for cassava genetic evaluation. These models may be useful for other plant species.
The Use of Linear Programming for Prediction.
ERIC Educational Resources Information Center
Schnittjer, Carl J.
The purpose of the study was to develop a linear programming model to be used for prediction, test the accuracy of the predictions, and compare the accuracy with that produced by curvilinear multiple regression analysis. (Author)
Hu, Xuefei; Waller, Lance A; Lyapustin, Alexei; Wang, Yujie; Liu, Yang
2014-10-16
Multiple studies have developed surface PM 2.5 (particle size less than 2.5 µm in aerodynamic diameter) prediction models using satellite-derived aerosol optical depth as the primary predictor and meteorological and land use variables as secondary variables. To our knowledge, satellite-retrieved fire information has not been used for PM 2.5 concentration prediction in statistical models. Fire data could be a useful predictor since fires are significant contributors of PM 2.5 . In this paper, we examined whether remotely sensed fire count data could improve PM 2.5 prediction accuracy in the southeastern U.S. in a spatial statistical model setting. A sensitivity analysis showed that when the radius of the buffer zone centered at each PM 2.5 monitoring site reached 75 km, fire count data generally have the greatest predictive power of PM 2.5 across the models considered. Cross validation (CV) generated an R 2 of 0.69, a mean prediction error of 2.75 µg/m 3 , and root-mean-square prediction errors (RMSPEs) of 4.29 µg/m 3 , indicating a good fit between the dependent and predictor variables. A comparison showed that the prediction accuracy was improved more substantially from the nonfire model to the fire model at sites with higher fire counts. With increasing fire counts, CV RMSPE decreased by values up to 1.5 µg/m 3 , exhibiting a maximum improvement of 13.4% in prediction accuracy. Fire count data were shown to have better performance in southern Georgia and in the spring season due to higher fire occurrence. Our findings indicate that fire count data provide a measurable improvement in PM 2.5 concentration estimation, especially in areas and seasons prone to fire events.
Common polygenic variation enhances risk prediction for Alzheimer’s disease
Sims, Rebecca; Bannister, Christian; Harold, Denise; Vronskaya, Maria; Majounie, Elisa; Badarinarayan, Nandini; Morgan, Kevin; Passmore, Peter; Holmes, Clive; Powell, John; Brayne, Carol; Gill, Michael; Mead, Simon; Goate, Alison; Cruchaga, Carlos; Lambert, Jean-Charles; van Duijn, Cornelia; Maier, Wolfgang; Ramirez, Alfredo; Holmans, Peter; Jones, Lesley; Hardy, John; Seshadri, Sudha; Schellenberg, Gerard D.; Amouyel, Philippe
2015-01-01
The identification of subjects at high risk for Alzheimer’s disease is important for prognosis and early intervention. We investigated the polygenic architecture of Alzheimer’s disease and the accuracy of Alzheimer’s disease prediction models, including and excluding the polygenic component in the model. This study used genotype data from the powerful dataset comprising 17 008 cases and 37 154 controls obtained from the International Genomics of Alzheimer’s Project (IGAP). Polygenic score analysis tested whether the alleles identified to associate with disease in one sample set were significantly enriched in the cases relative to the controls in an independent sample. The disease prediction accuracy was investigated in a subset of the IGAP data, a sample of 3049 cases and 1554 controls (for whom APOE genotype data were available) by means of sensitivity, specificity, area under the receiver operating characteristic curve (AUC) and positive and negative predictive values. We observed significant evidence for a polygenic component enriched in Alzheimer’s disease (P = 4.9 × 10−26). This enrichment remained significant after APOE and other genome-wide associated regions were excluded (P = 3.4 × 10−19). The best prediction accuracy AUC = 78.2% (95% confidence interval 77–80%) was achieved by a logistic regression model with APOE, the polygenic score, sex and age as predictors. In conclusion, Alzheimer’s disease has a significant polygenic component, which has predictive utility for Alzheimer’s disease risk and could be a valuable research tool complementing experimental designs, including preventative clinical trials, stem cell selection and high/low risk clinical studies. In modelling a range of sample disease prevalences, we found that polygenic scores almost doubles case prediction from chance with increased prediction at polygenic extremes. PMID:26490334
Hu, Xuefei; Waller, Lance A.; Lyapustin, Alexei; Wang, Yujie; Liu, Yang
2017-01-01
Multiple studies have developed surface PM2.5 (particle size less than 2.5 µm in aerodynamic diameter) prediction models using satellite-derived aerosol optical depth as the primary predictor and meteorological and land use variables as secondary variables. To our knowledge, satellite-retrieved fire information has not been used for PM2.5 concentration prediction in statistical models. Fire data could be a useful predictor since fires are significant contributors of PM2.5. In this paper, we examined whether remotely sensed fire count data could improve PM2.5 prediction accuracy in the southeastern U.S. in a spatial statistical model setting. A sensitivity analysis showed that when the radius of the buffer zone centered at each PM2.5 monitoring site reached 75 km, fire count data generally have the greatest predictive power of PM2.5 across the models considered. Cross validation (CV) generated an R2 of 0.69, a mean prediction error of 2.75 µg/m3, and root-mean-square prediction errors (RMSPEs) of 4.29 µg/m3, indicating a good fit between the dependent and predictor variables. A comparison showed that the prediction accuracy was improved more substantially from the nonfire model to the fire model at sites with higher fire counts. With increasing fire counts, CV RMSPE decreased by values up to 1.5 µg/m3, exhibiting a maximum improvement of 13.4% in prediction accuracy. Fire count data were shown to have better performance in southern Georgia and in the spring season due to higher fire occurrence. Our findings indicate that fire count data provide a measurable improvement in PM2.5 concentration estimation, especially in areas and seasons prone to fire events. PMID:28967648
Pothula, Venu M.; Yuan, Stanley C.; Maerz, David A.; Montes, Lucresia; Oleszkiewicz, Stephen M.; Yusupov, Albert; Perline, Richard
2015-01-01
Background Advanced predictive analytical techniques are being increasingly applied to clinical risk assessment. This study compared a neural network model to several other models in predicting the length of stay (LOS) in the cardiac surgical intensive care unit (ICU) based on pre-incision patient characteristics. Methods Thirty six variables collected from 185 cardiac surgical patients were analyzed for contribution to ICU LOS. The Automatic Linear Modeling (ALM) module of IBM-SPSS software identified 8 factors with statistically significant associations with ICU LOS; these factors were also analyzed with the Artificial Neural Network (ANN) module of the same software. The weighted contributions of each factor (“trained” data) were then applied to data for a “new” patient to predict ICU LOS for that individual. Results Factors identified in the ALM model were: use of an intra-aortic balloon pump; O2 delivery index; age; use of positive cardiac inotropic agents; hematocrit; serum creatinine ≥ 1.3 mg/deciliter; gender; arterial pCO2. The r2 value for ALM prediction of ICU LOS in the initial (training) model was 0.356, p <0.0001. Cross validation in prediction of a “new” patient yielded r2 = 0.200, p <0.0001. The same 8 factors analyzed with ANN yielded a training prediction r2 of 0.535 (p <0.0001) and a cross validation prediction r2 of 0.410, p <0.0001. Two additional predictive algorithms were studied, but they had lower prediction accuracies. Our validated neural network model identified the upper quartile of ICU LOS with an odds ratio of 9.8(p <0.0001). Conclusions ANN demonstrated a 2-fold greater accuracy than ALM in prediction of observed ICU LOS. This greater accuracy would be presumed to result from the capacity of ANN to capture nonlinear effects and higher order interactions. Predictive modeling may be of value in early anticipation of risks of post-operative morbidity and utilization of ICU facilities. PMID:26710254
PPCM: Combing multiple classifiers to improve protein-protein interaction prediction
Yao, Jianzhuang; Guo, Hong; Yang, Xiaohan
2015-08-01
Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using anmore » assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.« less
Kebede, Mihiretu; Zegeye, Desalegn Tigabu; Zeleke, Berihun Megabiaw
2017-12-01
To monitor the progress of therapy and disease progression, periodic CD4 counts are required throughout the course of HIV/AIDS care and support. The demand for CD4 count measurement is increasing as ART programs expand over the last decade. This study aimed to predict CD4 count changes and to identify the predictors of CD4 count changes among patients on ART. A cross-sectional study was conducted at the University of Gondar Hospital from 3,104 adult patients on ART with CD4 counts measured at least twice (baseline and most recent). Data were retrieved from the HIV care clinic electronic database and patients` charts. Descriptive data were analyzed by SPSS version 20. Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology was followed to undertake the study. WEKA version 3.8 was used to conduct a predictive data mining. Before building the predictive data mining models, information gain values and correlation-based Feature Selection methods were used for attribute selection. Variables were ranked according to their relevance based on their information gain values. J48, Neural Network, and Random Forest algorithms were experimented to assess model accuracies. The median duration of ART was 191.5 weeks. The mean CD4 count change was 243 (SD 191.14) cells per microliter. Overall, 2427 (78.2%) patients had their CD4 counts increased by at least 100 cells per microliter, while 4% had a decline from the baseline CD4 value. Baseline variables including age, educational status, CD8 count, ART regimen, and hemoglobin levels predicted CD4 count changes with predictive accuracies of J48, Neural Network, and Random Forest being 87.1%, 83.5%, and 99.8%, respectively. Random Forest algorithm had a superior performance accuracy level than both J48 and Artificial Neural Network. The precision, sensitivity and recall values of Random Forest were also more than 99%. Nearly accurate prediction results were obtained using Random Forest algorithm. This algorithm could be used in a low-resource setting to build a web-based prediction model for CD4 count changes. Copyright © 2017 Elsevier B.V. All rights reserved.
Blanche, Paul; Proust-Lima, Cécile; Loubère, Lucie; Berr, Claudine; Dartigues, Jean-François; Jacqmin-Gadda, Hélène
2015-03-01
Thanks to the growing interest in personalized medicine, joint modeling of longitudinal marker and time-to-event data has recently started to be used to derive dynamic individual risk predictions. Individual predictions are called dynamic because they are updated when information on the subject's health profile grows with time. We focus in this work on statistical methods for quantifying and comparing dynamic predictive accuracy of this kind of prognostic models, accounting for right censoring and possibly competing events. Dynamic area under the ROC curve (AUC) and Brier Score (BS) are used to quantify predictive accuracy. Nonparametric inverse probability of censoring weighting is used to estimate dynamic curves of AUC and BS as functions of the time at which predictions are made. Asymptotic results are established and both pointwise confidence intervals and simultaneous confidence bands are derived. Tests are also proposed to compare the dynamic prediction accuracy curves of two prognostic models. The finite sample behavior of the inference procedures is assessed via simulations. We apply the proposed methodology to compare various prediction models using repeated measures of two psychometric tests to predict dementia in the elderly, accounting for the competing risk of death. Models are estimated on the French Paquid cohort and predictive accuracies are evaluated and compared on the French Three-City cohort. © 2014, The International Biometric Society.
The need to approximate the use-case in clinical machine learning.
Saeb, Sohrab; Lonini, Luca; Jayaraman, Arun; Mohr, David C; Kording, Konrad P
2017-05-01
The availability of smartphone and wearable sensor technology is leading to a rapid accumulation of human subject data, and machine learning is emerging as a technique to map those data into clinical predictions. As machine learning algorithms are increasingly used to support clinical decision making, it is vital to reliably quantify their prediction accuracy. Cross-validation (CV) is the standard approach where the accuracy of such algorithms is evaluated on part of the data the algorithm has not seen during training. However, for this procedure to be meaningful, the relationship between the training and the validation set should mimic the relationship between the training set and the dataset expected for the clinical use. Here we compared two popular CV methods: record-wise and subject-wise. While the subject-wise method mirrors the clinically relevant use-case scenario of diagnosis in newly recruited subjects, the record-wise strategy has no such interpretation. Using both a publicly available dataset and a simulation, we found that record-wise CV often massively overestimates the prediction accuracy of the algorithms. We also conducted a systematic review of the relevant literature, and found that this overly optimistic method was used by almost half of the retrieved studies that used accelerometers, wearable sensors, or smartphones to predict clinical outcomes. As we move towards an era of machine learning-based diagnosis and treatment, using proper methods to evaluate their accuracy is crucial, as inaccurate results can mislead both clinicians and data scientists. © The Author 2017. Published by Oxford University Press.
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models.
Karoui, Sofiene; Carabaño, María Jesús; Díaz, Clara; Legarra, Andrés
2012-12-07
Using a multi-breed reference population might be a way of increasing the accuracy of genomic breeding values in small breeds. Models involving mixed-breed data do not take into account the fact that marker effects may differ among breeds. This study was aimed at investigating the impact on accuracy of increasing the number of genotyped candidates in the training set by using a multi-breed reference population, in contrast to single-breed genomic evaluations. Three traits (milk production, fat content and female fertility) were analyzed by genomic mixed linear models and Bayesian methodology. Three breeds of French dairy cattle were used: Holstein, Montbéliarde and Normande with 2976, 950 and 970 bulls in the training population, respectively and 964, 222 and 248 bulls in the validation population, respectively. All animals were genotyped with the Illumina Bovine SNP50 array. Accuracy of genomic breeding values was evaluated under three scenarios for the correlation of genomic breeding values between breeds (r(g)): uncorrelated (1), r(g) = 0; estimated r(g) (2); high, r(g) = 0.95 (3). Accuracy and bias of predictions obtained in the validation population with the multi-breed training set were assessed by the coefficient of determination (R(2)) and by the regression coefficient of daughter yield deviations of validation bulls on their predicted genomic breeding values, respectively. The genetic variation captured by the markers for each trait was similar to that estimated for routine pedigree-based genetic evaluation. Posterior means for rg ranged from -0.01 for fertility between Montbéliarde and Normande to 0.79 for milk yield between Montbéliarde and Holstein. Differences in R(2) between the three scenarios were notable only for fat content in the Montbéliarde breed: from 0.27 in scenario (1) to 0.33 in scenarios (2) and (3). Accuracies for fertility were lower than for other traits. Using a multi-breed reference population resulted in small or no increases in accuracy. Only the breed with a small data set and large genetic correlation with the breed with a large data set showed increased accuracy for the traits with moderate (milk) to high (fat content) heritability. No benefit was observed for fertility, a lowly heritable trait.
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models
2012-01-01
Background Using a multi-breed reference population might be a way of increasing the accuracy of genomic breeding values in small breeds. Models involving mixed-breed data do not take into account the fact that marker effects may differ among breeds. This study was aimed at investigating the impact on accuracy of increasing the number of genotyped candidates in the training set by using a multi-breed reference population, in contrast to single-breed genomic evaluations. Methods Three traits (milk production, fat content and female fertility) were analyzed by genomic mixed linear models and Bayesian methodology. Three breeds of French dairy cattle were used: Holstein, Montbéliarde and Normande with 2976, 950 and 970 bulls in the training population, respectively and 964, 222 and 248 bulls in the validation population, respectively. All animals were genotyped with the Illumina Bovine SNP50 array. Accuracy of genomic breeding values was evaluated under three scenarios for the correlation of genomic breeding values between breeds (rg): uncorrelated (1), rg = 0; estimated rg (2); high, rg = 0.95 (3). Accuracy and bias of predictions obtained in the validation population with the multi-breed training set were assessed by the coefficient of determination (R2) and by the regression coefficient of daughter yield deviations of validation bulls on their predicted genomic breeding values, respectively. Results The genetic variation captured by the markers for each trait was similar to that estimated for routine pedigree-based genetic evaluation. Posterior means for rg ranged from −0.01 for fertility between Montbéliarde and Normande to 0.79 for milk yield between Montbéliarde and Holstein. Differences in R2 between the three scenarios were notable only for fat content in the Montbéliarde breed: from 0.27 in scenario (1) to 0.33 in scenarios (2) and (3). Accuracies for fertility were lower than for other traits. Conclusions Using a multi-breed reference population resulted in small or no increases in accuracy. Only the breed with a small data set and large genetic correlation with the breed with a large data set showed increased accuracy for the traits with moderate (milk) to high (fat content) heritability. No benefit was observed for fertility, a lowly heritable trait. PMID:23216664
Bio-knowledge based filters improve residue-residue contact prediction accuracy.
Wozniak, P P; Pelc, J; Skrzypecki, M; Vriend, G; Kotulska, M
2018-05-29
Residue-residue contact prediction through direct coupling analysis has reached impressive accuracy, but yet higher accuracy will be needed to allow for routine modelling of protein structures. One way to improve the prediction accuracy is to filter predicted contacts using knowledge about the particular protein of interest or knowledge about protein structures in general. We focus on the latter and discuss a set of filters that can be used to remove false positive contact predictions. Each filter depends on one or a few cut-off parameters for which the filter performance was investigated. Combining all filters while using default parameters resulted for a test-set of 851 protein domains in the removal of 29% of the predictions of which 92% were indeed false positives. All data and scripts are available from http://comprec-lin.iiar.pwr.edu.pl/FPfilter/. malgorzata.kotulska@pwr.edu.pl. Supplementary data are available at Bioinformatics online.
Tsukagoshi, Mariko; Araki, Kenichiro; Saito, Fumiyoshi; Kubo, Norio; Watanabe, Akira; Igarashi, Takamichi; Ishii, Norihiro; Yamanaka, Takahiro; Shirabe, Ken; Kuwano, Hiroyuki
2018-04-01
International consensus guidelines for intraductal papillary mucinous neoplasms (IPMNs) were revised in 2012. We aimed to evaluate the clinical utility of each predictor in the 2006 and 2012 guidelines and validate the diagnostic value and surgical indications. Forty-two patients with surgically resected IPMNs were included. Each predictor was applied to evaluate its diagnostic value. The 2012 guidelines had greater accuracy for invasive carcinoma than the 2006 guidelines (64.3 vs. 31.0%). Moreover, the accuracy for high-grade dysplasia was also increased (48.6 vs. 77.1%). When the main pancreatic duct (MPD) size ≥8 mm was substituted for MPD size ≥10 mm in the 2012 guidelines, the accuracy for high-grade dysplasia was 80.0%. The 2012 guidelines exhibited increased diagnostic accuracy for invasive IPMN. It is important to consider surgical resection prior to invasive carcinoma, and high-risk stigmata might be a useful diagnostic criterion. Furthermore, MPD size ≥8 mm may be predictive of high-grade dysplasia.
Compound activity prediction using models of binding pockets or ligand properties in 3D
Kufareva, Irina; Chen, Yu-Chen; Ilatovskiy, Andrey V.; Abagyan, Ruben
2014-01-01
Transient interactions of endogenous and exogenous small molecules with flexible binding sites in proteins or macromolecular assemblies play a critical role in all biological processes. Current advances in high-resolution protein structure determination, database development, and docking methodology make it possible to design three-dimensional models for prediction of such interactions with increasing accuracy and specificity. Using the data collected in the Pocketome encyclopedia, we here provide an overview of two types of the three-dimensional ligand activity models, pocket-based and ligand property-based, for two important classes of proteins, nuclear and G-protein coupled receptors. For half the targets, the pocket models discriminate actives from property matched decoys with acceptable accuracy (the area under ROC curve, AUC, exceeding 84%) and for about one fifth of the targets with high accuracy (AUC > 95%). The 3D ligand property field models performed better than 95% in half of the cases. The high performance models can already become a basis of activity predictions for new chemicals. Family-wide benchmarking of the models highlights strengths of both approaches and helps identify their inherent bottlenecks and challenges. PMID:23116466
The Neural-fuzzy Thermal Error Compensation Controller on CNC Machining Center
NASA Astrophysics Data System (ADS)
Tseng, Pai-Chung; Chen, Shen-Len
The geometric errors and structural thermal deformation are factors that influence the machining accuracy of Computer Numerical Control (CNC) machining center. Therefore, researchers pay attention to thermal error compensation technologies on CNC machine tools. Some real-time error compensation techniques have been successfully demonstrated in both laboratories and industrial sites. The compensation results still need to be enhanced. In this research, the neural-fuzzy theory has been conducted to derive a thermal prediction model. An IC-type thermometer has been used to detect the heat sources temperature variation. The thermal drifts are online measured by a touch-triggered probe with a standard bar. A thermal prediction model is then derived by neural-fuzzy theory based on the temperature variation and the thermal drifts. A Graphic User Interface (GUI) system is also built to conduct the user friendly operation interface with Insprise C++ Builder. The experimental results show that the thermal prediction model developed by neural-fuzzy theory methodology can improve machining accuracy from 80µm to 3µm. Comparison with the multi-variable linear regression analysis the compensation accuracy is increased from ±10µm to ±3µm.
Protein contact prediction using patterns of correlation.
Hamilton, Nicholas; Burrage, Kevin; Ragan, Mark A; Huber, Thomas
2004-09-01
We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two "windows" of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Copyright 2004 Wiley-Liss, Inc.
Karpušenkaitė, Aistė; Ruzgas, Tomas; Denafas, Gintaras
2018-05-01
The aim of the study was to create a hybrid forecasting method that could produce higher accuracy forecasts than previously used 'pure' time series methods. Mentioned methods were already tested with total automotive waste, hazardous automotive waste, and total medical waste generation, but demonstrated at least a 6% error rate in different cases and efforts were made to decrease it even more. Newly developed hybrid models used a random start generation method to incorporate different time-series advantages and it helped to increase the accuracy of forecasts by 3%-4% in hazardous automotive waste and total medical waste generation cases; the new model did not increase the accuracy of total automotive waste generation forecasts. Developed models' abilities to forecast short- and mid-term forecasts were tested using prediction horizon.
Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F
2016-01-01
Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.
Cantiello, Francesco; Russo, Giorgio Ivan; Cicione, Antonio; Ferro, Matteo; Cimino, Sebastiano; Favilla, Vincenzo; Perdonà, Sisto; De Cobelli, Ottavio; Magno, Carlo; Morgia, Giuseppe; Damiano, Rocco
2016-04-01
To assess the performance of prostate health index (PHI) and prostate cancer antigen 3 (PCA3) when added to the PRIAS or Epstein criteria in predicting the presence of pathologically insignificant prostate cancer (IPCa) in patients who underwent radical prostatectomy (RP) but eligible for active surveillance (AS). An observational retrospective study was performed in 188 PCa patients treated with laparoscopic or robot-assisted RP but eligible for AS according to Epstein or PRIAS criteria. Blood and urinary specimens were collected before initial prostate biopsy for PHI and PCA3 measurements. Multivariate logistic regression analyses and decision curve analysis were carried out to identify predictors of IPCa using the updated ERSPC definition. At the multivariate analyses, the inclusion of both PCA3 and PHI significantly increased the accuracy of the Epstein multivariate model in predicting IPCa with an increase of 17 % (AUC = 0.77) and of 32 % (AUC = 0.92), respectively. The inclusion of both PCA3 and PHI also increased the predictive accuracy of the PRIAS multivariate model with an increase of 29 % (AUC = 0.87) and of 39 % (AUC = 0.97), respectively. DCA revealed that the multivariable models with the addition of PHI or PCA3 showed a greater net benefit and performed better than the reference models. In a direct comparison, PHI outperformed PCA3 performance resulting in higher net benefit. In a same cohort of patients eligible for AS, the addition of PHI and PCA3 to Epstein or PRIAS models improved their prognostic performance. PHI resulted in greater net benefit in predicting IPCa compared to PCA3.
2015-06-07
anno - tations for these 5 attributes we achieve (65.18%) accuracy, better than human performance (60.12%) at predicting rel- ative virality directly...Nature, 2005. 1 [3] A. Berg, T. Berg, H. Daume, J . Dodge, A. Goyal, X. Han, A. Mensch, M. Mitchell, A. Sood, K. Stratos, et al. Understanding and...predicting importance in images. In CVPR, 2012. 2 [4] J . Berger. Arousal increases social transmission of information. Psy- chological science, 2011. 1
Hartman, Joshua D; Day, Graeme M; Beran, Gregory J O
2016-11-02
Chemical shift prediction plays an important role in the determination or validation of crystal structures with solid-state nuclear magnetic resonance (NMR) spectroscopy. One of the fundamental theoretical challenges lies in discriminating variations in chemical shifts resulting from different crystallographic environments. Fragment-based electronic structure methods provide an alternative to the widely used plane wave gauge-including projector augmented wave (GIPAW) density functional technique for chemical shift prediction. Fragment methods allow hybrid density functionals to be employed routinely in chemical shift prediction, and we have recently demonstrated appreciable improvements in the accuracy of the predicted shifts when using the hybrid PBE0 functional instead of generalized gradient approximation (GGA) functionals like PBE. Here, we investigate the solid-state 13 C and 15 N NMR spectra for multiple crystal forms of acetaminophen, phenobarbital, and testosterone. We demonstrate that the use of the hybrid density functional instead of a GGA provides both higher accuracy in the chemical shifts and increased discrimination among the different crystallographic environments. Finally, these results also provide compelling evidence for the transferability of the linear regression parameters mapping predicted chemical shieldings to chemical shifts that were derived in an earlier study.
2016-01-01
Chemical shift prediction plays an important role in the determination or validation of crystal structures with solid-state nuclear magnetic resonance (NMR) spectroscopy. One of the fundamental theoretical challenges lies in discriminating variations in chemical shifts resulting from different crystallographic environments. Fragment-based electronic structure methods provide an alternative to the widely used plane wave gauge-including projector augmented wave (GIPAW) density functional technique for chemical shift prediction. Fragment methods allow hybrid density functionals to be employed routinely in chemical shift prediction, and we have recently demonstrated appreciable improvements in the accuracy of the predicted shifts when using the hybrid PBE0 functional instead of generalized gradient approximation (GGA) functionals like PBE. Here, we investigate the solid-state 13C and 15N NMR spectra for multiple crystal forms of acetaminophen, phenobarbital, and testosterone. We demonstrate that the use of the hybrid density functional instead of a GGA provides both higher accuracy in the chemical shifts and increased discrimination among the different crystallographic environments. Finally, these results also provide compelling evidence for the transferability of the linear regression parameters mapping predicted chemical shieldings to chemical shifts that were derived in an earlier study. PMID:27829821
Cortical Thickness Predicts the First Onset of Major Depression in Adolescence
Foland-Ross, Lara C.; Sacchet, Matthew D.; Prasad, Gautam; Gilbert, Brooke; Thompson, Paul M.; Gotlib, Ian H.
2015-01-01
Given the increasing prevalence of Major Depressive Disorder and recent advances in preventative treatments for this disorder, an important challenge in pediatric neuroimaging is the early identification of individuals at risk for depression. We examined whether machine learning can be used to predict the onset of depression at the individual level. Thirty-three never-disordered adolescents (10–15 years old) underwent structural MRI. Participants were followed for 5 years to monitor the emergence of clinically significant depressive symptoms. We used support vector machines (SVMs) to test whether baseline cortical thickness could reliably distinguish adolescents who develop depression from adolescents who remained free of any Axis I disorder. Accuracies from subsampled cross-validated classification were used to assess classifier performance. Baseline cortical thickness correctly predicted the future onset of depression with an overall accuracy of 70% (69% sensitivity, 70% specificity; p = 0.021). Examination of SVM feature weights indicated that the right medial orbitofrontal, right precentral, left anterior cingulate, and bilateral insular cortex contributed most strongly to this classification. These findings indicate that cortical gray matter structure can predict the subsequent onset of depression. An important direction for future research is to elucidate mechanisms by which these anomalies in gray matter structure increase risk for developing this disorder. PMID:26315399
Cortical thickness predicts the first onset of major depression in adolescence.
Foland-Ross, Lara C; Sacchet, Matthew D; Prasad, Gautam; Gilbert, Brooke; Thompson, Paul M; Gotlib, Ian H
2015-11-01
Given the increasing prevalence of Major Depressive Disorder and recent advances in preventative treatments for this disorder, an important challenge in pediatric neuroimaging is the early identification of individuals at risk for depression. We examined whether machine learning can be used to predict the onset of depression at the individual level. Thirty-three never-disordered adolescents (10-15 years old) underwent structural MRI. Participants were followed for 5 years to monitor the emergence of clinically significant depressive symptoms. We used support vector machines (SVMs) to test whether baseline cortical thickness could reliably distinguish adolescents who develop depression from adolescents who remained free of any Axis I disorder. Accuracies from subsampled cross-validated classification were used to assess classifier performance. Baseline cortical thickness correctly predicted the future onset of depression with an overall accuracy of 70% (69% sensitivity, 70% specificity; p=0.021). Examination of SVM feature weights indicated that the right medial orbitofrontal, right precentral, left anterior cingulate, and bilateral insular cortex contributed most strongly to this classification. These findings indicate that cortical gray matter structure can predict the subsequent onset of depression. An important direction for future research is to elucidate mechanisms by which these anomalies in gray matter structure increase risk for developing this disorder. Copyright © 2015 Elsevier Ltd. All rights reserved.
Cui, Zaixu; Gong, Gaolang
2018-06-02
Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.
Prediction of Dementia in Primary Care Patients
Jessen, Frank; Wiese, Birgitt; Bickel, Horst; Eiffländer-Gorfer, Sandra; Fuchs, Angela; Kaduszkiewicz, Hanna; Köhler, Mirjam; Luck, Tobias; Mösch, Edelgard; Pentzek, Michael; Riedel-Heller, Steffi G.; Wagner, Michael; Weyerer, Siegfried; Maier, Wolfgang; van den Bussche, Hendrik
2011-01-01
Background Current approaches for AD prediction are based on biomarkers, which are however of restricted availability in primary care. AD prediction tools for primary care are therefore needed. We present a prediction score based on information that can be obtained in the primary care setting. Methodology/Principal Findings We performed a longitudinal cohort study in 3.055 non-demented individuals above 75 years recruited via primary care chart registries (Study on Aging, Cognition and Dementia, AgeCoDe). After the baseline investigation we performed three follow-up investigations at 18 months intervals with incident dementia as the primary outcome. The best set of predictors was extracted from the baseline variables in one randomly selected half of the sample. This set included age, subjective memory impairment, performance on delayed verbal recall and verbal fluency, on the Mini-Mental-State-Examination, and on an instrumental activities of daily living scale. These variables were aggregated to a prediction score, which achieved a prediction accuracy of 0.84 for AD. The score was applied to the second half of the sample (test cohort). Here, the prediction accuracy was 0.79. With a cut-off of at least 80% sensitivity in the first cohort, 79.6% sensitivity, 66.4% specificity, 14.7% positive predictive value (PPV) and 97.8% negative predictive value of (NPV) for AD were achieved in the test cohort. At a cut-off for a high risk population (5% of individuals with the highest risk score in the first cohort) the PPV for AD was 39.1% (52% for any dementia) in the test cohort. Conclusions The prediction score has useful prediction accuracy. It can define individuals (1) sensitively for low cost-low risk interventions, or (2) more specific and with increased PPV for measures of prevention with greater costs or risks. As it is independent of technical aids, it may be used within large scale prevention programs. PMID:21364746
Prediction of dementia in primary care patients.
Jessen, Frank; Wiese, Birgitt; Bickel, Horst; Eiffländer-Gorfer, Sandra; Fuchs, Angela; Kaduszkiewicz, Hanna; Köhler, Mirjam; Luck, Tobias; Mösch, Edelgard; Pentzek, Michael; Riedel-Heller, Steffi G; Wagner, Michael; Weyerer, Siegfried; Maier, Wolfgang; van den Bussche, Hendrik
2011-02-18
Current approaches for AD prediction are based on biomarkers, which are however of restricted availability in primary care. AD prediction tools for primary care are therefore needed. We present a prediction score based on information that can be obtained in the primary care setting. We performed a longitudinal cohort study in 3.055 non-demented individuals above 75 years recruited via primary care chart registries (Study on Aging, Cognition and Dementia, AgeCoDe). After the baseline investigation we performed three follow-up investigations at 18 months intervals with incident dementia as the primary outcome. The best set of predictors was extracted from the baseline variables in one randomly selected half of the sample. This set included age, subjective memory impairment, performance on delayed verbal recall and verbal fluency, on the Mini-Mental-State-Examination, and on an instrumental activities of daily living scale. These variables were aggregated to a prediction score, which achieved a prediction accuracy of 0.84 for AD. The score was applied to the second half of the sample (test cohort). Here, the prediction accuracy was 0.79. With a cut-off of at least 80% sensitivity in the first cohort, 79.6% sensitivity, 66.4% specificity, 14.7% positive predictive value (PPV) and 97.8% negative predictive value of (NPV) for AD were achieved in the test cohort. At a cut-off for a high risk population (5% of individuals with the highest risk score in the first cohort) the PPV for AD was 39.1% (52% for any dementia) in the test cohort. The prediction score has useful prediction accuracy. It can define individuals (1) sensitively for low cost-low risk interventions, or (2) more specific and with increased PPV for measures of prevention with greater costs or risks. As it is independent of technical aids, it may be used within large scale prevention programs.
Jiang, Y; Zhao, Y; Rodemann, B; Plieske, J; Kollers, S; Korzun, V; Ebmeyer, E; Argillier, O; Hinze, M; Ling, J; Röder, M S; Ganal, M W; Mette, M F; Reif, J C
2015-03-01
Genome-wide mapping approaches in diverse populations are powerful tools to unravel the genetic architecture of complex traits. The main goals of our study were to investigate the potential and limits to unravel the genetic architecture and to identify the factors determining the accuracy of prediction of the genotypic variation of Fusarium head blight (FHB) resistance in wheat (Triticum aestivum L.) based on data collected with a diverse panel of 372 European varieties. The wheat lines were phenotyped in multi-location field trials for FHB resistance and genotyped with 782 simple sequence repeat (SSR) markers, and 9k and 90k single-nucleotide polymorphism (SNP) arrays. We applied genome-wide association mapping in combination with fivefold cross-validations and observed surprisingly high accuracies of prediction for marker-assisted selection based on the detected quantitative trait loci (QTLs). Using a random sample of markers not selected for marker-trait associations revealed only a slight decrease in prediction accuracy compared with marker-based selection exploiting the QTL information. The same picture was confirmed in a simulation study, suggesting that relatedness is a main driver of the accuracy of prediction in marker-assisted selection of FHB resistance. When the accuracy of prediction of three genomic selection models was contrasted for the three marker data sets, no significant differences in accuracies among marker platforms and genomic selection models were observed. Marker density impacted the accuracy of prediction only marginally. Consequently, genomic selection of FHB resistance can be implemented most cost-efficiently based on low- to medium-density SNP arrays.
Space debris tracking at San Fernando laser station
NASA Astrophysics Data System (ADS)
Catalán, M.; Quijano, M.; Pazos, A.; Martín Davila, J.; Cortina, L. M.
2016-12-01
For years to come space debris will be a major issue for society. It has a negative impact on active artificial satellites, having implications for future missions. Tracking space debris as accurately as possible is the first step towards controlling this problem, yet it presents a challenge for science. The main limitation is the relatively low accuracy of the methods used to date for tracking these objects. Clearly, improving the predicted orbit accuracy is crucial (avoiding unnecessary anti-collision maneuvers). A new field of research was recently instituted by our satellite laser ranging station: tracking decommissioned artificial satellites equipped with retroreflectors. To this end we work in conjunction with international space agencies which provide increasing attention to this problem. We thus proposed to share our time-schedule of use of the satellite laser ranging station for obtaining data that would make orbital element predictions far more accurate (meter accuracy), whilst maintaining our tracking routines for active satellites. This manuscript reports on the actions carried out so far.
Prediction algorithms for urban traffic control
DOT National Transportation Integrated Search
1979-02-01
The objectives of this study are to 1) review and assess the state-of-the-art of prediction algorithms for urban traffic control in terms of their accuracy and application, and 2) determine the prediction accuracy obtainable by examining the performa...
Rath, Timo; Tontini, Gian E; Vieth, Michael; Nägel, Andreas; Neurath, Markus F; Neumann, Helmut
2016-06-01
In order to reduce time, costs, and risks associated with resection of diminutive colorectal polyps, the American Society for Gastrointestinal Endoscopy (ASGE) recently proposed performance thresholds that new technologies should meet for the accurate real-time assessment of histology of colorectal polyps. In this study, we prospectively assessed whether laser-induced fluorescence spectroscopy (LIFS), using the new WavSTAT4 optical biopsy system, can meet the ASGE criteria. 27 patients undergoing screening or surveillance colonoscopy were included. The histology of 137 diminutive colorectal polyps was predicted in real time using LIFS and findings were compared with the results of conventional histopathological examination. The accuracy of predicting polyp histology with WavSTAT4 was assessed according to the ASGE criteria. The overall accuracy of LIFS using WavSTAT4 for predicting polyp histology was 84.7 % with sensitivity, specificity, and negative predictive value (NPV) of 81.8 %, 85.2 %, and 96.1 %. When only distal colorectal diminutive polyps were considered, the NPV for excluding adenomatous histology increased to 100 % (accuracy 82.4 %, sensitivity 100 %, specificity 80.6 %). On-site, LIFS correctly predicted the recommended surveillance intervals with an accuracy of 88.9 % (24/27 patients) when compared with histology-based United States guideline recommendations; in the 3 patients for whom LIFS- and histopathology-based recommended surveillance intervals differed, LIFS predicted shorter surveillance intervals. From the data of this pilot study, LIFS using the WavSTAT4 system appears accurate enough to allow distal colorectal polyps to be left in place and nearly reaches the threshold to "resect and discard" them without pathologic assessment. WavSTAT4 therefore has the potential to reduce costs and risks associated with the removal of diminutive colorectal polyps. © Georg Thieme Verlag KG Stuttgart · New York.
Yock, Adam D; Rao, Arvind; Dong, Lei; Beadle, Beth M; Garden, Adam S; Kudchadker, Rajat J; Court, Laurence E
2014-05-01
The purpose of this work was to develop and evaluate the accuracy of several predictive models of variation in tumor volume throughout the course of radiation therapy. Nineteen patients with oropharyngeal cancers were imaged daily with CT-on-rails for image-guided alignment per an institutional protocol. The daily volumes of 35 tumors in these 19 patients were determined and used to generate (1) a linear model in which tumor volume changed at a constant rate, (2) a general linear model that utilized the power fit relationship between the daily and initial tumor volumes, and (3) a functional general linear model that identified and exploited the primary modes of variation between time series describing the changing tumor volumes. Primary and nodal tumor volumes were examined separately. The accuracy of these models in predicting daily tumor volumes were compared with those of static and linear reference models using leave-one-out cross-validation. In predicting the daily volume of primary tumors, the general linear model and the functional general linear model were more accurate than the static reference model by 9.9% (range: -11.6%-23.8%) and 14.6% (range: -7.3%-27.5%), respectively, and were more accurate than the linear reference model by 14.2% (range: -6.8%-40.3%) and 13.1% (range: -1.5%-52.5%), respectively. In predicting the daily volume of nodal tumors, only the 14.4% (range: -11.1%-20.5%) improvement in accuracy of the functional general linear model compared to the static reference model was statistically significant. A general linear model and a functional general linear model trained on data from a small population of patients can predict the primary tumor volume throughout the course of radiation therapy with greater accuracy than standard reference models. These more accurate models may increase the prognostic value of information about the tumor garnered from pretreatment computed tomography images and facilitate improved treatment management.
Medium- and Long-term Prediction of LOD Change by the Leap-step Autoregressive Model
NASA Astrophysics Data System (ADS)
Wang, Qijie
2015-08-01
The accuracy of medium- and long-term prediction of length of day (LOD) change base on combined least-square and autoregressive (LS+AR) deteriorates gradually. Leap-step autoregressive (LSAR) model can significantly reduce the edge effect of the observation sequence. Especially, LSAR model greatly improves the resolution of signals’ low-frequency components. Therefore, it can improve the efficiency of prediction. In this work, LSAR is used to forecast the LOD change. The LOD series from EOP 08 C04 provided by IERS is modeled by both the LSAR and AR models. The results of the two models are analyzed and compared. When the prediction length is between 10-30 days, the accuracy improvement is less than 10%. When the prediction length amounts to above 30 day, the accuracy improved obviously, with the maximum being around 19%. The results show that the LSAR model has higher prediction accuracy and stability in medium- and long-term prediction.
2014-01-01
Introduction Prolonged ventilation and failed extubation are associated with increased harm and cost. The added value of heart and respiratory rate variability (HRV and RRV) during spontaneous breathing trials (SBTs) to predict extubation failure remains unknown. Methods We enrolled 721 patients in a multicenter (12 sites), prospective, observational study, evaluating clinical estimates of risk of extubation failure, physiologic measures recorded during SBTs, HRV and RRV recorded before and during the last SBT prior to extubation, and extubation outcomes. We excluded 287 patients because of protocol or technical violations, or poor data quality. Measures of variability (97 HRV, 82 RRV) were calculated from electrocardiogram and capnography waveforms followed by automated cleaning and variability analysis using Continuous Individualized Multiorgan Variability Analysis (CIMVA™) software. Repeated randomized subsampling with training, validation, and testing were used to derive and compare predictive models. Results Of 434 patients with high-quality data, 51 (12%) failed extubation. Two HRV and eight RRV measures showed statistically significant association with extubation failure (P <0.0041, 5% false discovery rate). An ensemble average of five univariate logistic regression models using RRV during SBT, yielding a probability of extubation failure (called WAVE score), demonstrated optimal predictive capacity. With repeated random subsampling and testing, the model showed mean receiver operating characteristic area under the curve (ROC AUC) of 0.69, higher than heart rate (0.51), rapid shallow breathing index (RBSI; 0.61) and respiratory rate (0.63). After deriving a WAVE model based on all data, training-set performance demonstrated that the model increased its predictive power when applied to patients conventionally considered high risk: a WAVE score >0.5 in patients with RSBI >105 and perceived high risk of failure yielded a fold increase in risk of extubation failure of 3.0 (95% confidence interval (CI) 1.2 to 5.2) and 3.5 (95% CI 1.9 to 5.4), respectively. Conclusions Altered HRV and RRV (during the SBT prior to extubation) are significantly associated with extubation failure. A predictive model using RRV during the last SBT provided optimal accuracy of prediction in all patients, with improved accuracy when combined with clinical impression or RSBI. This model requires a validation cohort to evaluate accuracy and generalizability. Trial registration ClinicalTrials.gov NCT01237886. Registered 13 October 2010. PMID:24713049
Song, Hao; Ruan, Dan; Liu, Wenyang; Stenger, V Andrew; Pohmann, Rolf; Fernández-Seara, Maria A; Nair, Tejas; Jung, Sungkyu; Luo, Jingqin; Motai, Yuichi; Ma, Jingfei; Hazle, John D; Gach, H Michael
2017-03-01
Respiratory motion prediction using an artificial neural network (ANN) was integrated with pseudocontinuous arterial spin labeling (pCASL) MRI to allow free-breathing perfusion measurements in the kidney. In this study, we evaluated the performance of the ANN to accurately predict the location of the kidneys during image acquisition. A pencil-beam navigator was integrated with a pCASL sequence to measure lung/diaphragm motion during ANN training and the pCASL transit delay. The ANN algorithm ran concurrently in the background to predict organ location during the 0.7-s 15-slice acquisition based on the navigator data. The predictions were supplied to the pulse sequence to prospectively adjust the axial slice acquisition to match the predicted organ location. Additional navigators were acquired immediately after the multislice acquisition to assess the performance and accuracy of the ANN. The technique was tested in eight healthy volunteers. The root-mean-square error (RMSE) and mean absolute error (MAE) for the eight volunteers were 1.91 ± 0.17 mm and 1.43 ± 0.17 mm, respectively, for the ANN. The RMSE increased with transit delay. The MAE typically increased from the first to last prediction in the image acquisition. The overshoot was 23.58% ± 3.05% using the target prediction accuracy of ± 1 mm. Respiratory motion prediction with prospective motion correction was successfully demonstrated for free-breathing perfusion MRI of the kidney. The method serves as an alternative to multiple breathholds and requires minimal effort from the patient. © 2017 American Association of Physicists in Medicine.
NASA Astrophysics Data System (ADS)
Takayama, T.; Iwasaki, A.
2016-06-01
Above-ground biomass prediction of tropical rain forest using remote sensing data is of paramount importance to continuous large-area forest monitoring. Hyperspectral data can provide rich spectral information for the biomass prediction; however, the prediction accuracy is affected by a small-sample-size problem, which widely exists as overfitting in using high dimensional data where the number of training samples is smaller than the dimensionality of the samples due to limitation of require time, cost, and human resources for field surveys. A common approach to addressing this problem is reducing the dimensionality of dataset. Also, acquired hyperspectral data usually have low signal-to-noise ratio due to a narrow bandwidth and local or global shifts of peaks due to instrumental instability or small differences in considering practical measurement conditions. In this work, we propose a methodology based on fused lasso regression that select optimal bands for the biomass prediction model with encouraging sparsity and grouping, which solves the small-sample-size problem by the dimensionality reduction from the sparsity and the noise and peak shift problem by the grouping. The prediction model provided higher accuracy with root-mean-square error (RMSE) of 66.16 t/ha in the cross-validation than other methods; multiple linear analysis, partial least squares regression, and lasso regression. Furthermore, fusion of spectral and spatial information derived from texture index increased the prediction accuracy with RMSE of 62.62 t/ha. This analysis proves efficiency of fused lasso and image texture in biomass estimation of tropical forests.
NASA Technical Reports Server (NTRS)
Orme, John S.; Schkolnik, Gerard S.
1995-01-01
Performance Seeking Control (PSC), an onboard, adaptive, real-time optimization algorithm, relies upon an onboard propulsion system model. Flight results illustrated propulsion system performance improvements as calculated by the model. These improvements were subject to uncertainty arising from modeling error. Thus to quantify uncertainty in the PSC performance improvements, modeling accuracy must be assessed. A flight test approach to verify PSC-predicted increases in thrust (FNP) and absolute levels of fan stall margin is developed and applied to flight test data. Application of the excess thrust technique shows that increases of FNP agree to within 3 percent of full-scale measurements for most conditions. Accuracy to these levels is significant because uncertainty bands may now be applied to the performance improvements provided by PSC. Assessment of PSC fan stall margin modeling accuracy was completed with analysis of in-flight stall tests. Results indicate that the model overestimates the stall margin by between 5 to 10 percent. Because PSC achieves performance gains by using available stall margin, this overestimation may represent performance improvements to be recovered with increased modeling accuracy. Assessment of thrust and stall margin modeling accuracy provides a critical piece for a comprehensive understanding of PSC's capabilities and limitations.
Muleta, Kebede T; Bulli, Peter; Zhang, Zhiwu; Chen, Xianming; Pumphrey, Michael
2017-11-01
Harnessing diversity from germplasm collections is more feasible today because of the development of lower-cost and higher-throughput genotyping methods. However, the cost of phenotyping is still generally high, so efficient methods of sampling and exploiting useful diversity are needed. Genomic selection (GS) has the potential to enhance the use of desirable genetic variation in germplasm collections through predicting the genomic estimated breeding values (GEBVs) for all traits that have been measured. Here, we evaluated the effects of various scenarios of population genetic properties and marker density on the accuracy of GEBVs in the context of applying GS for wheat ( L.) germplasm use. Empirical data for adult plant resistance to stripe rust ( f. sp. ) collected on 1163 spring wheat accessions and genotypic data based on the wheat 9K single nucleotide polymorphism (SNP) iSelect assay were used for various genomic prediction tests. Unsurprisingly, the results of the cross-validation tests demonstrated that prediction accuracy increased with an increase in training population size and marker density. It was evident that using all the available markers (5619) was unnecessary for capturing the trait variation in the germplasm collection, with no further gain in prediction accuracy beyond 1 SNP per 3.2 cM (∼1850 markers), which is close to the linkage disequilibrium decay rate in this population. Collectively, our results suggest that larger germplasm collections may be efficiently sampled via lower-density genotyping methods, whereas genetic relationships between the training and validation populations remain critical when exploiting GS to select from germplasm collections. Copyright © 2017 Crop Science Society of America.
KANAZAWA, Tomomi; SEKI, Motohide; ISHIYAMA, Keiki; ARASEKI, Masao; IZAIKE, Yoshiaki; TAKAHASHI, Toru
2017-01-01
This study assessed the effects of gonadotropin-releasing hormone (GnRH) treatment on Day 5 (Day 0 = estrus) on luteal blood flow and accuracy of pregnancy prediction in recipient cows. On Day 5, 120 lactating Holstein cows were randomly assigned to a control group (n = 63) or GnRH group treated with 100 μg of GnRH agonist (n = 57). On Days 3, 5, 7, and 14, each cow underwent ultrasound examination to measure the blood flow area (BFA) and time-averaged maximum velocity (TAMV) at the spiral arteries at the base of the corpus luteum using color Doppler ultrasonography. Cows with a corpus luteum diameter ≥ 20 mm (n = 120) received embryo transfers on Day 7. The BFA values in the GnRH group were significantly higher than those in the control group on Days 7 and 14. TAMV did not differ between these groups. According to receiver operating characteristic analyses to predict pregnancy, a BFA cutoff of 0.52 cm2 yielded the highest sensitivity (83.3%) and specificity (90.5%) on Day 7, and BFA and TAMV values of 0.94 cm2 and 44.93 cm/s, respectively, yielded the highest sensitivity (97.1%) and specificity (100%) on Day 14 in the GnRH group. The areas under the curve for the paired BFA and TAMV in the GnRH group were 0.058 higher than those in the control group (0.996 and 0.938, respectively; P < 0.05). In conclusion, GnRH treatment on Day 5 increased the luteal BFA in recipient cows on Days 7 and 14, and improved the accuracy of pregnancy prediction on Day 14. PMID:28552886
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Traci L.; Sharon, Keren, E-mail: tljohn@umich.edu
Until now, systematic errors in strong gravitational lens modeling have been acknowledged but have never been fully quantified. Here, we launch an investigation into the systematics induced by constraint selection. We model the simulated cluster Ares 362 times using random selections of image systems with and without spectroscopic redshifts and quantify the systematics using several diagnostics: image predictability, accuracy of model-predicted redshifts, enclosed mass, and magnification. We find that for models with >15 image systems, the image plane rms does not decrease significantly when more systems are added; however, the rms values quoted in the literature may be misleading asmore » to the ability of a model to predict new multiple images. The mass is well constrained near the Einstein radius in all cases, and systematic error drops to <2% for models using >10 image systems. Magnification errors are smallest along the straight portions of the critical curve, and the value of the magnification is systematically lower near curved portions. For >15 systems, the systematic error on magnification is ∼2%. We report no trend in magnification error with the fraction of spectroscopic image systems when selecting constraints at random; however, when using the same selection of constraints, increasing this fraction up to ∼0.5 will increase model accuracy. The results suggest that the selection of constraints, rather than quantity alone, determines the accuracy of the magnification. We note that spectroscopic follow-up of at least a few image systems is crucial because models without any spectroscopic redshifts are inaccurate across all of our diagnostics.« less
Krendl, Anne C; Rule, Nicholas O; Ambady, Nalini
2014-09-01
Young adults can be surprisingly accurate at making inferences about people from their faces. Although these first impressions have important consequences for both the perceiver and the target, it remains an open question whether first impression accuracy is preserved with age. Specifically, could age differences in impressions toward others stem from age-related deficits in accurately detecting complex social cues? Research on aging and impression formation suggests that young and older adults show relative consensus in their first impressions, but it is unknown whether they differ in accuracy. It has been widely shown that aging disrupts emotion recognition accuracy, and that these impairments may predict deficits in other social judgments, such as detecting deceit. However, it is unclear whether general impression formation accuracy (e.g., emotion recognition accuracy, detecting complex social cues) relies on similar or distinct mechanisms. It is important to examine this question to evaluate how, if at all, aging might affect overall accuracy. Here, we examined whether aging impaired first impression accuracy in predicting real-world outcomes and categorizing social group membership. Specifically, we studied whether emotion recognition accuracy and age-related cognitive decline (which has been implicated in exacerbating deficits in emotion recognition) predict first impression accuracy. Our results revealed that emotion recognition accuracy did not predict first impression accuracy, nor did age-related cognitive decline impair it. These findings suggest that domains of social perception outside of emotion recognition may rely on mechanisms that are relatively unimpaired by aging. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Training set selection for the prediction of essential genes.
Cheng, Jian; Xu, Zhao; Wu, Wenwu; Zhao, Li; Li, Xiangchen; Liu, Yanlin; Tao, Shiheng
2014-01-01
Various computational models have been developed to transfer annotations of gene essentiality between organisms. However, despite the increasing number of microorganisms with well-characterized sets of essential genes, selection of appropriate training sets for predicting the essential genes of poorly-studied or newly sequenced organisms remains challenging. In this study, a machine learning approach was applied reciprocally to predict the essential genes in 21 microorganisms. Results showed that training set selection greatly influenced predictive accuracy. We determined four criteria for training set selection: (1) essential genes in the selected training set should be reliable; (2) the growth conditions in which essential genes are defined should be consistent in training and prediction sets; (3) species used as training set should be closely related to the target organism; and (4) organisms used as training and prediction sets should exhibit similar phenotypes or lifestyles. We then analyzed the performance of an incomplete training set and an integrated training set with multiple organisms. We found that the size of the training set should be at least 10% of the total genes to yield accurate predictions. Additionally, the integrated training sets exhibited remarkable increase in stability and accuracy compared with single sets. Finally, we compared the performance of the integrated training sets with the four criteria and with random selection. The results revealed that a rational selection of training sets based on our criteria yields better performance than random selection. Thus, our results provide empirical guidance on training set selection for the identification of essential genes on a genome-wide scale.
Francesconi, M; Minichino, A; Carrión, R E; Delle Chiaie, R; Bevilacqua, A; Parisi, M; Rullo, S; Bersani, F Saverio; Biondi, M; Cadenhead, K
2017-02-01
Accuracy of risk algorithms for psychosis prediction in "at risk mental state" (ARMS) samples may differ according to the recruitment setting. Standardized criteria used to detect ARMS individuals may lack specificity if the recruitment setting is a secondary mental health service. The authors tested a modified strategy to predict psychosis conversion in this setting by using a systematic selection of trait-markers of the psychosis prodrome in a sample with a heterogeneous ARMS status. 138 non-psychotic outpatients (aged 17-31) were consecutively recruited in secondary mental health services and followed-up for up to 3 years (mean follow-up time, 2.2 years; SD=0.9). Baseline ARMS status, clinical, demographic, cognitive, and neurological soft signs measures were collected. Cox regression was used to derive a risk index. 48% individuals met ARMS criteria (ARMS-Positive, ARMS+). Conversion rate to psychosis was 21% for the overall sample, 34% for ARMS+, and 9% for ARMS-Negative (ARMS-). The final predictor model with a positive predictive validity of 80% consisted of four variables: Disorder of Thought Content, visuospatial/constructional deficits, sensory-integration, and theory-of-mind abnormalities. Removing Disorder of Thought Content from the model only slightly modified the predictive accuracy (-6.2%), but increased the sensitivity (+9.5%). These results suggest that in a secondary mental health setting the use of trait-markers of the psychosis prodrome may predict psychosis conversion with great accuracy despite the heterogeneity of the ARMS status. The use of the proposed predictive algorithm may enable a selective recruitment, potentially reducing duration of untreated psychosis and improving prognostic outcomes. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Zhang, Huiling; Huang, Qingsheng; Bei, Zhendong; Wei, Yanjie; Floudas, Christodoulos A
2016-03-01
In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα-Cα atoms. First, using a rigorous leave-one-protein-out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible at http://hpcc.siat.ac.cn/COMSAT/. © 2016 Wiley Periodicals, Inc.
Can multi-subpopulation reference sets improve the genomic predictive ability for pigs?
Fangmann, A; Bergfelder-Drüing, S; Tholen, E; Simianer, H; Erbe, M
2015-12-01
In most countries and for most livestock species, genomic evaluations are obtained from within-breed analyses. To achieve reliable breeding values, however, a sufficient reference sample size is essential. To increase this size, the use of multibreed reference populations for small populations is considered a suitable option in other species. Over decades, the separate breeding work of different pig breeding organizations in Germany has led to stratified subpopulations in the breed German Large White. Due to this fact and the limited number of Large White animals available in each organization, there was a pressing need for ascertaining if multi-subpopulation genomic prediction is superior compared with within-subpopulation prediction in pigs. Direct genomic breeding values were estimated with genomic BLUP for the trait "number of piglets born alive" using genotype data (Illumina Porcine 60K SNP BeadChip) from 2,053 German Large White animals from five different commercial pig breeding companies. To assess the prediction accuracy of within- and multi-subpopulation reference sets, a random 5-fold cross-validation with 20 replications was performed. The five subpopulations considered were only slightly differentiated from each other. However, the prediction accuracy of the multi-subpopulations approach was not better than that of the within-subpopulation evaluation, for which the predictive ability was already high. Reference sets composed of closely related multi-subpopulation sets performed better than sets of distantly related subpopulations but not better than the within-subpopulation approach. Despite the low differentiation of the five subpopulations, the genetic connectedness between these different subpopulations seems to be too small to improve the prediction accuracy by applying multi-subpopulation reference sets. Consequently, resources should be used for enlarging the reference population within subpopulation, for example, by adding genotyped females.
Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Akgöl, Batuhan; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann
2017-02-01
Early generation genomic selection is superior to conventional phenotypic selection in line breeding and can be strongly improved by including additional information from preliminary yield trials. The selection of lines that enter resource-demanding multi-environment trials is a crucial decision in every line breeding program as a large amount of resources are allocated for thoroughly testing these potential varietal candidates. We compared conventional phenotypic selection with various genomic selection approaches across multiple years as well as the merit of integrating phenotypic information from preliminary yield trials into the genomic selection framework. The prediction accuracy using only phenotypic data was rather low (r = 0.21) for grain yield but could be improved by modeling genetic relationships in unreplicated preliminary yield trials (r = 0.33). Genomic selection models were nevertheless found to be superior to conventional phenotypic selection for predicting grain yield performance of lines across years (r = 0.39). We subsequently simplified the problem of predicting untested lines in untested years to predicting tested lines in untested years by combining breeding values from preliminary yield trials and predictions from genomic selection models by a heritability index. This genomic assisted selection led to a 20% increase in prediction accuracy, which could be further enhanced by an appropriate marker selection for both grain yield (r = 0.48) and protein content (r = 0.63). The easy to implement and robust genomic assisted selection gave thus a higher prediction accuracy than either conventional phenotypic or genomic selection alone. The proposed method took the complex inheritance of both low and high heritable traits into account and appears capable to support breeders in their selection decisions to develop enhanced varieties more efficiently.
Technow, Frank; Schrag, Tobias A; Schipprack, Wolfgang; Bauer, Eva; Simianer, Henner; Melchinger, Albrecht E
2014-08-01
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill-Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker-QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in test crosses. Copyright © 2014 by the Genetics Society of America.
Kaiju, Taro; Doi, Keiichi; Yokota, Masashi; Watanabe, Kei; Inoue, Masato; Ando, Hiroshi; Takahashi, Kazutaka; Yoshida, Fumiaki; Hirata, Masayuki; Suzuki, Takafumi
2017-01-01
Electrocorticogram (ECoG) has great potential as a source signal, especially for clinical BMI. Until recently, ECoG electrodes were commonly used for identifying epileptogenic foci in clinical situations, and such electrodes were low-density and large. Increasing the number and density of recording channels could enable the collection of richer motor/sensory information, and may enhance the precision of decoding and increase opportunities for controlling external devices. Several reports have aimed to increase the number and density of channels. However, few studies have discussed the actual validity of high-density ECoG arrays. In this study, we developed novel high-density flexible ECoG arrays and conducted decoding analyses with monkey somatosensory evoked potentials (SEPs). Using MEMS technology, we made 96-channel Parylene electrode arrays with an inter-electrode distance of 700 μm and recording site area of 350 μm 2 . The arrays were mainly placed onto the finger representation area in the somatosensory cortex of the macaque, and partially inserted into the central sulcus. With electrical finger stimulation, we successfully recorded and visualized finger SEPs with a high spatiotemporal resolution. We conducted offline analyses in which the stimulated fingers and intensity were predicted from recorded SEPs using a support vector machine. We obtained the following results: (1) Very high accuracy (~98%) was achieved with just a short segment of data (~15 ms from stimulus onset). (2) High accuracy (~96%) was achieved even when only a single channel was used. This result indicated placement optimality for decoding. (3) Higher channel counts generally improved prediction accuracy, but the efficacy was small for predictions with feature vectors that included time-series information. These results suggest that ECoG signals with high spatiotemporal resolution could enable greater decoding precision or external device control.
Kaiju, Taro; Doi, Keiichi; Yokota, Masashi; Watanabe, Kei; Inoue, Masato; Ando, Hiroshi; Takahashi, Kazutaka; Yoshida, Fumiaki; Hirata, Masayuki; Suzuki, Takafumi
2017-01-01
Electrocorticogram (ECoG) has great potential as a source signal, especially for clinical BMI. Until recently, ECoG electrodes were commonly used for identifying epileptogenic foci in clinical situations, and such electrodes were low-density and large. Increasing the number and density of recording channels could enable the collection of richer motor/sensory information, and may enhance the precision of decoding and increase opportunities for controlling external devices. Several reports have aimed to increase the number and density of channels. However, few studies have discussed the actual validity of high-density ECoG arrays. In this study, we developed novel high-density flexible ECoG arrays and conducted decoding analyses with monkey somatosensory evoked potentials (SEPs). Using MEMS technology, we made 96-channel Parylene electrode arrays with an inter-electrode distance of 700 μm and recording site area of 350 μm2. The arrays were mainly placed onto the finger representation area in the somatosensory cortex of the macaque, and partially inserted into the central sulcus. With electrical finger stimulation, we successfully recorded and visualized finger SEPs with a high spatiotemporal resolution. We conducted offline analyses in which the stimulated fingers and intensity were predicted from recorded SEPs using a support vector machine. We obtained the following results: (1) Very high accuracy (~98%) was achieved with just a short segment of data (~15 ms from stimulus onset). (2) High accuracy (~96%) was achieved even when only a single channel was used. This result indicated placement optimality for decoding. (3) Higher channel counts generally improved prediction accuracy, but the efficacy was small for predictions with feature vectors that included time-series information. These results suggest that ECoG signals with high spatiotemporal resolution could enable greater decoding precision or external device control. PMID:28442997
Matuszewski, Szymon; Frątczak-Łagiewska, Katarzyna
2018-02-05
Insects colonizing human or animal cadavers may be used to estimate post-mortem interval (PMI) usually by aging larvae or pupae sampled on a crime scene. The accuracy of insect age estimates in a forensic context is reduced by large intraspecific variation in insect development time. Here we test the concept that insect size at emergence may be used to predict insect physiological age and accordingly to improve the accuracy of age estimates in forensic entomology. Using results of laboratory study on development of forensically-useful beetle Creophilus maxillosus (Linnaeus, 1758) (Staphylinidae) we demonstrate that its physiological age at emergence [i.e. thermal summation value (K) needed for emergence] fall with an increase of beetle size. In the validation study it was found that K estimated based on the adult insect size was significantly closer to the true K as compared to K from the general thermal summation model. Using beetle length at emergence as a predictor variable and male or female specific model regressing K against beetle length gave the most accurate predictions of age. These results demonstrate that size of C. maxillosus at emergence improves accuracy of age estimates in a forensic context.
Hwang, Yoo Na; Lee, Ju Hwan; Kim, Ga Young; Jiang, Yuan Yuan; Kim, Sung Min
2015-01-01
This paper focuses on the improvement of the diagnostic accuracy of focal liver lesions by quantifying the key features of cysts, hemangiomas, and malignant lesions on ultrasound images. The focal liver lesions were divided into 29 cysts, 37 hemangiomas, and 33 malignancies. A total of 42 hybrid textural features that composed of 5 first order statistics, 18 gray level co-occurrence matrices, 18 Law's, and echogenicity were extracted. A total of 29 key features that were selected by principal component analysis were used as a set of inputs for a feed-forward neural network. For each lesion, the performance of the diagnosis was evaluated by using the positive predictive value, negative predictive value, sensitivity, specificity, and accuracy. The results of the experiment indicate that the proposed method exhibits great performance, a high diagnosis accuracy of over 96% among all focal liver lesion groups (cyst vs. hemangioma, cyst vs. malignant, and hemangioma vs. malignant) on ultrasound images. The accuracy was slightly increased when echogenicity was included in the optimal feature set. These results indicate that it is possible for the proposed method to be applied clinically.
Posterior Predictive Checks for Conditional Independence between Response Time and Accuracy
ERIC Educational Resources Information Center
Bolsinova, Maria; Tijmstra, Jesper
2016-01-01
Conditional independence (CI) between response time and response accuracy is a fundamental assumption of many joint models for time and accuracy used in educational measurement. In this study, posterior predictive checks (PPCs) are proposed for testing this assumption. These PPCs are based on three discrepancy measures reflecting different…
The microcomputer scientific software series 4: testing prediction accuracy.
H. Michael Rauscher
1986-01-01
A computer program, ATEST, is described in this combination user's guide / programmer's manual. ATEST provides users with an efficient and convenient tool to test the accuracy of predictors. As input ATEST requires observed-predicted data pairs. The output reports the two components of accuracy, bias and precision.
Neurocognitive and Behavioral Predictors of Math Performance in Children with and without ADHD
Antonini, Tanya N.; O’Brien, Kathleen M.; Narad, Megan E.; Langberg, Joshua M.; Tamm, Leanne; Epstein, Jeff N.
2014-01-01
Objective: This study examined neurocognitive and behavioral predictors of math performance in children with and without attention-deficit/hyperactivity disorder (ADHD). Method: Neurocognitive and behavioral variables were examined as predictors of 1) standardized mathematics achievement scores,2) productivity on an analog math task, and 3) accuracy on an analog math task. Results: Children with ADHD had lower achievement scores but did not significantly differ from controls on math productivity or accuracy. N-back accuracy and parent-rated attention predicted math achievement. N-back accuracy and observed attention predicted math productivity. Alerting scores on the Attentional Network Task predicted math accuracy. Mediation analyses indicated that n-back accuracy significantly mediated the relationship between diagnostic group and math achievement. Conclusion: Neurocognition, rather than behavior, may account for the deficits in math achievement exhibited by many children with ADHD. PMID:24071774
Neurocognitive and Behavioral Predictors of Math Performance in Children With and Without ADHD.
Antonini, Tanya N; Kingery, Kathleen M; Narad, Megan E; Langberg, Joshua M; Tamm, Leanne; Epstein, Jeffery N
2016-02-01
This study examined neurocognitive and behavioral predictors of math performance in children with and without ADHD. Neurocognitive and behavioral variables were examined as predictors of (a) standardized mathematics achievement scores, (b) productivity on an analog math task, and (c) accuracy on an analog math task. Children with ADHD had lower achievement scores but did not significantly differ from controls on math productivity or accuracy. N-back accuracy and parent-rated attention predicted math achievement. N-back accuracy and observed attention predicted math productivity. Alerting scores on the attentional network task predicted math accuracy. Mediation analyses indicated that n-back accuracy significantly mediated the relationship between diagnostic group and math achievement. Neurocognition, rather than behavior, may account for the deficits in math achievement exhibited by many children with ADHD. © The Author(s) 2013.
Gaussian Processes for Prediction of Homing Pigeon Flight Trajectories
NASA Astrophysics Data System (ADS)
Mann, Richard; Freeman, Robin; Osborne, Michael; Garnett, Roman; Meade, Jessica; Armstrong, Chris; Biro, Dora; Guilford, Tim; Roberts, Stephen
2009-12-01
We construct and apply a stochastic Gaussian Process (GP) model of flight trajectory generation for pigeons trained to home from specific release sites. The model shows increasing predictive power as the birds become familiar with the sites, mirroring the animal's learning process. We show how the increasing similarity between successive flight trajectories can be used to infer, with increasing accuracy, an idealised route that captures the repeated spatial aspects of the bird's flight. We subsequently use techniques associated with reduced-rank GP approximations to objectively identify the key waypoints used by each bird to memorise its idiosyncratic habitual route between the release site and the home loft.
Artificial neural network prediction of ischemic tissue fate in acute stroke imaging
Huang, Shiliang; Shen, Qiang; Duong, Timothy Q
2010-01-01
Multimodal magnetic resonance imaging of acute stroke provides predictive value that can be used to guide stroke therapy. A flexible artificial neural network (ANN) algorithm was developed and applied to predict ischemic tissue fate on three stroke groups: 30-, 60-minute, and permanent middle cerebral artery occlusion in rats. Cerebral blood flow (CBF), apparent diffusion coefficient (ADC), and spin–spin relaxation time constant (T2) were acquired during the acute phase up to 3 hours and again at 24 hours followed by histology. Infarct was predicted on a pixel-by-pixel basis using only acute (30-minute) stroke data. In addition, neighboring pixel information and infarction incidence were also incorporated into the ANN model to improve prediction accuracy. Receiver-operating characteristic analysis was used to quantify prediction accuracy. The major findings were the following: (1) CBF alone poorly predicted the final infarct across three experimental groups; (2) ADC alone adequately predicted the infarct; (3) CBF+ADC improved the prediction accuracy; (4) inclusion of neighboring pixel information and infarction incidence further improved the prediction accuracy; and (5) prediction was more accurate for permanent occlusion, followed by 60- and 30-minute occlusion. The ANN predictive model could thus provide a flexible and objective framework for clinicians to evaluate stroke treatment options on an individual patient basis. PMID:20424631
CAN STABILITY REALLY PREDICT AN IMPENDING SLIP-RELATED FALL AMONG OLDER ADULTS?
Yang, Feng; Pai, Yi-Chung
2015-01-01
The primary purpose of this study was to systematically evaluate and compare the predictive power of falls for a battery of stability indices, obtained during normal walking among community-dwelling older adults. One hundred and eighty seven community-dwelling older adults participated in the study. After walking regularly for 20 strides on a walkway, participants were subjected to an unannounced slip during gait under the protection of a safety harness. Full body kinematics and kinetics were monitored during walking using a motion capture system synchronized with force plates. Stability variables, including feasible-stability-region measurement, margin of stability, the maximum Floquet multiplier, the Lyapunov exponents (short- and long-term), and the variability of gait parameters (including the step length, step width, and step time) were calculated for each subject. Accuracy of predicting slip outcome (fall vs. recovery) was examined for each stability variable using logistic regression. Results showed that the feasible-stability-region measurement predicted fall incidence among these subjects with the highest accuracy (68.4%). Except for the step width (with an accuracy of 60.2%), no other stability variables could differentiate fallers from those who did not fall for the sample studied in this study. The findings from the present study could provide guidance to identify individuals at increased risk of falling using the feasible-stability-region measurement or variability of the step width. PMID:25458148
Biggerstaff, Matthew; Alper, David; Dredze, Mark; Fox, Spencer; Fung, Isaac Chun-Hai; Hickmann, Kyle S; Lewis, Bryan; Rosenfeld, Roni; Shaman, Jeffrey; Tsou, Ming-Hsiang; Velardi, Paola; Vespignani, Alessandro; Finelli, Lyn
2016-07-22
Early insights into the timing of the start, peak, and intensity of the influenza season could be useful in planning influenza prevention and control activities. To encourage development and innovation in influenza forecasting, the Centers for Disease Control and Prevention (CDC) organized a challenge to predict the 2013-14 Unites States influenza season. Challenge contestants were asked to forecast the start, peak, and intensity of the 2013-2014 influenza season at the national level and at any or all Health and Human Services (HHS) region level(s). The challenge ran from December 1, 2013-March 27, 2014; contestants were required to submit 9 biweekly forecasts at the national level to be eligible. The selection of the winner was based on expert evaluation of the methodology used to make the prediction and the accuracy of the prediction as judged against the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet). Nine teams submitted 13 forecasts for all required milestones. The first forecast was due on December 2, 2013; 3/13 forecasts received correctly predicted the start of the influenza season within one week, 1/13 predicted the peak within 1 week, 3/13 predicted the peak ILINet percentage within 1 %, and 4/13 predicted the season duration within 1 week. For the prediction due on December 19, 2013, the number of forecasts that correctly forecasted the peak week increased to 2/13, the peak percentage to 6/13, and the duration of the season to 6/13. As the season progressed, the forecasts became more stable and were closer to the season milestones. Forecasting has become technically feasible, but further efforts are needed to improve forecast accuracy so that policy makers can reliably use these predictions. CDC and challenge contestants plan to build upon the methods developed during this contest to improve the accuracy of influenza forecasts.
Affective processes in human-automation interactions.
Merritt, Stephanie M
2011-08-01
This study contributes to the literature on automation reliance by illuminating the influences of user moods and emotions on reliance on automated systems. Past work has focused predominantly on cognitive and attitudinal variables, such as perceived machine reliability and trust. However, recent work on human decision making suggests that affective variables (i.e., moods and emotions) are also important. Drawing from the affect infusion model, significant effects of affect are hypothesized. Furthermore, a new affectively laden attitude termed liking is introduced. Participants watched video clips selected to induce positive or negative moods, then interacted with a fictitious automated system on an X-ray screening task At five time points, important variables were assessed including trust, liking, perceived machine accuracy, user self-perceived accuracy, and reliance.These variables, along with propensity to trust machines and state affect, were integrated in a structural equation model. Happiness significantly increased trust and liking for the system throughout the task. Liking was the only variable that significantly predicted reliance early in the task. Trust predicted reliance later in the task, whereas perceived machine accuracy and user self-perceived accuracy had no significant direct effects on reliance at any time. Affective influences on automation reliance are demonstrated, suggesting that this decision-making process may be less rational and more emotional than previously acknowledged. Liking for a new system may be key to appropriate reliance, particularly early in the task. Positive affect can be easily induced and may be a lever for increasing liking.
Cross-validation of recent and longstanding resting metabolic rate prediction equations
USDA-ARS?s Scientific Manuscript database
Resting metabolic rate (RMR) measurement is time consuming and requires specialized equipment. Prediction equations provide an easy method to estimate RMR; however, their accuracy likely varies across individuals. Understanding the factors that influence predicted RMR accuracy at the individual lev...
An alternative covariance estimator to investigate genetic heterogeneity in populations.
Heslot, Nicolas; Jannink, Jean-Luc
2015-11-26
For genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship. We propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy. This alternative covariance estimator can be used to gain insight into trait-specific genetic heterogeneity by identifying relevant sub-populations that lack genetic correlation between them. Genetic correlation can be 0 between identified sub-populations by performing automatic selection of relevant sets of individuals to be included in the training population. It may also increase statistical power in GWAS.
The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion.
Borkowski, Piotr
2017-06-20
It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship's current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships.
The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion
Borkowski, Piotr
2017-01-01
It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship’s current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships. PMID:28632176
Sixty-five years of the long march in protein secondary structure prediction: the final stretch?
Yang, Yuedong; Gao, Jianzhao; Wang, Jihua; Heffernan, Rhys; Hanson, Jack; Paliwal, Kuldip; Zhou, Yaoqi
2018-01-01
Abstract Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82–84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88–90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction. PMID:28040746
Kimura, Kenta; Kimura, Motohiro
2016-09-28
The evaluative processing of the valence of action feedback is reflected by an event-related brain potential component called feedback-related negativity (FRN) or reward positivity (RewP). Recent studies have shown that FRN/RewP is markedly reduced when the action-feedback interval is long (e.g. 6000 ms), indicating that an increase in the action-feedback interval can undermine the evaluative processing of the valence of action feedback. The aim of the present study was to investigate whether or not such undermined evaluative processing of delayed action feedback could be restored by improving the accuracy of the prediction in terms of the timing of action feedback. With a typical gambling task in which the participant chose one of two cards and received an action feedback indicating monetary gain or loss, the present study showed that FRN/RewP was significantly elicited even when the action-feedback interval was 6000 ms, when an auditory stimulus sequence was additionally presented during the action-feedback interval as a temporal cue. This result suggests that the undermined evaluative processing of delayed action feedback can be restored by increasing the accuracy of the prediction on the timing of the action feedback.
Powerful Voter Selection for Making Multistep Delegate Ballot Fair
NASA Astrophysics Data System (ADS)
Yamakawa, Hiroshi
For decision by majority, each voter often exercises his right by delegating to trustable other voters. Multi-step delegates rule allows indirect delegating through more than one voter, and this helps each voter finding his delegate voters. In this paper, we propose powerful voter selection method depending on the multi-step delegate rule. This method sequentially selects voters who is most delegated indirectly. Multi-agent simulation demonstrate that we can achieve highly fair poll results from small number of vote by using proposed method. Here, fairness is prediction accuracy to sum of all voters preferences for choices. In simulation, each voter selects choices arranged on one dimensional preference axis for voting. Acquaintance relationships among voters were generated as a random network, and each voter delegates some of his acquaintances who has similar preferences. We obtained simulation results from various acquaintance networks, and then averaged these results. Firstly, if each voter has enough acquaintances in average, proposed method can help predicting sum of all voters' preferences of choices from small number of vote. Secondly, if the number of each voter's acquaintances increases corresponding to an increase in the number of voters, prediction accuracy (fairness) from small number of vote can be kept in appropriate level.
Siegelaar, Sarah E; Barwari, Temo; Hermanides, Jeroen; van der Voort, Peter H J; Hoekstra, Joost B L; DeVries, J Hans
2013-11-01
Continuous glucose monitoring could be helpful for glucose regulation in critically ill patients; however, its accuracy is uncertain and might be influenced by microcirculation. We investigated the microcirculation and its relation to the accuracy of 2 continuous glucose monitoring devices in patients after cardiac surgery. The present prospective, observational study included 60 patients admitted for cardiac surgery. Two continuous glucose monitoring devices (Guardian Real-Time and FreeStyle Navigator) were placed before surgery. The relative absolute deviation between continuous glucose monitoring and the arterial reference glucose was calculated to assess the accuracy. Microcirculation was measured using the microvascular flow index, perfused vessel density, and proportion of perfused vessels using sublingual sidestream dark-field imaging, and tissue oxygenation using near-infrared spectroscopy. The associations were assessed using a linear mixed-effects model for repeated measures. The median relative absolute deviation of the Navigator was 11% (interquartile range, 8%-16%) and of the Guardian was 14% (interquartile range, 11%-18%; P = .05). Tissue oxygenation significantly increased during the intensive care unit admission (maximum 91.2% [3.9] after 6 hours) and decreased thereafter, stabilizing after 20 hours. A decrease in perfused vessel density accompanied the increase in tissue oxygenation. Microcirculatory variables were not associated with sensor accuracy. A lower peripheral temperature (Navigator, b = -0.008, P = .003; Guardian, b = -0.006, P = .048), and for the Navigator, also a higher Acute Physiology and Chronic Health Evaluation IV predicted mortality (b = 0.017, P < .001) and age (b = 0.002, P = .037) were associated with decreased sensor accuracy. The results of the present study have shown acceptable accuracy for both sensors in patients after cardiac surgery. The microcirculation was impaired to a limited extent compared with that in patients with sepsis and healthy controls. This impairment was not related to sensor accuracy but the peripheral temperature for both sensors and patient age and Acute Physiology and Chronic Health Evaluation IV predicted mortality for the Navigator were. Copyright © 2013 The American Association for Thoracic Surgery. Published by Mosby, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Fu, Xiao Lei; Jin, Bao Ming; Jiang, Xiao Lei; Chen, Cheng
2018-06-01
Data assimilation is an efficient way to improve the simulation/prediction accuracy in many fields of geosciences especially in meteorological and hydrological applications. This study takes unscented particle filter (UPF) as an example to test its performance at different two probability distribution, Gaussian and Uniform distributions with two different assimilation frequencies experiments (1) assimilating hourly in situ soil surface temperature, (2) assimilating the original Moderate Resolution Imaging Spectroradiometer (MODIS) Land Surface Temperature (LST) once per day. The numerical experiment results show that the filter performs better when increasing the assimilation frequency. In addition, UPF is efficient for improving the soil variables (e.g., soil temperature) simulation/prediction accuracy, though it is not sensitive to the probability distribution for observation error in soil temperature assimilation.
Acosta-Pech, Rocío; Crossa, José; de Los Campos, Gustavo; Teyssèdre, Simon; Claustres, Bruno; Pérez-Elizalde, Sergio; Pérez-Rodríguez, Paulino
2017-07-01
A new genomic model that incorporates genotype × environment interaction gave increased prediction accuracy of untested hybrid response for traits such as percent starch content, percent dry matter content and silage yield of maize hybrids. The prediction of hybrid performance (HP) is very important in agricultural breeding programs. In plant breeding, multi-environment trials play an important role in the selection of important traits, such as stability across environments, grain yield and pest resistance. Environmental conditions modulate gene expression causing genotype × environment interaction (G × E), such that the estimated genetic correlations of the performance of individual lines across environments summarize the joint action of genes and environmental conditions. This article proposes a genomic statistical model that incorporates G × E for general and specific combining ability for predicting the performance of hybrids in environments. The proposed model can also be applied to any other hybrid species with distinct parental pools. In this study, we evaluated the predictive ability of two HP prediction models using a cross-validation approach applied in extensive maize hybrid data, comprising 2724 hybrids derived from 507 dent lines and 24 flint lines, which were evaluated for three traits in 58 environments over 12 years; analyses were performed for each year. On average, genomic models that include the interaction of general and specific combining ability with environments have greater predictive ability than genomic models without interaction with environments (ranging from 12 to 22%, depending on the trait). We concluded that including G × E in the prediction of untested maize hybrids increases the accuracy of genomic models.
NASA Astrophysics Data System (ADS)
Hong-bo, Wang; Chang-yin, Zhao; Wei, Zhang; Jin-wei, Zhan; Sheng-xian, Yu
2016-07-01
The Earth gravitational field model is one of the most important dynamic models in satellite orbit computation. Several space gravity missions made great successes in recent years, prompting the publishing of several gravitational filed models. In this paper, two classical (JGM3, EGM96) and four latest (EIGEN-CHAMP05S, GGM03S, GOCE02S, EGM2008) models are evaluated by employing them in the precision orbit determination (POD) and prediction. These calculations are performed based on the laser ranging observation of four Low Earth Orbit (LEO) satellites, including CHAMP, GFZ-1, GRACE-A, and SWARM-A. The residual error of observation in POD is adopted to describe the accuracy of six gravitational field models. The main results we obtained are as follows. (1) For the POD of LEOs, the accuracies of 4 latest models are at the same level, and better than those of 2 classical models; (2) Taking JGM3 as reference, EGM96 model's accuracy is better in most situations, and the accuracies of the 4 latest models are improved by 12%-47% in POD and 63% in prediction, respectively. We also confirm that the model's accuracy in POD is enhanced with the increasing degree and order if they are smaller than 70, and when they exceed 70, the accuracy keeps constant, implying that the model's degree and order truncated to 70 are sufficient to meet the requirement of LEO computation of centimeter precision.
Maden, Orhan; Balci, Kevser Gülcihan; Selcuk, Mehmet Timur; Balci, Mustafa Mücahit; Açar, Burak; Unal, Sefa; Kara, Meryem; Selcuk, Hatice
2015-12-01
The aim of this study was to investigate the accuracy of three algorithms in predicting accessory pathway locations in adult patients with Wolff-Parkinson-White syndrome in Turkish population. A total of 207 adult patients with Wolff-Parkinson-White syndrome were retrospectively analyzed. The most preexcited 12-lead electrocardiogram in sinus rhythm was used for analysis. Two investigators blinded to the patient data used three algorithms for prediction of accessory pathway location. Among all locations, 48.5% were left-sided, 44% were right-sided, and 7.5% were located in the midseptum or anteroseptum. When only exact locations were accepted as match, predictive accuracy for Chiang was 71.5%, 72.4% for d'Avila, and 71.5% for Arruda. The percentage of predictive accuracy of all algorithms did not differ between the algorithms (p = 1.000; p = 0.875; p = 0.885, respectively). The best algorithm for prediction of right-sided, left-sided, and anteroseptal and midseptal accessory pathways was Arruda (p < 0.001). Arruda was significantly better than d'Avila in predicting adjacent sites (p = 0.035) and the percent of the contralateral site prediction was higher with d'Avila than Arruda (p = 0.013). All algorithms were similar in predicting accessory pathway location and the predicted accuracy was lower than previously reported by their authors. However, according to the accessory pathway site, the algorithm designed by Arruda et al. showed better predictions than the other algorithms and using this algorithm may provide advantages before a planned ablation.
Campos, G S; Reimann, F A; Cardoso, L L; Ferreira, C E R; Junqueira, V S; Schmidt, P I; Braccini Neto, J; Yokoo, M J I; Sollero, B P; Boligon, A A; Cardoso, F F
2018-05-07
The objective of the present study was to evaluate the accuracy and bias of direct and blended genomic predictions using different methods and cross-validation techniques for growth traits (weight and weight gains) and visual scores (conformation, precocity, muscling and size) obtained at weaning and at yearling in Hereford and Braford breeds. Phenotypic data contained 126,290 animals belonging to the Delta G Connection genetic improvement program, and a set of 3,545 animals genotyped with the 50K chip and 131 sires with the 777K. After quality control, 41,045 markers remained for all animals. An animal model was used to estimate (co)variances components and to predict breeding values, which were later used to calculate the deregressed estimated breeding values (DEBV). Animals with genotype and phenotype for the traits studied were divided into four or five groups by random and k-means clustering cross-validation strategies. The values of accuracy of the direct genomic values (DGV) were moderate to high magnitude for at weaning and at yearling traits, ranging from 0.19 to 0.45 for the k-means and 0.23 to 0.78 for random clustering among all traits. The greatest gain in relation to the pedigree BLUP (PBLUP) was 9.5% with the BayesB method with both the k-means and the random clustering. Blended genomic value accuracies ranged from 0.19 to 0.56 for k-means and from 0.21 to 0.82 for random clustering. The analyzes using the historical pedigree and phenotypes contributed additional information to calculate the GEBV and in general, the largest gains were for the single-step (ssGBLUP) method in bivariate analyses with a mean increase of 43.00% among all traits measured at weaning and of 46.27% for those evaluated at yearling. The accuracy values for the marker effects estimation methods were lower for k-means clustering, indicating that the training set relationship to the selection candidates is a major factor affecting accuracy of genomic predictions. The gains in accuracy obtained with genomic blending methods, mainly ssGBLUP in bivariate analyses, indicate that genomic predictions should be used as a tool to improve genetic gains in relation to the traditional PBLUP selection.
Accuracy test for link prediction in terms of similarity index: The case of WS and BA models
NASA Astrophysics Data System (ADS)
Ahn, Min-Woo; Jung, Woo-Sung
2015-07-01
Link prediction is a technique that uses the topological information in a given network to infer the missing links in it. Since past research on link prediction has primarily focused on enhancing performance for given empirical systems, negligible attention has been devoted to link prediction with regard to network models. In this paper, we thus apply link prediction to two network models: The Watts-Strogatz (WS) model and Barabási-Albert (BA) model. We attempt to gain a better understanding of the relation between accuracy and each network parameter (mean degree, the number of nodes and the rewiring probability in the WS model) through network models. Six similarity indices are used, with precision and area under the ROC curve (AUC) value as the accuracy metrics. We observe a positive correlation between mean degree and accuracy, and size independence of the AUC value.
Effectiveness of link prediction for face-to-face behavioral networks.
Tsugawa, Sho; Ohsaki, Hiroyuki
2013-01-01
Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30-0.45 and a recall of 0.10-0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks.
Single-Step BLUP with Varying Genotyping Effort in Open-Pollinated Picea glauca.
Ratcliffe, Blaise; El-Dien, Omnia Gamal; Cappa, Eduardo P; Porth, Ilga; Klápště, Jaroslav; Chen, Charles; El-Kassaby, Yousry A
2017-03-10
Maximization of genetic gain in forest tree breeding programs is contingent on the accuracy of the predicted breeding values and precision of the estimated genetic parameters. We investigated the effect of the combined use of contemporary pedigree information and genomic relatedness estimates on the accuracy of predicted breeding values and precision of estimated genetic parameters, as well as rankings of selection candidates, using single-step genomic evaluation (HBLUP). In this study, two traits with diverse heritabilities [tree height (HT) and wood density (WD)] were assessed at various levels of family genotyping efforts (0, 25, 50, 75, and 100%) from a population of white spruce ( Picea glauca ) consisting of 1694 trees from 214 open-pollinated families, representing 43 provenances in Québec, Canada. The results revealed that HBLUP bivariate analysis is effective in reducing the known bias in heritability estimates of open-pollinated populations, as it exposes hidden relatedness, potential pedigree errors, and inbreeding. The addition of genomic information in the analysis considerably improved the accuracy in breeding value estimates by accounting for both Mendelian sampling and historical coancestry that were not captured by the contemporary pedigree alone. Increasing family genotyping efforts were associated with continuous improvement in model fit, precision of genetic parameters, and breeding value accuracy. Yet, improvements were observed even at minimal genotyping effort, indicating that even modest genotyping effort is effective in improving genetic evaluation. The combined utilization of both pedigree and genomic information may be a cost-effective approach to increase the accuracy of breeding values in forest tree breeding programs where shallow pedigrees and large testing populations are the norm. Copyright © 2017 Ratcliffe et al.
Cheng, Qi; Xue, Dabin; Wang, Guanyu; Ochieng, Washington Yotto
2017-01-01
The increasing number of vehicles in modern cities brings the problem of increasing crashes. One of the applications or services of Intelligent Transportation Systems (ITS) conceived to improve safety and reduce congestion is collision avoidance. This safety critical application requires sub-meter level vehicle state estimation accuracy with very high integrity, continuity and availability, to detect an impending collision and issue a warning or intervene in the case that the warning is not heeded. Because of the challenging city environment, to date there is no approved method capable of delivering this high level of performance in vehicle state estimation. In particular, the current Global Navigation Satellite System (GNSS) based collision avoidance systems have the major limitation that the real-time accuracy of dynamic state estimation deteriorates during abrupt acceleration and deceleration situations, compromising the integrity of collision avoidance. Therefore, to provide the Required Navigation Performance (RNP) for collision avoidance, this paper proposes a novel Particle Filter (PF) based model for the integration or fusion of real-time kinematic (RTK) GNSS position solutions with electronic compass and road segment data used in conjunction with an Autoregressive (AR) motion model. The real-time vehicle state estimates are used together with distance based collision avoidance algorithms to predict potential collisions. The algorithms are tested by simulation and in the field representing a low density urban environment. The results show that the proposed algorithm meets the horizontal positioning accuracy requirement for collision avoidance and is superior to positioning accuracy of GNSS only, traditional Constant Velocity (CV) and Constant Acceleration (CA) based motion models, with a significant improvement in the prediction accuracy of potential collision. PMID:29186851
Sun, Rui; Cheng, Qi; Xue, Dabin; Wang, Guanyu; Ochieng, Washington Yotto
2017-11-25
The increasing number of vehicles in modern cities brings the problem of increasing crashes. One of the applications or services of Intelligent Transportation Systems (ITS) conceived to improve safety and reduce congestion is collision avoidance. This safety critical application requires sub-meter level vehicle state estimation accuracy with very high integrity, continuity and availability, to detect an impending collision and issue a warning or intervene in the case that the warning is not heeded. Because of the challenging city environment, to date there is no approved method capable of delivering this high level of performance in vehicle state estimation. In particular, the current Global Navigation Satellite System (GNSS) based collision avoidance systems have the major limitation that the real-time accuracy of dynamic state estimation deteriorates during abrupt acceleration and deceleration situations, compromising the integrity of collision avoidance. Therefore, to provide the Required Navigation Performance (RNP) for collision avoidance, this paper proposes a novel Particle Filter (PF) based model for the integration or fusion of real-time kinematic (RTK) GNSS position solutions with electronic compass and road segment data used in conjunction with an Autoregressive (AR) motion model. The real-time vehicle state estimates are used together with distance based collision avoidance algorithms to predict potential collisions. The algorithms are tested by simulation and in the field representing a low density urban environment. The results show that the proposed algorithm meets the horizontal positioning accuracy requirement for collision avoidance and is superior to positioning accuracy of GNSS only, traditional Constant Velocity (CV) and Constant Acceleration (CA) based motion models, with a significant improvement in the prediction accuracy of potential collision.
ERIC Educational Resources Information Center
Hilton, N. Zoe; Harris, Grant T.
2009-01-01
Prediction effect sizes such as ROC area are important for demonstrating a risk assessment's generalizability and utility. How a study defines recidivism might affect predictive accuracy. Nonrecidivism is problematic when predicting specialized violence (e.g., domestic violence). The present study cross-validates the ability of the Ontario…
Robust Decision-making Applied to Model Selection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hemez, Francois M.
2012-08-06
The scientific and engineering communities are relying more and more on numerical models to simulate ever-increasingly complex phenomena. Selecting a model, from among a family of models that meets the simulation requirements, presents a challenge to modern-day analysts. To address this concern, a framework is adopted anchored in info-gap decision theory. The framework proposes to select models by examining the trade-offs between prediction accuracy and sensitivity to epistemic uncertainty. The framework is demonstrated on two structural engineering applications by asking the following question: Which model, of several numerical models, approximates the behavior of a structure when parameters that define eachmore » of those models are unknown? One observation is that models that are nominally more accurate are not necessarily more robust, and their accuracy can deteriorate greatly depending upon the assumptions made. It is posited that, as reliance on numerical models increases, establishing robustness will become as important as demonstrating accuracy.« less
Improving Fermi Orbit Determination and Prediction in an Uncertain Atmospheric Drag Environment
NASA Technical Reports Server (NTRS)
Vavrina, Matthew A.; Newman, Clark P.; Slojkowski, Steven E.; Carpenter, J. Russell
2014-01-01
Orbit determination and prediction of the Fermi Gamma-ray Space Telescope trajectory is strongly impacted by the unpredictability and variability of atmospheric density and the spacecraft's ballistic coefficient. Operationally, Global Positioning System point solutions are processed with an extended Kalman filter for orbit determination, and predictions are generated for conjunction assessment with secondary objects. When these predictions are compared to Joint Space Operations Center radar-based solutions, the close approach distance between the two predictions can greatly differ ahead of the conjunction. This work explores strategies for improving prediction accuracy and helps to explain the prediction disparities. Namely, a tuning analysis is performed to determine atmospheric drag modeling and filter parameters that can improve orbit determination as well as prediction accuracy. A 45% improvement in three-day prediction accuracy is realized by tuning the ballistic coefficient and atmospheric density stochastic models, measurement frequency, and other modeling and filter parameters.
Pillai, Rekha N; Konje, Justin C; Richardson, Matthew; Tincello, Douglas G; Potdar, Neelam
2018-01-01
Both ultrasound and biochemical markers either alone or in combination have been described in the literature for the prediction of miscarriage. We performed this systematic review and meta-analysis to determine the best combination of biochemical, ultrasound and demographic markers to predict miscarriage in women with viable intrauterine pregnancy. The electronic database search included Medline (1946-June 2017), Embase (1980-June 2017), CINAHL (1981-June 2017) and Cochrane library. Key MESH and Boolean terms were used for the search. Data extraction and collection was performed based on the eligibility criteria by two authors independently. Quality assessment of the individual studies was done using QUADAS 2 (Quality Assessment for Diagnostic Accuracy Studies-2: A Revised Tool) and statistical analysis performed using the Cochrane systematic review manager 5.3 and STATA vs.13.0. Due to the diversity of the combinations used for prediction in the included papers it was not possible to perform a meta-analysis on combination markers. Therefore, we proceeded to perform a meta-analysis on ultrasound markers alone to determine the best marker that can help to improve the diagnostic accuracy of predicting miscarriage in women with viable intrauterine pregnancy. The systematic review identified 18 eligible studies for the quantitative meta-analysis with a total of 5584 women. Among the ultrasound scan markers, fetal bradycardia (n=10 studies, n=1762 women) on hierarchical summary receiver operating characteristic showed sensitivity of 68.41%, specificity of 97.84%, positive likelihood ratio of 31.73 (indicating a large effect on increasing the probability of predicting miscarriage) and negative likelihood ratio of 0.32. In studies for women with threatened miscarriage (n=5 studies, n=771 women) fetal bradycardia showed further increase in sensitivity (84.18%) for miscarriage prediction. Although there is gestational age dependent variation in the fetal heart rate, a plot of fetal heart rate cut off level versus log diagnostic odds ratio showed that at ≤110 beat per minutes the diagnostic power to predict miscarriage is higher. Other markers of intra uterine hematoma, crown rump length and yolk sac had significantly decreased predictive value. Therefore in women with threatened miscarriage and presence of fetal bradycardia on ultrasound scan, there is a role for offering repeat ultrasound scan in a week to ten days interval. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wang, Qianxin; Hu, Chao; Xu, Tianhe; Chang, Guobin; Hernández Moraleda, Alberto
2017-12-01
Analysis centers (ACs) for global navigation satellite systems (GNSSs) cannot accurately obtain real-time Earth rotation parameters (ERPs). Thus, the prediction of ultra-rapid orbits in the international terrestrial reference system (ITRS) has to utilize the predicted ERPs issued by the International Earth Rotation and Reference Systems Service (IERS) or the International GNSS Service (IGS). In this study, the accuracy of ERPs predicted by IERS and IGS is analyzed. The error of the ERPs predicted for one day can reach 0.15 mas and 0.053 ms in polar motion and UT1-UTC direction, respectively. Then, the impact of ERP errors on ultra-rapid orbit prediction by GNSS is studied. The methods for orbit integration and frame transformation in orbit prediction with introduced ERP errors dominate the accuracy of the predicted orbit. Experimental results show that the transformation from the geocentric celestial references system (GCRS) to ITRS exerts the strongest effect on the accuracy of the predicted ultra-rapid orbit. To obtain the most accurate predicted ultra-rapid orbit, a corresponding real-time orbit correction method is developed. First, orbits without ERP-related errors are predicted on the basis of ITRS observed part of ultra-rapid orbit for use as reference. Then, the corresponding predicted orbit is transformed from GCRS to ITRS to adjust for the predicted ERPs. Finally, the corrected ERPs with error slopes are re-introduced to correct the predicted orbit in ITRS. To validate the proposed method, three experimental schemes are designed: function extrapolation, simulation experiments, and experiments with predicted ultra-rapid orbits and international GNSS Monitoring and Assessment System (iGMAS) products. Experimental results show that using the proposed correction method with IERS products considerably improved the accuracy of ultra-rapid orbit prediction (except the geosynchronous BeiDou orbits). The accuracy of orbit prediction is enhanced by at least 50% (error related to ERP) when a highly accurate observed orbit is used with the correction method. For iGMAS-predicted orbits, the accuracy improvement ranges from 8.5% for the inclined BeiDou orbits to 17.99% for the GPS orbits. This demonstrates that the correction method proposed by this study can optimize the ultra-rapid orbit prediction.
NASA Astrophysics Data System (ADS)
Hancock, Matthew C.; Magnan, Jerry F.
2017-03-01
To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capabilities of statistical learning methods for classifying nodule malignancy, utilizing the Lung Image Database Consortium (LIDC) dataset, and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that is achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 (+/-1.14)% which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 (+/-0.012), which increases to 0.949 (+/-0.007) when diameter and volume features are included, along with the accuracy to 88.08 (+/-1.11)%. Our results are comparable to those in the literature that use algorithmically-derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features, and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.
Predicting impending death: inconsistency in speed is a selective and early marker.
Macdonald, Stuart W S; Hultsch, David F; Dixon, Roger A
2008-09-01
Among older adults, deficits in both level and variability of speeded performance are linked to neurological impairment. This study examined whether and when speed (rate), speed (inconsistency), and traditional accuracy-based markers of cognitive performance foreshadow terminal decline and impending death. Victoria Longitudinal Study data spanning 12 years (5 waves) of measurement were assembled for 707 adults aged 59 to 95 years. Whereas 442 survivors completed all waves and relevant measures, 265 decedents participated on at least 1 occasion and subsequently died. Four main results were observed. First, Cox regressions evaluating the 3 cognitive predictors of mortality replicated previous results for cognitive accuracy predictors. Second, level (rate) of speeded performance predicted survival independent of demographic indicators, cardiovascular health, and cognitive performance level. Third, inconsistency in speed predicted survival independent of all influences combined. Fourth, follow-up random-effects models revealed increases in inconsistency in speed per year closer to death, with advancing age further moderating the accelerated growth. Hierarchical prediction patterns support the view that inconsistency in speed is an early behavioral marker of neurological dysfunction associated with impending death. (c) 2008 APA, all rights reserved
Predicting Impending Death: Inconsistency in Speed is a Selective and Early Marker
MacDonald, Stuart W.S.; Hultsch, David F.; Dixon, Roger A.
2008-01-01
Among older adults, deficits in both level and variability of speeded performance are linked to neurological impairment. This study examined whether and when speed (rate), speed (inconsistency), and traditional accuracy-based markers of cognitive performance foreshadow terminal decline and impending death. Victoria Longitudinal Study data spanning 12 years (5 waves) of measurement were assembled for 707 adults aged 59 to 95 years. Whereas 442 survivors completed all waves and relevant measures, 265 decedents participated on at least one occasion and subsequently died. Four main results were observed. First, Cox regressions evaluating the three cognitive predictors of mortality replicated previous results for cognitive accuracy predictors. Second, level (rate) of speeded performance predicted survival independent of demographic indicators, cardiovascular health, and cognitive performance level. Third, inconsistency in speed predicted survival independent of all influences combined. Fourth, follow-up random-effects models revealed increases in inconsistency in speed per year closer to death, with advancing age further moderating the accelerated growth. Hierarchical prediction patterns support the view that inconsistency in speed is an early behavioral marker of neurological dysfunction associated with impending death. PMID:18808249
Fatigue Strength Prediction for Titanium Alloy TiAl6V4 Manufactured by Selective Laser Melting
NASA Astrophysics Data System (ADS)
Leuders, Stefan; Vollmer, Malte; Brenne, Florian; Tröster, Thomas; Niendorf, Thomas
2015-09-01
Selective laser melting (SLM), as a metalworking additive manufacturing technique, received considerable attention from industry and academia due to unprecedented design freedom and overall balanced material properties. However, the fatigue behavior of SLM-processed materials often suffers from local imperfections such as micron-sized pores. In order to enable robust designs of SLM components used in an industrial environment, further research regarding process-induced porosity and its impact on the fatigue behavior is required. Hence, this study aims at a transfer of fatigue prediction models, established for conventional process-routes, to the field of SLM materials. By using high-resolution computed tomography, load increase tests, and electron microscopy, it is shown that pore-based fatigue strength predictions for a titanium alloy TiAl6V4 have become feasible. However, the obtained accuracies are subjected to scatter, which is probably caused by the high defect density even present in SLM materials manufactured following optimized processing routes. Based on thorough examination of crack surfaces and crack initiation sites, respectively, implications for optimization of prediction accuracy of the models in focus are deduced.
Predicting Individual Characteristics from Digital Traces on Social Media: A Meta-Analysis.
Settanni, Michele; Azucar, Danny; Marengo, Davide
2018-04-01
The increasing utilization of social media provides a vast and new source of user-generated ecological data (digital traces), which can be automatically collected for research purposes. The availability of these data sets, combined with the convergence between social and computer sciences, has led researchers to develop automated methods to extract digital traces from social media and use them to predict individual psychological characteristics and behaviors. In this article, we reviewed the literature on this topic and conducted a series of meta-analyses to determine the strength of associations between digital traces and specific individual characteristics; personality, psychological well-being, and intelligence. Potential moderator effects were analyzed with respect to type of social media platform, type of digital traces examined, and study quality. Our findings indicate that digital traces from social media can be studied to assess and predict theoretically distant psychosocial characteristics with remarkable accuracy. Analysis of moderators indicated that the collection of specific types of information (i.e., user demographics), and the inclusion of different types of digital traces, could help improve the accuracy of predictions.
Li, X; Lund, M S; Zhang, Q; Costa, C N; Ducrocq, V; Su, G
2016-06-01
The present study investigated the improvement of prediction reliabilities for 3 production traits in Brazilian Holsteins that had no genotype information by adding information from Nordic and French Holstein bulls that had genotypes. The estimated across-country genetic correlations (ranging from 0.604 to 0.726) indicated that an important genotype by environment interaction exists between Brazilian and Nordic (or Nordic and French) populations. Prediction reliabilities for Brazilian genotyped bulls were greatly increased by including data of Nordic and French bulls, and a 2-trait single-step genomic BLUP performed much better than the corresponding pedigree-based BLUP. However, only a minor improvement in prediction reliabilities was observed in nongenotyped Brazilian cows. The results indicate that although there is a large genotype by environment interaction, inclusion of a foreign reference population can improve accuracy of genetic evaluation for the Brazilian Holstein population. However, a Brazilian reference population is necessary to obtain a more accurate genomic evaluation. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier
NASA Astrophysics Data System (ADS)
Wang, Leilei; Cheng, Jinyong
2018-03-01
Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Reduced fMRI activity predicts relapse in patients recovering from stimulant dependence.
Clark, Vincent P; Beatty, Gregory K; Anderson, Robert E; Kodituwakku, Piyadassa; Phillips, John P; Lane, Terran D R; Kiehl, Kent A; Calhoun, Vince D
2014-02-01
Relapse presents a significant problem for patients recovering from stimulant dependence. Here we examined the hypothesis that patterns of brain function obtained at an early stage of abstinence differentiates patients who later relapse versus those who remain abstinent. Forty-five recently abstinent stimulant-dependent patients were tested using a randomized event-related functional MRI (ER-fMRI) design that was developed in order to replicate a previous ERP study of relapse using a selective attention task, and were then monitored until 6 months of verified abstinence or stimulant use occurred. SPM revealed smaller absolute blood oxygen level-dependent (BOLD) response amplitude in bilateral ventral posterior cingulate and right insular cortex in 23 patients positive for relapse to stimulant use compared with 22 who remained abstinent. ER-fMRI, psychiatric, neuropsychological, demographic, personal and family history of drug use were compared in order to form predictive models. ER-fMRI was found to predict abstinence with higher accuracy than any other single measure obtained in this study. Logistic regression using fMRI amplitude in right posterior cingulate and insular cortex predicted abstinence with 77.8% accuracy, which increased to 89.9% accuracy when history of mania was included. Using 10-fold cross-validation, Bayesian logistic regression and multilayer perceptron algorithms provided the highest accuracy of 84.4%. These results, combined with previous studies, suggest that the functional organization of paralimbic brain regions including ventral anterior and posterior cingulate and right insula are related to patients' ability to maintain abstinence. Novel therapies designed to target these paralimbic regions identified using ER-fMRI may improve treatment outcome. Copyright © 2012 Wiley Periodicals, Inc.
Kvavilashvili, Lia; Ford, Ruth M
2014-11-01
It is well documented that young children greatly overestimate their performance on tests of retrospective memory (RM), but the current investigation is the first to examine children's prediction accuracy for prospective memory (PM). Three studies were conducted, each testing a different group of 5-year-olds. In Study 1 (N=46), participants were asked to predict their success in a simple event-based PM task (remembering to convey a message to a toy mole if they encountered a particular picture during a picture-naming activity). Before naming the pictures, children listened to either a reminder story or a neutral story. Results showed that children were highly accurate in their PM predictions (78% accuracy) and that the reminder story appeared to benefit PM only in children who predicted they would remember the PM response. In Study 2 (N=80), children showed high PM prediction accuracy (69%) regardless of whether the cue was specific or general and despite typical overoptimism regarding their performance on a 10-item RM task using item-by-item prediction. Study 3 (N=35) showed that children were prone to overestimate RM even when asked about their ability to recall a single item-the mole's unusual name. In light of these findings, we consider possible reasons for children's impressive PM prediction accuracy, including the potential involvement of future thinking in performance predictions and PM. Copyright © 2014 Elsevier Inc. All rights reserved.
Liu, Guozheng; Zhao, Yusheng; Gowda, Manje; Longin, C. Friedrich H.; Reif, Jochen C.; Mette, Michael F.
2016-01-01
Bread-making quality traits are central targets for wheat breeding. The objectives of our study were to (1) examine the presence of major effect QTLs for quality traits in a Central European elite wheat population, (2) explore the optimal strategy for predicting the hybrid performance for wheat quality traits, and (3) investigate the effects of marker density and the composition and size of the training population on the accuracy of prediction of hybrid performance. In total 135 inbred lines of Central European bread wheat (Triticum aestivum L.) and 1,604 hybrids derived from them were evaluated for seven quality traits in up to six environments. The 135 parental lines were genotyped using a 90k single-nucleotide polymorphism array. Genome-wide association mapping initially suggested presence of several quantitative trait loci (QTLs), but cross-validation rather indicated the absence of major effect QTLs for all quality traits except of 1000-kernel weight. Genomic selection substantially outperformed marker-assisted selection in predicting hybrid performance. A resampling study revealed that increasing the effective population size in the estimation set of hybrids is relevant to boost the accuracy of prediction for an unrelated test population. PMID:27383841
Application of the aeroacoustic analogy to a shrouded, subsonic, radial fan
NASA Astrophysics Data System (ADS)
Buccieri, Bryan M.; Richards, Christopher M.
2016-12-01
A study was conducted to investigate the predictive capability of computational aeroacoustics with respect to a shrouded, subsonic, radial fan. A three dimensional unsteady fluid dynamics simulation was conducted to produce aerodynamic data used as the acoustic source for an aeroacoustics simulation. Two acoustic models were developed: one modeling the forces on the rotating fan blades as a set of rotating dipoles located at the center of mass of each fan blade and one modeling the forces on the stationary fan shroud as a field of distributed stationary dipoles. Predicted acoustic response was compared to experimental data measured at two operating speeds using three different outlet restrictions. The blade source model predicted overall far field sound power levels within 5 dB averaged over the six different operating conditions while the shroud model predicted overall far field sound power levels within 7 dB averaged over the same conditions. Doubling the density of the computational fluids mesh and using a scale adaptive simulation turbulence model increased broadband noise accuracy. However, computation time doubled and the accuracy of the overall sound power level prediction improved by only 1 dB.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lacaze, Guilhem; Oefelein, Joseph
Large-eddy-simulation (LES) is quickly becoming a method of choice for studying complex thermo-physics in a wide range of propulsion and power systems. It provides a means to study coupled turbulent combustion and flow processes in parameter spaces that are unattainable using direct-numerical-simulation (DNS), with a degree of fidelity that can be far more accurate than conventional engineering methods such as the Reynolds-averaged Navier-Stokes (RANS) approx- imation. However, development of predictive LES is complicated by the complex interdependence of different type of errors coming from numerical methods, algorithms, models and boundary con- ditions. On the other hand, control of accuracy hasmore » become a critical aspect in the development of predictive LES for design. The objective of this project is to create a framework of metrics aimed at quantifying the quality and accuracy of state-of-the-art LES in a manner that addresses the myriad of competing interdependencies. In a typical simulation cycle, only 20% of the computational time is actually usable. The rest is spent in case preparation, assessment, and validation, because of the lack of guidelines. This work increases confidence in the accuracy of a given solution while min- imizing the time obtaining the solution. The approach facilitates control of the tradeoffs between cost, accuracy, and uncertainties as a function of fidelity and methods employed. The analysis is coupled with advanced Uncertainty Quantification techniques employed to estimate confidence in model predictions and calibrate model's parameters. This work has provided positive conse- quences on the accuracy of the results delivered by LES and will soon have a broad impact on research supported both by the DOE and elsewhere.« less
Protein subcellular localization prediction using artificial intelligence technology.
Nair, Rajesh; Rost, Burkhard
2008-01-01
Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with high-throughput methods for predicting localization, and they are beginning to play an important role in directing experimental research. In this chapter, we review some of the most important methods for the prediction of subcellular localization.
Maltreated children's memory: accuracy, suggestibility, and psychopathology.
Eisen, Mitchell L; Goodman, Gail S; Qin, Jianjian; Davis, Suzanne; Crayton, John
2007-11-01
Memory, suggestibility, stress arousal, and trauma-related psychopathology were examined in 328 3- to 16-year-olds involved in forensic investigations of abuse and neglect. Children's memory and suggestibility were assessed for a medical examination and venipuncture. Being older and scoring higher in cognitive functioning were related to fewer inaccuracies. In addition, cortisol level and trauma symptoms in children who reported more dissociative tendencies were associated with increased memory error, whereas cortisol level and trauma symptoms were not associated with increased error for children who reported fewer dissociative tendencies. Sexual and/or physical abuse predicted greater accuracy. The study contributes important new information to scientific understanding of maltreatment, psychopathology, and eyewitness memory in children. (c) 2007 APA.
Omran, Dalia; Zayed, Rania A; Nabeel, Mohammed M; Mobarak, Lamiaa; Zakaria, Zeinab; Farid, Azza; Hassany, Mohamed; Saif, Sameh; Mostafa, Muhammad; Saad, Omar Khalid; Yosry, Ayman
2018-05-01
Stage of liver fibrosis is critical for treatment decision and prediction of outcomes in chronic hepatitis C (CHC) patients. We evaluated the diagnostic accuracy of transient elastography (TE)-FibroScan and noninvasive serum markers tests in the assessment of liver fibrosis in CHC patients, in reference to liver biopsy. One-hundred treatment-naive CHC patients were subjected to liver biopsy, TE-FibroScan, and eight serum biomarkers tests; AST/ALT ratio (AAR), AST to platelet ratio index (APRI), age-platelet index (AP index), fibrosis quotient (FibroQ), fibrosis 4 index (FIB-4), cirrhosis discriminant score (CDS), King score, and Goteborg University Cirrhosis Index (GUCI). Receiver operating characteristic curves were constructed to compare the diagnostic accuracy of these noninvasive methods in predicting significant fibrosis in CHC patients. TE-FibroScan predicted significant fibrosis at cutoff value 8.5 kPa with area under the receiver operating characteristic (AUROC) 0.90, sensitivity 83%, specificity 91.5%, positive predictive value (PPV) 91.2%, and negative predictive value (NPV) 84.4%. Serum biomarkers tests showed that AP index and FibroQ had the highest diagnostic accuracy in predicting significant liver fibrosis at cutoff 4.5 and 2.7, AUROC was 0.8 and 0.8 with sensitivity 73.6% and 73.6%, specificity 70.2% and 68.1%, PPV 71.1% and 69.8%, and NPV 72.9% and 72.3%, respectively. Combined AP index and FibroQ had AUROC 0.83 with sensitivity 73.6%, specificity 80.9%, PPV 79.6%, and NPV 75.7% for predicting significant liver fibrosis. APRI, FIB-4, CDS, King score, and GUCI had intermediate accuracy in predicting significant liver fibrosis with AUROC 0.68, 0.78, 0.74, 0.74, and 0.67, respectively, while AAR had low accuracy in predicting significant liver fibrosis. TE-FibroScan is the most accurate noninvasive alternative to liver biopsy. AP index and FibroQ, either as individual tests or combined, have good accuracy in predicting significant liver fibrosis, and are better combined for higher specificity.
Thomas, Geoff; Fletcher, Garth J O
2003-12-01
Using a video-review procedure, multiple perceivers carried out mind-reading tasks of multiple targets at different levels of acquaintanceship (50 dating couples, friends of the dating partners, and strangers). As predicted, the authors found that mind-reading accuracy was (a). higher as a function of increased acquaintanceship, (b). relatively unaffected by target effects, (c). influenced by individual differences in perceivers' ability, and (d). higher for female than male perceivers. In addition, superior mind-reading accuracy (for dating couples and friends) was related to higher relationship satisfaction, closeness, and more prior disclosure about the problems discussed, but only under moderating conditions related to sex and relationship length. The authors conclude that the nature of the relationship between the perceiver and the target occupies a pivotal role in determining mind-reading accuracy.
Predicting Individual Fuel Economy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Zhenhong; Greene, David L
2011-01-01
To make informed decisions about travel and vehicle purchase, consumers need unbiased and accurate information of the fuel economy they will actually obtain. In the past, the EPA fuel economy estimates based on its 1984 rules have been widely criticized for overestimating on-road fuel economy. In 2008, EPA adopted a new estimation rule. This study compares the usefulness of the EPA's 1984 and 2008 estimates based on their prediction bias and accuracy and attempts to improve the prediction of on-road fuel economies based on consumer and vehicle attributes. We examine the usefulness of the EPA fuel economy estimates using amore » large sample of self-reported on-road fuel economy data and develop an Individualized Model for more accurately predicting an individual driver's on-road fuel economy based on easily determined vehicle and driver attributes. Accuracy rather than bias appears to have limited the usefulness of the EPA 1984 estimates in predicting on-road MPG. The EPA 2008 estimates appear to be equally inaccurate and substantially more biased relative to the self-reported data. Furthermore, the 2008 estimates exhibit an underestimation bias that increases with increasing fuel economy, suggesting that the new numbers will tend to underestimate the real-world benefits of fuel economy and emissions standards. By including several simple driver and vehicle attributes, the Individualized Model reduces the unexplained variance by over 55% and the standard error by 33% based on an independent test sample. The additional explanatory variables can be easily provided by the individuals.« less
Improved numerical methods for turbulent viscous recirculating flows
NASA Technical Reports Server (NTRS)
Turan, A.; Vandoormaal, J. P.
1988-01-01
The performance of discrete methods for the prediction of fluid flows can be enhanced by improving the convergence rate of solvers and by increasing the accuracy of the discrete representation of the equations of motion. This report evaluates the gains in solver performance that are available when various acceleration methods are applied. Various discretizations are also examined and two are recommended because of their accuracy and robustness. Insertion of the improved discretization and solver accelerator into a TEACH mode, that has been widely applied to combustor flows, illustrates the substantial gains to be achieved.
Clark, Samuel A; Hickey, John M; Daetwyler, Hans D; van der Werf, Julius H J
2012-02-09
The theory of genomic selection is based on the prediction of the effects of genetic markers in linkage disequilibrium with quantitative trait loci. However, genomic selection also relies on relationships between individuals to accurately predict genetic value. This study aimed to examine the importance of information on relatives versus that of unrelated or more distantly related individuals on the estimation of genomic breeding values. Simulated and real data were used to examine the effects of various degrees of relationship on the accuracy of genomic selection. Genomic Best Linear Unbiased Prediction (gBLUP) was compared to two pedigree based BLUP methods, one with a shallow one generation pedigree and the other with a deep ten generation pedigree. The accuracy of estimated breeding values for different groups of selection candidates that had varying degrees of relationships to a reference data set of 1750 animals was investigated. The gBLUP method predicted breeding values more accurately than BLUP. The most accurate breeding values were estimated using gBLUP for closely related animals. Similarly, the pedigree based BLUP methods were also accurate for closely related animals, however when the pedigree based BLUP methods were used to predict unrelated animals, the accuracy was close to zero. In contrast, gBLUP breeding values, for animals that had no pedigree relationship with animals in the reference data set, allowed substantial accuracy. An animal's relationship to the reference data set is an important factor for the accuracy of genomic predictions. Animals that share a close relationship to the reference data set had the highest accuracy from genomic predictions. However a baseline accuracy that is driven by the reference data set size and the overall population effective population size enables gBLUP to estimate a breeding value for unrelated animals within a population (breed), using information previously ignored by pedigree based BLUP methods.
Evaluation of in silico tools to predict the skin sensitization potential of chemicals.
Verheyen, G R; Braeken, E; Van Deun, K; Van Miert, S
2017-01-01
Public domain and commercial in silico tools were compared for their performance in predicting the skin sensitization potential of chemicals. The packages were either statistical based (Vega, CASE Ultra) or rule based (OECD Toolbox, Toxtree, Derek Nexus). In practice, several of these in silico tools are used in gap filling and read-across, but here their use was limited to make predictions based on presence/absence of structural features associated to sensitization. The top 400 ranking substances of the ATSDR 2011 Priority List of Hazardous Substances were selected as a starting point. Experimental information was identified for 160 chemically diverse substances (82 positive and 78 negative). The prediction for skin sensitization potential was compared with the experimental data. Rule-based tools perform slightly better, with accuracies ranging from 0.6 (OECD Toolbox) to 0.78 (Derek Nexus), compared with statistical tools that had accuracies ranging from 0.48 (Vega) to 0.73 (CASE Ultra - LLNA weak model). Combining models increased the performance, with positive and negative predictive values up to 80% and 84%, respectively. However, the number of substances that were predicted positive or negative for skin sensitization in both models was low. Adding more substances to the dataset will increase the confidence in the conclusions reached. The insights obtained in this evaluation are incorporated in a web database www.asopus.weebly.com that provides a potential end user context for the scope and performance of different in silico tools with respect to a common dataset of curated skin sensitization data.
Can nutrient status of four woody plant species be predicted using field spectrometry?
NASA Astrophysics Data System (ADS)
Ferwerda, Jelle G.; Skidmore, Andrew K.
This paper demonstrates the potential of hyperspectral remote sensing to predict the chemical composition (i.e., nitrogen, phosphorous, calcium, potassium, sodium, and magnesium) of three tree species (i.e., willow, mopane and olive) and one shrub species (i.e., heather). Reflectance spectra, derivative spectra and continuum-removed spectra were compared in terms of predictive power. Results showed that the best predictions for nitrogen, phosphorous, and magnesium occur when using derivative spectra, and the best predictions for sodium, potassium, and calcium occur when using continuum-removed data. To test whether a general model for multiple species is also valid for individual species, a bootstrapping routine was applied. Prediction accuracies for the individual species were lower then prediction accuracies obtained for the combined dataset for all except one element/species combination, indicating that indices with high prediction accuracies at the landscape scale are less appropriate to detect the chemical content of individual species.
Survival prediction of trauma patients: a study on US National Trauma Data Bank.
Sefrioui, I; Amadini, R; Mauro, J; El Fallahi, A; Gabbrielli, M
2017-12-01
Exceptional circumstances like major incidents or natural disasters may cause a huge number of victims that might not be immediately and simultaneously saved. In these cases it is important to define priorities avoiding to waste time and resources for not savable victims. Trauma and Injury Severity Score (TRISS) methodology is the well-known and standard system usually used by practitioners to predict the survival probability of trauma patients. However, practitioners have noted that the accuracy of TRISS predictions is unacceptable especially for severely injured patients. Thus, alternative methods should be proposed. In this work we evaluate different approaches for predicting whether a patient will survive or not according to simple and easily measurable observations. We conducted a rigorous, comparative study based on the most important prediction techniques using real clinical data of the US National Trauma Data Bank. Empirical results show that well-known Machine Learning classifiers can outperform the TRISS methodology. Based on our findings, we can say that the best approach we evaluated is Random Forest: it has the best accuracy, the best area under the curve, and k-statistic, as well as the second-best sensitivity and specificity. It has also a good calibration curve. Furthermore, its performance monotonically increases as the dataset size grows, meaning that it can be very effective to exploit incoming knowledge. Considering the whole dataset, it is always better than TRISS. Finally, we implemented a new tool to compute the survival of victims. This will help medical practitioners to obtain a better accuracy than the TRISS tools. Random Forests may be a good candidate solution for improving the predictions on survival upon the standard TRISS methodology.
NASA Astrophysics Data System (ADS)
Hemmat Esfe, Mohammad; Tatar, Afshin; Ahangar, Mohammad Reza Hassani; Rostamian, Hossein
2018-02-01
Since the conventional thermal fluids such as water, oil, and ethylene glycol have poor thermal properties, the tiny solid particles are added to these fluids to increase their heat transfer improvement. As viscosity determines the rheological behavior of a fluid, studying the parameters affecting the viscosity is crucial. Since the experimental measurement of viscosity is expensive and time consuming, predicting this parameter is the apt method. In this work, three artificial intelligence methods containing Genetic Algorithm-Radial Basis Function Neural Networks (GA-RBF), Least Square Support Vector Machine (LS-SVM) and Gene Expression Programming (GEP) were applied to predict the viscosity of TiO2/SAE 50 nano-lubricant with Non-Newtonian power-law behavior using experimental data. The correlation factor (R2), Average Absolute Relative Deviation (AARD), Root Mean Square Error (RMSE), and Margin of Deviation were employed to investigate the accuracy of the proposed models. RMSE values of 0.58, 1.28, and 6.59 and R2 values of 0.99998, 0.99991, and 0.99777 reveal the accuracy of the proposed models for respective GA-RBF, CSA-LSSVM, and GEP methods. Among the developed models, the GA-RBF shows the best accuracy.
Mahshid, Minoo; Saboury, Aboulfazl; Fayaz, Ali; Sadr, Seyed Jalil; Lampert, Friedrich; Mir, Maziar
2012-01-01
Background Mechanical torque devices (MTDs) are one of the most commonly recommended devices used to deliver optimal torque to the screw of dental implants. Recently, high variability has been reported about the accuracy of spring-style mechanical torque devices (S-S MTDs). Joint stability and survival rate of fixed implant supported prosthesis depends on the accuracy of these devices. Currently, there is limited information on the steam sterilization influence on the accuracy of MTDs. The purpose of this study was to assess the effect of steam sterilization on the accuracy (±10% of the target torque) of spring-style mechanical torque devices for dental implants. Materials and methods Fifteen new S-S MTDs and their appropriate drivers from three different manufacturers (Nobel Biocare, Straumann [ITI], and Biomet 3i [3i]) were selected. Peak torque of devices (5 in each subgroup) was measured before and after autoclaving using a Tohnichi torque gauge. Descriptive statistical analysis was used and a repeated-measures ANOVA with type of device as a between-subject comparison was performed to assess the difference in accuracy among the three groups of spring-style mechanical torque devices after sterilization. A Bonferroni post hoc test was used to assess pairwise comparisons. Results Before steam sterilization, all the tested devices stayed within 10% of their target values. After 100 sterilization cycles, results didn’t show any significant difference between raw and absolute error values in the Nobel Biocare and ITI devices; however the results demonstrated an increase of error values in the 3i group (P < 0.05). Raw error values increased with a predictable pattern in 3i devices and showed more than a 10% difference from target torque values (maximum difference of 14% from target torque was seen in 17% of peak torque measurements). Conclusion Within the limitation of this study, steam sterilization did not affect the accuracy (±10% of the target torque) of the Nobel Biocare and ITI MTDs. Raw error values increased with a predictable pattern in 3i devices and showed more than 10% difference from target torque values. Before expanding upon the clinical implications, the controlled and combined effect of aging (frequency of use) and steam sterilization needs more investigation. PMID:23674923
Song, Wan; Bang, Seok Hwan; Jeon, Hwang Gyun; Jeong, Byong Chang; Seo, Seong Il; Jeon, Seong Soo; Choi, Han Yong; Kim, Chan Kyo; Lee, Hyun Moo
2018-02-23
The objective of this study was to investigate the effect of Prostate Imaging Reporting and Data System version 2 (PI-RADSv2) on prediction of postoperative Gleason score (GS) upgrading for patients with biopsy GS 6 prostate cancer. We retrospectively reviewed 443 patients who underwent magnetic resonance imaging (MRI) and radical prostatectomy for biopsy-proven GS 6 prostate cancer between January 2011 and December 2013. Preoperative clinical variables and pathologic GS were examined, and all MRI findings were assessed with PI-RADSv2. Receiver operating characteristic curves were used to compare predictive accuracies of multivariate logistic regression models with or without PI-RADSv2. Of the total 443 patients, 297 (67.0%) experienced GS upgrading postoperatively. PI-RADSv2 scores 1 to 3 and 4 to 5 were identified in 157 (25.4%) and 286 (64.6%) patients, respectively, and the rate of GS upgrading was 54.1% and 74.1%, respectively (P < .001). In multivariate analysis, prostate-specific antigen density > 0.16 ng/mL 2 , number of positive cores ≥ 2, maximum percentage of cancer per core > 20, and PI-RADSv2 score 4 to 5 were independent predictors influencing GS upgrading (each P < .05). When predictive accuracies of multivariate models with or without PI-RADSv2 were compared, the model including PI-RADSv2 was shown to have significantly higher accuracy (area under the curve, 0.729 vs. 0.703; P = .041). Use of PI-RADSv2 is an independent predictor of postoperative GS upgrading and increases the predictive accuracy of GS upgrading. PI-RADSv2 might be used as a preoperative imaging tool to determine risk classification and to help counsel patients with regard to treatment decision and prognosis of disease. Copyright © 2018 Elsevier Inc. All rights reserved.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments
Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic
2001-01-01
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Carroll, Julia M; Mundy, Ian R; Cunningham, Anna J
2014-09-01
It is well established that speech, language and phonological skills are closely associated with literacy, and that children with a family risk of dyslexia (FRD) tend to show deficits in each of these areas in the preschool years. This paper examines what the relationships are between FRD and these skills, and whether deficits in speech, language and phonological processing fully account for the increased risk of dyslexia in children with FRD. One hundred and fifty-three 4-6-year-old children, 44 of whom had FRD, completed a battery of speech, language, phonology and literacy tasks. Word reading and spelling were retested 6 months later, and text reading accuracy and reading comprehension were tested 3 years later. The children with FRD were at increased risk of developing difficulties in reading accuracy, but not reading comprehension. Four groups were compared: good and poor readers with and without FRD. In most cases good readers outperformed poor readers regardless of family history, but there was an effect of family history on naming and nonword repetition regardless of literacy outcome, suggesting a role for speech production skills as an endophenotype of dyslexia. Phonological processing predicted spelling, while language predicted text reading accuracy and comprehension. FRD was a significant additional predictor of reading and spelling after controlling for speech production, language and phonological processing, suggesting that children with FRD show additional difficulties in literacy that cannot be fully explained in terms of their language and phonological skills. © 2014 John Wiley & Sons Ltd.
Avsec, Žiga; Cheng, Jun; Gagneur, Julien
2018-01-01
Abstract Motivation Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. Results Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 120 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox. Availability and implementation Spline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at https://github.com/gagneurlab/Manuscript_Avsec_Bioinformatics_2017. Contact avsec@in.tum.de or gagneur@in.tum.de Supplementary information Supplementary data are available at Bioinformatics online. PMID:29155928
Prestimulus EEG Power Predicts Conscious Awareness But Not Objective Visual Performance
Veniero, Domenica
2017-01-01
Abstract Prestimulus oscillatory neural activity has been linked to perceptual outcomes during performance of psychophysical detection and discrimination tasks. Specifically, the power and phase of low frequency oscillations have been found to predict whether an upcoming weak visual target will be detected or not. However, the mechanisms by which baseline oscillatory activity influences perception remain unclear. Recent studies suggest that the frequently reported negative relationship between α power and stimulus detection may be explained by changes in detection criterion (i.e., increased target present responses regardless of whether the target was present/absent) driven by the state of neural excitability, rather than changes in visual sensitivity (i.e., more veridical percepts). Here, we recorded EEG while human participants performed a luminance discrimination task on perithreshold stimuli in combination with single-trial ratings of perceptual awareness. Our aim was to investigate whether the power and/or phase of prestimulus oscillatory activity predict discrimination accuracy and/or perceptual awareness on a trial-by-trial basis. Prestimulus power (3–28 Hz) was inversely related to perceptual awareness ratings (i.e., higher ratings in states of low prestimulus power/high excitability) but did not predict discrimination accuracy. In contrast, prestimulus oscillatory phase did not predict awareness ratings or accuracy in any frequency band. These results provide evidence that prestimulus α power influences the level of subjective awareness of threshold visual stimuli but does not influence visual sensitivity when a decision has to be made regarding stimulus features. Hence, we find a clear dissociation between the influence of ongoing neural activity on conscious awareness and objective performance. PMID:29255794
NASA Astrophysics Data System (ADS)
Mannon, Timothy Patrick, Jr.
Improving well design has and always will be the primary goal in drilling operations in the oil and gas industry. Oil and gas plays are continuing to move into increasingly hostile drilling environments, including near and/or sub-salt proximities. The ability to reduce the risk and uncertainly involved in drilling operations in unconventional geologic settings starts with improving the techniques for mudweight window modeling. To address this issue, an analysis of wellbore stability and well design improvement has been conducted. This study will show a systematic approach to well design by focusing on best practices for mudweight window projection for a field in Mississippi Canyon, Gulf of Mexico. The field includes depleted reservoirs and is in close proximity of salt intrusions. Analysis of offset wells has been conducted in the interest of developing an accurate picture of the subsurface environment by making connections between depth, non-productive time (NPT) events, and mudweights used. Commonly practiced petrophysical methods of pore pressure, fracture pressure, and shear failure gradient prediction have been applied to key offset wells in order to enhance the well design for two proposed wells. For the first time in the literature, the accuracy of the commonly accepted, seismic interval velocity based and the relatively new, seismic frequency based methodologies for pore pressure prediction are qualitatively and quantitatively compared for accuracy. Accuracy standards will be based on the agreement of the seismic outputs to pressure data obtained while drilling and petrophysically based pore pressure outputs for each well. The results will show significantly higher accuracy for the seismic frequency based approach in wells that were in near/sub-salt environments and higher overall accuracy for all of the wells in the study as a whole.
Lee, Sunghee; Lee, Seung Ku; Kim, Jong Yeol; Cho, Namhan; Shin, Chol
2017-09-02
To examine whether the use of Sasang constitutional (SC) types, such as Tae-yang (TY), Tae-eum (TE), So-yang (SY), and So-eum (SE) types, increases the accuracy of risk prediction for metabolic syndrome. From 2001 to 2014, 3529 individuals aged 40 to 69 years participated in a longitudinal prospective cohort. The Cox proportional hazard model was utilized to predict the risk of developing metabolic syndrome. During the 14 year follow-up, 1591 incident events of metabolic syndrome were observed. Individuals with TE type had higher body mass indexes and waist circumferences than individuals with SY and SE types. The risk of developing metabolic syndrome was the highest among individuals with the TE type, followed by the SY type and the SE type. When the prediction risk models for incident metabolic syndrome were compared, the area under the curve for the model using SC types was significantly increased to 0.8173. Significant predictors for incident metabolic syndrome were different according to the SC types. For individuals with the TE type, the significant predictors were age, sex, body mass index (BMI), education, smoking, drinking, fasting glucose level, high-density lipoprotein (HDL) cholesterol level, systolic and diastolic blood pressure, and triglyceride level. For Individuals with the SE type, the predictors were sex, smoking, fasting glucose, HDL cholesterol level, systolic and diastolic blood pressure, and triglyceride level, while the predictors in individuals with the SY type were age, sex, BMI, smoking, drinking, total cholesterol level, fasting glucose level, HDL cholesterol level, systolic and diastolic blood pressure, and triglyceride level. In this prospective cohort study among 3529 individuals, we observed that utilizing the SC types significantly increased the accuracy of the risk prediction for the development of metabolic syndrome.
Kesorn, Kraisak; Ongruk, Phatsavee; Chompoosri, Jakkrawarn; Phumee, Atchara; Thavara, Usavadee; Tawatsin, Apiwat; Siriyasatien, Padet
2015-01-01
Background In the past few decades, several researchers have proposed highly accurate prediction models that have typically relied on climate parameters. However, climate factors can be unreliable and can lower the effectiveness of prediction when they are applied in locations where climate factors do not differ significantly. The purpose of this study was to improve a dengue surveillance system in areas with similar climate by exploiting the infection rate in the Aedes aegypti mosquito and using the support vector machine (SVM) technique for forecasting the dengue morbidity rate. Methods and Findings Areas with high incidence of dengue outbreaks in central Thailand were studied. The proposed framework consisted of the following three major parts: 1) data integration, 2) model construction, and 3) model evaluation. We discovered that the Ae. aegypti female and larvae mosquito infection rates were significantly positively associated with the morbidity rate. Thus, the increasing infection rate of female mosquitoes and larvae led to a higher number of dengue cases, and the prediction performance increased when those predictors were integrated into a predictive model. In this research, we applied the SVM with the radial basis function (RBF) kernel to forecast the high morbidity rate and take precautions to prevent the development of pervasive dengue epidemics. The experimental results showed that the introduced parameters significantly increased the prediction accuracy to 88.37% when used on the test set data, and these parameters led to the highest performance compared to state-of-the-art forecasting models. Conclusions The infection rates of the Ae. aegypti female mosquitoes and larvae improved the morbidity rate forecasting efficiency better than the climate parameters used in classical frameworks. We demonstrated that the SVM-R-based model has high generalization performance and obtained the highest prediction performance compared to classical models as measured by the accuracy, sensitivity, specificity, and mean absolute error (MAE). PMID:25961289
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
Predictive performance of four frailty measures in an older Australian population
Widagdo, Imaina S.; Pratt, Nicole; Russell, Mary; Roughead, Elizabeth E.
2015-01-01
Background: there are several different frailty measures available for identifying the frail elderly. However, their predictive performance in an Australian population has not been examined. Objective: to examine the predictive performance of four internationally validated frailty measures in an older Australian population. Methods: a retrospective study in the Australian Longitudinal Study of Ageing (ALSA) with 2,087 participants. Frailty was measured at baseline using frailty phenotype (FP), simplified frailty phenotype (SFP), frailty index (FI) and prognostic frailty score (PFS). Odds ratios (OR) were calculated to measure the association between frailty and outcomes at Wave 3 including mortality, hospitalisation, nursing home admission, fall and a combination of all outcomes. Predictive performance was measured by assessing sensitivity, specificity, positive and negative predictive values (PPV and NPV) and likelihood ratio (LR). Area under the curve (AUC) of dichotomised and the multilevel or continuous model of the measures was examined. Results: prevalence of frailty varied from 2% up to 49% between the measures. Frailty was significantly associated with an increased risk of any outcome, OR (95% confidence interval) for FP: 1.9 (1.4–2.8), SFP: 3.6 (1.5–8.8), FI: 3.4 (2.7–4.3) and PFS: 2.3 (1.8–2.8). PFS had high sensitivity across all outcomes (sensitivity: 55.2–77.1%). The PPV for any outcome was highest for SFP and FI (70.8 and 69.7%, respectively). Only FI had acceptable accuracy in predicting outcomes, AUC: 0.59–0.70. Conclusions: being identified as frail by any of the four measures was associated with an increased risk of outcomes; however, their predictive accuracy varied. PMID:26504118
NASA Astrophysics Data System (ADS)
Sembiring, J.; Jones, F.
2018-03-01
Red cell Distribution Width (RDW) and platelet ratio (RPR) can predict liver fibrosis and cirrhosis in chronic hepatitis B with relatively high accuracy. RPR was superior to other non-invasive methods to predict liver fibrosis, such as AST and ALT ratio, AST and platelet ratio Index and FIB-4. The aim of this study was to assess diagnostic accuracy liver fibrosis by using RDW and platelets ratio in chronic hepatitis B patients based on compared with Fibroscan. This cross-sectional study was conducted at Adam Malik Hospital from January-June 2015. We examine 34 patients hepatitis B chronic, screen RDW, platelet, and fibroscan. Data were statistically analyzed. The result RPR with ROC procedure has an accuracy of 72.3% (95% CI: 84.1% - 97%). In this study, the RPR had a moderate ability to predict fibrosis degree (p = 0.029 with AUC> 70%). The cutoff value RPR was 0.0591, sensitivity and spesificity were 71.4% and 60%, Positive Prediction Value (PPV) was 55.6% and Negative Predictions Value (NPV) was 75%, positive likelihood ratio was 1.79 and negative likelihood ratio was 0.48. RPR have the ability to predict the degree of liver fibrosis in chronic hepatitis B patients with moderate accuracy.
Noise interference with echo delay discrimination in bat biosonar.
Simmons, J A
2017-11-01
Echolocating big brown bats (Eptesicus fuscus) were trained in a two-choice task to discriminate differences in the delay of electronic echoes at 1.7 ms delay (30 cm simulated range). Difference thresholds (∼45 μs) were comparable to previously published results. At selected above-threshold differences (116 and 232 μs delay), performance was measured in the presence of wideband random noise at increasing amplitudes in 10-dB steps to determine the noise level that prevented discrimination. Performance eventually failed, but the bats increased the amplitude and duration of their broadcasts to compensate for increasing noise, which allowed performance to persist at noise levels about 25 dB higher than without compensation. In the 232-μs delay discrimination condition, echo signal-to-noise ratio (2E/N 0 ) was 8-10 dB at the noise level that depressed performance to chance. Predicted echo-delay accuracy using big brown bat signals follows the Cramér-Rao bound for signal-to-noise ratios above 15 dB, but worsens below 15 dB due to side-peak ambiguity. At 2E/N 0 = 7-10 dB, predicted Cramér-Rao delay accuracy would be about 1 μs; considering side-peak ambiguity it would be about 200-300 μs. The bats' 232 μs performance reflects the intrusion of side-peak ambiguity into delay accuracy at low signal-to-noise ratios.
Comparison of Three Risk Scores to Predict Outcomes of Severe Lower Gastrointestinal Bleeding
Camus, Marine; Jensen, Dennis M.; Ohning, Gordon V.; Kovacs, Thomas O.; Jutabha, Rome; Ghassemi, Kevin A.; Machicado, Gustavo A.; Dulai, Gareth S.; Jensen, Mary Ellen; Gornbein, Jeffrey A.
2014-01-01
Background & aims Improved medical decisions by using a score at the initial patient triage level may lead to improvements in patient management, outcomes, and resource utilization. There is no validated score for management of lower gastrointestinal bleeding (LGIB) unlike for upper GIB. The aim of our study was to compare the accuracies of 3 different prognostic scores (CURE Hemostasis prognosis score, Charlston index and ASA score) for the prediction of 30 day rebleeding, surgery and death in severe LGIB. Methods Data on consecutive patients hospitalized with severe GI bleeding from January 2006 to October 2011 in our two-tertiary academic referral centers were prospectively collected. Sensitivities, specificities, accuracies and area under the receiver operating characteristic (AUROC) were computed for three scores for predictions of rebleeding, surgery and mortality at 30 days. Results 235 consecutive patients with LGIB were included between 2006 and 2011. 23% of patients rebled, 6% had surgery, and 7.7% of patients died. The accuracies of each score never reached 70% for predicting rebleeding or surgery in either. The ASA score had a highest accuracy for predicting mortality within 30 days (83.5%) whereas the CURE Hemostasis prognosis score and the Charlson index both had accuracies less than 75% for the prediction of death within 30 days. Conclusions ASA score could be useful to predict death within 30 days. However a new score is still warranted to predict all 30 days outcomes (rebleeding, surgery and death) in LGIB. PMID:25599218
Effectiveness of Link Prediction for Face-to-Face Behavioral Networks
Tsugawa, Sho; Ohsaki, Hiroyuki
2013-01-01
Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30–0.45 and a recall of 0.10–0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks. PMID:24339956
User Controllability in a Hybrid Recommender System
ERIC Educational Resources Information Center
Parra Santander, Denis Alejandro
2013-01-01
Since the introduction of Tapestry in 1990, research on recommender systems has traditionally focused on the development of algorithms whose goal is to increase the accuracy of predicting users' taste based on historical data. In the last decade, this research has diversified, with "human factors" being one area that has received…
Brightness perception of unrelated self-luminous colors.
Withouck, Martijn; Smet, Kevin A G; Ryckaert, Wouter R; Pointer, Michael R; Deconinck, Geert; Koenderink, Jan; Hanselaer, Peter
2013-06-01
The perception of brightness of unrelated self-luminous colored stimuli of the same luminance has been investigated. The Helmholtz-Kohlrausch (H-K) effect, i.e., an increase in brightness perception due to an increase in saturation, is clearly observed. This brightness perception is compared with the calculated brightness according to six existing vision models, color appearance models, and models based on the concept of equivalent luminance. Although these models included the H-K effect and half of them were developed to work with unrelated colors, none of the models seemed to be able to fully predict the perceived brightness. A tentative solution to increase the prediction accuracy of the color appearance model CAM97u, developed by Hunt, is presented.
Wang, Ming; Long, Qi
2016-09-01
Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Aliloo, Hassan; Pryce, Jennie E; González-Recio, Oscar; Cocks, Benjamin G; Hayes, Ben J
2016-02-01
Dominance effects may contribute to genetic variation of complex traits in dairy cattle, especially for traits closely related to fitness such as fertility. However, traditional genetic evaluations generally ignore dominance effects and consider additive genetic effects only. Availability of dense single nucleotide polymorphisms (SNPs) panels provides the opportunity to investigate the role of dominance in quantitative variation of complex traits at both the SNP and animal levels. Including dominance effects in the genomic evaluation of animals could also help to increase the accuracy of prediction of future phenotypes. In this study, we estimated additive and dominance variance components for fertility and milk production traits of genotyped Holstein and Jersey cows in Australia. The predictive abilities of a model that accounts for additive effects only (additive), and a model that accounts for both additive and dominance effects (additive + dominance) were compared in a fivefold cross-validation. Estimates of the proportion of dominance variation relative to phenotypic variation that is captured by SNPs, for production traits, were up to 3.8 and 7.1 % in Holstein and Jersey cows, respectively, whereas, for fertility, they were equal to 1.2 % in Holstein and very close to zero in Jersey cows. We found that including dominance in the model was not consistently advantageous. Based on maximum likelihood ratio tests, the additive + dominance model fitted the data better than the additive model, for milk, fat and protein yields in both breeds. However, regarding the prediction of phenotypes assessed with fivefold cross-validation, including dominance effects in the model improved accuracy only for fat yield in Holstein cows. Regression coefficients of phenotypes on genetic values and mean squared errors of predictions showed that the predictive ability of the additive + dominance model was superior to that of the additive model for some of the traits. In both breeds, dominance effects were significant (P < 0.01) for all milk production traits but not for fertility. Accuracy of prediction of phenotypes was slightly increased by including dominance effects in the genomic evaluation model. Thus, it can help to better identify highly performing individuals and be useful for culling decisions.
Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.
2007-01-01
To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.
Hydrometeorological model for streamflow prediction
Tangborn, Wendell V.
1979-01-01
The hydrometeorological model described in this manual was developed to predict seasonal streamflow from water in storage in a basin using streamflow and precipitation data. The model, as described, applies specifically to the Skokomish, Nisqually, and Cowlitz Rivers, in Washington State, and more generally to streams in other regions that derive seasonal runoff from melting snow. Thus the techniques demonstrated for these three drainage basins can be used as a guide for applying this method to other streams. Input to the computer program consists of daily averages of gaged runoff of these streams, and daily values of precipitation collected at Longmire, Kid Valley, and Cushman Dam. Predictions are based on estimates of the absolute storage of water, predominately as snow: storage is approximately equal to basin precipitation less observed runoff. A pre-forecast test season is used to revise the storage estimate and improve the prediction accuracy. To obtain maximum prediction accuracy for operational applications with this model , a systematic evaluation of several hydrologic and meteorologic variables is first necessary. Six input options to the computer program that control prediction accuracy are developed and demonstrated. Predictions of streamflow can be made at any time and for any length of season, although accuracy is usually poor for early-season predictions (before December 1) or for short seasons (less than 15 days). The coefficient of prediction (CP), the chief measure of accuracy used in this manual, approaches zero during the late autumn and early winter seasons and reaches a maximum of about 0.85 during the spring snowmelt season. (Kosco-USGS)
Protein docking prediction using predicted protein-protein interface.
Li, Bin; Kihara, Daisuke
2012-01-10
Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Li, Zai-Shang; Chen, Peng; Yao, Kai; Wang, Bin; Li, Jing; Mi, Qi-Wu; Chen, Xiao-Feng; Zhao, Qi; Li, Yong-Hong; Chen, Jie-Ping; Deng, Chuang-Zhong; Ye, Yun-Lin; Zhong, Ming-Zhu; Liu, Zhuo-Wei; Qin, Zi-Ke; Lin, Xiang-Tian; Liang, Wei-Cong; Han, Hui; Zhou, Fang-Jian
2016-04-12
To determine the predictive value and feasibility of the new outcome prediction model for Chinese patients with penile squamous cell carcinoma. The 3-year disease-specific survival (DSS) survival (DSS) was 92.3% in patients with < 8.70 mg/L CRP and 54.9% in those with elevated CRP (P < 0.001). The 3-year DSS was 86.5% in patients with a BMI < 22.6 Kg/m2 and 69.9% in those with a higher BMI (P = 0.025). In a multivariate analysis, pathological T stage (P < 0.001), pathological N stage (P = 0.002), BMI (P = 0.002), and CRP (P = 0.004) were independent predictors of DSS. A new scoring model was developed, consisting of BMI, CRP, and tumor T and N classification. In our study, we found that the addition of the above-mentioned parameters significantly increased the predictive accuracy of the system of the American Joint Committee on Cancer (AJCC) anatomic stage group. The accuracy of the new prediction category was verified. A total of 172 Chinese patients with penile squamous cell cancer were analyzed retrospectively between November 2005 and November 2014. Statistical data analysis was conducted using the nonparametric method. Survival analysis was performed with the log-rank test and the Cox proportional hazard model. Based on regression estimates of significant parameters in multivariate analysis, a new BMI-, CRP- and pathologic factors-based scoring model was developed to predict disease--specific outcomes. The predictive accuracy of the model was evaluated using the internal and external validation. The present study demonstrated that the TNCB score group system maybe a precise and easy to use tool for predicting outcomes in Chinese penile squamous cell carcinoma patients.
Common polygenic variation enhances risk prediction for Alzheimer's disease.
Escott-Price, Valentina; Sims, Rebecca; Bannister, Christian; Harold, Denise; Vronskaya, Maria; Majounie, Elisa; Badarinarayan, Nandini; Morgan, Kevin; Passmore, Peter; Holmes, Clive; Powell, John; Brayne, Carol; Gill, Michael; Mead, Simon; Goate, Alison; Cruchaga, Carlos; Lambert, Jean-Charles; van Duijn, Cornelia; Maier, Wolfgang; Ramirez, Alfredo; Holmans, Peter; Jones, Lesley; Hardy, John; Seshadri, Sudha; Schellenberg, Gerard D; Amouyel, Philippe; Williams, Julie
2015-12-01
The identification of subjects at high risk for Alzheimer's disease is important for prognosis and early intervention. We investigated the polygenic architecture of Alzheimer's disease and the accuracy of Alzheimer's disease prediction models, including and excluding the polygenic component in the model. This study used genotype data from the powerful dataset comprising 17 008 cases and 37 154 controls obtained from the International Genomics of Alzheimer's Project (IGAP). Polygenic score analysis tested whether the alleles identified to associate with disease in one sample set were significantly enriched in the cases relative to the controls in an independent sample. The disease prediction accuracy was investigated in a subset of the IGAP data, a sample of 3049 cases and 1554 controls (for whom APOE genotype data were available) by means of sensitivity, specificity, area under the receiver operating characteristic curve (AUC) and positive and negative predictive values. We observed significant evidence for a polygenic component enriched in Alzheimer's disease (P = 4.9 × 10(-26)). This enrichment remained significant after APOE and other genome-wide associated regions were excluded (P = 3.4 × 10(-19)). The best prediction accuracy AUC = 78.2% (95% confidence interval 77-80%) was achieved by a logistic regression model with APOE, the polygenic score, sex and age as predictors. In conclusion, Alzheimer's disease has a significant polygenic component, which has predictive utility for Alzheimer's disease risk and could be a valuable research tool complementing experimental designs, including preventative clinical trials, stem cell selection and high/low risk clinical studies. In modelling a range of sample disease prevalences, we found that polygenic scores almost doubles case prediction from chance with increased prediction at polygenic extremes. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Redlin, Matthias; Boettcher, Wolfgang; Dehmel, Frank; Cho, Mi-Young; Kukucka, Marian; Habazettl, Helmut
2017-11-01
When applying a blood-conserving approach in paediatric cardiac surgery with the aim of reducing the transfusion of homologous blood products, the decision to use blood or blood-free priming of the cardiopulmonary bypass (CPB) circuit is often based on the predicted haemoglobin concentration (Hb) as derived from the pre-CPB Hb, the prime volume and the estimated blood volume. We assessed the accuracy of this approach and whether it may be improved by using more sophisticated methods of estimating the blood volume. Data from 522 paediatric cardiac surgery patients treated with CPB with blood-free priming in a 2-year period from May 2013 to May 2015 were collected. Inclusion criteria were body weight <15 kg and available Hb data immediately prior to and after the onset of CPB. The Hb on CPB was predicted according to Fick's principle from the pre-CPB Hb, the prime volume and the patient blood volume. Linear regression analyses and Bland-Altman plots were used to assess the accuracy of the Hb prediction. Different methods to estimate the blood volume were assessed and compared. The initial Hb on CPB correlated well with the predicted Hb (R 2 =0.87, p<0.001). A Bland-Altman plot revealed little bias at 0.07 g/dL and an area of agreement from -1.35 to 1.48 g/dL. More sophisticated methods of estimating blood volume from lean body mass did not improve the Hb prediction, but rather increased bias. Hb prediction is reasonably accurate, with the best result obtained with the simplest method of estimating the blood volume at 80 mL/kg body weight. When deciding for or against blood-free priming, caution is necessary when the predicted Hb lies in a range of ± 2 g/dL around the transfusion trigger.
Zou, Lingyun; Wang, Zhengzhi; Huang, Jiaomin
2007-12-01
Subcellular location is one of the key biological characteristics of proteins. Position-specific profiles (PSP) have been introduced as important characteristics of proteins in this article. In this study, to obtain position-specific profiles, the Position Specific Iterative-Basic Local Alignment Search Tool (PSI-BLAST) has been used to search for protein sequences in a database. Position-specific scoring matrices are extracted from the profiles as one class of characteristics. Four-part amino acid compositions and 1st-7th order dipeptide compositions have also been calculated as the other two classes of characteristics. Therefore, twelve characteristic vectors are extracted from each of the protein sequences. Next, the characteristic vectors are weighed by a simple weighing function and inputted into a BP neural network predictor named PSP-Weighted Neural Network (PSP-WNN). The Levenberg-Marquardt algorithm is employed to adjust the weight matrices and thresholds during the network training instead of the error back propagation algorithm. With a jackknife test on the RH2427 dataset, PSP-WNN has achieved a higher overall prediction accuracy of 88.4% rather than the prediction results by the general BP neural network, Markov model, and fuzzy k-nearest neighbors algorithm on this dataset. In addition, the prediction performance of PSP-WNN has been evaluated with a five-fold cross validation test on the PK7579 dataset and the prediction results have been consistently better than those of the previous method on the basis of several support vector machines, using compositions of both amino acids and amino acid pairs. These results indicate that PSP-WNN is a powerful tool for subcellular localization prediction. At the end of the article, influences on prediction accuracy using different weighting proportions among three characteristic vector categories have been discussed. An appropriate proportion is considered by increasing the prediction accuracy.
Support vector machine incremental learning triggered by wrongly predicted samples
NASA Astrophysics Data System (ADS)
Tang, Ting-long; Guan, Qiu; Wu, Yi-rong
2018-05-01
According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.
Starry sky sign: A prevalent sonographic finding in mediastinal tuberculous lymph nodes.
Alici, Ibrahim Onur; Demirci, Nilg N Yilmaz; Yilmaz, Aydin; Karakaya, Jale; Erdogan, Yurdanur
2015-01-01
We report a prevalent finding in tuberculous lymphadenitis (TL): Starry sky sign, hyperechoic foci without acoustic shadows over a hypoechoic background. We retrospectively searched the database for a possible relationship of starry sky sign with a specific diagnosis and also the prevalence and accuracy of the finding. Starry sky sign was found in 16 of 31 tuberculous lymph nodes, while none of other lymph nodes (1,015 lymph nodes) exhibited this finding; giving a sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy of 51.6%, 100%, 100%, 98.5%, and 98.5%, respectively. Bacteriologic and histologic findings are gold standard in the diagnosis of tuberculosis, but this finding may guide the bronchoscopist in choosing the more pathologic node within a station and increase the diagnostic yield as it may relate to actively dividing mycobacteria.
Utsumi, Takanobu; Oka, Ryo; Endo, Takumi; Yano, Masashi; Kamijima, Shuichi; Kamiya, Naoto; Fujimura, Masaaki; Sekita, Nobuyuki; Mikami, Kazuo; Hiruta, Nobuyuki; Suzuki, Hiroyoshi
2015-11-01
The aim of this study is to validate and compare the predictive accuracy of two nomograms predicting the probability of Gleason sum upgrading between biopsy and radical prostatectomy pathology among representative patients with prostate cancer. We previously developed a nomogram, as did Chun et al. In this validation study, patients originated from two centers: Toho University Sakura Medical Center (n = 214) and Chibaken Saiseikai Narashino Hospital (n = 216). We assessed predictive accuracy using area under the curve values and constructed calibration plots to grasp the tendency for each institution. Both nomograms showed a high predictive accuracy in each institution, although the constructed calibration plots of the two nomograms underestimated the actual probability in Toho University Sakura Medical Center. Clinicians need to use calibration plots for each institution to correctly understand the tendency of each nomogram for their patients, even if each nomogram has a good predictive accuracy. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Okasha, Hussein; Elkholy, Shaimaa; El-Sayed, Ramy; Wifi, Mohamed-Naguib; El-Nady, Mohamed; El-Nabawi, Walid; El-Dayem, Waleed A; Radwan, Mohamed I; Farag, Ali; El-Sherif, Yahya; Al-Gemeie, Emad; Salman, Ahmed; El-Sherbiny, Mohamed; El-Mazny, Ahmed; Mahdy, Reem E
2017-08-28
To evaluate the accuracy of the elastography score combined to the strain ratio in the diagnosis of solid pancreatic lesions (SPL). A total of 172 patients with SPL identified by endoscopic ultrasound were enrolled in the study to evaluate the efficacy of elastography and strain ratio in differentiating malignant from benign lesions. The semi quantitative score of elastography was represented by the strain ratio method. Two areas were selected, area (A) representing the region of interest and area (B) representing the normal area. Area (B) was then divided by area (A). Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were calculated by comparing diagnoses made by elastography, strain ratio and final diagnoses. SPL were shown to be benign in 49 patients and malignant in 123 patients. Elastography alone had a sensitivity of 99%, a specificity of 63%, and an accuracy of 88%, a PPV of 87% and an NPV of 96%. The best cut-off level of strain ratio to obtain the maximal area under the curve was 7.8 with a sensitivity of 92%, specificity of 77%, PPV of 91%, NPV of 80% and an accuracy of 88%. Another estimated cut off strain ratio level of 3.8 had a higher sensitivity of 99% and NPV of 96%, but with less specificity, PPV and accuracy 53%, 84% and 86%, respectively. Adding both elastography to strain ratio resulted in a sensitivity of 98%, specificity of 77%, PPV of 91%, NPV of 95% and accuracy of 92% for the diagnosis of SPL. Combining elastography to strain ratio increases the accuracy of the differentiation of benign from malignant SPL.
Okasha, Hussein; Elkholy, Shaimaa; El-Sayed, Ramy; Wifi, Mohamed-Naguib; El-Nady, Mohamed; El-Nabawi, Walid; El-Dayem, Waleed A; Radwan, Mohamed I; Farag, Ali; El-sherif, Yahya; Al-Gemeie, Emad; Salman, Ahmed; El-Sherbiny, Mohamed; El-Mazny, Ahmed; Mahdy, Reem E
2017-01-01
AIM To evaluate the accuracy of the elastography score combined to the strain ratio in the diagnosis of solid pancreatic lesions (SPL). METHODS A total of 172 patients with SPL identified by endoscopic ultrasound were enrolled in the study to evaluate the efficacy of elastography and strain ratio in differentiating malignant from benign lesions. The semi quantitative score of elastography was represented by the strain ratio method. Two areas were selected, area (A) representing the region of interest and area (B) representing the normal area. Area (B) was then divided by area (A). Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were calculated by comparing diagnoses made by elastography, strain ratio and final diagnoses. RESULTS SPL were shown to be benign in 49 patients and malignant in 123 patients. Elastography alone had a sensitivity of 99%, a specificity of 63%, and an accuracy of 88%, a PPV of 87% and an NPV of 96%. The best cut-off level of strain ratio to obtain the maximal area under the curve was 7.8 with a sensitivity of 92%, specificity of 77%, PPV of 91%, NPV of 80% and an accuracy of 88%. Another estimated cut off strain ratio level of 3.8 had a higher sensitivity of 99% and NPV of 96%, but with less specificity, PPV and accuracy 53%, 84% and 86%, respectively. Adding both elastography to strain ratio resulted in a sensitivity of 98%, specificity of 77%, PPV of 91%, NPV of 95% and accuracy of 92% for the diagnosis of SPL. CONCLUSION Combining elastography to strain ratio increases the accuracy of the differentiation of benign from malignant SPL. PMID:28932088
Teklehaimanot, Hailay D; Schwartz, Joel; Teklehaimanot, Awash; Lipsitch, Marc
2004-11-19
Timely and accurate information about the onset of malaria epidemics is essential for effective control activities in epidemic-prone regions. Early warning methods that provide earlier alerts (usually by the use of weather variables) may permit control measures to interrupt transmission earlier in the epidemic, perhaps at the expense of some level of accuracy. Expected case numbers were modeled using a Poisson regression with lagged weather factors in a 4th-degree polynomial distributed lag model. For each week, the numbers of malaria cases were predicted using coefficients obtained using all years except that for which the prediction was being made. The effectiveness of alerts generated by the prediction system was compared against that of alerts based on observed cases. The usefulness of the prediction system was evaluated in cold and hot districts. The system predicts the overall pattern of cases well, yet underestimates the height of the largest peaks. Relative to alerts triggered by observed cases, the alerts triggered by the predicted number of cases performed slightly worse, within 5% of the detection system. The prediction-based alerts were able to prevent 10-25% more cases at a given sensitivity in cold districts than in hot ones. The prediction of malaria cases using lagged weather performed well in identifying periods of increased malaria cases. Weather-derived predictions identified epidemics with reasonable accuracy and better timeliness than early detection systems; therefore, the prediction of malarial epidemics using weather is a plausible alternative to early detection systems.
Lung Ultrasound for Diagnosing Pneumothorax in the Critically Ill Neonate.
Raimondi, Francesco; Rodriguez Fanjul, Javier; Aversa, Salvatore; Chirico, Gaetano; Yousef, Nadya; De Luca, Daniele; Corsini, Iuri; Dani, Carlo; Grappone, Lidia; Orfeo, Luigi; Migliaro, Fiorella; Vallone, Gianfranco; Capasso, Letizia
2016-08-01
To evaluate the accuracy of lung ultrasound for the diagnosis of pneumothorax in the sudden decompensating patient. In an international, prospective study, sudden decompensation was defined as a prolonged significant desaturation (oxygen saturation <65% for more than 40 seconds) and bradycardia or sudden increase of oxygen requirement by at least 50% in less than 10 minutes with a final fraction of inspired oxygen ≥0.7 to keep stable saturations. All eligible patients had an ultrasound scan before undergoing a chest radiograph, which was the reference standard. Forty-two infants (birth weight = 1531 ± 812 g; gestational age = 31 ± 3.5 weeks) were enrolled in 6 centers; pneumothorax was detected in 26 (62%). Lung ultrasound accuracy in diagnosing pneumothorax was as follows: sensitivity 100%, specificity 100%, positive predictive value 100%, and negative predictive value 100%. Clinical evaluation of pneumothorax showed sensitivity 84%, specificity 56%, positive predictive value 76%, and negative predictive value 69%. After sudden decompensation, a lung ultrasound scan was performed in an average time of 5.3 ± 5.6 minutes vs 19 ± 11.7 minutes required for a chest radiography. Emergency drainage was performed after an ultrasound scan but before radiography in 9 cases. Lung ultrasound shows high accuracy in detecting pneumothorax in the critical infant, outperforming clinical evaluation and reducing time to imaging diagnosis and drainage. Copyright © 2016 Elsevier Inc. All rights reserved.
Wu, Cai; Li, Liang
2018-05-15
This paper focuses on quantifying and estimating the predictive accuracy of prognostic models for time-to-event outcomes with competing events. We consider the time-dependent discrimination and calibration metrics, including the receiver operating characteristics curve and the Brier score, in the context of competing risks. To address censoring, we propose a unified nonparametric estimation framework for both discrimination and calibration measures, by weighting the censored subjects with the conditional probability of the event of interest given the observed data. The proposed method can be extended to time-dependent predictive accuracy metrics constructed from a general class of loss functions. We apply the methodology to a data set from the African American Study of Kidney Disease and Hypertension to evaluate the predictive accuracy of a prognostic risk score in predicting end-stage renal disease, accounting for the competing risk of pre-end-stage renal disease death, and evaluate its numerical performance in extensive simulation studies. Copyright © 2018 John Wiley & Sons, Ltd.
Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm
NASA Astrophysics Data System (ADS)
Salameh Shreem, Salam; Abdullah, Salwani; Nazri, Mohd Zakree Ahmad
2016-04-01
Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.
Bianchi, Lorenzo; Schiavina, Riccardo; Borghesi, Marco; Bianchi, Federico Mineo; Briganti, Alberto; Carini, Marco; Terrone, Carlo; Mottrie, Alex; Gacci, Mauro; Gontero, Paolo; Imbimbo, Ciro; Marchioro, Giansilvio; Milanese, Giulio; Mirone, Vincenzo; Montorsi, Francesco; Morgia, Giuseppe; Novara, Giacomo; Porreca, Angelo; Volpe, Alessandro; Brunocilla, Eugenio
2018-04-06
To assess the predictive accuracy and the clinical value of a recent nomogram predicting cancer-specific mortality-free survival after surgery in pN1 prostate cancer patients through an external validation. We evaluated 518 prostate cancer patients treated with radical prostatectomy and pelvic lymph node dissection with evidence of nodal metastases at final pathology, at 10 tertiary centers. External validation was carried out using regression coefficients of the previously published nomogram. The performance characteristics of the model were assessed by quantifying predictive accuracy, according to the area under the curve in the receiver operating characteristic curve and model calibration. Furthermore, we systematically analyzed the specificity, sensitivity, positive predictive value and negative predictive value for each nomogram-derived probability cut-off. Finally, we implemented decision curve analysis, in order to quantify the nomogram's clinical value in routine practice. External validation showed inferior predictive accuracy as referred to in the internal validation (65.8% vs 83.3%, respectively). The discrimination (area under the curve) of the multivariable model was 66.7% (95% CI 60.1-73.0%) by testing with receiver operating characteristic curve analysis. The calibration plot showed an overestimation throughout the range of predicted cancer-specific mortality-free survival rates probabilities. However, in decision curve analysis, the nomogram's use showed a net benefit when compared with the scenarios of treating all patients or none. In an external setting, the nomogram showed inferior predictive accuracy and suboptimal calibration characteristics as compared to that reported in the original population. However, decision curve analysis showed a clinical net benefit, suggesting a clinical implication to correctly manage pN1 prostate cancer patients after surgery. © 2018 The Japanese Urological Association.
Assessing and Ensuring GOES-R Magnetometer Accuracy
NASA Technical Reports Server (NTRS)
Kronenwetter, Jeffrey; Carter, Delano R.; Todirita, Monica; Chu, Donald
2016-01-01
The GOES-R magnetometer accuracy requirement is 1.7 nanoteslas (nT). During quiet times (100 nT), accuracy is defined as absolute mean plus 3 sigma. During storms (300 nT), accuracy is defined as absolute mean plus 2 sigma. To achieve this, the sensor itself has better than 1 nT accuracy. Because zero offset and scale factor drift over time, it is also necessary to perform annual calibration maneuvers. To predict performance, we used covariance analysis and attempted to corroborate it with simulations. Although not perfect, the two generally agree and show the expected behaviors. With the annual calibration regimen, these predictions suggest that the magnetometers will meet their accuracy requirements.
Erbe, M; Hayes, B J; Matukumalli, L K; Goswami, S; Bowman, P J; Reich, C M; Mason, B A; Goddard, M E
2012-07-01
Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Hassanpour, Saeed; Langlotz, Curtis P
2016-01-01
Imaging utilization has significantly increased over the last two decades, and is only recently showing signs of moderating. To help healthcare providers identify patients at risk for high imaging utilization, we developed a prediction model to recognize high imaging utilizers based on their initial imaging reports. The prediction model uses a machine learning text classification framework. In this study, we used radiology reports from 18,384 patients with at least one abdomen computed tomography study in their imaging record at Stanford Health Care as the training set. We modeled the radiology reports in a vector space and trained a support vector machine classifier for this prediction task. We evaluated our model on a separate test set of 4791 patients. In addition to high prediction accuracy, in our method, we aimed at achieving high specificity to identify patients at high risk for high imaging utilization. Our results (accuracy: 94.0%, sensitivity: 74.4%, specificity: 97.9%, positive predictive value: 87.3%, negative predictive value: 95.1%) show that a prediction model can enable healthcare providers to identify in advance patients who are likely to be high utilizers of imaging services. Machine learning classifiers developed from narrative radiology reports are feasible methods to predict imaging utilization. Such systems can be used to identify high utilizers, inform future image ordering behavior, and encourage judicious use of imaging. Copyright © 2016 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Li, Yaohang; Liu, Hui; Rata, Ionel; Jakobsson, Eric
2013-02-25
The rapidly increasing number of protein crystal structures available in the Protein Data Bank (PDB) has naturally made statistical analyses feasible in studying complex high-order inter-residue correlations. In this paper, we report a context-based secondary structure potential (CSSP) for assessing the quality of predicted protein secondary structures generated by various prediction servers. CSSP is a sequence-position-specific knowledge-based potential generated based on the potentials of mean force approach, where high-order inter-residue interactions are taken into consideration. The CSSP potential is effective in identifying secondary structure predictions with good quality. In 56% of the targets in the CB513 benchmark, the optimal CSSP potential is able to recognize the native secondary structure or a prediction with Q3 accuracy higher than 90% as best scored in the predicted secondary structures generated by 10 popularly used secondary structure prediction servers. In more than 80% of the CB513 targets, the predicted secondary structures with the lowest CSSP potential values yield higher than 80% Q3 accuracy. Similar performance of CSSP is found on the CASP9 targets as well. Moreover, our computational results also show that the CSSP potential using triplets outperforms the CSSP potential using doublets and is currently better than the CSSP potential using quartets.
Goo, Yeung-Ja James; Chi, Der-Jang; Shen, Zong-De
2016-01-01
The purpose of this study is to establish rigorous and reliable going concern doubt (GCD) prediction models. This study first uses the least absolute shrinkage and selection operator (LASSO) to select variables and then applies data mining techniques to establish prediction models, such as neural network (NN), classification and regression tree (CART), and support vector machine (SVM). The samples of this study include 48 GCD listed companies and 124 NGCD (non-GCD) listed companies from 2002 to 2013 in the TEJ database. We conduct fivefold cross validation in order to identify the prediction accuracy. According to the empirical results, the prediction accuracy of the LASSO-NN model is 88.96 % (Type I error rate is 12.22 %; Type II error rate is 7.50 %), the prediction accuracy of the LASSO-CART model is 88.75 % (Type I error rate is 13.61 %; Type II error rate is 14.17 %), and the prediction accuracy of the LASSO-SVM model is 89.79 % (Type I error rate is 10.00 %; Type II error rate is 15.83 %).
Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences
2018-01-01
Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal. PMID:29682424
Exploring Mouse Protein Function via Multiple Approaches.
Huang, Guohua; Chu, Chen; Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning; Cai, Yu-Dong
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.
Exploring Mouse Protein Function via Multiple Approaches
Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. PMID:27846315
Experimental and casework validation of ambient temperature corrections in forensic entomology.
Johnson, Aidan P; Wallman, James F; Archer, Melanie S
2012-01-01
This paper expands on Archer (J Forensic Sci 49, 2004, 553), examining additional factors affecting ambient temperature correction of weather station data in forensic entomology. Sixteen hypothetical body discovery sites (BDSs) in Victoria and New South Wales (Australia), both in autumn and in summer, were compared to test whether the accuracy of correlation was affected by (i) length of correlation period; (ii) distance between BDS and weather station; and (iii) periodicity of ambient temperature measurements. The accuracy of correlations in data sets from real Victorian and NSW forensic entomology cases was also examined. Correlations increased weather data accuracy in all experiments, but significant differences in accuracy were found only between periodicity treatments. We found that a >5°C difference between average values of body in situ and correlation period weather station data was predictive of correlations that decreased the accuracy of ambient temperatures estimated using correlation. Practitioners should inspect their weather data sets for such differences. © 2011 American Academy of Forensic Sciences.
Saatchi, Mahdi; McClure, Mathew C; McKay, Stephanie D; Rolf, Megan M; Kim, JaeWoo; Decker, Jared E; Taxis, Tasia M; Chapple, Richard H; Ramey, Holly R; Northcutt, Sally L; Bauck, Stewart; Woodward, Brent; Dekkers, Jack C M; Fernando, Rohan L; Schnabel, Robert D; Garrick, Dorian J; Taylor, Jeremy F
2011-11-28
Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.
2011-01-01
Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy. PMID:22122853
The urine dipstick test useful to rule out infections. A meta-analysis of the accuracy
Devillé, Walter LJM; Yzermans, Joris C; van Duijn, Nico P; Bezemer, P Dick; van der Windt, Daniëlle AWM; Bouter, Lex M
2004-01-01
Background Many studies have evaluated the accuracy of dipstick tests as rapid detectors of bacteriuria and urinary tract infections (UTI). The lack of an adequate explanation for the heterogeneity of the dipstick accuracy stimulates an ongoing debate. The objective of the present meta-analysis was to summarise the available evidence on the diagnostic accuracy of the urine dipstick test, taking into account various pre-defined potential sources of heterogeneity. Methods Literature from 1990 through 1999 was searched in Medline and Embase, and by reference tracking. Selected publications should be concerned with the diagnosis of bacteriuria or urinary tract infections, investigate the use of dipstick tests for nitrites and/or leukocyte esterase, and present empirical data. A checklist was used to assess methodological quality. Results 70 publications were included. Accuracy of nitrites was high in pregnant women (Diagnostic Odds Ratio = 165) and elderly people (DOR = 108). Positive predictive values were ≥80% in elderly and in family medicine. Accuracy of leukocyte-esterase was high in studies in urology patients (DOR = 276). Sensitivities were highest in family medicine (86%). Negative predictive values were high in both tests in all patient groups and settings, except for in family medicine. The combination of both test results showed an important increase in sensitivity. Accuracy was high in studies in urology patients (DOR = 52), in children (DOR = 46), and if clinical information was present (DOR = 28). Sensitivity was highest in studies carried out in family medicine (90%). Predictive values of combinations of positive test results were low in all other situations. Conclusions Overall, this review demonstrates that the urine dipstick test alone seems to be useful in all populations to exclude the presence of infection if the results of both nitrites and leukocyte-esterase are negative. Sensitivities of the combination of both tests vary between 68 and 88% in different patient groups, but positive test results have to be confirmed. Although the combination of positive test results is very sensitive in family practice, the usefulness of the dipstick test alone to rule in infection remains doubtful, even with high pre-test probabilities. PMID:15175113
Revealing how network structure affects accuracy of link prediction
NASA Astrophysics Data System (ADS)
Yang, Jin-Xuan; Zhang, Xiao-Dong
2017-08-01
Link prediction plays an important role in network reconstruction and network evolution. The network structure affects the accuracy of link prediction, which is an interesting problem. In this paper we use common neighbors and the Gini coefficient to reveal the relation between them, which can provide a good reference for the choice of a suitable link prediction algorithm according to the network structure. Moreover, the statistical analysis reveals correlation between the common neighbors index, Gini coefficient index and other indices to describe the network structure, such as Laplacian eigenvalues, clustering coefficient, degree heterogeneity, and assortativity of network. Furthermore, a new method to predict missing links is proposed. The experimental results show that the proposed algorithm yields better prediction accuracy and robustness to the network structure than existing currently used methods for a variety of real-world networks.
Analysis of energy-based algorithms for RNA secondary structure prediction
2012-01-01
Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Conclusions Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets. PMID:22296803
Analysis of energy-based algorithms for RNA secondary structure prediction.
Hajiaghayi, Monir; Condon, Anne; Hoos, Holger H
2012-02-01
RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets.
On the distance of genetic relationships and the accuracy of genomic prediction in pig breeding.
Meuwissen, Theo H E; Odegard, Jorgen; Andersen-Ranberg, Ina; Grindflek, Eli
2014-08-01
With the advent of genomic selection, alternative relationship matrices are used in animal breeding, which vary in their coverage of distant relationships due to old common ancestors. Relationships based on pedigree (A) and linkage analysis (GLA) cover only recent relationships because of the limited depth of the known pedigree. Relationships based on identity-by-state (G) include relationships up to the age of the SNP (single nucleotide polymorphism) mutations. We hypothesised that the latter relationships were too old, since QTL (quantitative trait locus) mutations for traits under selection were probably more recent than the SNPs on a chip, which are typically selected for high minor allele frequency. In addition, A and GLA relationships are too recent to cover genetic differences accurately. Thus, we devised a relationship matrix that considered intermediate-aged relationships and compared all these relationship matrices for their accuracy of genomic prediction in a pig breeding situation. Haplotypes were constructed and used to build a haplotype-based relationship matrix (GH), which considers more intermediate-aged relationships, since haplotypes recombine more quickly than SNPs mutate. Dense genotypes (38 453 SNPs) on 3250 elite breeding pigs were combined with phenotypes for growth rate (2668 records), lean meat percentage (2618), weight at three weeks of age (7387) and number of teats (5851) to estimate breeding values for all animals in the pedigree (8187 animals) using the aforementioned relationship matrices. Phenotypes on the youngest 424 to 486 animals were masked and predicted in order to assess the accuracy of the alternative genomic predictions. Correlations between the relationships and regressions of older on younger relationships revealed that the age of the relationships increased in the order A, GLA, GH and G. Use of genomic relationship matrices yielded significantly higher prediction accuracies than A. GH and G, differed not significantly, but were significantly more accurate than GLA. Our hypothesis that intermediate-aged relationships yield more accurate genomic predictions than G was confirmed for two of four traits, but these results were not statistically significant. Use of estimated genotype probabilities for ungenotyped animals proved to be an efficient method to include the phenotypes of ungenotyped animals.
DOT National Transportation Integrated Search
2015-07-01
Implementing the recommendations of this study is expected to significantly : improve the accuracy of camber measurements and predictions and to : ultimately help reduce construction delays, improve bridge serviceability, : and decrease costs.
Is Gaydar Affected by Attitudes Toward Homosexuality? Confidence, Labeling Bias, and Accuracy.
Brewer, Gayle; Lyons, Minna
2017-01-01
Previous research has largely ignored the relationship between sexual orientation judgement accuracy, confidence, and attitudes toward homosexuality. In an online study, participants (N = 269) judged the sexual orientation of homosexual and heterosexual targets presented via a series of facial photographs. Participants also indicated their confidence in each judgment and completed the Modern Homonegativity Scale (Morrison & Morrison, 2002). We found that (1) homosexual men and heterosexual women were more accurate when judging photographs of women as opposed to photographs of men, and (2) in heterosexual men, negative attitudes toward homosexual men predicted confidence and bias when rating men's photographs. Findings indicate that homosexual men and heterosexual women are similar in terms of accuracy in judging women's sexuality. Further, especially in men, homophobia is associated with cognitive biases in labeling other men but does not have a relationship with increased accuracy.
Genomic selection for crossbred performance accounting for breed-specific effects.
Lopes, Marcos S; Bovenhuis, Henk; Hidalgo, André M; van Arendonk, Johan A M; Knol, Egbert F; Bastiaansen, John W M
2017-06-26
Breed-specific effects are observed when the same allele of a given genetic marker has a different effect depending on its breed origin, which results in different allele substitution effects across breeds. In such a case, single-breed breeding values may not be the most accurate predictors of crossbred performance. Our aim was to estimate the contribution of alleles from each parental breed to the genetic variance of traits that are measured in crossbred offspring, and to compare the prediction accuracies of estimated direct genomic values (DGV) from a traditional genomic selection model (GS) that are trained on purebred or crossbred data, with accuracies of DGV from a model that accounts for breed-specific effects (BS), trained on purebred or crossbred data. The final dataset was composed of 924 Large White, 924 Landrace and 924 two-way cross (F1) genotyped and phenotyped animals. The traits evaluated were litter size (LS) and gestation length (GL) in pigs. The genetic correlation between purebred and crossbred performance was higher than 0.88 for both LS and GL. For both traits, the additive genetic variance was larger for alleles inherited from the Large White breed compared to alleles inherited from the Landrace breed (0.74 and 0.56 for LS, and 0.42 and 0.40 for GL, respectively). The highest prediction accuracies of crossbred performance were obtained when training was done on crossbred data. For LS, prediction accuracies were the same for GS and BS DGV (0.23), while for GL, prediction accuracy for BS DGV was similar to the accuracy of GS DGV (0.53 and 0.52, respectively). In this study, training on crossbred data resulted in higher prediction accuracy than training on purebred data and evidence of breed-specific effects for LS and GL was demonstrated. However, when training was done on crossbred data, both GS and BS models resulted in similar prediction accuracies. In future studies, traits with a lower genetic correlation between purebred and crossbred performance should be included to further assess the value of the BS model in genomic predictions.
A new method of power load prediction in electrification railway
NASA Astrophysics Data System (ADS)
Dun, Xiaohong
2018-04-01
Aiming at the character of electrification railway, the paper mainly studies the problem of load prediction in electrification railway. After the preprocessing of data, and the similar days are separated on the basis of its statistical characteristics. Meanwhile the accuracy of different methods is analyzed. The paper provides a new thought of prediction and a new method of accuracy of judgment for the load prediction of power system.
Hengartner, M P; Heekeren, K; Dvorsky, D; Walitza, S; Rössler, W; Theodoridou, A
2017-09-01
The aim of this study was to critically examine the prognostic validity of various clinical high-risk (CHR) criteria alone and in combination with additional clinical characteristics. A total of 188 CHR positive persons from the region of Zurich, Switzerland (mean age 20.5 years; 60.2% male), meeting ultra high-risk (UHR) and/or basic symptoms (BS) criteria, were followed over three years. The test battery included the Structured Interview for Prodromal Syndromes (SIPS), verbal IQ and many other screening tools. Conversion to psychosis was defined according to ICD-10 criteria for schizophrenia (F20) or brief psychotic disorder (F23). Altogether n=24 persons developed manifest psychosis within three years and according to Kaplan-Meier survival analysis, the projected conversion rate was 17.5%. The predictive accuracy of UHR was statistically significant but poor (area under the curve [AUC]=0.65, P<.05), whereas BS did not predict psychosis beyond mere chance (AUC=0.52, P=.730). Sensitivity and specificity were 0.83 and 0.47 for UHR, and 0.96 and 0.09 for BS. UHR plus BS achieved an AUC=0.66, with sensitivity and specificity of 0.75 and 0.56. In comparison, baseline antipsychotic medication yielded a predictive accuracy of AUC=0.62 (sensitivity=0.42; specificity=0.82). A multivariable prediction model comprising continuous measures of positive symptoms and verbal IQ achieved a substantially improved prognostic accuracy (AUC=0.85; sensitivity=0.86; specificity=0.85; positive predictive value=0.54; negative predictive value=0.97). We showed that BS have no predictive accuracy beyond chance, while UHR criteria poorly predict conversion to psychosis. Combining BS with UHR criteria did not improve the predictive accuracy of UHR alone. In contrast, dimensional measures of both positive symptoms and verbal IQ showed excellent prognostic validity. A critical re-thinking of binary at-risk criteria is necessary in order to improve the prognosis of psychotic disorders. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Prior familiarity with components enhances unconscious learning of relations.
Scott, Ryan B; Dienes, Zoltan
2010-03-01
The influence of prior familiarity with components on the implicit learning of relations was examined using artificial grammar learning. Prior to training on grammar strings, participants were familiarized with either the novel symbols used to construct the strings or with irrelevant geometric shapes. Participants familiarized with the relevant symbols showed greater accuracy when judging the correctness of new grammar strings. Familiarity with elemental components did not increase conscious awareness of the basis for discriminations (structural knowledge) but increased accuracy even in its absence. The subjective familiarity of test strings predicted grammaticality judgments. However, prior exposure to relevant symbols did not increase overall test string familiarity or reliance on familiarity when making grammaticality judgments. Familiarity with the symbols increased the learning of relations between them (bigrams and trigrams) thus resulting in greater familiarity for grammatical versus ungrammatical strings. The results have important implications for models of implicit learning.
Abtahi, Shirin; Abtahi, Farhad; Ellegård, Lars; Johannsson, Gudmundur; Bosaeus, Ingvar
2015-01-01
For several decades electrical bioimpedance (EBI) has been used to assess body fluid distribution and body composition. Despite the development of several different approaches for assessing total body water (TBW), it remains uncertain whether bioimpedance spectroscopic (BIS) approaches are more accurate than single frequency regression equations. The main objective of this study was to answer this question by calculating the expected accuracy of a single measurement for different EBI methods. The results of this study showed that all methods produced similarly high correlation and concordance coefficients, indicating good accuracy as a method. Even the limits of agreement produced from the Bland-Altman analysis indicated that the performance of single frequency, Sun's prediction equations, at population level was close to the performance of both BIS methods; however, when comparing the Mean Absolute Percentage Error value between the single frequency prediction equations and the BIS methods, a significant difference was obtained, indicating slightly better accuracy for the BIS methods. Despite the higher accuracy of BIS methods over 50 kHz prediction equations at both population and individual level, the magnitude of the improvement was small. Such slight improvement in accuracy of BIS methods is suggested insufficient to warrant their clinical use where the most accurate predictions of TBW are required, for example, when assessing over-fluidic status on dialysis. To reach expected errors below 4-5%, novel and individualized approaches must be developed to improve the accuracy of bioimpedance-based methods for the advent of innovative personalized health monitoring applications. PMID:26137489
NASA Astrophysics Data System (ADS)
Lin, Z. D.; Wang, Y. B.; Wang, R. J.; Wang, L. S.; Lu, C. P.; Zhang, Z. Y.; Song, L. T.; Liu, Y.
2017-07-01
A total of 130 topsoil samples collected from Guoyang County, Anhui Province, China, were used to establish a Vis-NIR model for the prediction of organic matter content (OMC) in lime concretion black soils. Different spectral pretreatments were applied for minimizing the irrelevant and useless information of the spectra and increasing the spectra correlation with the measured values. Subsequently, the Kennard-Stone (KS) method and sample set partitioning based on joint x-y distances (SPXY) were used to select the training set. Successive projection algorithm (SPA) and genetic algorithm (GA) were then applied for wavelength optimization. Finally, the principal component regression (PCR) model was constructed, in which the optimal number of principal components was determined using the leave-one-out cross validation technique. The results show that the combination of the Savitzky-Golay (SG) filter for smoothing and multiplicative scatter correction (MSC) can eliminate the effect of noise and baseline drift; the SPXY method is preferable to KS in the sample selection; both the SPA and the GA can significantly reduce the number of wavelength variables and favorably increase the accuracy, especially GA, which greatly improved the prediction accuracy of soil OMC with Rcc, RMSEP, and RPD up to 0.9316, 0.2142, and 2.3195, respectively.
ERIC Educational Resources Information Center
Myers, Jamie S.; Grigsby, Jim; Teel, Cynthia S.; Kramer, Andrew M.
2009-01-01
The goals of this study were to evaluate the accuracy of nurses' predictions of rehabilitation potential in older adults admitted to inpatient rehabilitation facilities and to ascertain whether the addition of a measure of executive cognitive function would enhance predictive accuracy. Secondary analysis was performed on prospective data collected…
Psychopathy, IQ, and Violence in European American and African American County Jail Inmates
ERIC Educational Resources Information Center
Walsh, Zach; Swogger, Marc T.; Kosson, David S.
2004-01-01
The accuracy of the prediction of criminal violence may be improved by combining psychopathy with other variables that have been found to predict violence. Research has suggested that assessing intelligence (i.e., IQ) as well as psychopathy improves the accuracy of violence prediction. In the present study, the authors tested this hypothesis by…
Prediction Accuracy: The Role of Feedback in 6th Graders' Recall Predictions
ERIC Educational Resources Information Center
Al-Harthy, Ibrahim S.
2016-01-01
The current study focused on the role of feedback on students' prediction accuracy (calibration). This phenomenon has been widely studied, but questions remain about how best to improve it. In the current investigation, fifty-seven students from sixth grade were randomly assigned to control and experimental groups. Thirty pictures were chosen from…
Luo, Shanhong; Snider, Anthony G
2009-11-01
There has been a long-standing debate about whether having accurate self-perceptions or holding positive illusions of self is more adaptive. This debate has recently expanded to consider the role of accuracy and bias of partner perceptions in romantic relationships. In the present study, we hypothesized that because accuracy, positivity bias, and similarity bias are likely to serve distinct functions in relationships, they should all make independent contributions to the prediction of marital satisfaction. In a sample of 288 newlywed couples, we tested this hypothesis by simultaneously modeling the actor effects and partner effects of accuracy, positivity bias, and similarity bias in predicting husbands' and wives' satisfaction. Findings across several perceptual domains suggest that all three perceptual indices independently predicted the perceiver's satisfaction. Accuracy and similarity bias, but not positivity bias, made unique contributions to the target's satisfaction. No sex differences were found.
Investigation on the Accuracy of Superposition Predictions of Film Cooling Effectiveness
NASA Astrophysics Data System (ADS)
Meng, Tong; Zhu, Hui-ren; Liu, Cun-liang; Wei, Jian-sheng
2018-05-01
Film cooling effectiveness on flat plates with double rows of holes has been studied experimentally and numerically in this paper. This configuration is widely used to simulate the multi-row film cooling on turbine vane. Film cooling effectiveness of double rows of holes and each single row was used to study the accuracy of superposition predictions. Method of stable infrared measurement technique was used to measure the surface temperature on the flat plate. This paper analyzed the factors that affect the film cooling effectiveness including hole shape, hole arrangement, row-to-row spacing and blowing ratio. Numerical simulations were performed to analyze the flow structure and film cooling mechanisms between each film cooling row. Results show that the blowing ratio within the range of 0.5 to 2 has a significant influence on the accuracy of superposition predictions. At low blowing ratios, results obtained by superposition method agree well with the experimental data. While at high blowing ratios, the accuracy of superposition prediction decreases. Another significant factor is hole arrangement. Results obtained by superposition prediction are nearly the same as experimental values of staggered arrangement structures. For in-line configurations, the superposition values of film cooling effectiveness are much higher than experimental data. For different hole shapes, the accuracy of superposition predictions on converging-expanding holes is better than cylinder holes and compound angle holes. For two different hole spacing structures in this paper, predictions show good agreement with the experiment results.
Vathsangam, Harshvardhan; Emken, Adar; Schroeder, E. Todd; Spruijt-Metz, Donna; Sukhatme, Gaurav S.
2011-01-01
This paper describes an experimental study in estimating energy expenditure from treadmill walking using a single hip-mounted triaxial inertial sensor comprised of a triaxial accelerometer and a triaxial gyroscope. Typical physical activity characterization using accelerometer generated counts suffers from two drawbacks - imprecison (due to proprietary counts) and incompleteness (due to incomplete movement description). We address these problems in the context of steady state walking by directly estimating energy expenditure with data from a hip-mounted inertial sensor. We represent the cyclic nature of walking with a Fourier transform of sensor streams and show how one can map this representation to energy expenditure (as measured by V O2 consumption, mL/min) using three regression techniques - Least Squares Regression (LSR), Bayesian Linear Regression (BLR) and Gaussian Process Regression (GPR). We perform a comparative analysis of the accuracy of sensor streams in predicting energy expenditure (measured by RMS prediction accuracy). Triaxial information is more accurate than uniaxial information. LSR based approaches are prone to outlier sensitivity and overfitting. Gyroscopic information showed equivalent if not better prediction accuracy as compared to accelerometers. Combining accelerometer and gyroscopic information provided better accuracy than using either sensor alone. We also analyze the best algorithmic approach among linear and nonlinear methods as measured by RMS prediction accuracy and run time. Nonlinear regression methods showed better prediction accuracy but required an order of magnitude of run time. This paper emphasizes the role of probabilistic techniques in conjunction with joint modeling of triaxial accelerations and rotational rates to improve energy expenditure prediction for steady-state treadmill walking. PMID:21690001
High accuracy operon prediction method based on STRING database scores.
Taboada, Blanca; Verde, Cristina; Merino, Enrique
2010-07-01
We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8-a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412-D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.
Comparison of Three Risk Scores to Predict Outcomes of Severe Lower Gastrointestinal Bleeding.
Camus, Marine; Jensen, Dennis M; Ohning, Gordon V; Kovacs, Thomas O; Jutabha, Rome; Ghassemi, Kevin A; Machicado, Gustavo A; Dulai, Gareth S; Jensen, Mary E; Gornbein, Jeffrey A
2016-01-01
Improved medical decisions by using a score at the initial patient triage level may lead to improvements in patient management, outcomes, and resource utilization. There is no validated score for management of lower gastrointestinal bleeding (LGIB) unlike for upper gastrointestinal bleeding. The aim of our study was to compare the accuracies of 3 different prognostic scores [Center for Ulcer Research and Education Hemostasis prognosis score, Charlson index, and American Society of Anesthesiologists (ASA) score] for the prediction of 30-day rebleeding, surgery, and death in severe LGIB. Data on consecutive patients hospitalized with severe gastrointestinal bleeding from January 2006 to October 2011 in our 2 tertiary academic referral centers were prospectively collected. Sensitivities, specificities, accuracies, and area under the receiver operator characteristic curve were computed for 3 scores for predictions of rebleeding, surgery, and mortality at 30 days. Two hundred thirty-five consecutive patients with LGIB were included between 2006 and 2011. Twenty-three percent of patients rebled, 6% had surgery, and 7.7% of patients died. The accuracies of each score never reached 70% for predicting rebleeding or surgery in either. The ASA score had a highest accuracy for predicting mortality within 30 days (83.5%), whereas the Center for Ulcer Research and Education Hemostasis prognosis score and the Charlson index both had accuracies <75% for the prediction of death within 30 days. ASA score could be useful to predict death within 30 days. However, a new score is still warranted to predict all 30 days outcomes (rebleeding, surgery, and death) in LGIB.
Massa, Luiz M; Hoffman, Jeanne M; Cardenas, Diana D
2009-01-01
To determine the validity, accuracy, and predictive value of the signs and symptoms of urinary tract infection (UTI) for individuals with spinal cord injury (SCI) using intermittent catheterization (IC) and the accuracy of individuals with SCI on IC at predicting their own UTI. Prospective cohort based on data from the first 3 months of a 1-year randomized controlled trial to evaluate UTI prevention effectiveness of hydrophilic and standard catheters. Fifty-six community-based individuals on IC. Presence of UTI as defined as bacteriuria with a colony count of at least 10(5) colony-forming units/mL and at least 1 sign or symptom of UTI. Analysis of monthly urine culture and urinalysis data combined with analysis of monthly data collected using a questionnaire that asked subjects to self-report on UTI signs and symptoms and whether or not they felt they had a UTI. Overall, "cloudy urine" had the highest accuracy (83.1%), and "leukocytes in the urine" had the highest sensitivity (82.8%). The highest specificity was for "fever" (99.0%); however, it had a very low sensitivity (6.9%). Subjects were able to predict their own UTI with an accuracy of 66.2%, and the negative predictive value (82.8%) was substantially higher than the positive predictive value (32.6%). The UTI signs and symptoms can predict a UTI more accurately than individual subjects can by using subjective impressions of their own signs and symptoms. Subjects were better at predicting when they did not have a UTI than when they did have a UTI.
Attention Modulates Spatial Precision in Multiple-Object Tracking.
Srivastava, Nisheeth; Vul, Ed
2016-01-01
We present a computational model of multiple-object tracking that makes trial-level predictions about the allocation of visual attention and the effect of this allocation on observers' ability to track multiple objects simultaneously. This model follows the intuition that increased attention to a location increases the spatial resolution of its internal representation. Using a combination of empirical and computational experiments, we demonstrate the existence of a tight coupling between cognitive and perceptual resources in this task: Low-level tracking of objects generates bottom-up predictions of error likelihood, and high-level attention allocation selectively reduces error probabilities in attended locations while increasing it at non-attended locations. Whereas earlier models of multiple-object tracking have predicted the big picture relationship between stimulus complexity and response accuracy, our approach makes accurate predictions of both the macro-scale effect of target number and velocity on tracking difficulty and micro-scale variations in difficulty across individual trials and targets arising from the idiosyncratic within-trial interactions of targets and distractors. Copyright © 2016 Cognitive Science Society, Inc.
Large eddy simulation of fine water sprays: comparative analysis of two models and computer codes
NASA Astrophysics Data System (ADS)
Tsoy, A. S.; Snegirev, A. Yu.
2015-09-01
The model and the computer code FDS, albeit widely used in engineering practice to predict fire development, is not sufficiently validated for fire suppression by fine water sprays. In this work, the effect of numerical resolution of the large scale turbulent pulsations on the accuracy of predicted time-averaged spray parameters is evaluated. Comparison of the simulation results obtained with the two versions of the model and code, as well as that of the predicted and measured radial distributions of the liquid flow rate revealed the need to apply monotonic and yet sufficiently accurate discrete approximations of the convective terms. Failure to do so delays jet break-up, otherwise induced by large turbulent eddies, thereby excessively focuses the predicted flow around its axis. The effect of the pressure drop in the spray nozzle is also examined, and its increase has shown to cause only weak increase of the evaporated fraction and vapor concentration despite the significant increase of flow velocity.
Recollection can be Weak and Familiarity can be Strong
Ingram, Katherine M.; Mickes, Laura; Wixted, John T.
2012-01-01
The Remember/Know procedure is widely used to investigate recollection and familiarity in recognition memory, but almost all of the results obtained using that procedure can be readily accommodated by a unidimensional model based on signal-detection theory. The unidimensional model holds that Remember judgments reflect strong memories (associated with high confidence, high accuracy, and fast reaction times), whereas Know judgments reflect weaker memories (associated with lower confidence, lower accuracy, and slower reaction times). Although this is invariably true on average, a new two-dimensional account (the Continuous Dual-Process model) suggests that Remember judgments made with low confidence should be associated with lower old/new accuracy, but higher source accuracy, than Know judgments made with high confidence. We tested this prediction – and found evidence to support it – using a modified Remember/Know procedure in which participants were first asked to indicate a degree of recollection-based or familiarity-based confidence for each word presented on a recognition test and were then asked to recollect the color (red or blue) and screen location (top or bottom) associated with the word at study. For familiarity-based decisions, old/new accuracy increased with old/new confidence, but source accuracy did not (suggesting that stronger old/new memory was supported by higher degrees of familiarity). For recollection-based decisions, both old/new accuracy and source accuracy increased with old/new confidence (suggesting that stronger old/new memory was supported by higher degrees of recollection). These findings suggest that recollection and familiarity are continuous processes and that participants can indicate which process mainly contributed to their recognition decisions. PMID:21967320
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
CPO Prediction: Accuracy Assessment and Impact on UT1 Intensive Results
NASA Technical Reports Server (NTRS)
Malkin, Zinovy
2010-01-01
The UT1 Intensive results heavily depend on the celestial pole offset (CPO) model used during data processing. Since accurate CPO values are available with a delay of two to four weeks, CPO predictions are necessarily applied to the UT1 Intensive data analysis, and errors in the predictions can influence the operational UT1 accuracy. In this paper we assess the real accuracy of CPO prediction using the actual IERS and PUL predictions made in 2007-2009. Also, results of operational processing were analyzed to investigate the actual impact of EOP prediction errors on the rapid UT1 results. It was found that the impact of CPO prediction errors is at a level of several microseconds, whereas the impact of the inaccuracy in the polar motion prediction may be about one order of magnitude larger for ultra-rapid UT1 results. The situation can be amended if the IERS Rapid solution will be updated more frequently.
Prediction of Cerebral Hyperperfusion Syndrome with Velocity Blood Pressure Index.
Lai, Zhi-Chao; Liu, Bao; Chen, Yu; Ni, Leng; Liu, Chang-Wei
2015-06-20
Cerebral hyperperfusion syndrome is an important complication of carotid endarterectomy (CEA). An >100% increase in middle cerebral artery velocity (MCAV) after CEA is used to predict the cerebral hyperperfusion syndrome (CHS) development, but the accuracy is limited. The increase in blood pressure (BP) after surgery is a risk factor of CHS, but no study uses it to predict CHS. This study was to create a more precise parameter for prediction of CHS by combined the increase of MCAV and BP after CEA. Systolic MCAV measured by transcranial Doppler and systematic BP were recorded preoperatively; 30 min postoperatively. The new parameter velocity BP index (VBI) was calculated from the postoperative increase ratios of MCAV and BP. The prediction powers of VBI and the increase ratio of MCAV (velocity ratio [VR]) were compared for predicting CHS occurrence. Totally, 6/185 cases suffered CHS. The best-fit cut-off point of 2.0 for VBI was identified, which had 83.3% sensitivity, 98.3% specificity, 62.5% positive predictive value and 99.4% negative predictive value for CHS development. This result is significantly better than VR (33.3%, 97.2%, 28.6% and 97.8%). The area under the curve (AUC) of receiver operating characteristic: AUC(VBI) = 0.981, 95% confidence interval [CI] 0.949-0.995; AUC(VR) = 0.935, 95% CI 0.890-0.966, P = 0.02. The new parameter VBI can more accurately predict patients at risk of CHS after CEA. This observation needs to be validated by larger studies.
Conde-Agudelo, A; Papageorghiou, A T; Kennedy, S H; Villar, J
2013-05-01
Several biomarkers for predicting intrauterine growth restriction (IUGR) have been proposed in recent years. However, the predictive performance of these biomarkers has not been systematically evaluated. To determine the predictive accuracy of novel biomarkers for IUGR in women with singleton gestations. Electronic databases, reference list checking and conference proceedings. Observational studies that evaluated the accuracy of novel biomarkers proposed for predicting IUGR. Data were extracted on characteristics, quality and predictive accuracy from each study to construct 2×2 tables. Summary receiver operating characteristic curves, sensitivities, specificities and likelihood ratios (LRs) were generated. A total of 53 studies, including 39,974 women and evaluating 37 novel biomarkers, fulfilled the inclusion criteria. Overall, the predictive accuracy of angiogenic factors for IUGR was minimal (median pooled positive and negative LRs of 1.7, range 1.0-19.8; and 0.8, range 0.0-1.0, respectively). Two small case-control studies reported high predictive values for placental growth factor and angiopoietin-2 only when IUGR was defined as birthweight centile with clinical or pathological evidence of fetal growth restriction. Biomarkers related to endothelial function/oxidative stress, placental protein/hormone, and others such as serum levels of vitamin D, urinary albumin:creatinine ratio, thyroid function tests and metabolomic profile had low predictive accuracy. None of the novel biomarkers evaluated in this review are sufficiently accurate to recommend their use as predictors of IUGR in routine clinical practice. However, the use of biomarkers in combination with biophysical parameters and maternal characteristics could be more useful and merits further research. © 2013 The Authors BJOG An International Journal of Obstetrics and Gynaecology © 2013 RCOG.
Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset
Lipps, David; Devineni, Sree
2016-01-01
MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been developed to predict if RNA transcripts contain miRNAs or not. Although being very successful, these predictors started to face multiple challenges in recent years. Many predictors were optimized using datasets of hundreds of miRNA samples. The sizes of these datasets are much smaller than the number of known miRNAs. Consequently, the prediction accuracy of these predictors in large dataset becomes unknown and needs to be re-tested. In addition, many predictors were optimized for either high sensitivity or high specificity. These optimization strategies may bring in serious limitations in applications. Moreover, to meet continuously raised expectations on these computational tools, improving the prediction accuracy becomes extremely important. In this study, a meta-predictor mirMeta was developed by integrating a set of non-linear transformations with meta-strategy. More specifically, the outputs of five individual predictors were first preprocessed using non-linear transformations, and then fed into an artificial neural network to make the meta-prediction. The prediction accuracy of meta-predictor was validated using both multi-fold cross-validation and independent dataset. The final accuracy of meta-predictor in newly-designed large dataset is improved by 7% to 93%. The meta-predictor is also proved to be less dependent on datasets, as well as has refined balance between sensitivity and specificity. This study has two folds of importance: First, it shows that the combination of non-linear transformations and artificial neural networks improves the prediction accuracy of individual predictors. Second, a new miRNA predictor with significantly improved prediction accuracy is developed for the community for identifying novel miRNAs and the complete set of miRNAs. Source code is available at: https://github.com/xueLab/mirMeta PMID:28002428
Predictive modeling of surimi cake shelf life at different storage temperatures
NASA Astrophysics Data System (ADS)
Wang, Yatong; Hou, Yanhua; Wang, Quanfu; Cui, Bingqing; Zhang, Xiangyu; Li, Xuepeng; Li, Yujin; Liu, Yuanping
2017-04-01
The Arrhenius model of the shelf life prediction which based on the TBARS index was established in this study. The results showed that the significant changed of AV, POV, COV and TBARS with temperature increased, and the reaction rate constants k was obtained by the first order reaction kinetics model. Then the secondary model fitting was based on the Arrhenius equation. There was the optimal fitting accuracy of TBARS in the first and the secondary model fitting (R2≥0.95). The verification test indicated that the relative error between the shelf life model prediction value and actual value was within ±10%, suggesting the model could predict the shelf life of surimi cake.
Predicting missing links in complex networks based on common neighbors and distance
Yang, Jinxuan; Zhang, Xiao-Dong
2016-01-01
The algorithms based on common neighbors metric to predict missing links in complex networks are very popular, but most of these algorithms do not account for missing links between nodes with no common neighbors. It is not accurate enough to reconstruct networks by using these methods in some cases especially when between nodes have less common neighbors. We proposed in this paper a new algorithm based on common neighbors and distance to improve accuracy of link prediction. Our proposed algorithm makes remarkable effect in predicting the missing links between nodes with no common neighbors and performs better than most existing currently used methods for a variety of real-world networks without increasing complexity. PMID:27905526
USDA-ARS?s Scientific Manuscript database
Potato breeding cycles typically last 6-7 years because of the modest seed multiplication rate and large number of traits required of new varieties. Genomic selection has the potential to increase genetic gain per unit of time, through higher accuracy and/or a shorter cycle. Both possibilities were ...
Bangera, Rama; Correa, Katharina; Lhorente, Jean P; Figueroa, René; Yáñez, José M
2017-01-31
Salmon Rickettsial Syndrome (SRS) caused by Piscirickettsia salmonis is a major disease affecting the Chilean salmon industry. Genomic selection (GS) is a method wherein genome-wide markers and phenotype information of full-sibs are used to predict genomic EBV (GEBV) of selection candidates and is expected to have increased accuracy and response to selection over traditional pedigree based Best Linear Unbiased Prediction (PBLUP). Widely used GS methods such as genomic BLUP (GBLUP), SNPBLUP, Bayes C and Bayesian Lasso may perform differently with respect to accuracy of GEBV prediction. Our aim was to compare the accuracy, in terms of reliability of genome-enabled prediction, from different GS methods with PBLUP for resistance to SRS in an Atlantic salmon breeding program. Number of days to death (DAYS), binary survival status (STATUS) phenotypes, and 50 K SNP array genotypes were obtained from 2601 smolts challenged with P. salmonis. The reliability of different GS methods at different SNP densities with and without pedigree were compared to PBLUP using a five-fold cross validation scheme. Heritability estimated from GS methods was significantly higher than PBLUP. Pearson's correlation between predicted GEBV from PBLUP and GS models ranged from 0.79 to 0.91 and 0.79-0.95 for DAYS and STATUS, respectively. The relative increase in reliability from different GS methods for DAYS and STATUS with 50 K SNP ranged from 8 to 25% and 27-30%, respectively. All GS methods outperformed PBLUP at all marker densities. DAYS and STATUS showed superior reliability over PBLUP even at the lowest marker density of 3 K and 500 SNP, respectively. 20 K SNP showed close to maximal reliability for both traits with little improvement using higher densities. These results indicate that genomic predictions can accelerate genetic progress for SRS resistance in Atlantic salmon and implementation of this approach will contribute to the control of SRS in Chile. We recommend GBLUP for routine GS evaluation because this method is computationally faster and the results are very similar with other GS methods. The use of lower density SNP or the combination of low density SNP and an imputation strategy may help to reduce genotyping costs without compromising gain in reliability.
Advances in Homology Protein Structure Modeling
Xiang, Zhexin
2007-01-01
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function. PMID:16787261
Assessing and Ensuring GOES-R Magnetometer Accuracy
NASA Technical Reports Server (NTRS)
Carter, Delano R.; Todirita, Monica; Kronenwetter, Jeffrey; Chu, Donald
2016-01-01
The GOES-R magnetometer subsystem accuracy requirement is 1.7 nanoteslas (nT). During quiet times (100 nT), accuracy is defined as absolute mean plus 3 sigma. During storms (300 nT), accuracy is defined as absolute mean plus 2 sigma. Error comes both from outside the magnetometers, e.g. spacecraft fields and misalignments, as well as inside, e.g. zero offset and scale factor errors. Because zero offset and scale factor drift over time, it will be necessary to perform annual calibration maneuvers. To predict performance before launch, we have used Monte Carlo simulations and covariance analysis. Both behave as expected, and their accuracy predictions agree within 30%. With the proposed calibration regimen, both suggest that the GOES-R magnetometer subsystem will meet its accuracy requirements.
Accuracy of ultrasonography in the detection of severe hepatic lipidosis in cats.
Yeager, A E; Mohammed, H
1992-04-01
The accuracy of ultrasonography in detection of feline hepatic lipidosis was studied retrospectively. The following ultrasonographic criteria were associated positively with severe hepatic lipidosis: the liver hyperechoic, compared with falciform fat; the liver isoechoic or hyperechoic, compared with omental fat; poor visualization of intrahepatic vessel borders; and increased attenuation of sound by the liver. In a group of 36 cats with clinically apparent hepatobiliary disease and in which liver biopsy was done, liver hyperechoic, compared with falciform fat, was the best criterion for diagnosis of severe hepatic lipidosis with 91% sensitivity, 100% specificity, and 100% positive predictive value.
Evaluation of new techniques for the calculation of internal recirculating flows
NASA Technical Reports Server (NTRS)
Van Doormaal, J. P.; Turan, A.; Raithby, G. D.
1987-01-01
The performance of discrete methods for the prediction of fluid flows can be enhanced by improving the convergence rate of solvers and by increasing the accuracy of the discrete representation of the equations of motion. This paper evaluates the gains in solver performance that are available when various acceleration methods are applied. Various discretizations are also examined and two are recommended because of their accuracy and robustness. Insertion of the improved discretization and solver accelerator into a TEACH code, that has been widely applied to combustor flows, illustrates the substantial gains that can be achieved.
A simple method to predict body temperature of small reptiles from environmental temperature.
Vickers, Mathew; Schwarzkopf, Lin
2016-05-01
To study behavioral thermoregulation, it is useful to use thermal sensors and physical models to collect environmental temperatures that are used to predict organism body temperature. Many techniques involve expensive or numerous types of sensors (cast copper models, or temperature, humidity, radiation, and wind speed sensors) to collect the microhabitat data necessary to predict body temperatures. Expense and diversity of requisite sensors can limit sampling resolution and accessibility of these methods. We compare body temperature predictions of small lizards from iButtons, DS18B20 sensors, and simple copper models, in both laboratory and natural conditions. Our aim was to develop an inexpensive yet accurate method for body temperature prediction. Either method was applicable given appropriate parameterization of the heat transfer equation used. The simplest and cheapest method was DS18B20 sensors attached to a small recording computer. There was little if any deficit in precision or accuracy compared to other published methods. We show how the heat transfer equation can be parameterized, and it can also be used to predict body temperature from historically collected data, allowing strong comparisons between current and previous environmental temperatures using the most modern techniques. Our simple method uses very cheap sensors and loggers to extensively sample habitat temperature, improving our understanding of microhabitat structure and thermal variability with respect to small ectotherms. While our method was quite precise, we feel any potential loss in accuracy is offset by the increase in sample resolution, important as it is increasingly apparent that, particularly for small ectotherms, habitat thermal heterogeneity is the strongest influence on transient body temperature.
Hack, Dallas; Huff, J Stephen; Curley, Kenneth; Naunheim, Roseanne; Ghosh Dastidar, Samanwoy; Prichep, Leslie S
2017-07-01
Extremely high accuracy for predicting CT+ traumatic brain injury (TBI) using a quantitative EEG (QEEG) based multivariate classification algorithm was demonstrated in an independent validation trial, in Emergency Department (ED) patients, using an easy to use handheld device. This study compares the predictive power using that algorithm (which includes LOC and amnesia), to the predictive power of LOC alone or LOC plus traumatic amnesia. ED patients 18-85years presenting within 72h of closed head injury, with GSC 12-15, were study candidates. 680 patients with known absence or presence of LOC were enrolled (145 CT+ and 535 CT- patients). 5-10min of eyes closed EEG was acquired using the Ahead 300 handheld device, from frontal and frontotemporal regions. The same classification algorithm methodology was used for both the EEG based and the LOC based algorithms. Predictive power was evaluated using area under the ROC curve (AUC) and odds ratios. The QEEG based classification algorithm demonstrated significant improvement in predictive power compared with LOC alone, both in improved AUC (83% improvement) and odds ratio (increase from 4.65 to 16.22). Adding RGA and/or PTA to LOC was not improved over LOC alone. Rapid triage of TBI relies on strong initial predictors. Addition of an electrophysiological based marker was shown to outperform report of LOC alone or LOC plus amnesia, in determining risk of an intracranial bleed. In addition, ease of use at point-of-care, non-invasive, and rapid result using such technology suggests significant value added to standard clinical prediction. Copyright © 2017 Elsevier Inc. All rights reserved.
Olexa, Edward M.; Lawrence, Rick L
2014-01-01
Federal land management agencies provide stewardship over much of the rangelands in the arid andsemi-arid western United States, but they often lack data of the proper spatiotemporal resolution andextent needed to assess range conditions and monitor trends. Recent advances in the blending of com-plementary, remotely sensed data could provide public lands managers with the needed information.We applied the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) to five Landsat TMand concurrent Terra MODIS scenes, and used pixel-based regression and difference image analyses toevaluate the quality of synthetic reflectance and NDVI products associated with semi-arid rangeland. Pre-dicted red reflectance data consistently demonstrated higher accuracy, less bias, and stronger correlationwith observed data than did analogous near-infrared (NIR) data. The accuracy of both bands tended todecline as the lag between base and prediction dates increased; however, mean absolute errors (MAE)were typically ≤10%. The quality of area-wide NDVI estimates was less consistent than either spectra lband, although the MAE of estimates predicted using early season base pairs were ≤10% throughout the growing season. Correlation between known and predicted NDVI values and agreement with the 1:1regression line tended to decline as the prediction lag increased. Further analyses of NDVI predictions,based on a 22 June base pair and stratified by land cover/land use (LCLU), revealed accurate estimates through the growing season; however, inter-class performance varied. This work demonstrates the successful application of the STARFM algorithm to semi-arid rangeland; however, we encourage evaluation of STARFM’s performance on a per product basis, stratified by LCLU, with attention given to the influence of base pair selection and the impact of the time lag.
[Forest lighting fire forecasting for Daxing'anling Mountains based on MAXENT model].
Sun, Yu; Shi, Ming-Chang; Peng, Huan; Zhu, Pei-Lin; Liu, Si-Lin; Wu, Shi-Lei; He, Cheng; Chen, Feng
2014-04-01
Daxing'anling Mountains is one of the areas with the highest occurrence of forest lighting fire in Heilongjiang Province, and developing a lightning fire forecast model to accurately predict the forest fires in this area is of importance. Based on the data of forest lightning fires and environment variables, the MAXENT model was used to predict the lightning fire in Daxing' anling region. Firstly, we studied the collinear diagnostic of each environment variable, evaluated the importance of the environmental variables using training gain and the Jackknife method, and then evaluated the prediction accuracy of the MAXENT model using the max Kappa value and the AUC value. The results showed that the variance inflation factor (VIF) values of lightning energy and neutralized charge were 5.012 and 6.230, respectively. They were collinear with the other variables, so the model could not be used for training. Daily rainfall, the number of cloud-to-ground lightning, and current intensity of cloud-to-ground lightning were the three most important factors affecting the lightning fires in the forest, while the daily average wind speed and the slope was of less importance. With the increase of the proportion of test data, the max Kappa and AUC values were increased. The max Kappa values were above 0.75 and the average value was 0.772, while all of the AUC values were above 0.5 and the average value was 0. 859. With a moderate level of prediction accuracy being achieved, the MAXENT model could be used to predict forest lightning fire in Daxing'anling Mountains.
Liu, Qianying; Lei, Zhixin; Zhu, Feng; Ihsan, Awais; Wang, Xu; Yuan, Zonghui
2017-01-01
Genotoxicity and carcinogenicity testing of pharmaceuticals prior to commercialization is requested by regulatory agencies. The bacterial mutagenicity test was considered having the highest accuracy of carcinogenic prediction. However, some evidences suggest that it always results in false-positive responses when the bacterial mutagenicity test is used to predict carcinogenicity. Along with major changes made to the International Committee on Harmonization guidance on genotoxicity testing [S2 (R1)], the old data (especially the cytotgenetic data) may not meet current guidelines. This review provides a compendium of retrievable results of genotoxicity and animal carcinogenicity of 136 antiparasitics. Neither genotoxicity nor carcinogenicity data is available for 84 (61.8%), while 52 (38.2%) have been evaluated in at least one genotoxicity or carcinogenicity study, and only 20 (14.7%) in both genotoxicity and carcinogenicity studies. Among 33 antiparasitics with at least one old result in in vitro genotoxicity, 15 (45.5%) are in agreement with the current ICH S2 (R1) guidance for data acceptance. Compared with other genotoxicity assays, the DNA lesions can significantly increase the accuracy of prediction of carcinogenicity. Together, a combination of DNA lesion and bacterial tests is a more accurate way to predict carcinogenicity. PMID:29170735
Hostettler, Isabel Charlotte; Muroi, Carl; Richter, Johannes Konstantin; Schmid, Josef; Neidert, Marian Christoph; Seule, Martin; Boss, Oliver; Pangalu, Athina; Germans, Menno Robbert; Keller, Emanuela
2018-01-19
OBJECTIVE The aim of this study was to create prediction models for outcome parameters by decision tree analysis based on clinical and laboratory data in patients with aneurysmal subarachnoid hemorrhage (aSAH). METHODS The database consisted of clinical and laboratory parameters of 548 patients with aSAH who were admitted to the Neurocritical Care Unit, University Hospital Zurich. To examine the model performance, the cohort was randomly divided into a derivation cohort (60% [n = 329]; training data set) and a validation cohort (40% [n = 219]; test data set). The classification and regression tree prediction algorithm was applied to predict death, functional outcome, and ventriculoperitoneal (VP) shunt dependency. Chi-square automatic interaction detection was applied to predict delayed cerebral infarction on days 1, 3, and 7. RESULTS The overall mortality was 18.4%. The accuracy of the decision tree models was good for survival on day 1 and favorable functional outcome at all time points, with a difference between the training and test data sets of < 5%. Prediction accuracy for survival on day 1 was 75.2%. The most important differentiating factor was the interleukin-6 (IL-6) level on day 1. Favorable functional outcome, defined as Glasgow Outcome Scale scores of 4 and 5, was observed in 68.6% of patients. Favorable functional outcome at all time points had a prediction accuracy of 71.1% in the training data set, with procalcitonin on day 1 being the most important differentiating factor at all time points. A total of 148 patients (27%) developed VP shunt dependency. The most important differentiating factor was hyperglycemia on admission. CONCLUSIONS The multiple variable analysis capability of decision trees enables exploration of dependent variables in the context of multiple changing influences over the course of an illness. The decision tree currently generated increases awareness of the early systemic stress response, which is seemingly pertinent for prognostication.
ASME V\\&V challenge problem: Surrogate-based V&V
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beghini, Lauren L.; Hough, Patricia D.
2015-12-18
The process of verification and validation can be resource intensive. From the computational model perspective, the resource demand typically arises from long simulation run times on multiple cores coupled with the need to characterize and propagate uncertainties. In addition, predictive computations performed for safety and reliability analyses have similar resource requirements. For this reason, there is a tradeoff between the time required to complete the requisite studies and the fidelity or accuracy of the results that can be obtained. At a high level, our approach is cast within a validation hierarchy that provides a framework in which we perform sensitivitymore » analysis, model calibration, model validation, and prediction. The evidence gathered as part of these activities is mapped into the Predictive Capability Maturity Model to assess credibility of the model used for the reliability predictions. With regard to specific technical aspects of our analysis, we employ surrogate-based methods, primarily based on polynomial chaos expansions and Gaussian processes, for model calibration, sensitivity analysis, and uncertainty quantification in order to reduce the number of simulations that must be done. The goal is to tip the tradeoff balance to improving accuracy without increasing the computational demands.« less
[Effect of the near infrared spectrum resolution on the nitrogen content model in green tea].
Yang, Dan; Liu, Xin; Liu, Hong-Gang; Zhang, Ying-Bin; Yin, Peng
2013-07-01
The effect of different resolutions(2, 4, 6, 8, 16 cm(-1)) on the near infrared spectrogram and nitrogen content model for green tea was studied. Test results showed that instrument resolution could influence the spectra quality. The higher the resolution was, the richer the information would be, but the noise would increase. With lower resolution, spectrogram would be much more smooth, but get seriously distorted, and prediction accuracy would decrease at the same time. The partial least squares model was built after spectral pretreatment. When resolution was 4 cm(-1), the RMSEP value of external validation set was 0.054 6, which was obviously lower than others. The Corr. Coeff. was 0.998 2. Its prediction performance was the best and the prediction accuracy better. STDEV and RSD were 0.020 and 0.334 respectively. Resolution 4 cm(-1) for near infrared spectrometer collecting green tea samples was the optimal resolution. This research can provide a reference for parameters selection when collecting green tea spectra with near infrared spectrometer, improve the stability and prediction performance of the model and promote the application and promotion of the near infrared spectroscopy for tea.
Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index.
Xue, Wufeng; Zhang, Lei; Mou, Xuanqin; Bovik, Alan C
2014-02-01
It is an important task to faithfully evaluate the perceptual quality of output images in many applications, such as image compression, image restoration, and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy, but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy. MATLAB source code of GMSD can be downloaded at http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm.
Multivariate prediction of upper limb prosthesis acceptance or rejection.
Biddiss, Elaine A; Chau, Tom T
2008-07-01
To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.
The value of vital sign trends for detecting clinical deterioration on the wards
Churpek, Matthew M; Adhikari, Richa; Edelson, Dana P
2016-01-01
Aim Early detection of clinical deterioration on the wards may improve outcomes, and most early warning scores only utilize a patient’s current vital signs. The added value of vital sign trends over time is poorly characterized. We investigated whether adding trends improves accuracy and which methods are optimal for modelling trends. Methods Patients admitted to five hospitals over a five-year period were included in this observational cohort study, with 60% of the data used for model derivation and 40% for validation. Vital signs were utilized to predict the combined outcome of cardiac arrest, intensive care unit transfer, and death. The accuracy of models utilizing both the current value and different trend methods were compared using the area under the receiver operating characteristic curve (AUC). Results A total of 269,999 patient admissions were included, which resulted in 16,452 outcomes. Overall, trends increased accuracy compared to a model containing only current vital signs (AUC 0.78 vs. 0.74; p<0.001). The methods that resulted in the greatest average increase in accuracy were the vital sign slope (AUC improvement 0.013) and minimum value (AUC improvement 0.012), while the change from the previous value resulted in an average worsening of the AUC (change in AUC −0.002). The AUC increased most for systolic blood pressure when trends were added (AUC improvement 0.05). Conclusion Vital sign trends increased the accuracy of models designed to detect critical illness on the wards. Our findings have important implications for clinicians at the bedside and for the development of early warning scores. PMID:26898412
The value of vital sign trends for detecting clinical deterioration on the wards.
Churpek, Matthew M; Adhikari, Richa; Edelson, Dana P
2016-05-01
Early detection of clinical deterioration on the wards may improve outcomes, and most early warning scores only utilize a patient's current vital signs. The added value of vital sign trends over time is poorly characterized. We investigated whether adding trends improves accuracy and which methods are optimal for modelling trends. Patients admitted to five hospitals over a five-year period were included in this observational cohort study, with 60% of the data used for model derivation and 40% for validation. Vital signs were utilized to predict the combined outcome of cardiac arrest, intensive care unit transfer, and death. The accuracy of models utilizing both the current value and different trend methods were compared using the area under the receiver operating characteristic curve (AUC). A total of 269,999 patient admissions were included, which resulted in 16,452 outcomes. Overall, trends increased accuracy compared to a model containing only current vital signs (AUC 0.78 vs. 0.74; p<0.001). The methods that resulted in the greatest average increase in accuracy were the vital sign slope (AUC improvement 0.013) and minimum value (AUC improvement 0.012), while the change from the previous value resulted in an average worsening of the AUC (change in AUC -0.002). The AUC increased most for systolic blood pressure when trends were added (AUC improvement 0.05). Vital sign trends increased the accuracy of models designed to detect critical illness on the wards. Our findings have important implications for clinicians at the bedside and for the development of early warning scores. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Chan, Johanna L; Lin, Li; Feiler, Michael; Wolf, Andrew I; Cardona, Diana M; Gellad, Ziad F
2012-11-07
To evaluate accuracy of in vivo diagnosis of adenomatous vs non-adenomatous polyps using i-SCAN digital chromoendoscopy compared with high-definition white light. This is a single-center comparative effectiveness pilot study. Polyps (n = 103) from 75 average-risk adult outpatients undergoing screening or surveillance colonoscopy between December 1, 2010 and April 1, 2011 were evaluated by two participating endoscopists in an academic outpatient endoscopy center. Polyps were evaluated both with high-definition white light and with i-SCAN to make an in vivo prediction of adenomatous vs non-adenomatous pathology. We determined diagnostic characteristics of i-SCAN and high-definition white light, including sensitivity, specificity, and accuracy, with regards to identifying adenomatous vs non-adenomatous polyps. Histopathologic diagnosis was the gold standard comparison. One hundred and three small polyps, detected from forty-three patients, were included in the analysis. The average size of the polyps evaluated in the analysis was 3.7 mm (SD 1.3 mm, range 2 mm to 8 mm). Formal histopathology revealed that 54/103 (52.4%) were adenomas, 26/103 (25.2%) were hyperplastic, and 23/103 (22.3%) were other diagnoses include "lymphoid aggregates", "non-specific colitis," and "no pathologic diagnosis." Overall, the combined accuracy of endoscopists for predicting adenomas was identical between i-SCAN (71.8%, 95%CI: 62.1%-80.3%) and high-definition white light (71.8%, 95%CI: 62.1%-80.3%). However, the accuracy of each endoscopist differed substantially, where endoscopist A demonstrated 63.0% overall accuracy (95%CI: 50.9%-74.0%) as compared with endoscopist B demonstrating 93.3% overall accuracy (95%CI: 77.9%-99.2%), irrespective of imaging modality. Neither endoscopist demonstrated a significant learning effect with i-SCAN during the study. Though endoscopist A increased accuracy using i-SCAN from 59% (95%CI: 42.1%-74.4%) in the first half to 67.6% (95%CI: 49.5%-82.6%) in the second half, and endoscopist B decreased accuracy using i-SCAN from 100% (95%CI: 80.5%-100.0%) in the first half to 84.6% (95%CI: 54.6%-98.1%) in the second half, neither of these differences were statistically significant. i-SCAN and high-definition white light had similar efficacy predicting polyp histology. Endoscopist training likely plays a critical role in diagnostic test characteristics and deserves further study.
The Structure of Scientific Evolution
2013-01-01
Science is the construction and testing of systems that bind symbols to sensations according to rules. Material implication is the primary rule, providing the structure of definition, elaboration, delimitation, prediction, explanation, and control. The goal of science is not to secure truth, which is a binary function of accuracy, but rather to increase the information about data communicated by theory. This process is symmetric and thus entails an increase in the information about theory communicated by data. Important components in this communication are the elevation of data to the status of facts, the descent of models under the guidance of theory, and their close alignment through the evolving retroductive process. The information mutual to theory and data may be measured as the reduction in the entropy, or complexity, of the field of data given the model. It may also be measured as the reduction in the entropy of the field of models given the data. This symmetry explains the important status of parsimony (how thoroughly the data exploit what the model can say) alongside accuracy (how thoroughly the model represents what can be said about the data). Mutual information is increased by increasing model accuracy and parsimony, and by enlarging and refining the data field under purview. PMID:28018043
Cow genotyping strategies for genomic selection in a small dairy cattle population.
Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A
2017-01-01
This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Villarreal, Miguel L.; van Riper, Charles; Petrakis, Roy E.
2013-01-01
Riparian vegetation provides important wildlife habitat in the Southwestern United States, but limited distributions and spatial complexity often leads to inaccurate representation in maps used to guide conservation. We test the use of data conflation and aggregation on multiple vegetation/land-cover maps to improve the accuracy of habitat models for the threatened western yellow-billed cuckoo (Coccyzus americanus occidentalis). We used species observations (n = 479) from a state-wide survey to develop habitat models from 1) three vegetation/land-cover maps produced at different geographic scales ranging from state to national, and 2) new aggregate maps defined by the spatial agreement of cover types, which were defined as high (agreement = all data sets), moderate (agreement ≥ 2), and low (no agreement required). Model accuracies, predicted habitat locations, and total area of predicted habitat varied considerably, illustrating the effects of input data quality on habitat predictions and resulting potential impacts on conservation planning. Habitat models based on aggregated and conflated data were more accurate and had higher model sensitivity than original vegetation/land-cover, but this accuracy came at the cost of reduced geographic extent of predicted habitat. Using the highest performing models, we assessed cuckoo habitat preference and distribution in Arizona and found that major watersheds containing high-probably habitat are fragmented by a wide swath of low-probability habitat. Focus on riparian restoration in these areas could provide more breeding habitat for the threatened cuckoo, offset potential future habitat losses in adjacent watershed, and increase regional connectivity for other threatened vertebrates that also use riparian corridors.
Herrick, Ariane L; Peytrignet, Sebastien; Lunt, Mark; Pan, Xiaoyan; Hesselstrand, Roger; Mouthon, Luc; Silman, Alan J; Dinsdale, Graham; Brown, Edith; Czirják, László; Distler, Jörg H W; Distler, Oliver; Fligelstone, Kim; Gregory, William J; Ochiel, Rachel; Vonk, Madelon C; Ancuţa, Codrina; Ong, Voon H; Farge, Dominique; Hudson, Marie; Matucci-Cerinic, Marco; Balbir-Gurman, Alexandra; Midtvedt, Øyvind; Jobanputra, Paresh; Jordan, Alison C; Stevens, Wendy; Moinzadeh, Pia; Hall, Frances C; Agard, Christian; Anderson, Marina E; Diot, Elisabeth; Madhok, Rajan; Akil, Mohammed; Buch, Maya H; Chung, Lorinda; Damjanov, Nemanja S; Gunawardena, Harsha; Lanyon, Peter; Ahmad, Yasmeen; Chakravarty, Kuntal; Jacobsen, Søren; MacGregor, Alexander J; McHugh, Neil; Müller-Ladner, Ulf; Riemekasten, Gabriela; Becker, Michael; Roddy, Janet; Carreira, Patricia E; Fauchais, Anne Laure; Hachulla, Eric; Hamilton, Jennifer; İnanç, Murat; McLaren, John S; van Laar, Jacob M; Pathare, Sanjay; Proudman, Susanna M; Rudin, Anna; Sahhar, Joanne; Coppere, Brigitte; Serratrice, Christine; Sheeran, Tom; Veale, Douglas J; Grange, Claire; Trad, Georges-Selim; Denton, Christopher P
2018-01-01
Objectives Our aim was to use the opportunity provided by the European Scleroderma Observational Study to (1) identify and describe those patients with early diffuse cutaneous systemic sclerosis (dcSSc) with progressive skin thickness, and (2) derive prediction models for progression over 12 months, to inform future randomised controlled trials (RCTs). Methods The modified Rodnan skin score (mRSS) was recorded every 3 months in 326 patients. ‘Progressors’ were defined as those experiencing a 5-unit and 25% increase in mRSS score over 12 months (±3 months). Logistic models were fitted to predict progression and, using receiver operating characteristic (ROC) curves, were compared on the basis of the area under curve (AUC), accuracy and positive predictive value (PPV). Results 66 patients (22.5%) progressed, 227 (77.5%) did not (33 could not have their status assessed due to insufficient data). Progressors had shorter disease duration (median 8.1 vs 12.6 months, P=0.001) and lower mRSS (median 19 vs 21 units, P=0.030) than non-progressors. Skin score was highest, and peaked earliest, in the anti-RNA polymerase III (Pol3+) subgroup (n=50). A first predictive model (including mRSS, duration of skin thickening and their interaction) had an accuracy of 60.9%, AUC of 0.666 and PPV of 33.8%. By adding a variable for Pol3 positivity, the model reached an accuracy of 71%, AUC of 0.711 and PPV of 41%. Conclusions Two prediction models for progressive skin thickening were derived, for use both in clinical practice and for cohort enrichment in RCTs. These models will inform recruitment into the many clinical trials of dcSSc projected for the coming years. Trial registration number NCT02339441. PMID:29306872
Roy, Janine; Aust, Daniela; Knösel, Thomas; Rümmele, Petra; Jahnke, Beatrix; Hentrich, Vera; Rückert, Felix; Niedergethmann, Marco; Weichert, Wilko; Bahra, Marcus; Schlitt, Hans J.; Settmacher, Utz; Friess, Helmut; Büchler, Markus; Saeger, Hans-Detlev; Schroeder, Michael; Pilarsky, Christian; Grützmann, Robert
2012-01-01
Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice. PMID:22615549
Schmaal, Lianne; Marquand, Andre F; Rhebergen, Didi; van Tol, Marie-José; Ruhé, Henricus G; van der Wee, Nic J A; Veltman, Dick J; Penninx, Brenda W J H
2015-08-15
A chronic course of major depressive disorder (MDD) is associated with profound alterations in brain volumes and emotional and cognitive processing. However, no neurobiological markers have been identified that prospectively predict MDD course trajectories. This study evaluated the prognostic value of different neuroimaging modalities, clinical characteristics, and their combination to classify MDD course trajectories. One hundred eighteen MDD patients underwent structural and functional magnetic resonance imaging (MRI) (emotional facial expressions and executive functioning) and were clinically followed-up at 2 years. Three MDD trajectories (chronic n = 23, gradual improving n = 36, and fast remission n = 59) were identified based on Life Chart Interview measuring the presence of symptoms each month. Gaussian process classifiers were employed to evaluate prognostic value of neuroimaging data and clinical characteristics (including baseline severity, duration, and comorbidity). Chronic patients could be discriminated from patients with more favorable trajectories from neural responses to various emotional faces (up to 73% accuracy) but not from structural MRI and functional MRI related to executive functioning. Chronic patients could also be discriminated from remitted patients based on clinical characteristics (accuracy 69%) but not when age differences between the groups were taken into account. Combining different task contrasts or data sources increased prediction accuracies in some but not all cases. Our findings provide evidence that the prediction of naturalistic course of depression over 2 years is improved by considering neuroimaging data especially derived from neural responses to emotional facial expressions. Neural responses to emotional salient faces more accurately predicted outcome than clinical data. Copyright © 2015 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Accuracy of unloading with the anti-gravity treadmill.
McNeill, David K P; de Heer, Hendrik D; Bounds, Roger G; Coast, J Richard
2015-03-01
Body weight (BW)-supported treadmill training has become increasingly popular in professional sports and rehabilitation. To date, little is known about the accuracy of the lower-body positive pressure treadmill. This study evaluated the accuracy of the BW support reported on the AlterG "Anti-Gravity" Treadmill across the spectrum of unloading, from full BW (100%) to 20% BW. Thirty-one adults (15 men and 16 women) with a mean age of 29.3 years (SD = 10.9), and a mean weight of 66.55 kg (SD = 12.68) were recruited. Participants were weighed outside the machine and then inside at 100-20% BW in 10% increments. Predicted BW, as presented by the AlterG equipment, was compared with measured BW. Significant differences between predicted and measured BW were found at all but 90% through 70% of BW. Differences were small (<5%), except at the extreme ends of the unloading spectrum. At 100% BW, the measured weight was lower than predicted (mean = 93.15%, SD = 1.21, p < 0.001 vs. predicted). At 30 and 20% BW, the measured weight was higher than predicted at 35.75% (SD = 2.89, p < 0.001), and 27.67% (SD = 3.76, p < 0.001), respectively. These findings suggest that there are significant differences between reported and measured BW support on the AlterG Anti-Gravity Treadmill®, with the largest differences (>5%) found at 100% BW and the greatest BW support (30 and 20% BW). These differences may be associated with changes in metabolic demand and maximum speed during walking or running and should be taken into consideration when using these devices for training and research purposes.
Zheng, Leilei; Chai, Hao; Chen, Wanzhen; Yu, Rongrong; He, Wei; Jiang, Zhengyan; Yu, Shaohua; Li, Huichun; Wang, Wei
2011-12-01
Early parental bonding experiences play a role in emotion recognition and expression in later adulthood, and patients with personality disorder frequently experience inappropriate parental bonding styles, therefore the aim of the present study was to explore whether parental bonding style is correlated with recognition of facial emotion in personality disorder patients. The Parental Bonding Instrument (PBI) and the Matsumoto and Ekman Japanese and Caucasian Facial Expressions of Emotion (JACFEE) photo set tests were carried out in 289 participants. Patients scored lower on parental Care but higher on parental Freedom Control and Autonomy Denial subscales, and they displayed less accuracy when recognizing contempt, disgust and happiness than the healthy volunteers. In healthy volunteers, maternal Autonomy Denial significantly predicted accuracy when recognizing fear, and maternal Care predicted the accuracy of recognizing sadness. In patients, paternal Care negatively predicted the accuracy of recognizing anger, paternal Freedom Control predicted the perceived intensity of contempt, maternal Care predicted the accuracy of recognizing sadness, and the intensity of disgust. Parenting bonding styles have an impact on the decoding process and sensitivity when recognizing facial emotions, especially in personality disorder patients. © 2011 The Authors. Psychiatry and Clinical Neurosciences © 2011 Japanese Society of Psychiatry and Neurology.