Massa, Luiz M; Hoffman, Jeanne M; Cardenas, Diana D
2009-01-01
To determine the validity, accuracy, and predictive value of the signs and symptoms of urinary tract infection (UTI) for individuals with spinal cord injury (SCI) using intermittent catheterization (IC) and the accuracy of individuals with SCI on IC at predicting their own UTI. Prospective cohort based on data from the first 3 months of a 1-year randomized controlled trial to evaluate UTI prevention effectiveness of hydrophilic and standard catheters. Fifty-six community-based individuals on IC. Presence of UTI as defined as bacteriuria with a colony count of at least 10(5) colony-forming units/mL and at least 1 sign or symptom of UTI. Analysis of monthly urine culture and urinalysis data combined with analysis of monthly data collected using a questionnaire that asked subjects to self-report on UTI signs and symptoms and whether or not they felt they had a UTI. Overall, "cloudy urine" had the highest accuracy (83.1%), and "leukocytes in the urine" had the highest sensitivity (82.8%). The highest specificity was for "fever" (99.0%); however, it had a very low sensitivity (6.9%). Subjects were able to predict their own UTI with an accuracy of 66.2%, and the negative predictive value (82.8%) was substantially higher than the positive predictive value (32.6%). The UTI signs and symptoms can predict a UTI more accurately than individual subjects can by using subjective impressions of their own signs and symptoms. Subjects were better at predicting when they did not have a UTI than when they did have a UTI.
Analysis of near infrared spectra for age-grading of wild populations of Anopheles gambiae.
Krajacich, Benjamin J; Meyers, Jacob I; Alout, Haoues; Dabiré, Roch K; Dowell, Floyd E; Foy, Brian D
2017-11-07
Understanding the age-structure of mosquito populations, especially malaria vectors such as Anopheles gambiae, is important for assessing the risk of infectious mosquitoes, and how vector control interventions may impact this risk. The use of near-infrared spectroscopy (NIRS) for age-grading has been demonstrated previously on laboratory and semi-field mosquitoes, but to date has not been utilized on wild-caught mosquitoes whose age is externally validated via parity status or parasite infection stage. In this study, we developed regression and classification models using NIRS on datasets of wild An. gambiae (s.l.) reared from larvae collected from the field in Burkina Faso, and two laboratory strains. We compared the accuracy of these models for predicting the ages of wild-caught mosquitoes that had been scored for their parity status as well as for positivity for Plasmodium sporozoites. Regression models utilizing variable selection increased predictive accuracy over the more common full-spectrum partial least squares (PLS) approach for cross-validation of the datasets, validation, and independent test sets. Models produced from datasets that included the greatest range of mosquito samples (i.e. different sampling locations and times) had the highest predictive accuracy on independent testing sets, though overall accuracy on these samples was low. For classification, we found that intramodel accuracy ranged between 73.5-97.0% for grouping of mosquitoes into "early" and "late" age classes, with the highest prediction accuracy found in laboratory colonized mosquitoes. However, this accuracy was decreased on test sets, with the highest classification of an independent set of wild-caught larvae reared to set ages being 69.6%. Variation in NIRS data, likely from dietary, genetic, and other factors limits the accuracy of this technique with wild-caught mosquitoes. Alternative algorithms may help improve prediction accuracy, but care should be taken to either maximize variety in models or minimize confounders.
Thandassery, Ragesh B; Al Kaabi, Saad; Soofi, Madiha E; Mohiuddin, Syed A; John, Anil K; Al Mohannadi, Muneera; Al Ejji, Khalid; Yakoob, Rafie; Derbala, Moutaz F; Wani, Hamidullah; Sharma, Manik; Al Dweik, Nazeeh; Butt, Mohammed T; Kamel, Yasser M; Sultan, Khaleel; Pasic, Fuad; Singh, Rajvir
2016-07-01
Many indirect noninvasive scores to predict liver fibrosis are calculated from routine blood investigations. Only limited studies have compared their efficacy head to head. We aimed to compare these scores with liver biopsy fibrosis stages in patients with chronic hepatitis C. From blood investigations of 1602 patients with chronic hepatitis C who underwent a liver biopsy before initiation of antiviral treatment, 19 simple noninvasive scores were calculated. The area under the receiver operating characteristic curves and diagnostic accuracy of each of these scores were calculated (with reference to the Scheuer staging) and compared. The mean age of the patients was 41.8±9.6 years (1365 men). The most common genotype was genotype 4 (65.6%). Significant fibrosis, advanced fibrosis, and cirrhosis were seen in 65.1%, 25.6, and 6.6% of patients, respectively. All the scores except the aspartate transaminase (AST) alanine transaminase ratio, Pohl score, mean platelet volume, fibro-alpha, and red cell distribution width to platelet count ratio index showed high predictive accuracy for the stages of fibrosis. King's score (cutoff, 17.5) showed the highest predictive accuracy for significant and advanced fibrosis. King's score, Göteborg university cirrhosis index, APRI (the AST/platelet count ratio index), and Fibrosis-4 (FIB-4) had the highest predictive accuracy for cirrhosis, with the APRI (cutoff, 2) and FIB-4 (cutoff, 3.25) showing the highest diagnostic accuracy.We derived the study score 8.5 - 0.2(albumin, g/dL) +0.01(AST, IU/L) -0.02(platelet count, 10/L), which at a cutoff of >4.7 had a predictive accuracy of 0.868 (95% confidence interval, 0.833-0.904) for cirrhosis. King's score for significant and advanced fibrosis and the APRI or FIB-4 score for cirrhosis could be the best simple indirect noninvasive scores.
Accuracy of four commonly used color vision tests in the identification of cone disorders.
Thiadens, Alberta A H J; Hoyng, Carel B; Polling, Jan Roelof; Bernaerts-Biskop, Riet; van den Born, L Ingeborgh; Klaver, Caroline C W
2013-04-01
To determine which color vision test is most appropriate for the identification of cone disorders. In a clinic-based study, four commonly used color vision tests were compared between patients with cone dystrophy (n = 37), controls with normal visual acuity (n = 35), and controls with low vision (n = 39) and legal blindness (n = 11). Mean outcome measures were specificity, sensitivity, positive predictive value and discriminative accuracy of the Ishihara test, Hardy-Rand-Rittler (HRR) test, and the Lanthony and Farnsworth Panel D-15 tests. In the comparison between cone dystrophy and all controls, sensitivity, specificity and predictive value were highest for the HRR and Ishihara tests. When patients were compared to controls with normal vision, discriminative accuracy was highest for the HRR test (c-statistic for PD-axes 1, for T-axis 0.851). When compared to controls with poor vision, discriminative accuracy was again highest for the HRR test (c-statistic for PD-axes 0.900, for T-axis 0.766), followed by the Lanthony Panel D-15 test (c-statistic for PD-axes 0.880, for T-axis 0.500) and Ishihara test (c-statistic 0.886). Discriminative accuracies of all tests did not further decrease when patients were compared to controls who were legally blind. The HRR, Lanthony Panel D-15 and Ishihara all have a high discriminative accuracy to identify cone disorders, but the highest scores were for the HRR test. Poor visual acuity slightly decreased the accuracy of all tests. Our advice is to use the HRR test since this test also allows for evaluation of all three color axes and quantification of color defects.
The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation.
Luan, Tu; Woolliams, John A; Lien, Sigbjørn; Kent, Matthew; Svendsen, Morten; Meuwissen, Theo H E
2009-11-01
Genomic Selection (GS) is a newly developed tool for the estimation of breeding values for quantitative traits through the use of dense markers covering the whole genome. For a successful application of GS, accuracy of the prediction of genomewide breeding value (GW-EBV) is a key issue to consider. Here we investigated the accuracy and possible bias of GW-EBV prediction, using real bovine SNP genotyping (18,991 SNPs) and phenotypic data of 500 Norwegian Red bulls. The study was performed on milk yield, fat yield, protein yield, first lactation mastitis traits, and calving ease. Three methods, best linear unbiased prediction (G-BLUP), Bayesian statistics (BayesB), and a mixture model approach (MIXTURE), were used to estimate marker effects, and their accuracy and bias were estimated by using cross-validation. The accuracies of the GW-EBV prediction were found to vary widely between 0.12 and 0.62. G-BLUP gave overall the highest accuracy. We observed a strong relationship between the accuracy of the prediction and the heritability of the trait. GW-EBV prediction for production traits with high heritability achieved higher accuracy and also lower bias than health traits with low heritability. To achieve a similar accuracy for the health traits probably more records will be needed.
Gender differences in structured risk assessment: comparing the accuracy of five instruments.
Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P; Rogers, Robert D
2009-04-01
Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S. Douglas, D. Eaves, & S. D. Hart, 1997); the Risk Matrix 2000-Violence (RM2000[V]; D. Thornton et al., 2003); the Violence Risk Appraisal Guide (VRAG; V. L. Quinsey, G. T. Harris, M. E. Rice, & C. A. Cormier, 1998); the Offenders Group Reconviction Scale (OGRS; J. B. Copas & P. Marshall, 1998; R. Taylor, 1999); and the total previous convictions among prisoners, prospectively assessed prerelease. The authors compared predischarge measures with subsequent offending and instruments ranked using multivariate regression. Most instruments demonstrated significant but moderate predictive ability. The OGRS ranked highest for violence among men, and the PCL-R and HCR-20 H subscale ranked highest for violence among women. The OGRS and total previous acquisitive convictions demonstrated greatest accuracy in predicting acquisitive offending among men and women. Actuarial instruments requiring no training to administer performed as well as personality assessment and structured risk assessment and were superior among men for violence.
Genomic-Enabled Prediction in Maize Using Kernel Models with Genotype × Environment Interaction
Bandeira e Sousa, Massaine; Cuevas, Jaime; de Oliveira Couto, Evellyn Giselly; Pérez-Rodríguez, Paulino; Jarquín, Diego; Fritsche-Neto, Roberto; Burgueño, Juan; Crossa, Jose
2017-01-01
Multi-environment trials are routinely conducted in plant breeding to select candidates for the next selection cycle. In this study, we compare the prediction accuracy of four developed genomic-enabled prediction models: (1) single-environment, main genotypic effect model (SM); (2) multi-environment, main genotypic effects model (MM); (3) multi-environment, single variance G×E deviation model (MDs); and (4) multi-environment, environment-specific variance G×E deviation model (MDe). Each of these four models were fitted using two kernel methods: a linear kernel Genomic Best Linear Unbiased Predictor, GBLUP (GB), and a nonlinear kernel Gaussian kernel (GK). The eight model-method combinations were applied to two extensive Brazilian maize data sets (HEL and USP data sets), having different numbers of maize hybrids evaluated in different environments for grain yield (GY), plant height (PH), and ear height (EH). Results show that the MDe and the MDs models fitted with the Gaussian kernel (MDe-GK, and MDs-GK) had the highest prediction accuracy. For GY in the HEL data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 9 to 32%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 9 to 49%. For GY in the USP data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 0 to 7%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 34 to 70%. For traits PH and EH, gains in prediction accuracy of models with GK compared to models with GB were smaller than those achieved in GY. Also, these gains in prediction accuracy decreased when a more difficult prediction problem was studied. PMID:28455415
Genomic-Enabled Prediction in Maize Using Kernel Models with Genotype × Environment Interaction.
Bandeira E Sousa, Massaine; Cuevas, Jaime; de Oliveira Couto, Evellyn Giselly; Pérez-Rodríguez, Paulino; Jarquín, Diego; Fritsche-Neto, Roberto; Burgueño, Juan; Crossa, Jose
2017-06-07
Multi-environment trials are routinely conducted in plant breeding to select candidates for the next selection cycle. In this study, we compare the prediction accuracy of four developed genomic-enabled prediction models: (1) single-environment, main genotypic effect model (SM); (2) multi-environment, main genotypic effects model (MM); (3) multi-environment, single variance G×E deviation model (MDs); and (4) multi-environment, environment-specific variance G×E deviation model (MDe). Each of these four models were fitted using two kernel methods: a linear kernel Genomic Best Linear Unbiased Predictor, GBLUP (GB), and a nonlinear kernel Gaussian kernel (GK). The eight model-method combinations were applied to two extensive Brazilian maize data sets (HEL and USP data sets), having different numbers of maize hybrids evaluated in different environments for grain yield (GY), plant height (PH), and ear height (EH). Results show that the MDe and the MDs models fitted with the Gaussian kernel (MDe-GK, and MDs-GK) had the highest prediction accuracy. For GY in the HEL data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 9 to 32%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 9 to 49%. For GY in the USP data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 0 to 7%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 34 to 70%. For traits PH and EH, gains in prediction accuracy of models with GK compared to models with GB were smaller than those achieved in GY. Also, these gains in prediction accuracy decreased when a more difficult prediction problem was studied. Copyright © 2017 Bandeira e Sousa et al.
Gender Differences in Structured Risk Assessment: Comparing the Accuracy of Five Instruments
ERIC Educational Resources Information Center
Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P.; Rogers, Robert D.
2009-01-01
Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S.…
Assessment of Protein Side-Chain Conformation Prediction Methods in Different Residue Environments
Peterson, Lenna X.; Kang, Xuejiao; Kihara, Daisuke
2016-01-01
Computational prediction of side-chain conformation is an important component of protein structure prediction. Accurate side-chain prediction is crucial for practical applications of protein structure models that need atomic detailed resolution such as protein and ligand design. We evaluated the accuracy of eight side-chain prediction methods in reproducing the side-chain conformations of experimentally solved structures deposited to the Protein Data Bank. Prediction accuracy was evaluated for a total of four different structural environments (buried, surface, interface, and membrane-spanning) in three different protein types (monomeric, multimeric, and membrane). Overall, the highest accuracy was observed for buried residues in monomeric and multimeric proteins. Notably, side-chains at protein interfaces and membrane-spanning regions were better predicted than surface residues even though the methods did not all use multimeric and membrane proteins for training. Thus, we conclude that the current methods are as practically useful for modeling protein docking interfaces and membrane-spanning regions as for modeling monomers. PMID:24619909
Improving transmembrane protein consensus topology prediction using inter-helical interaction.
Wang, Han; Zhang, Chao; Shi, Xiaohu; Zhang, Li; Zhou, You
2012-11-01
Alpha helix transmembrane proteins (αTMPs) represent roughly 30% of all open reading frames (ORFs) in a typical genome and are involved in many critical biological processes. Due to the special physicochemical properties, it is hard to crystallize and obtain high resolution structures experimentally, thus, sequence-based topology prediction is highly desirable for the study of transmembrane proteins (TMPs), both in structure prediction and function prediction. Various model-based topology prediction methods have been developed, but the accuracy of those individual predictors remain poor due to the limitation of the methods or the features they used. Thus, the consensus topology prediction method becomes practical for high accuracy applications by combining the advances of the individual predictors. Here, based on the observation that inter-helical interactions are commonly found within the transmembrane helixes (TMHs) and strongly indicate the existence of them, we present a novel consensus topology prediction method for αTMPs, CNTOP, which incorporates four top leading individual topology predictors, and further improves the prediction accuracy by using the predicted inter-helical interactions. The method achieved 87% prediction accuracy based on a benchmark dataset and 78% accuracy based on a non-redundant dataset which is composed of polytopic αTMPs. Our method derives the highest topology accuracy than any other individual predictors and consensus predictors, at the same time, the TMHs are more accurately predicted in their length and locations, where both the false positives (FPs) and the false negatives (FNs) decreased dramatically. The CNTOP is available at: http://ccst.jlu.edu.cn/JCSB/cntop/CNTOP.html. Copyright © 2012 Elsevier B.V. All rights reserved.
Rosenthal, Eric S; Biswal, Siddharth; Zafar, Sahar F; O'Connor, Kathryn L; Bechek, Sophia; Shenoy, Apeksha V; Boyle, Emily J; Shafi, Mouhsin M; Gilmore, Emily J; Foreman, Brandon P; Gaspard, Nicolas; Leslie-Mazwi, Thabele M; Rosand, Jonathan; Hoch, Daniel B; Ayata, Cenk; Cash, Sydney S; Cole, Andrew J; Patel, Aman B; Westover, M Brandon
2018-04-16
Delayed cerebral ischemia (DCI) is a common, disabling complication of subarachnoid hemorrhage (SAH). Preventing DCI is a key focus of neurocritical care, but interventions carry risk and cannot be applied indiscriminately. Although retrospective studies have identified continuous electroencephalographic (cEEG) measures associated with DCI, no study has characterized the accuracy of cEEG with sufficient rigor to justify using it to triage patients to interventions or clinical trials. We therefore prospectively assessed the accuracy of cEEG for predicting DCI, following the Standards for Reporting Diagnostic Accuracy Studies. We prospectively performed cEEG in nontraumatic, high-grade SAH patients at a single institution. The index test consisted of clinical neurophysiologists prospectively reporting prespecified EEG alarms: (1) decreasing relative alpha variability, (2) decreasing alpha-delta ratio, (3) worsening focal slowing, or (4) late appearing epileptiform abnormalities. The diagnostic reference standard was DCI determined by blinded, adjudicated review. Primary outcome measures were sensitivity and specificity of cEEG for subsequent DCI, determined by multistate survival analysis, adjusted for baseline risk. One hundred three of 227 consecutive patients were eligible and underwent cEEG monitoring (7.7-day mean duration). EEG alarms occurred in 96.2% of patients with and 19.6% without subsequent DCI (1.9-day median latency, interquartile range = 0.9-4.1). Among alarm subtypes, late onset epileptiform abnormalities had the highest predictive value. Prespecified EEG findings predicted DCI among patients with low (91% sensitivity, 83% specificity) and high (95% sensitivity, 77% specificity) baseline risk. cEEG accurately predicts DCI following SAH and may help target therapies to patients at highest risk of secondary brain injury. Ann Neurol 2018. © 2018 American Neurological Association.
Davey, James A; Chica, Roberto A
2014-05-01
Multistate computational protein design (MSD) with backbone ensembles approximating conformational flexibility can predict higher quality sequences than single-state design with a single fixed backbone. However, it is currently unclear what characteristics of backbone ensembles are required for the accurate prediction of protein sequence stability. In this study, we aimed to improve the accuracy of protein stability predictions made with MSD by using a variety of backbone ensembles to recapitulate the experimentally measured stability of 85 Streptococcal protein G domain β1 sequences. Ensembles tested here include an NMR ensemble as well as those generated by molecular dynamics (MD) simulations, by Backrub motions, and by PertMin, a new method that we developed involving the perturbation of atomic coordinates followed by energy minimization. MSD with the PertMin ensembles resulted in the most accurate predictions by providing the highest number of stable sequences in the top 25, and by correctly binning sequences as stable or unstable with the highest success rate (≈90%) and the lowest number of false positives. The performance of PertMin ensembles is due to the fact that their members closely resemble the input crystal structure and have low potential energy. Conversely, the NMR ensemble as well as those generated by MD simulations at 500 or 1000 K reduced prediction accuracy due to their low structural similarity to the crystal structure. The ensembles tested herein thus represent on- or off-target models of the native protein fold and could be used in future studies to design for desired properties other than stability. Copyright © 2013 Wiley Periodicals, Inc.
Genotyping by sequencing for genomic prediction in a soybean breeding population.
Jarquín, Diego; Kocak, Kyle; Posadas, Luis; Hyma, Katie; Jedlicka, Joseph; Graef, George; Lorenz, Aaron
2014-08-29
Advances in genotyping technology, such as genotyping by sequencing (GBS), are making genomic prediction more attractive to reduce breeding cycle times and costs associated with phenotyping. Genomic prediction and selection has been studied in several crop species, but no reports exist in soybean. The objectives of this study were (i) evaluate prospects for genomic selection using GBS in a typical soybean breeding program and (ii) evaluate the effect of GBS marker selection and imputation on genomic prediction accuracy. To achieve these objectives, a set of soybean lines sampled from the University of Nebraska Soybean Breeding Program were genotyped using GBS and evaluated for yield and other agronomic traits at multiple Nebraska locations. Genotyping by sequencing scored 16,502 single nucleotide polymorphisms (SNPs) with minor-allele frequency (MAF) > 0.05 and percentage of missing values ≤ 5% on 301 elite soybean breeding lines. When SNPs with up to 80% missing values were included, 52,349 SNPs were scored. Prediction accuracy for grain yield, assessed using cross validation, was estimated to be 0.64, indicating good potential for using genomic selection for grain yield in soybean. Filtering SNPs based on missing data percentage had little to no effect on prediction accuracy, especially when random forest imputation was used to impute missing values. The highest accuracies were observed when random forest imputation was used on all SNPs, but differences were not significant. A standard additive G-BLUP model was robust; modeling additive-by-additive epistasis did not provide any improvement in prediction accuracy. The effect of training population size on accuracy began to plateau around 100, but accuracy steadily climbed until the largest possible size was used in this analysis. Including only SNPs with MAF > 0.30 provided higher accuracies when training populations were smaller. Using GBS for genomic prediction in soybean holds good potential to expedite genetic gain. Our results suggest that standard additive G-BLUP models can be used on unfiltered, imputed GBS data without loss in accuracy.
Comparison of Three Risk Scores to Predict Outcomes of Severe Lower Gastrointestinal Bleeding
Camus, Marine; Jensen, Dennis M.; Ohning, Gordon V.; Kovacs, Thomas O.; Jutabha, Rome; Ghassemi, Kevin A.; Machicado, Gustavo A.; Dulai, Gareth S.; Jensen, Mary Ellen; Gornbein, Jeffrey A.
2014-01-01
Background & aims Improved medical decisions by using a score at the initial patient triage level may lead to improvements in patient management, outcomes, and resource utilization. There is no validated score for management of lower gastrointestinal bleeding (LGIB) unlike for upper GIB. The aim of our study was to compare the accuracies of 3 different prognostic scores (CURE Hemostasis prognosis score, Charlston index and ASA score) for the prediction of 30 day rebleeding, surgery and death in severe LGIB. Methods Data on consecutive patients hospitalized with severe GI bleeding from January 2006 to October 2011 in our two-tertiary academic referral centers were prospectively collected. Sensitivities, specificities, accuracies and area under the receiver operating characteristic (AUROC) were computed for three scores for predictions of rebleeding, surgery and mortality at 30 days. Results 235 consecutive patients with LGIB were included between 2006 and 2011. 23% of patients rebled, 6% had surgery, and 7.7% of patients died. The accuracies of each score never reached 70% for predicting rebleeding or surgery in either. The ASA score had a highest accuracy for predicting mortality within 30 days (83.5%) whereas the CURE Hemostasis prognosis score and the Charlson index both had accuracies less than 75% for the prediction of death within 30 days. Conclusions ASA score could be useful to predict death within 30 days. However a new score is still warranted to predict all 30 days outcomes (rebleeding, surgery and death) in LGIB. PMID:25599218
Genomic prediction of reproduction traits for Merino sheep.
Bolormaa, S; Brown, D J; Swan, A A; van der Werf, J H J; Hayes, B J; Daetwyler, H D
2017-06-01
Economically important reproduction traits in sheep, such as number of lambs weaned and litter size, are expressed only in females and later in life after most selection decisions are made, which makes them ideal candidates for genomic selection. Accurate genomic predictions would lead to greater genetic gain for these traits by enabling accurate selection of young rams with high genetic merit. The aim of this study was to design and evaluate the accuracy of a genomic prediction method for female reproduction in sheep using daughter trait deviations (DTD) for sires and ewe phenotypes (when individual ewes were genotyped) for three reproduction traits: number of lambs born (NLB), litter size (LSIZE) and number of lambs weaned. Genomic best linear unbiased prediction (GBLUP), BayesR and pedigree BLUP analyses of the three reproduction traits measured on 5340 sheep (4503 ewes and 837 sires) with real and imputed genotypes for 510 174 SNPs were performed. The prediction of breeding values using both sire and ewe trait records was validated in Merino sheep. Prediction accuracy was evaluated by across sire family and random cross-validations. Accuracies of genomic estimated breeding values (GEBVs) were assessed as the mean Pearson correlation adjusted by the accuracy of the input phenotypes. The addition of sire DTD into the prediction analysis resulted in higher accuracies compared with using only ewe records in genomic predictions or pedigree BLUP. Using GBLUP, the average accuracy based on the combined records (ewes and sire DTD) was 0.43 across traits, but the accuracies varied by trait and type of cross-validations. The accuracies of GEBVs from random cross-validations (range 0.17-0.61) were higher than were those from sire family cross-validations (range 0.00-0.51). The GEBV accuracies of 0.41-0.54 for NLB and LSIZE based on the combined records were amongst the highest in the study. Although BayesR was not significantly different from GBLUP in prediction accuracy, it identified several candidate genes which are known to be associated with NLB and LSIZE. The approach provides a way to make use of all data available in genomic prediction for traits that have limited recording. © 2017 Stichting International Foundation for Animal Genetics.
Chen, L; Schenkel, F; Vinsky, M; Crews, D H; Li, C
2013-10-01
In beef cattle, phenotypic data that are difficult and/or costly to measure, such as feed efficiency, and DNA marker genotypes are usually available on a small number of animals of different breeds or populations. To achieve a maximal accuracy of genomic prediction using the phenotype and genotype data, strategies for forming a training population to predict genomic breeding values (GEBV) of the selection candidates need to be evaluated. In this study, we examined the accuracy of predicting GEBV for residual feed intake (RFI) based on 522 Angus and 395 Charolais steers genotyped on SNP with the Illumina Bovine SNP50 Beadchip for 3 training population forming strategies: within breed, across breed, and by pooling data from the 2 breeds (i.e., combined). Two other scenarios with the training and validation data split by birth year and by sire family within a breed were also investigated to assess the impact of genetic relationships on the accuracy of genomic prediction. Three statistical methods including the best linear unbiased prediction with the relationship matrix defined based on the pedigree (PBLUP), based on the SNP genotypes (GBLUP), and a Bayesian method (BayesB) were used to predict the GEBV. The results showed that the accuracy of the GEBV prediction was the highest when the prediction was within breed and when the validation population had greater genetic relationships with the training population, with a maximum of 0.58 for Angus and 0.64 for Charolais. The within-breed prediction accuracies dropped to 0.29 and 0.38, respectively, when the validation populations had a minimal pedigree link with the training population. When the training population of a different breed was used to predict the GEBV of the validation population, that is, across-breed genomic prediction, the accuracies were further reduced to 0.10 to 0.22, depending on the prediction method used. Pooling data from the 2 breeds to form the training population resulted in accuracies increased to 0.31 and 0.43, respectively, for the Angus and Charolais validation populations. The results suggested that the genetic relationship of selection candidates with the training population has a greater impact on the accuracy of GEBV using the Illumina Bovine SNP50 Beadchip. Pooling data from different breeds to form the training population will improve the accuracy of across breed genomic prediction for RFI in beef cattle.
He, Jun; Xu, Jiaqi; Wu, Xiao-Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D; Spangler, Matthew L
2018-04-01
SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.
Improving prediction accuracy of cooling load using EMD, PSR and RBFNN
NASA Astrophysics Data System (ADS)
Shen, Limin; Wen, Yuanmei; Li, Xiaohong
2017-08-01
To increase the accuracy for the prediction of cooling load demand, this work presents an EMD (empirical mode decomposition)-PSR (phase space reconstruction) based RBFNN (radial basis function neural networks) method. Firstly, analyzed the chaotic nature of the real cooling load demand, transformed the non-stationary cooling load historical data into several stationary intrinsic mode functions (IMFs) by using EMD. Secondly, compared the RBFNN prediction accuracies of each IMFs and proposed an IMF combining scheme that is combine the lower-frequency components (called IMF4-IMF6 combined) while keep the higher frequency component (IMF1, IMF2, IMF3) and the residual unchanged. Thirdly, reconstruct phase space for each combined components separately, process the highest frequency component (IMF1) by differential method and predict with RBFNN in the reconstructed phase spaces. Real cooling load data of a centralized ice storage cooling systems in Guangzhou are used for simulation. The results show that the proposed hybrid method outperforms the traditional methods.
Application of GA-SVM method with parameter optimization for landslide development prediction
NASA Astrophysics Data System (ADS)
Li, X. Z.; Kong, J. M.
2013-10-01
Prediction of landslide development process is always a hot issue in landslide research. So far, many methods for landslide displacement series prediction have been proposed. Support vector machine (SVM) has been proved to be a novel algorithm with good performance. However, the performance strongly depends on the right selection of the parameters (C and γ) of SVM model. In this study, we presented an application of GA-SVM method with parameter optimization in landslide displacement rate prediction. We selected a typical large-scale landslide in some hydro - electrical engineering area of Southwest China as a case. On the basis of analyzing the basic characteristics and monitoring data of the landslide, a single-factor GA-SVM model and a multi-factor GA-SVM model of the landslide were built. Moreover, the models were compared with single-factor and multi-factor SVM models of the landslide. The results show that, the four models have high prediction accuracies, but the accuracies of GA-SVM models are slightly higher than those of SVM models and the accuracies of multi-factor models are slightly higher than those of single-factor models for the landslide prediction. The accuracy of the multi-factor GA-SVM models is the highest, with the smallest RSME of 0.0009 and the biggest RI of 0.9992.
Pietraszek-Grzywaczewska, Iwona; Bernas, Szymon; Łojko, Piotr; Piechota, Anna; Piechota, Mariusz
2016-01-01
Scoring systems in critical care patients are essential for predicting of the patient outcome and evaluating the therapy. In this study, we determined the value of the Acute Physiology and Chronic Health Evaluation II (APACHE II), Simplified Acute Physiology Score II (SAPS II), Sequential Organ Failure Assessment (SOFA) and Glasgow Coma Scale (GCS) scoring systems in the prediction of mortality in adult patients admitted to the intensive care unit (ICU) with severe purulent bacterial meningitis. We retrospectively analysed data from 98 adult patients with severe purulent bacterial meningitis who were admitted to the single ICU between March 2006 and September 2015. Univariate logistic regression identified the following risk factors of death in patients with severe purulent bacterial meningitis: APACHE II, SAPS II, SOFA, and GCS scores, and the lengths of ICU stay and hospital stay. The independent risk factors of patient death in multivariate analysis were the SAPS II score, the length of ICU stay and the length of hospital stay. In the prediction of mortality according to the area under the curve, the SAPS II score had the highest accuracy followed by the APACHE II, GCS and SOFA scores. For the prediction of mortality in a patient with severe purulent bacterial meningitis, SAPS II had the highest accuracy.
Tighe, Patrick J.; Harle, Christopher A.; Hurley, Robert W.; Aytug, Haldun; Boezaart, Andre P.; Fillingim, Roger B.
2015-01-01
Background Given their ability to process highly dimensional datasets with hundreds of variables, machine learning algorithms may offer one solution to the vexing challenge of predicting postoperative pain. Methods Here, we report on the application of machine learning algorithms to predict postoperative pain outcomes in a retrospective cohort of 8071 surgical patients using 796 clinical variables. Five algorithms were compared in terms of their ability to forecast moderate to severe postoperative pain: Least Absolute Shrinkage and Selection Operator (LASSO), gradient-boosted decision tree, support vector machine, neural network, and k-nearest neighbor, with logistic regression included for baseline comparison. Results In forecasting moderate to severe postoperative pain for postoperative day (POD) 1, the LASSO algorithm, using all 796 variables, had the highest accuracy with an area under the receiver-operating curve (ROC) of 0.704. Next, the gradient-boosted decision tree had an ROC of 0.665 and the k-nearest neighbor algorithm had an ROC of 0.643. For POD 3, the LASSO algorithm, using all variables, again had the highest accuracy, with an ROC of 0.727. Logistic regression had a lower ROC of 0.5 for predicting pain outcomes on POD 1 and 3. Conclusions Machine learning algorithms, when combined with complex and heterogeneous data from electronic medical record systems, can forecast acute postoperative pain outcomes with accuracies similar to methods that rely only on variables specifically collected for pain outcome prediction. PMID:26031220
AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad
2016-01-01
Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463
BDDCS Class Prediction for New Molecular Entities
Broccatelli, Fabio; Cruciani, Gabriele; Benet, Leslie Z.; Oprea, Tudor I.
2012-01-01
The Biopharmaceutics Drug Disposition Classification System (BDDCS) was successfully employed for predicting drug-drug interactions (DDIs) with respect to drug metabolizing enzymes (DMEs), drug transporters and their interplay. The major assumption of BDDCS is that the extent of metabolism (EoM) predicts high versus low intestinal permeability rate, and vice versa, at least when uptake transporters or paracellular transport are not involved. We recently published a collection of over 900 marketed drugs classified for BDDCS. We suggest that a reliable model for predicting BDDCS class, integrated with in vitro assays, could anticipate disposition and potential DDIs of new molecular entities (NMEs). Here we describe a computational procedure for predicting BDDCS class from molecular structures. The model was trained on a set of 300 oral drugs, and validated on an external set of 379 oral drugs, using 17 descriptors calculated or derived from the VolSurf+ software. For each molecule, a probability of BDDCS class membership was given, based on predicted EoM, FDA solubility (FDAS) and their confidence scores. The accuracy in predicting FDAS was 78% in training and 77% in validation, while for EoM prediction the accuracy was 82% in training and 79% in external validation. The actual BDDCS class corresponded to the highest ranked calculated class for 55% of the validation molecules, and it was within the top two ranked more than 92% of the times. The unbalanced stratification of the dataset didn’t affect the prediction, which showed highest accuracy in predicting classes 2 and 3 with respect to the most populated class 1. For class 4 drugs a general lack of predictability was observed. A linear discriminant analysis (LDA) confirmed the degree of accuracy for the prediction of the different BDDCS classes is tied to the structure of the dataset. This model could routinely be used in early drug discovery to prioritize in vitro tests for NMEs (e.g., affinity to transporters, intestinal metabolism, intestinal absorption and plasma protein binding). We further applied the BDDCS prediction model on a large set of medicinal chemistry compounds (over 30,000 chemicals). Based on this application, we suggest that solubility, and not permeability, is the major difference between NMEs and drugs. We anticipate that the forecast of BDDCS categories in early drug discovery may lead to a significant R&D cost reduction. PMID:22224483
Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E
2017-07-01
The leaf spotting diseases in wheat that include Septoria tritici blotch (STB) caused by , Stagonospora nodorum blotch (SNB) caused by , and tan spot (TS) caused by pose challenges to breeding programs in selecting for resistance. A promising approach that could enable selection prior to phenotyping is genomic selection that uses genome-wide markers to estimate breeding values (BVs) for quantitative traits. To evaluate this approach for seedling and/or adult plant resistance (APR) to STB, SNB, and TS, we compared the predictive ability of least-squares (LS) approach with genomic-enabled prediction models including genomic best linear unbiased predictor (GBLUP), Bayesian ridge regression (BRR), Bayes A (BA), Bayes B (BB), Bayes Cπ (BC), Bayesian least absolute shrinkage and selection operator (BL), and reproducing kernel Hilbert spaces markers (RKHS-M), a pedigree-based model (RKHS-P) and RKHS markers and pedigree (RKHS-MP). We observed that LS gave the lowest prediction accuracies and RKHS-MP, the highest. The genomic-enabled prediction models and RKHS-P gave similar accuracies. The increase in accuracy using genomic prediction models over LS was 48%. The mean genomic prediction accuracies were 0.45 for STB (APR), 0.55 for SNB (seedling), 0.66 for TS (seedling) and 0.48 for TS (APR). We also compared markers from two whole-genome profiling approaches: genotyping by sequencing (GBS) and diversity arrays technology sequencing (DArTseq) for prediction. While, GBS markers performed slightly better than DArTseq, combining markers from the two approaches did not improve accuracies. We conclude that implementing GS in breeding for these diseases would help to achieve higher accuracies and rapid gains from selection. Copyright © 2017 Crop Science Society of America.
High accuracy operon prediction method based on STRING database scores.
Taboada, Blanca; Verde, Cristina; Merino, Enrique
2010-07-01
We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8-a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412-D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.
Comparison of Three Risk Scores to Predict Outcomes of Severe Lower Gastrointestinal Bleeding.
Camus, Marine; Jensen, Dennis M; Ohning, Gordon V; Kovacs, Thomas O; Jutabha, Rome; Ghassemi, Kevin A; Machicado, Gustavo A; Dulai, Gareth S; Jensen, Mary E; Gornbein, Jeffrey A
2016-01-01
Improved medical decisions by using a score at the initial patient triage level may lead to improvements in patient management, outcomes, and resource utilization. There is no validated score for management of lower gastrointestinal bleeding (LGIB) unlike for upper gastrointestinal bleeding. The aim of our study was to compare the accuracies of 3 different prognostic scores [Center for Ulcer Research and Education Hemostasis prognosis score, Charlson index, and American Society of Anesthesiologists (ASA) score] for the prediction of 30-day rebleeding, surgery, and death in severe LGIB. Data on consecutive patients hospitalized with severe gastrointestinal bleeding from January 2006 to October 2011 in our 2 tertiary academic referral centers were prospectively collected. Sensitivities, specificities, accuracies, and area under the receiver operator characteristic curve were computed for 3 scores for predictions of rebleeding, surgery, and mortality at 30 days. Two hundred thirty-five consecutive patients with LGIB were included between 2006 and 2011. Twenty-three percent of patients rebled, 6% had surgery, and 7.7% of patients died. The accuracies of each score never reached 70% for predicting rebleeding or surgery in either. The ASA score had a highest accuracy for predicting mortality within 30 days (83.5%), whereas the Center for Ulcer Research and Education Hemostasis prognosis score and the Charlson index both had accuracies <75% for the prediction of death within 30 days. ASA score could be useful to predict death within 30 days. However, a new score is still warranted to predict all 30 days outcomes (rebleeding, surgery, and death) in LGIB.
2009-01-01
Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835
Banzato, T; Cherubini, G B; Atzori, M; Zotti, A
2018-05-01
An established deep neural network (DNN) based on transfer learning and a newly designed DNN were tested to predict the grade of meningiomas from magnetic resonance (MR) images in dogs and to determine the accuracy of classification of using pre- and post-contrast T1-weighted (T1W), and T2-weighted (T2W) MR images. The images were randomly assigned to a training set, a validation set and a test set, comprising 60%, 10% and 30% of images, respectively. The combination of DNN and MR sequence displaying the highest discriminating accuracy was used to develop an image classifier to predict the grading of new cases. The algorithm based on transfer learning using the established DNN did not provide satisfactory results, whereas the newly designed DNN had high classification accuracy. On the basis of classification accuracy, an image classifier built on the newly designed DNN using post-contrast T1W images was developed. This image classifier correctly predicted the grading of 8 out of 10 images not included in the data set. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
The development of a probabilistic approach to forecast coastal change
Lentz, Erika E.; Hapke, Cheryl J.; Rosati, Julie D.; Wang, Ping; Roberts, Tiffany M.
2011-01-01
This study demonstrates the applicability of a Bayesian probabilistic model as an effective tool in predicting post-storm beach changes along sandy coastlines. Volume change and net shoreline movement are modeled for two study sites at Fire Island, New York in response to two extratropical storms in 2007 and 2009. Both study areas include modified areas adjacent to unmodified areas in morphologically different segments of coast. Predicted outcomes are evaluated against observed changes to test model accuracy and uncertainty along 163 cross-shore transects. Results show strong agreement in the cross validation of predictions vs. observations, with 70-82% accuracies reported. Although no consistent spatial pattern in inaccurate predictions could be determined, the highest prediction uncertainties appeared in locations that had been recently replenished. Further testing and model refinement are needed; however, these initial results show that Bayesian networks have the potential to serve as important decision-support tools in forecasting coastal change.
Training set optimization under population structure in genomic selection.
Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E
2015-01-01
Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.
Consumer preferences for the predictive genetic test for Alzheimer disease.
Huang, Ming-Yi; Huston, Sally A; Perri, Matthew
2014-04-01
The purpose of this study was to assess consumer preferences for predictive genetic testing for Alzheimer disease in the United States. A rating conjoint analysis was conducted using an anonymous online survey distributed by Qualtrics to a general population panel in April 2011 in the United States. The study design included three attributes: Accuracy (40%, 80%, and 100%), Treatment Availability (Cure is available/Drug for symptom relief but no cure), and Anonymity (Anonymous/Not anonymous). A total of 12 scenarios were used to elicit people's preference, assessed by an 11-point scale. The respondents also indicated their highest willingness-to-pay (WTP) for each scenario through open-ended questions. A total of 295 responses were collected over 4 days. The most important attribute for the aggregate model was Accuracy, contributing 64.73% to the preference rating. Treatment Availability and Anonymity contributed 20.72% and 14.59%, respectively, to the preference rating. The median WTP for the highest-rating scenario (Accuracy 100%, a cure is available, test result is anonymous) was $100 (mean = $276). The median WTP for the lowest-rating scenario (40% accuracy, no cure but drugs for symptom relief, not anonymous) was zero (mean = $34). The results of this study highlight attributes people find important when making the hypothetical decision to obtain an AD genetic test. These results should be of interests to policy makers, genetic test developers and health care providers.
Clark, Samuel A; Hickey, John M; Daetwyler, Hans D; van der Werf, Julius H J
2012-02-09
The theory of genomic selection is based on the prediction of the effects of genetic markers in linkage disequilibrium with quantitative trait loci. However, genomic selection also relies on relationships between individuals to accurately predict genetic value. This study aimed to examine the importance of information on relatives versus that of unrelated or more distantly related individuals on the estimation of genomic breeding values. Simulated and real data were used to examine the effects of various degrees of relationship on the accuracy of genomic selection. Genomic Best Linear Unbiased Prediction (gBLUP) was compared to two pedigree based BLUP methods, one with a shallow one generation pedigree and the other with a deep ten generation pedigree. The accuracy of estimated breeding values for different groups of selection candidates that had varying degrees of relationships to a reference data set of 1750 animals was investigated. The gBLUP method predicted breeding values more accurately than BLUP. The most accurate breeding values were estimated using gBLUP for closely related animals. Similarly, the pedigree based BLUP methods were also accurate for closely related animals, however when the pedigree based BLUP methods were used to predict unrelated animals, the accuracy was close to zero. In contrast, gBLUP breeding values, for animals that had no pedigree relationship with animals in the reference data set, allowed substantial accuracy. An animal's relationship to the reference data set is an important factor for the accuracy of genomic predictions. Animals that share a close relationship to the reference data set had the highest accuracy from genomic predictions. However a baseline accuracy that is driven by the reference data set size and the overall population effective population size enables gBLUP to estimate a breeding value for unrelated animals within a population (breed), using information previously ignored by pedigree based BLUP methods.
Estimating the Uncertainty and Predictive Capabilities of Three-Dimensional Earth Models (Postprint)
2012-03-22
www.isc.ac.uk). This global database includes more than 7,000 events whose epicentral location accuracy is known to at least 5 km. GT events with...region, which illustrates the difficulty of validating a model with travel times alone. However, the IASPEI REL database is currently the highest...S (right) paths in the IASPEI REL ground-truth database . Stations are represented by purple triangles and events by gray circles. Note the sparse
Chara, Liaskou; Eleftherios, Vouzounerakis; Maria, Moirasgenti; Anastasia, Trikoupi; Chryssoula, Staikou
2014-01-01
Background and Aims: Difficult airway assessment is based on various anatomic parameters of upper airway, much of it being concentrated on oral cavity and the pharyngeal structures. The diagnostic value of tests based on neck anatomy in predicting difficult laryngoscopy was assessed in this prospective, open cohort study. Methods: We studied 341 adult patients scheduled to receive general anaesthesia. Thyromental distance (TMD), sternomental distance (STMD), ratio of height to thyromental distance (RHTMD) and neck circumference (NC) were measured pre-operatively. The laryngoscopic view was classified according to the Cormack–Lehane Grade (1-4). Difficult laryngoscopy was defined as Cormack–Lehane Grade 3 or 4. The optimal cut-off points for each variable were identified by using receiver operating characteristic analysis. Sensitivity, specificity and positive predictive value and negative predictive value (NPV) were calculated for each test. Multivariate analysis with logistic regression, including all variables, was used to create a predictive model. Comparisons between genders were also performed. Results: Laryngoscopy was difficult in 12.6% of the patients. The cut-off values were: TMD ≤7 cm, STMD ≤15 cm, RHTMD >18.4 and NC >37.5 cm. The RHTMD had the highest sensitivity (88.4%) and NPV (95.2%), while TMD had the highest specificity (83.9%). The area under curve (AUC) for the TMD, STMD, RHTMD and NC was 0.63, 0.64, 0.62 and 0.54, respectively. The predictive model exhibited a higher and statistically significant diagnostic accuracy (AUC: 0.68, P < 0.001). Gender-specific cut-off points improved the predictive accuracy of NC in women (AUC: 0.65). Conclusions: The TMD, STMD, RHTMD and NC were found to be poor single predictors of difficult laryngoscopy, while a model including all four variables had a significant predictive accuracy. Among the studied tests, gender-specific cut-off points should be used for NC. PMID:24963183
Liaskou, Chara; Chara, Liaskou; Vouzounerakis, Eleftherios; Eleftherios, Vouzounerakis; Moirasgenti, Maria; Maria, Moirasgenti; Trikoupi, Anastasia; Anastasia, Trikoupi; Staikou, Chryssoula; Chryssoula, Staikou
2014-03-01
Difficult airway assessment is based on various anatomic parameters of upper airway, much of it being concentrated on oral cavity and the pharyngeal structures. The diagnostic value of tests based on neck anatomy in predicting difficult laryngoscopy was assessed in this prospective, open cohort study. We studied 341 adult patients scheduled to receive general anaesthesia. Thyromental distance (TMD), sternomental distance (STMD), ratio of height to thyromental distance (RHTMD) and neck circumference (NC) were measured pre-operatively. The laryngoscopic view was classified according to the Cormack-Lehane Grade (1-4). Difficult laryngoscopy was defined as Cormack-Lehane Grade 3 or 4. The optimal cut-off points for each variable were identified by using receiver operating characteristic analysis. Sensitivity, specificity and positive predictive value and negative predictive value (NPV) were calculated for each test. Multivariate analysis with logistic regression, including all variables, was used to create a predictive model. Comparisons between genders were also performed. Laryngoscopy was difficult in 12.6% of the patients. The cut-off values were: TMD ≤7 cm, STMD ≤15 cm, RHTMD >18.4 and NC >37.5 cm. The RHTMD had the highest sensitivity (88.4%) and NPV (95.2%), while TMD had the highest specificity (83.9%). The area under curve (AUC) for the TMD, STMD, RHTMD and NC was 0.63, 0.64, 0.62 and 0.54, respectively. The predictive model exhibited a higher and statistically significant diagnostic accuracy (AUC: 0.68, P < 0.001). Gender-specific cut-off points improved the predictive accuracy of NC in women (AUC: 0.65). The TMD, STMD, RHTMD and NC were found to be poor single predictors of difficult laryngoscopy, while a model including all four variables had a significant predictive accuracy. Among the studied tests, gender-specific cut-off points should be used for NC.
Zanderigo, Francesca; Sparacino, Giovanni; Kovatchev, Boris; Cobelli, Claudio
2007-09-01
The aim of this article was to use continuous glucose error-grid analysis (CG-EGA) to assess the accuracy of two time-series modeling methodologies recently developed to predict glucose levels ahead of time using continuous glucose monitoring (CGM) data. We considered subcutaneous time series of glucose concentration monitored every 3 minutes for 48 hours by the minimally invasive CGM sensor Glucoday® (Menarini Diagnostics, Florence, Italy) in 28 type 1 diabetic volunteers. Two prediction algorithms, based on first-order polynomial and autoregressive (AR) models, respectively, were considered with prediction horizons of 30 and 45 minutes and forgetting factors (ff) of 0.2, 0.5, and 0.8. CG-EGA was used on the predicted profiles to assess their point and dynamic accuracies using original CGM profiles as reference. Continuous glucose error-grid analysis showed that the accuracy of both prediction algorithms is overall very good and that their performance is similar from a clinical point of view. However, the AR model seems preferable for hypoglycemia prevention. CG-EGA also suggests that, irrespective of the time-series model, the use of ff = 0.8 yields the highest accurate readings in all glucose ranges. For the first time, CG-EGA is proposed as a tool to assess clinically relevant performance of a prediction method separately at hypoglycemia, euglycemia, and hyperglycemia. In particular, we have shown that CG-EGA can be helpful in comparing different prediction algorithms, as well as in optimizing their parameters.
Isma’eel, Hussain A.; Sakr, George E.; Almedawar, Mohamad M.; Fathallah, Jihan; Garabedian, Torkom; Eddine, Savo Bou Zein
2015-01-01
Background High dietary salt intake is directly linked to hypertension and cardiovascular diseases (CVDs). Predicting behaviors regarding salt intake habits is vital to guide interventions and increase their effectiveness. We aim to compare the accuracy of an artificial neural network (ANN) based tool that predicts behavior from key knowledge questions along with clinical data in a high cardiovascular risk cohort relative to the least square models (LSM) method. Methods We collected knowledge, attitude and behavior data on 115 patients. A behavior score was calculated to classify patients’ behavior towards reducing salt intake. Accuracy comparison between ANN and regression analysis was calculated using the bootstrap technique with 200 iterations. Results Starting from a 69-item questionnaire, a reduced model was developed and included eight knowledge items found to result in the highest accuracy of 62% CI (58-67%). The best prediction accuracy in the full and reduced models was attained by ANN at 66% and 62%, respectively, compared to full and reduced LSM at 40% and 34%, respectively. The average relative increase in accuracy over all in the full and reduced models is 82% and 102%, respectively. Conclusions Using ANN modeling, we can predict salt reduction behaviors with 66% accuracy. The statistical model has been implemented in an online calculator and can be used in clinics to estimate the patient’s behavior. This will help implementation in future research to further prove clinical utility of this tool to guide therapeutic salt reduction interventions in high cardiovascular risk individuals. PMID:26090333
Development of machine learning models for diagnosis of glaucoma.
Kim, Seong Jae; Cho, Kyong Jin; Oh, Sejong
2017-01-01
The study aimed to develop machine learning models that have strong prediction power and interpretability for diagnosis of glaucoma based on retinal nerve fiber layer (RNFL) thickness and visual field (VF). We collected various candidate features from the examination of retinal nerve fiber layer (RNFL) thickness and visual field (VF). We also developed synthesized features from original features. We then selected the best features proper for classification (diagnosis) through feature evaluation. We used 100 cases of data as a test dataset and 399 cases of data as a training and validation dataset. To develop the glaucoma prediction model, we considered four machine learning algorithms: C5.0, random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN). We repeatedly composed a learning model using the training dataset and evaluated it by using the validation dataset. Finally, we got the best learning model that produces the highest validation accuracy. We analyzed quality of the models using several measures. The random forest model shows best performance and C5.0, SVM, and KNN models show similar accuracy. In the random forest model, the classification accuracy is 0.98, sensitivity is 0.983, specificity is 0.975, and AUC is 0.979. The developed prediction models show high accuracy, sensitivity, specificity, and AUC in classifying among glaucoma and healthy eyes. It will be used for predicting glaucoma against unknown examination records. Clinicians may reference the prediction results and be able to make better decisions. We may combine multiple learning models to increase prediction accuracy. The C5.0 model includes decision rules for prediction. It can be used to explain the reasons for specific predictions.
Steeg, Sarah; Quinlivan, Leah; Nowland, Rebecca; Carroll, Robert; Casey, Deborah; Clements, Caroline; Cooper, Jayne; Davies, Linda; Knipe, Duleeka; Ness, Jennifer; O'Connor, Rory C; Hawton, Keith; Gunnell, David; Kapur, Nav
2018-04-25
Risk scales are used widely in the management of patients presenting to hospital following self-harm. However, there is evidence that their diagnostic accuracy in predicting repeat self-harm is limited. Their predictive accuracy in population settings, and in identifying those at highest risk of suicide is not known. We compared the predictive accuracy of the Manchester Self-Harm Rule (MSHR), ReACT Self-Harm Rule (ReACT), SAD PERSONS Scale (SPS) and Modified SAD PERSONS Scale (MSPS) in an unselected sample of patients attending hospital following self-harm. Data on 4000 episodes of self-harm presenting to Emergency Departments (ED) between 2010 and 2012 were obtained from four established monitoring systems in England. Episodes were assigned a risk category for each scale and followed up for 6 months. The episode-based repeat rate was 28% (1133/4000) and the incidence of suicide was 0.5% (18/3962). The MSHR and ReACT performed with high sensitivity (98% and 94% respectively) and low specificity (15% and 23%). The SPS and the MSPS performed with relatively low sensitivity (24-29% and 9-12% respectively) and high specificity (76-77% and 90%). The area under the curve was 71% for both MSHR and ReACT, 51% for SPS and 49% for MSPS. Differences in predictive accuracy by subgroup were small. The scales were less accurate at predicting suicide than repeat self-harm. The scales failed to accurately predict repeat self-harm and suicide. The findings support existing clinical guidance not to use risk classification scales alone to determine treatment or predict future risk.
Cost-Effective Prediction of Reading Difficulties.
ERIC Educational Resources Information Center
Heath, Steve M.; Hogben, John H.
2004-01-01
This study addressed 2 questions: (a) Can preschoolers who will fail at reading be more efficiently identified by targeting those at highest risk for reading problems? and (b) will auditory temporal processing (ATP) improve the accuracy of identification derived from phonological processing and oral language ability? A sample of 227 preschoolers…
Omran, Dalia; Zayed, Rania A; Nabeel, Mohammed M; Mobarak, Lamiaa; Zakaria, Zeinab; Farid, Azza; Hassany, Mohamed; Saif, Sameh; Mostafa, Muhammad; Saad, Omar Khalid; Yosry, Ayman
2018-05-01
Stage of liver fibrosis is critical for treatment decision and prediction of outcomes in chronic hepatitis C (CHC) patients. We evaluated the diagnostic accuracy of transient elastography (TE)-FibroScan and noninvasive serum markers tests in the assessment of liver fibrosis in CHC patients, in reference to liver biopsy. One-hundred treatment-naive CHC patients were subjected to liver biopsy, TE-FibroScan, and eight serum biomarkers tests; AST/ALT ratio (AAR), AST to platelet ratio index (APRI), age-platelet index (AP index), fibrosis quotient (FibroQ), fibrosis 4 index (FIB-4), cirrhosis discriminant score (CDS), King score, and Goteborg University Cirrhosis Index (GUCI). Receiver operating characteristic curves were constructed to compare the diagnostic accuracy of these noninvasive methods in predicting significant fibrosis in CHC patients. TE-FibroScan predicted significant fibrosis at cutoff value 8.5 kPa with area under the receiver operating characteristic (AUROC) 0.90, sensitivity 83%, specificity 91.5%, positive predictive value (PPV) 91.2%, and negative predictive value (NPV) 84.4%. Serum biomarkers tests showed that AP index and FibroQ had the highest diagnostic accuracy in predicting significant liver fibrosis at cutoff 4.5 and 2.7, AUROC was 0.8 and 0.8 with sensitivity 73.6% and 73.6%, specificity 70.2% and 68.1%, PPV 71.1% and 69.8%, and NPV 72.9% and 72.3%, respectively. Combined AP index and FibroQ had AUROC 0.83 with sensitivity 73.6%, specificity 80.9%, PPV 79.6%, and NPV 75.7% for predicting significant liver fibrosis. APRI, FIB-4, CDS, King score, and GUCI had intermediate accuracy in predicting significant liver fibrosis with AUROC 0.68, 0.78, 0.74, 0.74, and 0.67, respectively, while AAR had low accuracy in predicting significant liver fibrosis. TE-FibroScan is the most accurate noninvasive alternative to liver biopsy. AP index and FibroQ, either as individual tests or combined, have good accuracy in predicting significant liver fibrosis, and are better combined for higher specificity.
Automated detection of brain atrophy patterns based on MRI for the prediction of Alzheimer's disease
Plant, Claudia; Teipel, Stefan J.; Oswald, Annahita; Böhm, Christian; Meindl, Thomas; Mourao-Miranda, Janaina; Bokde, Arun W.; Hampel, Harald; Ewers, Michael
2010-01-01
Subjects with mild cognitive impairment (MCI) have an increased risk to develop Alzheimer's disease (AD). Voxel-based MRI studies have demonstrated that widely distributed cortical and subcortical brain areas show atrophic changes in MCI, preceding the onset of AD-type dementia. Here we developed a novel data mining framework in combination with three different classifiers including support vector machine (SVM), Bayes statistics, and voting feature intervals (VFI) to derive a quantitative index of pattern matching for the prediction of the conversion from MCI to AD. MRI was collected in 32 AD patients, 24 MCI subjects and 18 healthy controls (HC). Nine out of 24 MCI subjects converted to AD after an average follow-up interval of 2.5 years. Using feature selection algorithms, brain regions showing the highest accuracy for the discrimination between AD and HC were identified, reaching a classification accuracy of up to 92%. The extracted AD clusters were used as a search region to extract those brain areas that are predictive of conversion to AD within MCI subjects. The most predictive brain areas included the anterior cingulate gyrus and orbitofrontal cortex. The best prediction accuracy, which was cross-validated via train-and-test, was 75% for the prediction of the conversion from MCI to AD. The present results suggest that novel multivariate methods of pattern matching reach a clinically relevant accuracy for the a priori prediction of the progression from MCI to AD. PMID:19961938
Prediction-Oriented Marker Selection (PROMISE): With Application to High-Dimensional Regression.
Kim, Soyeon; Baladandayuthapani, Veerabhadran; Lee, J Jack
2017-06-01
In personalized medicine, biomarkers are used to select therapies with the highest likelihood of success based on an individual patient's biomarker/genomic profile. Two goals are to choose important biomarkers that accurately predict treatment outcomes and to cull unimportant biomarkers to reduce the cost of biological and clinical verifications. These goals are challenging due to the high dimensionality of genomic data. Variable selection methods based on penalized regression (e.g., the lasso and elastic net) have yielded promising results. However, selecting the right amount of penalization is critical to simultaneously achieving these two goals. Standard approaches based on cross-validation (CV) typically provide high prediction accuracy with high true positive rates but at the cost of too many false positives. Alternatively, stability selection (SS) controls the number of false positives, but at the cost of yielding too few true positives. To circumvent these issues, we propose prediction-oriented marker selection (PROMISE), which combines SS with CV to conflate the advantages of both methods. Our application of PROMISE with the lasso and elastic net in data analysis shows that, compared to CV, PROMISE produces sparse solutions, few false positives, and small type I + type II error, and maintains good prediction accuracy, with a marginal decrease in the true positive rates. Compared to SS, PROMISE offers better prediction accuracy and true positive rates. In summary, PROMISE can be applied in many fields to select regularization parameters when the goals are to minimize false positives and maximize prediction accuracy.
The urine dipstick test useful to rule out infections. A meta-analysis of the accuracy
Devillé, Walter LJM; Yzermans, Joris C; van Duijn, Nico P; Bezemer, P Dick; van der Windt, Daniëlle AWM; Bouter, Lex M
2004-01-01
Background Many studies have evaluated the accuracy of dipstick tests as rapid detectors of bacteriuria and urinary tract infections (UTI). The lack of an adequate explanation for the heterogeneity of the dipstick accuracy stimulates an ongoing debate. The objective of the present meta-analysis was to summarise the available evidence on the diagnostic accuracy of the urine dipstick test, taking into account various pre-defined potential sources of heterogeneity. Methods Literature from 1990 through 1999 was searched in Medline and Embase, and by reference tracking. Selected publications should be concerned with the diagnosis of bacteriuria or urinary tract infections, investigate the use of dipstick tests for nitrites and/or leukocyte esterase, and present empirical data. A checklist was used to assess methodological quality. Results 70 publications were included. Accuracy of nitrites was high in pregnant women (Diagnostic Odds Ratio = 165) and elderly people (DOR = 108). Positive predictive values were ≥80% in elderly and in family medicine. Accuracy of leukocyte-esterase was high in studies in urology patients (DOR = 276). Sensitivities were highest in family medicine (86%). Negative predictive values were high in both tests in all patient groups and settings, except for in family medicine. The combination of both test results showed an important increase in sensitivity. Accuracy was high in studies in urology patients (DOR = 52), in children (DOR = 46), and if clinical information was present (DOR = 28). Sensitivity was highest in studies carried out in family medicine (90%). Predictive values of combinations of positive test results were low in all other situations. Conclusions Overall, this review demonstrates that the urine dipstick test alone seems to be useful in all populations to exclude the presence of infection if the results of both nitrites and leukocyte-esterase are negative. Sensitivities of the combination of both tests vary between 68 and 88% in different patient groups, but positive test results have to be confirmed. Although the combination of positive test results is very sensitive in family practice, the usefulness of the dipstick test alone to rule in infection remains doubtful, even with high pre-test probabilities. PMID:15175113
Tural, Cristina; Tor, Jordi; Sanvisens, Arantza; Pérez-Alvarez, Núria; Martínez, Elisenda; Ojanguren, Isabel; García-Samaniego, Javier; Rockstroh, Juergen; Barluenga, Eva; Muga, Robert; Planas, Ramon; Sirera, Guillem; Rey-Joly, Celestino; Clotet, Bonaventura
2009-03-01
We assessed the ability of 3 simple biochemical tests to stage liver fibrosis in patients co-infected with human immunodeficiency virus (HIV) and hepatitis C virus (HCV). We analyzed liver biopsy samples from 324 consecutive HIV/HCV-positive patients (72% men; mean age, 38 y; mean CD4+ T-cell counts, 548 cells/mm(3)). Scheuer fibrosis scores were as follows: 30% had F0, 22% had F1, 19% had F2, 23% had F3, and 6% had F4. Logistic regression analyses were used to predict the probability of significant (>or=F2) or advanced (>or=F3) fibrosis, based on numeric scores from the APRI, FORNS, or FIB-4 tests (alone and in combination). Area under the receiver operating characteristic curves were analyzed to assess diagnostic performance. Area under the receiver operating characteristic curves analyses indicated that the 3 tests had similar abilities to identify F2 and F3; the ability of APRI, FORNS, and FIB-4 were as follows: F2 or greater: 0.72, 0.67, and 0.72, respectively; F3 or greater: 0.75, 0.73, and 0.78, respectively. The accuracy of each test in predicting which samples were F3 or greater was significantly higher than for F2 or greater (APRI, FORNS, and FIB-4: >or=F3: 75%, 76%, and 76%, respectively; >or=F2: 66%, 62%, and 68%, respectively). By using the lowest cut-off values for all 3 tests, F3 or greater was ruled out with sensitivity and negative predictive values of 79% to 94% and 87% to 91%, respectively, and 47% to 70% accuracy. Advanced liver fibrosis (>or=F3) was identified using the highest cut-off value, with specificity and positive predictive values of 90% to 96% and 63% to 73%, respectively, and 75% to 77% accuracy. Simple biochemical tests accurately predicted liver fibrosis in more than half the HIV/HCV co-infected patients. The absence and presence of liver fibrosis are predicted fairly using the lowest and highest cut-off levels, respectively.
An evaluation of selected (Q)SARs/expert systems for predicting skin sensitisation potential.
Fitzpatrick, J M; Roberts, D W; Patlewicz, G
2018-06-01
Predictive testing to characterise substances for their skin sensitisation potential has historically been based on animal models such as the Local Lymph Node Assay (LLNA) and the Guinea Pig Maximisation Test (GPMT). In recent years, EU regulations, have provided a strong incentive to develop non-animal alternatives, such as expert systems software. Here we selected three different types of expert systems: VEGA (statistical), Derek Nexus (knowledge-based) and TIMES-SS (hybrid), and evaluated their performance using two large sets of animal data: one set of 1249 substances from eChemportal and a second set of 515 substances from NICEATM. A model was considered successful at predicting skin sensitisation potential if it had at least the same balanced accuracy as the LLNA and the GPMT had in predicting the other outcomes, which ranged from 79% to 86%. We found that the highest balanced accuracy of any of the expert systems evaluated was 65% when making global predictions. For substances within the domain of TIMES-SS, however, balanced accuracies for the two datasets were found to be 79% and 82%. In those cases where a chemical was within the TIMES-SS domain, the TIMES-SS skin sensitisation hazard prediction had the same confidence as the result from LLNA or GPMT.
Genomic selection for crossbred performance accounting for breed-specific effects.
Lopes, Marcos S; Bovenhuis, Henk; Hidalgo, André M; van Arendonk, Johan A M; Knol, Egbert F; Bastiaansen, John W M
2017-06-26
Breed-specific effects are observed when the same allele of a given genetic marker has a different effect depending on its breed origin, which results in different allele substitution effects across breeds. In such a case, single-breed breeding values may not be the most accurate predictors of crossbred performance. Our aim was to estimate the contribution of alleles from each parental breed to the genetic variance of traits that are measured in crossbred offspring, and to compare the prediction accuracies of estimated direct genomic values (DGV) from a traditional genomic selection model (GS) that are trained on purebred or crossbred data, with accuracies of DGV from a model that accounts for breed-specific effects (BS), trained on purebred or crossbred data. The final dataset was composed of 924 Large White, 924 Landrace and 924 two-way cross (F1) genotyped and phenotyped animals. The traits evaluated were litter size (LS) and gestation length (GL) in pigs. The genetic correlation between purebred and crossbred performance was higher than 0.88 for both LS and GL. For both traits, the additive genetic variance was larger for alleles inherited from the Large White breed compared to alleles inherited from the Landrace breed (0.74 and 0.56 for LS, and 0.42 and 0.40 for GL, respectively). The highest prediction accuracies of crossbred performance were obtained when training was done on crossbred data. For LS, prediction accuracies were the same for GS and BS DGV (0.23), while for GL, prediction accuracy for BS DGV was similar to the accuracy of GS DGV (0.53 and 0.52, respectively). In this study, training on crossbred data resulted in higher prediction accuracy than training on purebred data and evidence of breed-specific effects for LS and GL was demonstrated. However, when training was done on crossbred data, both GS and BS models resulted in similar prediction accuracies. In future studies, traits with a lower genetic correlation between purebred and crossbred performance should be included to further assess the value of the BS model in genomic predictions.
Research on the Optimum Water Content of Detecting Soil Nitrogen Using Near Infrared Sensor
He, Yong; Nie, Pengcheng; Dong, Tao; Qu, Fangfang; Lin, Lei
2017-01-01
Nitrogen is one of the important indexes to evaluate the physiological and biochemical properties of soil. The level of soil nitrogen content influences the nutrient levels of crops directly. The near infrared sensor can be used to detect the soil nitrogen content rapidly, nondestructively, and conveniently. In order to investigate the effect of the different soil water content on soil nitrogen detection by near infrared sensor, the soil samples were dealt with different drying times and the corresponding water content was measured. The drying time was set from 1 h to 8 h, and every 1 h 90 samples (each nitrogen concentration of 10 samples) were detected. The spectral information of samples was obtained by near infrared sensor, meanwhile, the soil water content was calculated every 1 h. The prediction model of soil nitrogen content was established by two linear modeling methods, including partial least squares (PLS) and uninformative variable elimination (UVE). The experiment shows that the soil has the highest detection accuracy when the drying time is 3 h and the corresponding soil water content is 1.03%. The correlation coefficients of the calibration set are 0.9721 and 0.9656, and the correlation coefficients of the prediction set are 0.9712 and 0.9682, respectively. The prediction accuracy of both models is high, while the prediction effect of PLS model is better and more stable. The results indicate that the soil water content at 1.03% has the minimum influence on the detection of soil nitrogen content using a near infrared sensor while the detection accuracy is the highest and the time cost is the lowest, which is of great significance to develop a portable apparatus detecting nitrogen in the field accurately and rapidly. PMID:28880202
Research on the Optimum Water Content of Detecting Soil Nitrogen Using Near Infrared Sensor.
He, Yong; Xiao, Shupei; Nie, Pengcheng; Dong, Tao; Qu, Fangfang; Lin, Lei
2017-09-07
Nitrogen is one of the important indexes to evaluate the physiological and biochemical properties of soil. The level of soil nitrogen content influences the nutrient levels of crops directly. The near infrared sensor can be used to detect the soil nitrogen content rapidly, nondestructively, and conveniently. In order to investigate the effect of the different soil water content on soil nitrogen detection by near infrared sensor, the soil samples were dealt with different drying times and the corresponding water content was measured. The drying time was set from 1 h to 8 h, and every 1 h 90 samples (each nitrogen concentration of 10 samples) were detected. The spectral information of samples was obtained by near infrared sensor, meanwhile, the soil water content was calculated every 1 h. The prediction model of soil nitrogen content was established by two linear modeling methods, including partial least squares (PLS) and uninformative variable elimination (UVE). The experiment shows that the soil has the highest detection accuracy when the drying time is 3 h and the corresponding soil water content is 1.03%. The correlation coefficients of the calibration set are 0.9721 and 0.9656, and the correlation coefficients of the prediction set are 0.9712 and 0.9682, respectively. The prediction accuracy of both models is high, while the prediction effect of PLS model is better and more stable. The results indicate that the soil water content at 1.03% has the minimum influence on the detection of soil nitrogen content using a near infrared sensor while the detection accuracy is the highest and the time cost is the lowest, which is of great significance to develop a portable apparatus detecting nitrogen in the field accurately and rapidly.
KANAZAWA, Tomomi; SEKI, Motohide; ISHIYAMA, Keiki; ARASEKI, Masao; IZAIKE, Yoshiaki; TAKAHASHI, Toru
2017-01-01
This study assessed the effects of gonadotropin-releasing hormone (GnRH) treatment on Day 5 (Day 0 = estrus) on luteal blood flow and accuracy of pregnancy prediction in recipient cows. On Day 5, 120 lactating Holstein cows were randomly assigned to a control group (n = 63) or GnRH group treated with 100 μg of GnRH agonist (n = 57). On Days 3, 5, 7, and 14, each cow underwent ultrasound examination to measure the blood flow area (BFA) and time-averaged maximum velocity (TAMV) at the spiral arteries at the base of the corpus luteum using color Doppler ultrasonography. Cows with a corpus luteum diameter ≥ 20 mm (n = 120) received embryo transfers on Day 7. The BFA values in the GnRH group were significantly higher than those in the control group on Days 7 and 14. TAMV did not differ between these groups. According to receiver operating characteristic analyses to predict pregnancy, a BFA cutoff of 0.52 cm2 yielded the highest sensitivity (83.3%) and specificity (90.5%) on Day 7, and BFA and TAMV values of 0.94 cm2 and 44.93 cm/s, respectively, yielded the highest sensitivity (97.1%) and specificity (100%) on Day 14 in the GnRH group. The areas under the curve for the paired BFA and TAMV in the GnRH group were 0.058 higher than those in the control group (0.996 and 0.938, respectively; P < 0.05). In conclusion, GnRH treatment on Day 5 increased the luteal BFA in recipient cows on Days 7 and 14, and improved the accuracy of pregnancy prediction on Day 14. PMID:28552886
Naderi, S; Yin, T; König, S
2016-09-01
A simulation study was conducted to investigate the performance of random forest (RF) and genomic BLUP (GBLUP) for genomic predictions of binary disease traits based on cow calibration groups. Training and testing sets were modified in different scenarios according to disease incidence, the quantitative-genetic background of the trait (h(2)=0.30 and h(2)=0.10), and the genomic architecture [725 quantitative trait loci (QTL) and 290 QTL, populations with high and low levels of linkage disequilibrium (LD)]. For all scenarios, 10,005 SNP (depicting a low-density 10K SNP chip) and 50,025 SNP (depicting a 50K SNP chip) were evenly spaced along 29 chromosomes. Training and testing sets included 20,000 cows (4,000 sick, 16,000 healthy, disease incidence 20%) from the last 2 generations. Initially, 4,000 sick cows were assigned to the testing set, and the remaining 16,000 healthy cows represented the training set. In the ongoing allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick animals from the testing set to the training set, and vice versa. The size of the training and testing sets was kept constant. Evaluation criteria for both GBLUP and RF were the correlations between genomic breeding values and true breeding values (prediction accuracy), and the area under the receiving operating characteristic curve (AUROC). Prediction accuracy and AUROC increased for both methods and all scenarios as increasing percentages of sick cows were allocated to the training set. Highest prediction accuracies were observed for disease incidences in training sets that reflected the population disease incidence of 0.20. For this allocation scheme, the largest prediction accuracies of 0.53 for RF and of 0.51 for GBLUP, and the largest AUROC of 0.66 for RF and of 0.64 for GBLUP, were achieved using 50,025 SNP, a heritability of 0.30, and 725 QTL. Heritability decreases from 0.30 to 0.10 and QTL reduction from 725 to 290 were associated with decreasing prediction accuracy and decreasing AUROC for all scenarios. This decrease was more pronounced for RF. Also, the increase of LD had stronger effect on RF results than on GBLUP results. The highest prediction accuracy from the low LD scenario was 0.30 from RF and 0.36 from GBLUP, and increased to 0.39 for both methods in the high LD population. Random forest successfully identified important SNP in close map distance to QTL explaining a high proportion of the phenotypic trait variations. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Nurhidayati, E.; Buchori, I.; Mussadun; Fariz, T. R.
2017-07-01
Pontianak waterfront city as water-based urban has the potential of water resources, socio-economic, cultural, tourism and riverine settlements. Settlements areas in the eastern district of Pontianak waterfront city is located in the triangle of Kapuas river and Landak river. This study uses quantitative-GIS methods that integrates binary logistic regression and Cellular Automata-Markov models. The data used in this study such as satellite imagery Quickbird 2003, Ikonos 2008 and elevation contour interval 1 meter. This study aims to discover the settlement land use changes in 2003-2014 and to predict the settlements areas in 2020. This study results the accuracy in predicting of changes in settlements areas shows overall accuracy (79.74%) and the highest kappa index (0.55). The prediction results show that settlement areas (481.98 Ha) in 2020 and the increasingly of settlement areas (6.80 Ha/year) in 2014-2020. The development of settlement areas in 2020 shows the highest land expansion in Parit Mayor Village. The results of regression coefficient value (0) of flooding variable, so flooding did not influence to the development of settlement areas in the eastern district of Pontianak because the building’s adaptation of rumah panggung’s settlements was very good which have adjusted to the height of tidal flood.
NASA Astrophysics Data System (ADS)
Janowiecki, Steven; Cortese, Luca; Catinella, Barbara; Goodwin, Adelle J.
2018-05-01
We use galaxies from the Herschel Reference Survey to evaluate commonly used indirect predictors of cold gas masses. We calibrate predictions for cold neutral atomic and molecular gas using infrared dust emission and gas depletion time methods that are self-consistent and have ˜20 per cent accuracy (with the highest accuracy in the prediction of total cold gas mass). However, modest systematic residual dependences are found in all calibrations that depend on the partition between molecular and atomic gas, and can over/underpredict gas masses by up to 0.3 dex. As expected, dust-based estimates are best at predicting the total gas mass while depletion time-based estimates are only able to predict the (star-forming) molecular gas mass. Additionally, we advise caution when applying these predictions to high-z galaxies, as significant (0.5 dex or more) errors can arise when incorrect assumptions are made about the dominant gas phase. Any scaling relations derived using predicted gas masses may be more closely related to the calibrations used than to the actual galaxies observed.
Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav
2013-12-09
In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models.
Frouzan, Arash; Masoumi, Kambiz; Delirroyfard, Ali; Mazdaie, Behnaz; Bagherzadegan, Elnaz
2017-08-01
Long bone fractures are common injuries caused by trauma. Some studies have demonstrated that ultrasound has a high sensitivity and specificity in the diagnosis of upper and lower extremity long bone fractures. The aim of this study was to determine the accuracy of ultrasound compared with plain radiography in diagnosis of upper and lower extremity long bone fractures in traumatic patients. This cross-sectional study assessed 100 patients admitted to the emergency department of Imam Khomeini Hospital, Ahvaz, Iran with trauma to the upper and lower extremities, from September 2014 through October 2015. In all patients, first ultrasound and then standard plain radiography for the upper and lower limb was performed. Data were analyzed by SPSS version 21 to determine the specificity and sensitivity. The mean age of patients with upper and lower limb trauma were 31.43±12.32 years and 29.63±5.89 years, respectively. Radius fracture was the most frequent compared to other fractures (27%). Sensitivity, specificity, positive predicted value, and negative predicted value of ultrasound compared with plain radiography in the diagnosis of upper extremity long bones were 95.3%, 87.7%, 87.2% and 96.2%, respectively, and the highest accuracy was observed in left arm fractures (100%). Tibia and fibula fractures were the most frequent types compared to other fractures (89.2%). Sensitivity, specificity, PPV and NPV of ultrasound compared with plain radiography in the diagnosis of upper extremity long bone fractures were 98.6%, 83%, 65.4% and 87.1%, respectively, and the highest accuracy was observed in men, lower ages and femoral fractures. The results of this study showed that ultrasound compared with plain radiography has a high accuracy in the diagnosis of upper and lower extremity long bone fractures.
A Comparative Study of Data Mining Techniques on Football Match Prediction
NASA Astrophysics Data System (ADS)
Rosli, Che Mohamad Firdaus Che Mohd; Zainuri Saringat, Mohd; Razali, Nazim; Mustapha, Aida
2018-05-01
Data prediction have become a trend in today’s business or organization. This paper is set to predict match outcomes for association football from the perspective of football club managers and coaches. This paper explored different data mining techniques used for predicting the match outcomes where the target class is win, draw and lose. The main objective of this research is to find the most accurate data mining technique that fits the nature of football data. The techniques tested are Decision Trees, Neural Networks, Bayesian Network, and k-Nearest Neighbors. The results from the comparative experiments showed that Decision Trees produced the highest average prediction accuracy in the domain of football match prediction by 99.56%.
Kuo, Pao-Jen; Wu, Shao-Chun; Chien, Peng-Chen; Chang, Shu-Shya; Rau, Cheng-Shyuan; Tai, Hsueh-Ling; Peng, Shu-Hui; Lin, Yi-Chun; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2018-03-02
The aim of this study was to develop an effective surgical site infection (SSI) prediction model in patients receiving free-flap reconstruction after surgery for head and neck cancer using artificial neural network (ANN), and to compare its predictive power with that of conventional logistic regression (LR). There were 1,836 patients with 1,854 free-flap reconstructions and 438 postoperative SSIs in the dataset for analysis. They were randomly assigned tin ratio of 7:3 into a training set and a test set. Based on comprehensive characteristics of patients and diseases in the absence or presence of operative data, prediction of SSI was performed at two time points (pre-operatively and post-operatively) with a feed-forward ANN and the LR models. In addition to the calculated accuracy, sensitivity, and specificity, the predictive performance of ANN and LR were assessed based on area under the curve (AUC) measures of receiver operator characteristic curves and Brier score. ANN had a significantly higher AUC (0.892) of post-operative prediction and AUC (0.808) of pre-operative prediction than LR (both P <0.0001). In addition, there was significant higher AUC of post-operative prediction than pre-operative prediction by ANN (p<0.0001). With the highest AUC and the lowest Brier score (0.090), the post-operative prediction by ANN had the highest overall predictive performance. The post-operative prediction by ANN had the highest overall performance in predicting SSI after free-flap reconstruction in patients receiving surgery for head and neck cancer.
Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto; ...
2017-06-14
The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto
The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less
Preciat Gonzalez, German A; El Assal, Lemmer R P; Noronha, Alberto; Thiele, Ines; Haraldsdóttir, Hulda S; Fleming, Ronan M T
2017-06-14
The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, many algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.
Assessment of the reliability of protein-protein interactions and protein function prediction.
Deng, Minghua; Sun, Fengzhu; Chen, Ting
2003-01-01
As more and more high-throughput protein-protein interaction data are collected, the task of estimating the reliability of different data sets becomes increasingly important. In this paper, we present our study of two groups of protein-protein interaction data, the physical interaction data and the protein complex data, and estimate the reliability of these data sets using three different measurements: (1) the distribution of gene expression correlation coefficients, (2) the reliability based on gene expression correlation coefficients, and (3) the accuracy of protein function predictions. We develop a maximum likelihood method to estimate the reliability of protein interaction data sets according to the distribution of correlation coefficients of gene expression profiles of putative interacting protein pairs. The results of the three measurements are consistent with each other. The MIPS protein complex data have the highest mean gene expression correlation coefficients (0.256) and the highest accuracy in predicting protein functions (70% sensitivity and specificity), while Ito's Yeast two-hybrid data have the lowest mean (0.041) and the lowest accuracy (15% sensitivity and specificity). Uetz's data are more reliable than Ito's data in all three measurements, and the TAP protein complex data are more reliable than the HMS-PCI data in all three measurements as well. The complex data sets generally perform better in function predictions than do the physical interaction data sets. Proteins in complexes are shown to be more highly correlated in gene expression. The results confirm that the components of a protein complex can be assigned to functions that the complex carries out within a cell. There are three interaction data sets different from the above two groups: the genetic interaction data, the in-silico data and the syn-express data. Their capability of predicting protein functions generally falls between that of the Y2H data and that of the MIPS protein complex data. The supplementary information is available at the following Web site: http://www-hto.usc.edu/-msms/AssessInteraction/.
Accuracy of Definitions for Linkage to Care in Persons Living with HIV
KELLER, Sara C.; YEHIA, Baligh R.; EBERHART, Michael G.; BRADY, Kathleen A.
2013-01-01
Objective To compare the accuracy of linkage to care metrics for patients diagnosed with HIV using retention in care and virologic suppression as the gold standards of effective linkage. Design A retrospective cohort study of patients aged 18 and over with newly-diagnosed HIV infection in the City of Philadelphia, 2007 to 2008. Methods Times from diagnosis to clinic visits or laboratory testing were used as linkage measures. Outcome variables included being retained in care and achieving virologic suppression, 366-730 days after diagnosis. Positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC) for each linkage measure and retention and virologic suppression outcomes are described. Results Of the 1781 patients in the study, 503 (28.2%) were retained in care in the Ryan White system and 418 (23.5%) achieved virologic suppression 366-730 days after diagnosis. The linkage measure with the highest PPV for retention was having two clinic visits within 365 days of diagnosis, separated by 90 days (74.2%). Having a clinic visit between 21 and 365 days after diagnosis had both the highest NPV for retention (94.5%) and the highest adjusted AUC for retention (0.872). Having two tests within 365 days of diagnosis, separated by 90 days, had the highest adjusted AUC for virologic suppression (0.780). Conclusions Linkage measures associated with clinic visits had higher PPV and NPV for retention, while linkage measures associated with laboratory testing had higher PPV and NPV for retention. Linkage measures should be chosen based on the outcome of interest. PMID:23614992
Mahmood, Hafiz Sultan; Hoogmoed, Willem B.; van Henten, Eldert J.
2013-01-01
Fine-scale spatial information on soil properties is needed to successfully implement precision agriculture. Proximal gamma-ray spectroscopy has recently emerged as a promising tool to collect fine-scale soil information. The objective of this study was to evaluate a proximal gamma-ray spectrometer to predict several soil properties using energy-windows and full-spectrum analysis methods in two differently managed sandy loam fields: conventional and organic. In the conventional field, both methods predicted clay, pH and total nitrogen with a good accuracy (R2 ≥ 0.56) in the top 0–15 cm soil depth, whereas in the organic field, only clay content was predicted with such accuracy. The highest prediction accuracy was found for total nitrogen (R2 = 0.75) in the conventional field in the energy-windows method. Predictions were better in the top 0–15 cm soil depths than in the 15–30 cm soil depths for individual and combined fields. This implies that gamma-ray spectroscopy can generally benefit soil characterisation for annual crops where the condition of the seedbed is important. Small differences in soil structure (conventional vs. organic) cannot be determined. As for the methodology, we conclude that the energy-windows method can establish relations between radionuclide data and soil properties as accurate as the full-spectrum analysis method. PMID:24287541
Da, Yang; Wang, Chunkao; Wang, Shengwen; Hu, Guo
2014-01-01
We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005–0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level. PMID:24498162
Da, Yang; Wang, Chunkao; Wang, Shengwen; Hu, Guo
2014-01-01
We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005-0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level.
Yingyongyudha, Anyamanee; Saengsirisuwan, Vitoon; Panichaporn, Wanvisa; Boonsinsukh, Rumpa
2016-01-01
Balance deficits a significant predictor of falls in older adults. The Balance Evaluation Systems Test (BESTest) and the Mini-Balance Evaluation Systems Test (Mini-BESTest) are tools that may predict the likelihood of a fall, but their capabilities and accuracies have not been adequately addressed. Therefore, this study aimed at examining the capabilities of the BESTest and Mini-BESTest for identifying older adult with history of falls and comparing the participants with history of falls identification accuracy of the BESTest, Mini-BESTest, Berg Balance Scale (BBS), and the Timed Up and Go Test (TUG) for identifying participants with a history of falls. Two hundred healthy older adults with a mean age of 70 years were classified into participants with and without history of fall groups on the basis of their 12-month fall history. Their balance abilities were assessed using the BESTest, Mini-BESTest, BBS, and TUG. An analysis of the resulting receiver operating characteristic curves was performed to calculate the area under the curve (AUC), sensitivity, specificity, cutoff score, and posttest accuracy of each. The Mini-BESTest showed the highest AUC (0.84) compared with the BESTest (0.74), BBS (0.69), and TUG (0.35), suggesting that the Mini-BESTest had the highest accuracy in identifying older adult with history of falls. At the cutoff score of 16 (out of 28), the Mini-BESTest demonstrated a posttest accuracy of 85% with a sensitivity of 85% and specificity of 75%. The Mini-BESTest had the highest posttest accuracy, with the others having results of 76% (BESTest), 60% (BBS), and 65% (TUG). The Mini-BESTest is the most accurate tool for identifying older adult with history of falls compared with the BESTest, BBS, and TUG.
Chung, Hyun Sik; Lee, Yu Jung; Jo, Yun Sung
2017-02-21
BACKGROUND Acute liver failure (ALF) is known to be a rapidly progressive and fatal disease. Various models which could help to estimate the post-transplant outcome for ALF have been developed; however, none of them have been proved to be the definitive predictive model of accuracy. We suggest a new predictive model, and investigated which model has the highest predictive accuracy for the short-term outcome in patients who underwent living donor liver transplantation (LDLT) due to ALF. MATERIAL AND METHODS Data from a total 88 patients were collected retrospectively. King's College Hospital criteria (KCH), Child-Turcotte-Pugh (CTP) classification, and model for end-stage liver disease (MELD) score were calculated. Univariate analysis was performed, and then multivariate statistical adjustment for preoperative variables of ALF prognosis was performed. A new predictive model was developed, called the MELD conjugated serum phosphorus model (MELD-p). The individual diagnostic accuracy and cut-off value of models in predicting 3-month post-transplant mortality were evaluated using the area under the receiver operating characteristic curve (AUC). The difference in AUC between MELD-p and the other models was analyzed. The diagnostic improvement in MELD-p was assessed using the net reclassification improvement (NRI) and integrated discrimination improvement (IDI). RESULTS The MELD-p and MELD scores had high predictive accuracy (AUC >0.9). KCH and serum phosphorus had an acceptable predictive ability (AUC >0.7). The CTP classification failed to show discriminative accuracy in predicting 3-month post-transplant mortality. The difference in AUC between MELD-p and the other models had statistically significant associations with CTP and KCH. The cut-off value of MELD-p was 3.98 for predicting 3-month post-transplant mortality. The NRI was 9.9% and the IDI was 2.9%. CONCLUSIONS MELD-p score can predict 3-month post-transplant mortality better than other scoring systems after LDLT due to ALF. The recommended cut-off value of MELD-p is 3.98.
Samad, Manar D; Ulloa, Alvaro; Wehner, Gregory J; Jing, Linyuan; Hartzel, Dustin; Good, Christopher W; Williams, Brent A; Haggerty, Christopher M; Fornwalt, Brandon K
2018-06-09
The goal of this study was to use machine learning to more accurately predict survival after echocardiography. Predicting patient outcomes (e.g., survival) following echocardiography is primarily based on ejection fraction (EF) and comorbidities. However, there may be significant predictive information within additional echocardiography-derived measurements combined with clinical electronic health record data. Mortality was studied in 171,510 unselected patients who underwent 331,317 echocardiograms in a large regional health system. We investigated the predictive performance of nonlinear machine learning models compared with that of linear logistic regression models using 3 different inputs: 1) clinical variables, including 90 cardiovascular-relevant International Classification of Diseases, Tenth Revision, codes, and age, sex, height, weight, heart rate, blood pressures, low-density lipoprotein, high-density lipoprotein, and smoking; 2) clinical variables plus physician-reported EF; and 3) clinical variables and EF, plus 57 additional echocardiographic measurements. Missing data were imputed with a multivariate imputation by using a chained equations algorithm (MICE). We compared models versus each other and baseline clinical scoring systems by using a mean area under the curve (AUC) over 10 cross-validation folds and across 10 survival durations (6 to 60 months). Machine learning models achieved significantly higher prediction accuracy (all AUC >0.82) over common clinical risk scores (AUC = 0.61 to 0.79), with the nonlinear random forest models outperforming logistic regression (p < 0.01). The random forest model including all echocardiographic measurements yielded the highest prediction accuracy (p < 0.01 across all models and survival durations). Only 10 variables were needed to achieve 96% of the maximum prediction accuracy, with 6 of these variables being derived from echocardiography. Tricuspid regurgitation velocity was more predictive of survival than LVEF. In a subset of studies with complete data for the top 10 variables, multivariate imputation by chained equations yielded slightly reduced predictive accuracies (difference in AUC of 0.003) compared with the original data. Machine learning can fully utilize large combinations of disparate input variables to predict survival after echocardiography with superior accuracy. Copyright © 2018 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav
2013-01-01
In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models. PMID:24082033
Beauchet, O; Noublanche, F; Simon, R; Sekhon, H; Chabot, J; Levinoff, E J; Kabeshova, A; Launay, C P
2018-01-01
Identification of the risk of falls is important among older inpatients. This study aims to examine performance criteria (i.e.; sensitivity, specificity, positive predictive value, negative predictive value and accuracy) for fall prediction resulting from a nurse assessment and an artificial neural networks (ANNs) analysis in older inpatients hospitalized in acute care medical wards. A total of 848 older inpatients (mean age, 83.0±7.2 years; 41.8% female) admitted to acute care medical wards in Angers University hospital (France) were included in this study using an observational prospective cohort design. Within 24 hours after admission of older inpatients, nurses performed a bedside clinical assessment. Participants were separated into non-fallers and fallers (i.e.; ≥1 fall during hospitalization stay). The analysis was conducted using three feed forward ANNs (multilayer perceptron [MLP], averaged neural network, and neuroevolution of augmenting topologies [NEAT]). Seventy-three (8.6%) participants fell at least once during their hospital stay. ANNs showed a high specificity, regardless of which ANN was used, and the highest value reported was with MLP (99.8%). In contrast, sensitivity was lower, with values ranging between 98.4 to 14.8%. MLP had the highest accuracy (99.7). Performance criteria for fall prediction resulting from a bedside nursing assessment and an ANNs analysis was associated with a high specificity but a low sensitivity, suggesting that this combined approach should be used more as a diagnostic test than a screening test when considering older inpatients in acute care medical ward.
Vallejo, Roger L; Leeds, Timothy D; Gao, Guangtu; Parsons, James E; Martin, Kyle E; Evenhuis, Jason P; Fragomeni, Breno O; Wiens, Gregory D; Palti, Yniv
2017-02-01
Previously, we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative that enables exploitation of within-family genetic variation. We compared three GS models [single-step genomic best linear unbiased prediction (ssGBLUP), weighted ssGBLUP (wssGBLUP), and BayesB] to predict genomic-enabled breeding values (GEBV) for BCWD resistance in a commercial rainbow trout population, and compared the accuracy of GEBV to traditional estimates of breeding values (EBV) from a pedigree-based BLUP (P-BLUP) model. We also assessed the impact of sampling design on the accuracy of GEBV predictions. For these comparisons, we used BCWD survival phenotypes recorded on 7893 fish from 102 families, of which 1473 fish from 50 families had genotypes [57 K single nucleotide polymorphism (SNP) array]. Naïve siblings of the training fish (n = 930 testing fish) were genotyped to predict their GEBV and mated to produce 138 progeny testing families. In the following generation, 9968 progeny were phenotyped to empirically assess the accuracy of GEBV predictions made on their non-phenotyped parents. The accuracy of GEBV from all tested GS models were substantially higher than the P-BLUP model EBV. The highest increase in accuracy relative to the P-BLUP model was achieved with BayesB (97.2 to 108.8%), followed by wssGBLUP at iteration 2 (94.4 to 97.1%) and 3 (88.9 to 91.2%) and ssGBLUP (83.3 to 85.3%). Reducing the training sample size to n = ~1000 had no negative impact on the accuracy (0.67 to 0.72), but with n = ~500 the accuracy dropped to 0.53 to 0.61 if the training and testing fish were full-sibs, and even substantially lower, to 0.22 to 0.25, when they were not full-sibs. Using progeny performance data, we showed that the accuracy of genomic predictions is substantially higher than estimates obtained from the traditional pedigree-based BLUP model for BCWD resistance. Overall, we found that using a much smaller training sample size compared to similar studies in livestock, GS can substantially improve the selection accuracy and genetic gains for this trait in a commercial rainbow trout breeding population.
Can machine-learning improve cardiovascular risk prediction using routine clinical data?
Kai, Joe; Garibaldi, Jonathan M.; Qureshi, Nadeem
2017-01-01
Background Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Methods Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the ‘receiver operating curve’ (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). Findings 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723–0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739–0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755–0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755–0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759–0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Conclusions Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others. PMID:28376093
Can machine-learning improve cardiovascular risk prediction using routine clinical data?
Weng, Stephen F; Reps, Jenna; Kai, Joe; Garibaldi, Jonathan M; Qureshi, Nadeem
2017-01-01
Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the 'receiver operating curve' (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723-0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739-0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755-0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755-0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759-0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.
NASA Technical Reports Server (NTRS)
Beck, L. R.; Rodriguez, M. H.; Dister, S. W.; Rodriguez, A. D.; Washino, R. K.; Roberts, D. R.; Spanner, M. A.
1997-01-01
A blind test of two remote sensing-based models for predicting adult populations of Anopheles albimanus in villages, an indicator of malaria transmission risk, was conducted in southern Chiapas, Mexico. One model was developed using a discriminant analysis approach, while the other was based on regression analysis. The models were developed in 1992 for an area around Tapachula, Chiapas, using Landsat Thematic Mapper (TM) satellite data and geographic information system functions. Using two remotely sensed landscape elements, the discriminant model was able to successfully distinguish between villages with high and low An. albimanus abundance with an overall accuracy of 90%. To test the predictive capability of the models, multitemporal TM data were used to generate a landscape map of the Huixtla area, northwest of Tapachula, where the models were used to predict risk for 40 villages. The resulting predictions were not disclosed until the end of the test. Independently, An. albimanus abundance data were collected in the 40 randomly selected villages for which the predictions had been made. These data were subsequently used to assess the models' accuracies. The discriminant model accurately predicted 79% of the high-abundance villages and 50% of the low-abundance villages, for an overall accuracy of 70%. The regression model correctly identified seven of the 10 villages with the highest mosquito abundance. This test demonstrated that remote sensing-based models generated for one area can be used successfully in another, comparable area.
Sareen, Rateesh; Pandey, C L
2016-01-01
Background: Early diagnosis of lung cancer plays a pivotal role in reducing lung cancer death rate. Cytological techniques are safer, economical and provide quick results. Bronchoscopic washing, brushing and fine needle aspirations not only complement tissue biopsies in the diagnosis of lung cancer but also comparable. Objectives: (1) To find out diagnostic yields of bronchioalveolar lavage, bronchial brushings, FNAC in diagnosis of lung malignancy. (2) To compare relative accuracy of these three cytological techniques. (3) To correlate the cytologic diagnosis with clinical, bronchoscopic and CT findings. (4) Cytological and histopathological correlation of lung lesions. Methods: All the patients who came with clinical or radiological suspicion of lung malignancy in two and a half year period were included in study. Bronchoalveolar lavage was the most common type of cytological specimen (82.36%), followed by CT guided FNAC (9.45%) and bronchial brushings (8.19%). Sensitivity, specificity, positive and negative predictive value for all techniques and correlation with histopathology was done using standard formulas. Results: The most sensitive technique was CT FNAC – (87.25%) followed by brushings (77.78%) and BAL (72.69%). CT FNAC had highest diagnostic yield (90.38%), followed by brushings (86.67%) and BAL (83.67%). Specificity and positive predictive value were 100 % each of all techniques. Lowest false negatives were obtained in CT FNAC (12.5%) and highest in BAL (27.3%). Highest negative predictive value was of BAL 76.95 % followed by BB 75.59% and CT FNAC 70.59%. Conclusion: Before administering antitubercular treatment every effort should be made to rule out malignancy. CT FNAC had highest diagnostic yield among three cytological techniques. BAL is an important tool in screening central as well as in accessible lesions. It can be used at places where CT guided FNAC is not available or could not be done due to technical or financial limitations PMID:27890992
NetMHCcons: a consensus method for the major histocompatibility complex class I predictions.
Karosiene, Edita; Lundegaard, Claus; Lund, Ole; Nielsen, Morten
2012-03-01
A key role in cell-mediated immunity is dedicated to the major histocompatibility complex (MHC) molecules that bind peptides for presentation on the cell surface. Several in silico methods capable of predicting peptide binding to MHC class I have been developed. The accuracy of these methods depends on the data available characterizing the binding specificity of the MHC molecules. It has, moreover, been demonstrated that consensus methods defined as combinations of two or more different methods led to improved prediction accuracy. This plethora of methods makes it very difficult for the non-expert user to choose the most suitable method for predicting binding to a given MHC molecule. In this study, we have therefore made an in-depth analysis of combinations of three state-of-the-art MHC-peptide binding prediction methods (NetMHC, NetMHCpan and PickPocket). We demonstrate that a simple combination of NetMHC and NetMHCpan gives the highest performance when the allele in question is included in the training and is characterized by at least 50 data points with at least ten binders. Otherwise, NetMHCpan is the best predictor. When an allele has not been characterized, the performance depends on the distance to the training data. NetMHCpan has the highest performance when close neighbours are present in the training set, while the combination of NetMHCpan and PickPocket outperforms either of the two methods for alleles with more remote neighbours. The final method, NetMHCcons, is publicly available at www.cbs.dtu.dk/services/NetMHCcons , and allows the user in an automatic manner to obtain the most accurate predictions for any given MHC molecule.
Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik
2018-05-01
Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our findings suggest that SVM, using features from a combination of diffusion models, improves prediction accuracy for differentiation of benign versus malignant breast tumors, and may further assist in subtyping of breast cancer. 3 Technical Efficacy: Stage 3 J. Magn. Reson. Imaging 2018;47:1205-1216. © 2017 International Society for Magnetic Resonance in Medicine.
Kuo, Pao-Jen; Wu, Shao-Chun; Chien, Peng-Chen; Chang, Shu-Shya; Rau, Cheng-Shyuan; Tai, Hsueh-Ling; Peng, Shu-Hui; Lin, Yi-Chun; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2018-01-01
Background The aim of this study was to develop an effective surgical site infection (SSI) prediction model in patients receiving free-flap reconstruction after surgery for head and neck cancer using artificial neural network (ANN), and to compare its predictive power with that of conventional logistic regression (LR). Materials and methods There were 1,836 patients with 1,854 free-flap reconstructions and 438 postoperative SSIs in the dataset for analysis. They were randomly assigned tin ratio of 7:3 into a training set and a test set. Based on comprehensive characteristics of patients and diseases in the absence or presence of operative data, prediction of SSI was performed at two time points (pre-operatively and post-operatively) with a feed-forward ANN and the LR models. In addition to the calculated accuracy, sensitivity, and specificity, the predictive performance of ANN and LR were assessed based on area under the curve (AUC) measures of receiver operator characteristic curves and Brier score. Results ANN had a significantly higher AUC (0.892) of post-operative prediction and AUC (0.808) of pre-operative prediction than LR (both P<0.0001). In addition, there was significant higher AUC of post-operative prediction than pre-operative prediction by ANN (p<0.0001). With the highest AUC and the lowest Brier score (0.090), the post-operative prediction by ANN had the highest overall predictive performance. Conclusion The post-operative prediction by ANN had the highest overall performance in predicting SSI after free-flap reconstruction in patients receiving surgery for head and neck cancer. PMID:29568393
Galehdari, Hamid; Saki, Najmaldin; Mohammadi-Asl, Javad; Rahim, Fakher
2013-01-01
Crigler-Najjar syndrome (CNS) type I and type II are usually inherited as autosomal recessive conditions that result from mutations in the UGT1A1 gene. The main objective of the present review is to summarize results of all available evidence on the accuracy of SNP-based pathogenicity detection tools compared to published clinical result for the prediction of in nsSNPs that leads to disease using prediction performance method. A comprehensive search was performed to find all mutations related to CNS. Database searches included dbSNP, SNPdbe, HGMD, Swissvar, ensemble, and OMIM. All the mutation related to CNS was extracted. The pathogenicity prediction was done using SNP-based pathogenicity detection tools include SIFT, PHD-SNP, PolyPhen2, fathmm, Provean, and Mutpred. Overall, 59 different SNPs related to missense mutations in the UGT1A1 gene, were reviewed. Comparing the diagnostic OR, PolyPhen2 and Mutpred have the highest detection 4.983 (95% CI: 1.24 - 20.02) in both, following by SIFT (diagnostic OR: 3.25, 95% CI: 1.07 - 9.83). The highest MCC of SNP-based pathogenicity detection tools, was belong to SIFT (34.19%) followed by Provean, PolyPhen2, and Mutpred (29.99%, 29.89%, and 29.89%, respectively). Hence the highest SNP-based pathogenicity detection tools ACC, was fit to SIFT (62.71%) followed by PolyPhen2, and Mutpred (61.02%, in both). Our results suggest that some of the well-established SNP-based pathogenicity detection tools can appropriately reflect the role of a disease-associated SNP in both local and global structures.
Influence of cone beam CT enhancement filters on diagnosis ability of longitudinal root fractures
Nascimento, M C C; Nejaim, Y; de Almeida, S M; Bóscolo, F N; Haiter-Neto, F; Sobrinho, L C
2014-01-01
Objectives: To determine whether cone beam CT (CBCT) enhancement filters influence the diagnosis of longitudinal root fractures. Methods: 40 extracted human posterior teeth were endodontically prepared, and fractures with no separation of fragments were made in 20 teeth of this sample. The teeth were placed in a dry mandible and scanned using a Classic i-CAT® CBCT device (Imaging Sciences International, Inc., Hatfield, PA). Evaluations were performed with and without CBCT filters (Sharpen Mild, Sharpen Super Mild, S9, Sharpen, Sharpen 3 × 3, Angio Sharpen Medium 5 × 5, Angio Sharpen High 5 × 5 and Shadow 3 × 3) by three oral radiologists. Inter- and intraobserver agreement was calculated by the kappa test. Accuracy, sensitivity, specificity and positive and negative predictive values were determined. McNemar test was applied for agreement between all images vs the gold standard and original images vs images with filters (p < 0.05). Results: Means of intraobserver agreement ranged from good to excellent. Angio Sharpen Medium 5 × 5 filter obtained the highest positive predictive value (80.0%) and specificity value (76.5%). Angio Sharpen High 5 × 5 filter obtained the highest sensitivity (78.9%) and accuracy (77.5%) value. Negative predictive value was the highest (82.9%) for S9 filter. The McNemar test showed no statistically significant differences between images with and without CBCT filters (p > 0.05). Conclusions: Although no statistical differences was observed in the diagnosis of root fractures when using filters, these filters seem to improve diagnostic capacity for longitudinal root fractures. Further in vitro studies with endodontic-treated teeth and research in vivo should be considered. PMID:24408819
Multivariate Models for Prediction of Human Skin Sensitization Hazard
Strickland, Judy; Zang, Qingda; Paris, Michael; Lehmann, David M.; Allen, David; Choksi, Neepa; Matheson, Joanna; Jacobs, Abigail; Casey, Warren; Kleinstreuer, Nicole
2016-01-01
One of ICCVAM’s top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays—the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT), and KeratinoSens™ assay—six physicochemical properties, and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches, logistic regression (LR) and support vector machine (SVM), to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three LR and three SVM) with the highest accuracy (92%) used: (1) DPRA, h-CLAT, and read-across; (2) DPRA, h-CLAT, read-across, and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens, and log P. The models performed better at predicting human skin sensitization hazard than the murine local lymph node assay (accuracy = 88%), any of the alternative methods alone (accuracy = 63–79%), or test batteries combining data from the individual methods (accuracy = 75%). These results suggest that computational methods are promising tools to effectively identify potential human skin sensitizers without animal testing. PMID:27480324
Brodaty, Henry; Aerts, Liesbeth; Crawford, John D; Heffernan, Megan; Kochan, Nicole A; Reppermund, Simone; Kang, Kristan; Maston, Kate; Draper, Brian; Trollor, Julian N; Sachdev, Perminder S
2017-05-01
Mild cognitive impairment (MCI) is considered an intermediate stage between normal aging and dementia. It is diagnosed in the presence of subjective cognitive decline and objective cognitive impairment without significant functional impairment, although there are no standard operationalizations for each of these criteria. The objective of this study is to determine which operationalization of the MCI criteria is most accurate at predicting dementia. Six-year longitudinal study, part of the Sydney Memory and Ageing Study. Community-based. 873 community-dwelling dementia-free adults between 70 and 90 years of age. Persons from a non-English speaking background were excluded. Seven different operationalizations for subjective cognitive decline and eight measures of objective cognitive impairment (resulting in 56 different MCI operational algorithms) were applied. The accuracy of each algorithm to predict progression to dementia over 6 years was examined for 618 individuals. Baseline MCI prevalence varied between 0.4% and 30.2% and dementia conversion between 15.9% and 61.9% across different algorithms. The predictive accuracy for progression to dementia was poor. The highest accuracy was achieved based on objective cognitive impairment alone. Inclusion of subjective cognitive decline or mild functional impairment did not improve dementia prediction accuracy. Not MCI, but objective cognitive impairment alone, is the best predictor for progression to dementia in a community sample. Nevertheless, clinical assessment procedures need to be refined to improve the identification of pre-dementia individuals. Copyright © 2016 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.
Zhou, Chao; Yin, Kunlong; Cao, Ying; Ahmed, Bayes; Fu, Xiaolin
2018-05-08
Landslide displacement prediction is considered as an essential component for developing early warning systems. The modelling of conventional forecast methods requires enormous monitoring data that limit its application. To conduct accurate displacement prediction with limited data, a novel method is proposed and applied by integrating three computational intelligence algorithms namely: the wavelet transform (WT), the artificial bees colony (ABC), and the kernel-based extreme learning machine (KELM). At first, the total displacement was decomposed into several sub-sequences with different frequencies using the WT. Next each sub-sequence was predicted separately by the KELM whose parameters were optimized by the ABC. Finally the predicted total displacement was obtained by adding all the predicted sub-sequences. The Shuping landslide in the Three Gorges Reservoir area in China was taken as a case study. The performance of the new method was compared with the WT-ELM, ABC-KELM, ELM, and the support vector machine (SVM) methods. Results show that the prediction accuracy can be improved by decomposing the total displacement into sub-sequences with various frequencies and by predicting them separately. The ABC-KELM algorithm shows the highest prediction capacity followed by the ELM and SVM. Overall, the proposed method achieved excellent performance both in terms of accuracy and stability.
Ho, Chih-I; Chen, Jau-Yuan; Chen, Shou-Yen; Tsai, Yi-Wen; Weng, Yi-Ming; Tsao, Yu-Chung; Li, Wen-Cheng
2015-10-01
The triglycerides-to-high-density lipoprotein-cholesterol (TG/HDL-C) ratio has been identified as a biomarker of insulin resistance and a predictor for atherosclerosis. The objectives of this study were to investigate which the TG/HDL-C ratio is useful to detect metabolic syndrome (MS) risk factors and subclinical chronic kidney disease (CKD) in general population without known CKD or renal impairment and to compare predictive accuracy of MS risk factors. This was a cross-sectional study. A total 46,255 subjects aged ≥18 years undergoing health examination during 2010-2011 in Taiwan. The independent associations between TG/HDL-C ratio quartiles, waist circumstance (WC) waist-to-height ratio (WHtR), mean atrial pressure (MAP), and CKD prevalence was analyzed by using logistic regression models. Analyses of the areas under receiver operating characteristic (ROC) were performed to determine the accuracy of MS risk factors in predicting CKD. A dose-response manner was observed for the prevalence of CKD and measurements of MS risk factors, showing increases from the lowest to the highest quartile of the TG/HDL-C ratio. Males and females in the highest TG/HDL-C ratio quartile (>2.76) had a 1.4-fold and 1.74-fold greater risk of CKD than those in the lowest quartile (≤1.04), independent of confounding factors. Mean arterial pressure (MAP) had the highest AUC for predicting CKD among MS risk factors. The TG/HDL-C ratio was an independent risk factor for CKD, but it showed no superiority over MAP in predicting CKD. A TG/HDL-C ratio ≥2.76 may be useful in clinical practice to detect subjects with worsened cardiometabolic profile who need monitoring to prevent CKD. TG/HDL-C ratio is an independent risk factor for CKD in adults aged 18-50 years. MAP was the most powerful predictor over other MS risk factors in predicting CKD. However, longitudinal and comparative studies are required to demonstrate the predictive value of TG/HDL-C on the onset and progression of CKD over time. Copyright © 2014 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Hassan, Mahmoud Fathy; Rund, Nancy Mohamed Ali; Salama, Ahmed Husseiny
2013-01-01
Background. To assess the ability of mid-trimester sFlt-1/PlGF ratio for prediction of preeclampsia in two different Arabic populations. Methods. This study measured levels of sFlt-1, PlGF, and sFlt-1/PlGF ratio at midtrimester in 83 patients who developed preeclampsia with contemporary 250 matched controls. Results. Women subsequently developed preeclampsia had significantly lower PlGF levels and higher sFlt-1 and sFlt-1/PlGF ratio levels than women with normal pregnancies (P < 0.0001 for all). Women who with preterm preeclampsia had significantly higher sFlt-1 and sFlt-1/PlGF ratio than term preeclamptic women (P = 0.01, 0.003, resp.). A cutoff value of 3198 pg/mL for sFlt-1 was able to predict preeclampsia with sensitivity, specificity, and accuracy of 88%, 83.6%, and 84.7%, respectively, with odds ratio (OR) 37.2 [95% confidence interval (CI) 17.7-78.1]. PIGF at cutoff value of 138 pg/mL was able to predict preeclampsia with sensitivity, specificity, and accuracy of 85.5%, 77.2%, and 79.3%, respectively, with OR 20 [95% CI, 10.2-39.5]. The sFlt-1/PIGF ratio at cutoff value of 24.5 was able to predict preeclampsia with sensitivity, specificity, and accuracy of 91.6%, 86.4%, and 87.7%, respectively with OR 67 [95% CI, 29.3-162.1]. Conclusion. Midtrimester sFlt-1/PlGF ratio displayed the highest sensitivity, specificity, accuracy, and OR for prediction of preeclampsia, demonstrating that it may stipulate more effective prediction of preeclampsia development than individual factor assay.
Inui, Yoshitaka; Ito, Kengo; Kato, Takashi
2017-01-01
The value of fluorine-18-fluorodeoxyglucose positron emission tomography (18F-FDG-PET) and magnetic resonance imaging (MRI) for predicting conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD) in longer-term is unclear. To evaluate longer-term prediction of MCI to AD conversion using 18F-FDG-PET and MRI in a multicenter study. One-hundred and fourteen patients with MCI were followed for 5 years. They underwent clinical and neuropsychological examinations, 18F-FDG-PET, and MRI at baseline. PET images were visually classified into predefined dementia patterns. PET scores were calculated as a semi quantitative index. For structural MRI, z-scores in medial temporal area were calculated by automated volume-based morphometry (VBM). Overall, 72% patients with amnestic MCI progressed to AD during the 5-year follow-up. The diagnostic accuracy of PET scores over 5 years was 60% with 53% sensitivity and 84% specificity. Visual interpretation of PET images predicted conversion to AD with an overall 82% diagnostic accuracy, 94% sensitivity, and 53% specificity. The accuracy of VBM analysis presented little fluctuation through 5 years and it was highest (73%) at the 5-year follow-up, with 79% sensitivity and 63% specificity. The best performance (87.9% diagnostic accuracy, 89.8% sensitivity, and 82.4% specificity) was with a combination identified using multivariate logistic regression analysis that included PET visual interpretation, educational level, and neuropsychological tests as predictors. 18F-FDG-PET visual assessment showed high performance for predicting conversion to AD from MCI, particularly in combination with neuropsychological tests. PET scores showed high diagnostic specificity. Structural MRI focused on the medial temporal area showed stable predictive value throughout the 5-year course.
NASA Astrophysics Data System (ADS)
Tseng, Chien-Hsun
2018-06-01
This paper aims to develop a multidimensional wave digital filtering network for predicting static and dynamic behaviors of composite laminate based on the FSDT. The resultant network is, thus, an integrated platform that can perform not only the free vibration but also the bending deflection of moderate thick symmetric laminated plates with low plate side-to-thickness ratios (< = 20). Safeguarded by the Courant-Friedrichs-Levy stability condition with the least restriction in terms of optimization technique, the present method offers numerically high accuracy, stability and efficiency to proceed a wide range of modulus ratios for the FSDT laminated plates. Instead of using a constant shear correction factor (SCF) with a limited numerical accuracy for the bending deflection, an optimum SCF is particularly sought by looking for a minimum ratio of change in the transverse shear energy. This way, it can predict as good results in terms of accuracy for certain cases of bending deflection. Extensive simulation results carried out for the prediction of maximum bending deflection have demonstratively proven that the present method outperforms those based on the higher-order shear deformation and layerwise plate theories. To the best of our knowledge, this is the first work that shows an optimal selection of SCF can significantly increase the accuracy of FSDT-based laminates especially compared to the higher order theory disclaiming any correction. The highest accuracy of overall solution is compared to the 3D elasticity equilibrium one.
Estimation of stature from hand and foot dimensions in a Korean population.
Kim, Wonjoon; Kim, Yong Min; Yun, Myung Hwan
2018-04-01
The estimation of stature using foot and hand dimensions is essential in the process of personal identification. The shapes of feet and hands vary depending on races and gender, and it is of great importance to design an adequate equation in consideration of variances to estimate stature. This study is based on a total of 5,195 South Korean males and females, aged from 20 to 59 years. Body dimensions of stature, hand length, hand breadth, foot length, and foot breadth were measured according to standard anthropometric procedures. The independent t-test was performed in order to verify significant gender-induced differences and the results showed that there was significant difference between males and females for all the foot-hand dimensions (p<0.01). All dimensions showed a positive and statistically significant relation with stature in both genders (p<0.01). For both genders, the foot length showed highest correlation, whereas the hand breadth showed least correlation. The stepwise regression analysis was conducted, and the results showed that males had the highest prediction accuracy in the regression equation consisting of foot length and hand length (R 2 =0.532), whereas females had the highest accuracy in the regression model consisting of foot length and hand breadth (R 2 =0.437) The findings of this study indicated that hand and foot dimensions can be used to predict the stature of South Korean in the forensic science field. Copyright © 2018 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers
2010-01-01
Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP. PMID:20950478
Blum, Meike; Distl, Ottmar
2014-01-01
In the present study, breeding values for canine congenital sensorineural deafness, the presence of blue eyes and patches have been predicted using multivariate animal models to test the reliability of the breeding values for planned matings. The dataset consisted of 6669 German Dalmatian dogs born between 1988 and 2009. Data were provided by the Dalmatian kennel clubs which are members of the German Association for Dog Breeding and Husbandry (VDH). The hearing status for all dogs was evaluated using brainstem auditory evoked potentials. The reliability using the prediction error variance of breeding values and the realized reliability of the prediction of the phenotype of future progeny born in each one year between 2006 and 2009 were used as parameters to evaluate the goodness of prediction through breeding values. All animals from the previous birth years were used for prediction of the breeding values of the progeny in each of the up-coming birth years. The breeding values based on pedigree records achieved an average reliability of 0.19 for the future 1951 progeny. The predictive accuracy (R2) for the hearing status of single future progeny was at 1.3%. Combining breeding values for littermates increased the predictive accuracy to 3.5%. Corresponding values for maternal and paternal half-sib groups were at 3.2 and 7.3%. The use of breeding values for planned matings increases the phenotypic selection response over mass selection. The breeding values of sires may be used for planned matings because reliabilities and predictive accuracies for future paternal progeny groups were highest.
Conde-Agudelo, Agustin; Romero, Roberto
2015-12-01
To determine the accuracy of changes in transvaginal sonographic cervical length over time in predicting preterm birth in women with singleton and twin gestations. PubMed, Embase, Cinahl, Lilacs, and Medion (all from inception to June 30, 2015), bibliographies, Google scholar, and conference proceedings. Cohort or cross-sectional studies reporting on the predictive accuracy for preterm birth of changes in cervical length over time. Two reviewers independently selected studies, assessed the risk of bias, and extracted the data. Summary receiver-operating characteristic curves, pooled sensitivities and specificities, and summary likelihood ratios were generated. Fourteen studies met the inclusion criteria, of which 7 provided data on singleton gestations (3374 women) and 8 on twin gestations (1024 women). Among women with singleton gestations, the shortening of cervical length over time had a low predictive accuracy for preterm birth at <37 and <35 weeks of gestation with pooled sensitivities and specificities, and summary positive and negative likelihood ratios ranging from 49% to 74%, 44% to 85%, 1.3 to 4.1, and 0.3 to 0.7, respectively. In women with twin gestations, the shortening of cervical length over time had a low to moderate predictive accuracy for preterm birth at <34, <32, <30, and <28 weeks of gestation with pooled sensitivities and specificities, and summary positive and negative likelihood ratios ranging from 47% to 73%, 84% to 89%, 3.8 to 5.3, and 0.3 to 0.6, respectively. There were no statistically significant differences between the predictive accuracies for preterm birth of cervical length shortening over time and the single initial and/or final cervical length measurement in 8 of 11 studies that provided data for making these comparisons. In the largest and highest-quality study, a single measurement of cervical length obtained at 24 or 28 weeks of gestation was significantly more predictive of preterm birth than any decrease in cervical length between these gestational ages. Change in transvaginal sonographic cervical length over time is not a clinically useful test to predict preterm birth in women with singleton or twin gestations. A single cervical length measurement obtained between 18 and 24 weeks of gestation appears to be a better test to predict preterm birth than changes in cervical length over time. Published by Elsevier Inc.
Hua, Hong-Li; Zhang, Fa-Zhan; Labena, Abraham Alemayehu; Dong, Chuan; Jin, Yan-Ting; Guo, Feng-Biao
Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus , which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.
Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter
2017-05-12
A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P < 0.05). GFBLUP provides a framework for integrating multiple layers of biological knowledge to provide novel insights into the biological basis of complex traits, and to improve the accuracy of genomic prediction. The SNP set test might be used as a first-step to improve GFBLUP models. Approaches like GFBLUP and SNP set test will become increasingly useful, as the functional annotations of genomes keep accumulating for a range of species and traits.
Frouzan, Arash; Masoumi, Kambiz; Delirroyfard, Ali; Mazdaie, Behnaz; Bagherzadegan, Elnaz
2017-01-01
Background Long bone fractures are common injuries caused by trauma. Some studies have demonstrated that ultrasound has a high sensitivity and specificity in the diagnosis of upper and lower extremity long bone fractures. Objective The aim of this study was to determine the accuracy of ultrasound compared with plain radiography in diagnosis of upper and lower extremity long bone fractures in traumatic patients. Methods This cross-sectional study assessed 100 patients admitted to the emergency department of Imam Khomeini Hospital, Ahvaz, Iran with trauma to the upper and lower extremities, from September 2014 through October 2015. In all patients, first ultrasound and then standard plain radiography for the upper and lower limb was performed. Data were analyzed by SPSS version 21 to determine the specificity and sensitivity. Results The mean age of patients with upper and lower limb trauma were 31.43±12.32 years and 29.63±5.89 years, respectively. Radius fracture was the most frequent compared to other fractures (27%). Sensitivity, specificity, positive predicted value, and negative predicted value of ultrasound compared with plain radiography in the diagnosis of upper extremity long bones were 95.3%, 87.7%, 87.2% and 96.2%, respectively, and the highest accuracy was observed in left arm fractures (100%). Tibia and fibula fractures were the most frequent types compared to other fractures (89.2%). Sensitivity, specificity, PPV and NPV of ultrasound compared with plain radiography in the diagnosis of upper extremity long bone fractures were 98.6%, 83%, 65.4% and 87.1%, respectively, and the highest accuracy was observed in men, lower ages and femoral fractures. Conclusion The results of this study showed that ultrasound compared with plain radiography has a high accuracy in the diagnosis of upper and lower extremity long bone fractures. PMID:28979747
Modelling invasion for a habitat generalist and a specialist plant species
Evangelista, P.H.; Kumar, S.; Stohlgren, T.J.; Jarnevich, C.S.; Crall, A.W.; Norman, J. B.; Barnett, D.T.
2008-01-01
Predicting suitable habitat and the potential distribution of invasive species is a high priority for resource managers and systems ecologists. Most models are designed to identify habitat characteristics that define the ecological niche of a species with little consideration to individual species' traits. We tested five commonly used modelling methods on two invasive plant species, the habitat generalist Bromus tectorum and habitat specialist Tamarix chinensis, to compare model performances, evaluate predictability, and relate results to distribution traits associated with each species. Most of the tested models performed similarly for each species; however, the generalist species proved to be more difficult to predict than the specialist species. The highest area under the receiver-operating characteristic curve values with independent validation data sets of B. tectorum and T. chinensis was 0.503 and 0.885, respectively. Similarly, a confusion matrix for B. tectorum had the highest overall accuracy of 55%, while the overall accuracy for T. chinensis was 85%. Models for the generalist species had varying performances, poor evaluations, and inconsistent results. This may be a result of a generalist's capability to persist in a wide range of environmental conditions that are not easily defined by the data, independent variables or model design. Models for the specialist species had consistently strong performances, high evaluations, and similar results among different model applications. This is likely a consequence of the specialist's requirement for explicit environmental resources and ecological barriers that are easily defined by predictive models. Although defining new invaders as generalist or specialist species can be challenging, model performances and evaluations may provide valuable information on a species' potential invasiveness.
Genomic Prediction of Testcross Performance in Canola (Brassica napus)
Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.
2016-01-01
Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924
MEDEX 2015: Heart Rate Variability Predicts Development of Acute Mountain Sickness.
Sutherland, Angus; Freer, Joseph; Evans, Laura; Dolci, Alberto; Crotti, Matteo; Macdonald, Jamie Hugo
2017-09-01
Sutherland, Angus, Joseph Freer, Laura Evans, Alberto Dolci, Matteo Crotti, and Jamie Hugo Macdonald. MEDEX 2015: Heart rate variability predicts development of acute mountain sickness. High Alt Med Biol. 18: 199-208, 2017. Acute mountain sickness (AMS) develops when the body fails to acclimatize to atmospheric changes at altitude. Preascent prediction of susceptibility to AMS would be a useful tool to prevent subsequent harm. Changes to peripheral oxygen saturation (SpO 2 ) on hypoxic exposure have previously been shown to be of poor predictive value. Heart rate variability (HRV) has shown promise in the early prediction of AMS, but its use pre-expedition has not previously been investigated. We aimed to determine whether pre- and intraexpedition HRV assessment could predict susceptibility to AMS at high altitude with better diagnostic accuracy than SpO 2 . Forty-four healthy volunteers undertook an expedition in the Nepali Himalaya to >5000 m. SpO 2 and HRV parameters were recorded at rest in normoxia and in a normobaric hypoxic chamber before the expedition. On the expedition HRV parameters and SpO 2 were collected again at 3841 m. A daily Lake Louise Score was obtained to assess AMS symptomology. Low frequency/high frequency (LF/HF) ratio in normoxia (cutpoint ≤2.28 a.u.) and LF following 15 minutes of exposure to normobaric hypoxia had moderate (area under the curve ≥0.8) diagnostic accuracy. LF/HF ratio in normoxia had the highest sensitivity (85%) and specificity (88%) for predicting AMS on subsequent ascent to altitude. In contrast, pre-expedition SpO 2 measurements had poor (area under the curve <0.7) diagnostic accuracy and inferior sensitivity and specificity. Pre-ascent measurement of HRV in normoxia was found to be of better diagnostic accuracy for AMS prediction than all measures of HRV in hypoxia, and better than peripheral oxygen saturation monitoring.
Aminsharifi, Alireza; Irani, Dariush; Pooyesh, Shima; Parvin, Hamid; Dehghani, Sakineh; Yousofi, Khalilolah; Fazel, Ebrahim; Zibaie, Fatemeh
2017-05-01
To construct, train, and apply an artificial neural network (ANN) system for prediction of different outcome variables of percutaneous nephrolithotomy (PCNL). We calculated predictive accuracy, sensitivity, and precision for each outcome variable. During the study period, all adult patients who underwent PCNL at our institute were enrolled in the study. Preoperative and postoperative variables were recorded, and stone-free status was assessed perioperatively with computed tomography scans. MATLAB software was used to design and train the network in a feed forward back-propagation error adjustment scheme. Preoperative and postoperative data from 200 patients (training set) were used to analyze the effect and relative relevance of preoperative values on postoperative parameters. The validated adequately trained ANN was used to predict postoperative outcomes in the subsequent 254 adult patients (test set) whose preoperative values were serially fed into the system. To evaluate system accuracy in predicting each postoperative variable, predicted values were compared with actual outcomes. Two hundred fifty-four patients (155 [61%] males) were considered the test set. Mean stone burden was 6702.86 ± 381.6 mm 3 . Overall stone-free rate was 76.4%. Fifty-four out of 254 patients (21.3%) required ancillary procedures (shockwave lithotripsy 5.9%, transureteral lithotripsy 10.6%, and repeat PCNL 4.7%). The accuracy and sensitivity of the system in predicting different postoperative variables ranged from 81.0% to 98.2%. As a complex nonlinear mathematical model, our ANN system is an interconnected data mining tool, which prospectively analyzes and "learns" the relationships between variables. The accuracy and sensitivity of the system for predicting the stone-free rate, the need for blood transfusion, and post-PCNL ancillary procedures ranged from 81.0% to 98.2%.The stone burden and the stone morphometry were among the most significant preoperative characteristics that affected all postoperative outcome variables and they received the highest relative weight by the ANN system.
PREVALENCE OF METABOLIC SYNDROME IN YOUNG MEXICANS: A SENSITIVITY ANALYSIS ON ITS COMPONENTS.
Murguía-Romero, Miguel; Jiménez-Flores, J Rafael; Sigrist-Flores, Santiago C; Tapia-Pancardo, Diana C; Jiménez-Ramos, Arnulfo; Méndez-Cruz, A René; Villalobos-Molina, Rafael
2015-07-28
obesity is a worldwide epidemic, and the high prevalence of diabetes type II (DM2) and cardiovascular disease (CVD) is in great part a consequence of that epidemic. Metabolic syndrome is a useful tool to estimate the risk of a young population to evolve to DM2 and CVD. to estimate the MetS prevalence in young Mexicans, and to evaluate each parameter as an independent indicator through a sensitivity analysis. the prevalence of MetS was estimated in 6 063 young of the México City metropolitan area. A sensitivity analysis was conducted to estimate the performance of each one of the components of MetS, as an indicator of the presence of MetS itself. Five statistical of the sensitivity analysis were calculated for each MetS component and the other parameters included: sensitivity, specificity, positive predictive value or precision, negative predictive value, and accuracy. the prevalence of MetS in Mexican young population was estimated to be 13.4%. Waist circumference presented the highest sensitivity (96.8% women; 90.0% men), blood pressure presented the highest specificity for women (97.7%) and glucose for men (91.0%). When all the five statistical are considered triglycerides is the component with the highest values, showing a value of 75% or more in four of them. Differences by sex are detected for averages of all components of MetS in young without alterations. Mexican young are highly prone to acquire MetS: 71% have at least one and up to five MetS parameters altered, and 13.4% of them have MetS. From all the five components of MetS, waist circumference presented the highest sensitivity as a predictor of MetS, and triglycerides is the best parameter if a single factor is to be taken as sole predictor of MetS in Mexican young population, triglycerides is also the parameter with the highest accuracy. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Aided diagnosis methods of breast cancer based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Yue; Wang, Nian; Cui, Xiaoyu
2017-08-01
In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.
Technow, Frank; Schrag, Tobias A.; Schipprack, Wolfgang; Bauer, Eva; Simianer, Henner; Melchinger, Albrecht E.
2014-01-01
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill–Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker–QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in testcrosses. PMID:24850820
Family-Based Benchmarking of Copy Number Variation Detection Software.
Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael
2015-01-01
The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen
2017-12-27
Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Genomic prediction of the polled and horned phenotypes in Merino sheep.
Duijvesteijn, Naomi; Bolormaa, Sunduimijid; Daetwyler, Hans D; van der Werf, Julius H J
2018-05-22
In horned sheep breeds, breeding for polledness has been of interest for decades. The objective of this study was to improve prediction of the horned and polled phenotypes using horn scores classified as polled, scurs, knobs or horns. Derived phenotypes polled/non-polled (P/NP) and horned/non-horned (H/NH) were used to test four different strategies for prediction in 4001 purebred Merino sheep. These strategies include the use of single 'single nucleotide polymorphism' (SNP) genotypes, multiple-SNP haplotypes, genome-wide and chromosome-wide genomic best linear unbiased prediction and information from imputed sequence variants from the region including the RXFP2 gene. Low-density genotypes of these animals were imputed to the Illumina Ovine high-density (600k) chip and the 1.78-kb insertion polymorphism in RXFP2 was included in the imputation process to whole-genome sequence. We evaluated the mode of inheritance and validated models by a fivefold cross-validation and across- and between-family prediction. The most significant SNPs for prediction of P/NP and H/NH were OAR10_29546872.1 and OAR10_29458450, respectively, located on chromosome 10 close to the 1.78-kb insertion at 29.5 Mb. The mode of inheritance included an additive effect and a sex-dependent effect for dominance for P/NP and a sex-dependent additive and dominance effect for H/NH. Models with the highest prediction accuracies for H/NH used either single SNPs or 3-SNP haplotypes and included a polygenic effect estimated based on traditional pedigree relationships. Prediction accuracies for H/NH were 0.323 for females and 0.725 for males. For predicting P/NP, the best models were the same as for H/NH but included a genomic relationship matrix with accuracies of 0.713 for females and 0.620 for males. Our results show that prediction accuracy is high using a single SNP, but does not reach 1 since the causative mutation is not genotyped. Incomplete penetrance or allelic heterogeneity, which can influence expression of the phenotype, may explain why prediction accuracy did not approach 1 with any of the genetic models tested here. Nevertheless, a breeding program to eradicate horns from Merino sheep can be effective by selecting genotypes GG of SNP OAR10_29458450 or TT of SNP OAR10_29546872.1 since all sheep with these genotypes will be non-horned.
Bellomo-Brandao, Maria Angela; Andrade, Paula D; Costa, Sandra CB; Escanhoela, Cecilia AF; Vassallo, Jose; Porta, Gilda; De Tommaso, Adriana MA; Hessel, Gabriel
2009-01-01
AIM: To determine cytomegalovirus (CMV) frequency in neonatal intrahepatic cholestasis by serology, histological revision (searching for cytomegalic cells), immunohistochemistry, and polymerase chain reaction (PCR), and to verify the relationships among these methods. METHODS: The study comprised 101 non-consecutive infants submitted for hepatic biopsy between March 1982 and December 2005. Serological results were obtained from the patient’s files and the other methods were performed on paraffin-embedded liver samples from hepatic biopsies. The following statistical measures were calculated: frequency, sensibility, specific positive predictive value, negative predictive value, and accuracy. RESULTS: The frequencies of positive results were as follows: serology, 7/64 (11%); histological revision, 0/84; immunohistochemistry, 1/44 (2%), and PCR, 6/77 (8%). Only one patient had positive immunohistochemical findings and a positive PCR. The following statistical measures were calculated between PCR and serology: sensitivity, 33.3%; specificity, 88.89%; positive predictive value, 28.57%; negative predictive value, 90.91%; and accuracy, 82.35%. CONCLUSION: The frequency of positive CMV varied among the tests. Serology presented the highest positive frequency. When compared to PCR, the sensitivity and positive predictive value of serology were low. PMID:19610143
Electroencephalography Predicts Poor and Good Outcomes After Cardiac Arrest: A Two-Center Study.
Rossetti, Andrea O; Tovar Quiroga, Diego F; Juan, Elsa; Novy, Jan; White, Roger D; Ben-Hamouda, Nawfel; Britton, Jeffrey W; Oddo, Mauro; Rabinstein, Alejandro A
2017-07-01
The prognostic role of electroencephalography during and after targeted temperature management in postcardiac arrest patients, relatively to other predictors, is incompletely known. We assessed performances of electroencephalography during and after targeted temperature management toward good and poor outcomes, along with other recognized predictors. Cohort study (April 2009 to March 2016). Two academic hospitals (Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland; Mayo Clinic, Rochester, MN). Consecutive comatose adults admitted after cardiac arrest, identified through prospective registries. All patients were managed with targeted temperature management, receiving prespecified standardized clinical, neurophysiologic (particularly, electroencephalography during and after targeted temperature management), and biochemical evaluations. We assessed electroencephalography variables (reactivity, continuity, epileptiform features, and prespecified "benign" or "highly malignant" patterns based on the American Clinical Neurophysiology Society nomenclature) and other clinical, neurophysiologic (somatosensory-evoked potential), and biochemical prognosticators. Good outcome (Cerebral Performance Categories 1 and 2) and mortality predictions at 3 months were calculated. Among 357 patients, early electroencephalography reactivity and continuity and flexor or better motor reaction had greater than 70% positive predictive value for good outcome; reactivity (80.4%; 95% CI, 75.9-84.4%) and motor response (80.1%; 95% CI, 75.6-84.1%) had highest accuracy. Early benign electroencephalography heralded good outcome in 86.2% (95% CI, 79.8-91.1%). False positive rates for mortality were less than 5% for epileptiform or nonreactive early electroencephalography, nonreactive late electroencephalography, absent somatosensory-evoked potential, absent pupillary or corneal reflexes, presence of myoclonus, and neuron-specific enolase greater than 75 µg/L; accuracy was highest for early electroencephalography reactivity (86.6%; 95% CI, 82.6-90.0). Early highly malignant electroencephalography had an false positive rate of 1.5% with accuracy of 85.7% (95% CI, 81.7-89.2%). This study provides class III evidence that electroencephalography reactivity predicts both poor and good outcomes, and motor reaction good outcome after cardiac arrest. Electroencephalography reactivity seems to be the best discriminator between good and poor outcomes. Standardized electroencephalography interpretation seems to predict both conditions during and after targeted temperature management.
CAN STABILITY REALLY PREDICT AN IMPENDING SLIP-RELATED FALL AMONG OLDER ADULTS?
Yang, Feng; Pai, Yi-Chung
2015-01-01
The primary purpose of this study was to systematically evaluate and compare the predictive power of falls for a battery of stability indices, obtained during normal walking among community-dwelling older adults. One hundred and eighty seven community-dwelling older adults participated in the study. After walking regularly for 20 strides on a walkway, participants were subjected to an unannounced slip during gait under the protection of a safety harness. Full body kinematics and kinetics were monitored during walking using a motion capture system synchronized with force plates. Stability variables, including feasible-stability-region measurement, margin of stability, the maximum Floquet multiplier, the Lyapunov exponents (short- and long-term), and the variability of gait parameters (including the step length, step width, and step time) were calculated for each subject. Accuracy of predicting slip outcome (fall vs. recovery) was examined for each stability variable using logistic regression. Results showed that the feasible-stability-region measurement predicted fall incidence among these subjects with the highest accuracy (68.4%). Except for the step width (with an accuracy of 60.2%), no other stability variables could differentiate fallers from those who did not fall for the sample studied in this study. The findings from the present study could provide guidance to identify individuals at increased risk of falling using the feasible-stability-region measurement or variability of the step width. PMID:25458148
Kesorn, Kraisak; Ongruk, Phatsavee; Chompoosri, Jakkrawarn; Phumee, Atchara; Thavara, Usavadee; Tawatsin, Apiwat; Siriyasatien, Padet
2015-01-01
Background In the past few decades, several researchers have proposed highly accurate prediction models that have typically relied on climate parameters. However, climate factors can be unreliable and can lower the effectiveness of prediction when they are applied in locations where climate factors do not differ significantly. The purpose of this study was to improve a dengue surveillance system in areas with similar climate by exploiting the infection rate in the Aedes aegypti mosquito and using the support vector machine (SVM) technique for forecasting the dengue morbidity rate. Methods and Findings Areas with high incidence of dengue outbreaks in central Thailand were studied. The proposed framework consisted of the following three major parts: 1) data integration, 2) model construction, and 3) model evaluation. We discovered that the Ae. aegypti female and larvae mosquito infection rates were significantly positively associated with the morbidity rate. Thus, the increasing infection rate of female mosquitoes and larvae led to a higher number of dengue cases, and the prediction performance increased when those predictors were integrated into a predictive model. In this research, we applied the SVM with the radial basis function (RBF) kernel to forecast the high morbidity rate and take precautions to prevent the development of pervasive dengue epidemics. The experimental results showed that the introduced parameters significantly increased the prediction accuracy to 88.37% when used on the test set data, and these parameters led to the highest performance compared to state-of-the-art forecasting models. Conclusions The infection rates of the Ae. aegypti female mosquitoes and larvae improved the morbidity rate forecasting efficiency better than the climate parameters used in classical frameworks. We demonstrated that the SVM-R-based model has high generalization performance and obtained the highest prediction performance compared to classical models as measured by the accuracy, sensitivity, specificity, and mean absolute error (MAE). PMID:25961289
Knowledge discovery by accuracy maximization
Cacciatore, Stefano; Luchinat, Claudio; Tenori, Leonardo
2014-01-01
Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of cross-validation of the results. The discovery of a local manifold’s topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approach differs from previous methods in that it has an integrated procedure of validation of the results. In this way, the method ensures the highest robustness of the obtained solution. This robustness is demonstrated on experimental datasets of gene expression and metabolomics, where KODAMA compares favorably with other existing feature extraction methods. KODAMA is then applied to an astronomical dataset, revealing unexpected features. Interesting and not easily predictable features are also found in the analysis of the State of the Union speeches by American presidents: KODAMA reveals an abrupt linguistic transition sharply separating all post-Reagan from all pre-Reagan speeches. The transition occurs during Reagan’s presidency and not from its beginning. PMID:24706821
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.
Real-data comparison of data mining methods in prediction of diabetes in iran.
Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal
2013-09-01
Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.
Comparative Analysis of Hybrid Models for Prediction of BP Reactivity to Crossed Legs.
Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar
2017-01-01
Crossing the legs at the knees, during BP measurement, is one of the several physiological stimuli that considerably influence the accuracy of BP measurements. Therefore, it is paramount to develop an appropriate prediction model for interpreting influence of crossed legs on BP. This research work described the use of principal component analysis- (PCA-) fused forward stepwise regression (FSWR), artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), and least squares support vector machine (LS-SVM) models for prediction of BP reactivity to crossed legs among the normotensive and hypertensive participants. The evaluation of the performance of the proposed prediction models using appropriate statistical indices showed that the PCA-based LS-SVM (PCA-LS-SVM) model has the highest prediction accuracy with coefficient of determination ( R 2 ) = 93.16%, root mean square error (RMSE) = 0.27, and mean absolute percentage error (MAPE) = 5.71 for SBP prediction in normotensive subjects. Furthermore, R 2 = 96.46%, RMSE = 0.19, and MAPE = 1.76 for SBP prediction and R 2 = 95.44%, RMSE = 0.21, and MAPE = 2.78 for DBP prediction in hypertensive subjects using the PCA-LSSVM model. This assessment presents the importance and advantages posed by hybrid computing models for the prediction of variables in biomedical research studies.
Sixty-five years of the long march in protein secondary structure prediction: the final stretch?
Yang, Yuedong; Gao, Jianzhao; Wang, Jihua; Heffernan, Rhys; Hanson, Jack; Paliwal, Kuldip; Zhou, Yaoqi
2018-01-01
Abstract Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82–84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88–90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction. PMID:28040746
Pires, RES; Pereira, AA; Abreu-e-Silva, GM; Labronici, PJ; Figueiredo, LB; Godoy-Santos, AL; Kfuri, M
2014-01-01
Background: Foot and ankle injuries are frequent in emergency departments. Although only a few patients with foot and ankle sprain present fractures and the fracture patterns are almost always simple, lack of fracture diagnosis can lead to poor functional outcomes. Aim: The present study aims to evaluate the reliability of the Ottawa ankle rules and the orthopedic surgeon subjective perception to assess foot and ankle fractures after sprains. Subjects and Methods: A cross-sectional study was conducted from July 2012 to December 2012. Ethical approval was granted. Two hundred seventy-four adult patients admitted to the emergency department with foot and/or ankle sprain were evaluated by an orthopedic surgeon who completed a questionnaire prior to radiographic assessment. The Ottawa ankle rules and subjective perception of foot and/or ankle fractures were evaluated on the questionnaire. Results: Thirteen percent (36/274) patients presented fracture. Orthopedic surgeon subjective analysis showed 55.6% sensitivity, 90.1% specificity, 46.5% positive predictive value and 92.9% negative predictive value. The general orthopedic surgeon opinion accuracy was 85.4%. The Ottawa ankle rules presented 97.2% sensitivity, 7.8% specificity, 13.9% positive predictive value, 95% negative predictive value and 19.9% accuracy respectively. Weight-bearing inability was the Ottawa ankle rule item that presented the highest reliability, 69.4% sensitivity, 61.6% specificity, 63.1% accuracy, 21.9% positive predictive value and 93% negative predictive value respectively. Conclusion: The Ottawa ankle rules showed high reliability for deciding when to take radiographs in foot and/or ankle sprains. Weight-bearing inability was the most important isolated item to predict fracture presence. Orthopedic surgeon subjective analysis to predict fracture possibility showed a high specificity rate, representing a confident method to exclude unnecessary radiographic exams. PMID:24971221
2010-01-01
Background The binding of peptide fragments of extracellular peptides to class II MHC is a crucial event in the adaptive immune response. Each MHC allotype generally binds a distinct subset of peptides and the enormous number of possible peptide epitopes prevents their complete experimental characterization. Computational methods can utilize the limited experimental data to predict the binding affinities of peptides to class II MHC. Results We have developed the Regularized Thermodynamic Average, or RTA, method for predicting the affinities of peptides binding to class II MHC. RTA accounts for all possible peptide binding conformations using a thermodynamic average and includes a parameter constraint for regularization to improve accuracy on novel data. RTA was shown to achieve higher accuracy, as measured by AUC, than SMM-align on the same data for all 17 MHC allotypes examined. RTA also gave the highest accuracy on all but three allotypes when compared with results from 9 different prediction methods applied to the same data. In addition, the method correctly predicted the peptide binding register of 17 out of 18 peptide-MHC complexes. Finally, we found that suboptimal peptide binding registers, which are often ignored in other prediction methods, made significant contributions of at least 50% of the total binding energy for approximately 20% of the peptides. Conclusions The RTA method accurately predicts peptide binding affinities to class II MHC and accounts for multiple peptide binding registers while reducing overfitting through regularization. The method has potential applications in vaccine design and in understanding autoimmune disorders. A web server implementing the RTA prediction method is available at http://bordnerlab.org/RTA/. PMID:20089173
Genomic and pedigree-based prediction for leaf, stem, and stripe rust resistance in wheat.
Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Huerta-Espino, Julio; Lan, Caixia; Bhavani, Sridhar; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E
2017-07-01
Genomic prediction for seedling and adult plant resistance to wheat rusts was compared to prediction using few markers as fixed effects in a least-squares approach and pedigree-based prediction. The unceasing plant-pathogen arms race and ephemeral nature of some rust resistance genes have been challenging for wheat (Triticum aestivum L.) breeding programs and farmers. Hence, it is important to devise strategies for effective evaluation and exploitation of quantitative rust resistance. One promising approach that could accelerate gain from selection for rust resistance is 'genomic selection' which utilizes dense genome-wide markers to estimate the breeding values (BVs) for quantitative traits. Our objective was to compare three genomic prediction models including genomic best linear unbiased prediction (GBLUP), GBLUP A that was GBLUP with selected loci as fixed effects and reproducing kernel Hilbert spaces-markers (RKHS-M) with least-squares (LS) approach, RKHS-pedigree (RKHS-P), and RKHS markers and pedigree (RKHS-MP) to determine the BVs for seedling and/or adult plant resistance (APR) to leaf rust (LR), stem rust (SR), and stripe rust (YR). The 333 lines in the 45th IBWSN and the 313 lines in the 46th IBWSN were genotyped using genotyping-by-sequencing and phenotyped in replicated trials. The mean prediction accuracies ranged from 0.31-0.74 for LR seedling, 0.12-0.56 for LR APR, 0.31-0.65 for SR APR, 0.70-0.78 for YR seedling, and 0.34-0.71 for YR APR. For most datasets, the RKHS-MP model gave the highest accuracies, while LS gave the lowest. GBLUP, GBLUP A, RKHS-M, and RKHS-P models gave similar accuracies. Using genome-wide marker-based models resulted in an average of 42% increase in accuracy over LS. We conclude that GS is a promising approach for improvement of quantitative rust resistance and can be implemented in the breeding pipeline.
Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan
2016-02-01
Accuracy plays a vital role in the medical field as it concerns with the life of an individual. Extensive research has been conducted on disease classification and prediction using machine learning techniques. However, there is no agreement on which classifier produces the best results. A specific classifier may be better than others for a specific dataset, but another classifier could perform better for some other dataset. Ensemble of classifiers has been proved to be an effective way to improve classification accuracy. In this research we present an ensemble framework with multi-layer classification using enhanced bagging and optimized weighting. The proposed model called "HM-BagMoov" overcomes the limitations of conventional performance bottlenecks by utilizing an ensemble of seven heterogeneous classifiers. The framework is evaluated on five different heart disease datasets, four breast cancer datasets, two diabetes datasets, two liver disease datasets and one hepatitis dataset obtained from public repositories. The analysis of the results show that ensemble framework achieved the highest accuracy, sensitivity and F-Measure when compared with individual classifiers for all the diseases. In addition to this, the ensemble framework also achieved the highest accuracy when compared with the state of the art techniques. An application named "IntelliHealth" is also developed based on proposed model that may be used by hospitals/doctors for diagnostic advice. Copyright © 2015 Elsevier Inc. All rights reserved.
Prediction of Potential Hit Song and Musical Genre Using Artificial Neural Networks
NASA Astrophysics Data System (ADS)
Monterola, Christopher; Abundo, Cheryl; Tugaff, Jeric; Venturina, Lorcel Ericka
Accurately quantifying the goodness of music based on the seemingly subjective taste of the public is a multi-million industry. Recording companies can make sound decisions on which songs or artists to prioritize if accurate forecasting is achieved. We extract 56 single-valued musical features (e.g. pitch and tempo) from 380 Original Pilipino Music (OPM) songs (190 are hit songs) released from 2004 to 2006. Based on an effect size criterion which measures a variable's discriminating power, the 20 highest ranked features are fed to a classifier tasked to predict hit songs. We show that regardless of musical genre, a trained feed-forward neural network (NN) can predict potential hit songs with an average accuracy of ΦNN = 81%. The accuracy is about +20% higher than those of standard classifiers such as linear discriminant analysis (LDA, ΦLDA = 61%) and classification and regression trees (CART, ΦCART = 57%). Both LDA and CART are above the proportional chance criterion (PCC, ΦPCC = 50%) but are slightly below the suggested acceptable classifier requirement of 1.25*ΦPCC = 63%. Utilizing a similar procedure, we demonstrate that different genres (ballad, alternative rock or rock) of OPM songs can be automatically classified with near perfect accuracy using LDA or NN but only around 77% using CART.
Alborzi, Saeed; Rasekhi, Alireza; Shomali, Zahra; Madadi, Gooya; Alborzi, Mahshid; Kazemi, Mahboobeh; Hosseini Nohandani, Azam
2018-01-01
Abstract To determine the diagnostic accuracy of pelvic magnetic resonance imaging (MRI), transvaginal sonography (TVS), and transrectal sonography (TRS) in diagnosis of deep infiltrating endometriosis (DIE). This diagnostic accuracy study was conducted during a 2-year period including a total number of 317 patients with signs and symptoms of endometriosis. All the patients were evaluated by pelvic MRI, TVS, and TRS in the same center. The criterion standard was considered to be the laparoscopy and histopathologic examination. Of 317 patients being included in the present study, 252 tested positive for DIE. The sensitivity, specificity, positive predictive value, and negative predictive value of TVS was found to be 83.3%, 46.1%, 85.7%, and 41.6%, respectively. These variables were 80.5%, 18.6%, 79.3%, and 19.7% for TRS and 90.4%, 66.1%, 91.2%, and 64.1% for MRI, respectively. MRI had the highest accuracy (85.4%) when compared to TVS (75.7%) and TRS (67.8%). The sensitivity of TRS, TVS, and MRI in uterosacral ligament DIE was 82.8%, 70.9%, and 63.6%, respectively. On the contrary, specificity had a reverse trend, favoring MRI (93.9%, 92.8%, and 89.8% for TVS and TRS, respectively). The results of the present study demonstrated that TVS and TRS have appropriate diagnostic accuracy in diagnosis of DIE comparable to MRI. PMID:29465552
Delay functions in trip assignment for transport planning process
NASA Astrophysics Data System (ADS)
Leong, Lee Vien
2017-10-01
In transportation planning process, volume-delay and turn-penalty functions are the functions needed in traffic assignment to determine travel time on road network links. Volume-delay function is the delay function describing speed-flow relationship while turn-penalty function is the delay function associated to making a turn at intersection. The volume-delay function used in this study is the revised Bureau of Public Roads (BPR) function with the constant parameters, α and β values of 0.8298 and 3.361 while the turn-penalty functions for signalized intersection were developed based on uniform, random and overflow delay models. Parameters such as green time, cycle time and saturation flow were used in the development of turn-penalty functions. In order to assess the accuracy of the delay functions, road network in areas of Nibong Tebal, Penang and Parit Buntar, Perak was developed and modelled using transportation demand forecasting software. In order to calibrate the models, phase times and traffic volumes at fourteen signalised intersections within the study area were collected during morning and evening peak hours. The prediction of assigned volumes using the revised BPR function and the developed turn-penalty functions show close agreement to actual recorded traffic volume with the lowest percentage of accuracy, 80.08% and the highest, 93.04% for the morning peak model. As for the evening peak model, they were 75.59% and 95.33% respectively for lowest and highest percentage of accuracy. As for the yield left-turn lanes, the lowest percentage of accuracy obtained for the morning and evening peak models were 60.94% and 69.74% respectively while the highest percentage of accuracy obtained for both models were 100%. Therefore, can be concluded that the development and utilisation of delay functions based on local road conditions are important as localised delay functions can produce better estimate of link travel times and hence better planning for future scenarios.
Bernecker, Samantha L; Rosellini, Anthony J; Nock, Matthew K; Chiu, Wai Tat; Gutierrez, Peter M; Hwang, Irving; Joiner, Thomas E; Naifeh, James A; Sampson, Nancy A; Zaslavsky, Alan M; Stein, Murray B; Ursano, Robert J; Kessler, Ronald C
2018-04-03
High rates of mental disorders, suicidality, and interpersonal violence early in the military career have raised interest in implementing preventive interventions with high-risk new enlistees. The Army Study to Assess Risk and Resilience in Servicemembers (STARRS) developed risk-targeting systems for these outcomes based on machine learning methods using administrative data predictors. However, administrative data omit many risk factors, raising the question whether risk targeting could be improved by adding self-report survey data to prediction models. If so, the Army may gain from routinely administering surveys that assess additional risk factors. The STARRS New Soldier Survey was administered to 21,790 Regular Army soldiers who agreed to have survey data linked to administrative records. As reported previously, machine learning models using administrative data as predictors found that small proportions of high-risk soldiers accounted for high proportions of negative outcomes. Other machine learning models using self-report survey data as predictors were developed previously for three of these outcomes: major physical violence and sexual violence perpetration among men and sexual violence victimization among women. Here we examined the extent to which this survey information increases prediction accuracy, over models based solely on administrative data, for those three outcomes. We used discrete-time survival analysis to estimate a series of models predicting first occurrence, assessing how model fit improved and concentration of risk increased when adding the predicted risk score based on survey data to the predicted risk score based on administrative data. The addition of survey data improved prediction significantly for all outcomes. In the most extreme case, the percentage of reported sexual violence victimization among the 5% of female soldiers with highest predicted risk increased from 17.5% using only administrative predictors to 29.4% adding survey predictors, a 67.9% proportional increase in prediction accuracy. Other proportional increases in concentration of risk ranged from 4.8% to 49.5% (median = 26.0%). Data from an ongoing New Soldier Survey could substantially improve accuracy of risk models compared to models based exclusively on administrative predictors. Depending upon the characteristics of interventions used, the increase in targeting accuracy from survey data might offset survey administration costs.
Technow, Frank; Schrag, Tobias A; Schipprack, Wolfgang; Bauer, Eva; Simianer, Henner; Melchinger, Albrecht E
2014-08-01
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill-Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker-QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in test crosses. Copyright © 2014 by the Genetics Society of America.
Prediction of Dementia in Primary Care Patients
Jessen, Frank; Wiese, Birgitt; Bickel, Horst; Eiffländer-Gorfer, Sandra; Fuchs, Angela; Kaduszkiewicz, Hanna; Köhler, Mirjam; Luck, Tobias; Mösch, Edelgard; Pentzek, Michael; Riedel-Heller, Steffi G.; Wagner, Michael; Weyerer, Siegfried; Maier, Wolfgang; van den Bussche, Hendrik
2011-01-01
Background Current approaches for AD prediction are based on biomarkers, which are however of restricted availability in primary care. AD prediction tools for primary care are therefore needed. We present a prediction score based on information that can be obtained in the primary care setting. Methodology/Principal Findings We performed a longitudinal cohort study in 3.055 non-demented individuals above 75 years recruited via primary care chart registries (Study on Aging, Cognition and Dementia, AgeCoDe). After the baseline investigation we performed three follow-up investigations at 18 months intervals with incident dementia as the primary outcome. The best set of predictors was extracted from the baseline variables in one randomly selected half of the sample. This set included age, subjective memory impairment, performance on delayed verbal recall and verbal fluency, on the Mini-Mental-State-Examination, and on an instrumental activities of daily living scale. These variables were aggregated to a prediction score, which achieved a prediction accuracy of 0.84 for AD. The score was applied to the second half of the sample (test cohort). Here, the prediction accuracy was 0.79. With a cut-off of at least 80% sensitivity in the first cohort, 79.6% sensitivity, 66.4% specificity, 14.7% positive predictive value (PPV) and 97.8% negative predictive value of (NPV) for AD were achieved in the test cohort. At a cut-off for a high risk population (5% of individuals with the highest risk score in the first cohort) the PPV for AD was 39.1% (52% for any dementia) in the test cohort. Conclusions The prediction score has useful prediction accuracy. It can define individuals (1) sensitively for low cost-low risk interventions, or (2) more specific and with increased PPV for measures of prevention with greater costs or risks. As it is independent of technical aids, it may be used within large scale prevention programs. PMID:21364746
Prediction of dementia in primary care patients.
Jessen, Frank; Wiese, Birgitt; Bickel, Horst; Eiffländer-Gorfer, Sandra; Fuchs, Angela; Kaduszkiewicz, Hanna; Köhler, Mirjam; Luck, Tobias; Mösch, Edelgard; Pentzek, Michael; Riedel-Heller, Steffi G; Wagner, Michael; Weyerer, Siegfried; Maier, Wolfgang; van den Bussche, Hendrik
2011-02-18
Current approaches for AD prediction are based on biomarkers, which are however of restricted availability in primary care. AD prediction tools for primary care are therefore needed. We present a prediction score based on information that can be obtained in the primary care setting. We performed a longitudinal cohort study in 3.055 non-demented individuals above 75 years recruited via primary care chart registries (Study on Aging, Cognition and Dementia, AgeCoDe). After the baseline investigation we performed three follow-up investigations at 18 months intervals with incident dementia as the primary outcome. The best set of predictors was extracted from the baseline variables in one randomly selected half of the sample. This set included age, subjective memory impairment, performance on delayed verbal recall and verbal fluency, on the Mini-Mental-State-Examination, and on an instrumental activities of daily living scale. These variables were aggregated to a prediction score, which achieved a prediction accuracy of 0.84 for AD. The score was applied to the second half of the sample (test cohort). Here, the prediction accuracy was 0.79. With a cut-off of at least 80% sensitivity in the first cohort, 79.6% sensitivity, 66.4% specificity, 14.7% positive predictive value (PPV) and 97.8% negative predictive value of (NPV) for AD were achieved in the test cohort. At a cut-off for a high risk population (5% of individuals with the highest risk score in the first cohort) the PPV for AD was 39.1% (52% for any dementia) in the test cohort. The prediction score has useful prediction accuracy. It can define individuals (1) sensitively for low cost-low risk interventions, or (2) more specific and with increased PPV for measures of prevention with greater costs or risks. As it is independent of technical aids, it may be used within large scale prevention programs.
External validation of EPIWIN biodegradation models.
Posthumus, R; Traas, T P; Peijnenburg, W J G M; Hulzebos, E M
2005-01-01
The BIOWIN biodegradation models were evaluated for their suitability for regulatory purposes. BIOWIN includes the linear and non-linear BIODEG and MITI models for estimating the probability of rapid aerobic biodegradation and an expert survey model for primary and ultimate biodegradation estimation. Experimental biodegradation data for 110 newly notified substances were compared with the estimations of the different models. The models were applied separately and in combinations to determine which model(s) showed the best performance. The results of this study were compared with the results of other validation studies and other biodegradation models. The BIOWIN models predict not-readily biodegradable substances with high accuracy in contrast to ready biodegradability. In view of the high environmental concern of persistent chemicals and in view of the large number of not-readily biodegradable chemicals compared to the readily ones, a model is preferred that gives a minimum of false positives without a corresponding high percentage false negatives. A combination of the BIOWIN models (BIOWIN2 or BIOWIN6) showed the highest predictive value for not-readily biodegradability. However, the highest score for overall predictivity with lowest percentage false predictions was achieved by applying BIOWIN3 (pass level 2.75) and BIOWIN6.
Fisher, Moria E; Huang, Felix C; Wright, Zachary A; Patton, James L
2014-01-01
Manipulation of error feedback has been of great interest to recent studies in motor control and rehabilitation. Typically, motor adaptation is shown as a change in performance with a single scalar metric for each trial, yet such an approach might overlook details about how error evolves through the movement. We believe that statistical distributions of movement error through the extent of the trajectory can reveal unique patterns of adaption and possibly reveal clues to how the motor system processes information about error. This paper describes different possible ordinate domains, focusing on representations in time and state-space, used to quantify reaching errors. We hypothesized that the domain with the lowest amount of variability would lead to a predictive model of reaching error with the highest accuracy. Here we showed that errors represented in a time domain demonstrate the least variance and allow for the highest predictive model of reaching errors. These predictive models will give rise to more specialized methods of robotic feedback and improve previous techniques of error augmentation.
A Demand-Driven Approach for a Multi-Agent System in Supply Chain Management
NASA Astrophysics Data System (ADS)
Kovalchuk, Yevgeniya; Fasli, Maria
This paper presents the architecture of a multi-agent decision support system for Supply Chain Management (SCM) which has been designed to compete in the TAC SCM game. The behaviour of the system is demand-driven and the agents plan, predict, and react dynamically to changes in the market. The main strength of the system lies in the ability of the Demand agent to predict customer winning bid prices - the highest prices the agent can offer customers and still obtain their orders. This paper investigates the effect of the ability to predict customer order prices on the overall performance of the system. Four strategies are proposed and compared for predicting such prices. The experimental results reveal which strategies are better and show that there is a correlation between the accuracy of the models' predictions and the overall system performance: the more accurate the prediction of customer order prices, the higher the profit.
In silico models for predicting ready biodegradability under REACH: a comparative study.
Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio
2013-10-01
REACH (Registration Evaluation Authorization and restriction of Chemicals) legislation is a new European law which aims to raise the human protection level and environmental health. Under REACH all chemicals manufactured or imported for more than one ton per year must be evaluated for their ready biodegradability. Ready biodegradability is also used as a screening test for persistent, bioaccumulative and toxic (PBT) substances. REACH encourages the use of non-testing methods such as QSAR (quantitative structure-activity relationship) models in order to save money and time and to reduce the number of animals used for scientific purposes. Some QSAR models are available for predicting ready biodegradability. We used a dataset of 722 compounds to test four models: VEGA, TOPKAT, BIOWIN 5 and 6 and START and compared their performance on the basis of the following parameters: accuracy, sensitivity, specificity and Matthew's correlation coefficient (MCC). Performance was analyzed from different points of view. The first calculation was done on the whole dataset and VEGA and TOPKAT gave the best accuracy (88% and 87% respectively). Then we considered the compounds inside and outside the training set: BIOWIN 6 and 5 gave the best results for accuracy (81%) outside training set. Another analysis examined the applicability domain (AD). VEGA had the highest value for compounds inside the AD for all the parameters taken into account. Finally, compounds outside the training set and in the AD of the models were considered to assess predictive ability. VEGA gave the best accuracy results (99%) for this group of chemicals. Generally, START model gave poor results. Since BIOWIN, TOPKAT and VEGA models performed well, they may be used to predict ready biodegradability. Copyright © 2013 Elsevier B.V. All rights reserved.
Reduced fMRI activity predicts relapse in patients recovering from stimulant dependence.
Clark, Vincent P; Beatty, Gregory K; Anderson, Robert E; Kodituwakku, Piyadassa; Phillips, John P; Lane, Terran D R; Kiehl, Kent A; Calhoun, Vince D
2014-02-01
Relapse presents a significant problem for patients recovering from stimulant dependence. Here we examined the hypothesis that patterns of brain function obtained at an early stage of abstinence differentiates patients who later relapse versus those who remain abstinent. Forty-five recently abstinent stimulant-dependent patients were tested using a randomized event-related functional MRI (ER-fMRI) design that was developed in order to replicate a previous ERP study of relapse using a selective attention task, and were then monitored until 6 months of verified abstinence or stimulant use occurred. SPM revealed smaller absolute blood oxygen level-dependent (BOLD) response amplitude in bilateral ventral posterior cingulate and right insular cortex in 23 patients positive for relapse to stimulant use compared with 22 who remained abstinent. ER-fMRI, psychiatric, neuropsychological, demographic, personal and family history of drug use were compared in order to form predictive models. ER-fMRI was found to predict abstinence with higher accuracy than any other single measure obtained in this study. Logistic regression using fMRI amplitude in right posterior cingulate and insular cortex predicted abstinence with 77.8% accuracy, which increased to 89.9% accuracy when history of mania was included. Using 10-fold cross-validation, Bayesian logistic regression and multilayer perceptron algorithms provided the highest accuracy of 84.4%. These results, combined with previous studies, suggest that the functional organization of paralimbic brain regions including ventral anterior and posterior cingulate and right insula are related to patients' ability to maintain abstinence. Novel therapies designed to target these paralimbic regions identified using ER-fMRI may improve treatment outcome. Copyright © 2012 Wiley Periodicals, Inc.
Erbe, Malena; Gredler, Birgit; Seefried, Franz Reinhold; Bapst, Beat; Simianer, Henner
2013-01-01
Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independent chromosome segments ([Formula: see text]). The aim of our study was to find a general deterministic equation for the average accuracy of genomic breeding values that also accounts for marker density and can be fitted empirically. Two data sets of 5'698 Holstein Friesian bulls genotyped with 50 K SNPs and 1'332 Brown Swiss bulls genotyped with 50 K SNPs and imputed to ∼600 K SNPs were available. Different k-fold (k = 2-10, 15, 20) cross-validation scenarios (50 replicates, random assignment) were performed using a genomic BLUP approach. A maximum likelihood approach was used to estimate the parameters of different prediction equations. The highest likelihood was obtained when using a modified form of the deterministic equation of Daetwyler et al. (2010), augmented by a weighting factor (w) based on the assumption that the maximum achievable accuracy is [Formula: see text]. The proportion of genetic variance captured by the complete SNP sets ([Formula: see text]) was 0.76 to 0.82 for Holstein Friesian and 0.72 to 0.75 for Brown Swiss. When modifying the number of SNPs, w was found to be proportional to the log of the marker density up to a limit which is population and trait specific and was found to be reached with ∼20'000 SNPs in the Brown Swiss population studied.
Determination of sex from various hand dimensions of Koreans.
Jee, Soo-Chan; Bahn, Sangwoo; Yun, Myung Hwan
2015-12-01
In the case of disasters or crime scenes, forensic anthropometric methods have been utilized as a reliable way to quickly confirm the identification of victims using only a few parts of the body. A total of 321 measurement data (from 167 males and 154 females) were analyzed to investigate the suitability of detailed hand dimensions as discriminators of sex. A total of 29 variables including length, breadth, thickness, and circumference of fingers, palm, and wrist were measured. The obtained data were analyzed using descriptive statistics and t-test. The accuracy of sex indication from the hand dimensions data was found using discriminant analysis. The age effect and interaction effect according to age and sex on hand dimensions were analyzed by ANOVA. The prediction accuracy on a wide age range was also compared. According to the results, the maximum hand circumference showed the highest accuracy of 88.6% for predicting sex for males and 89.6% for females. Although the breadth, circumference, and thickness of hand parts generally showed higher accuracy than the lengths of hand parts in predicting the sex of the participant, the breadth and circumference of some finger joints showed a significant difference according to age and gender. Thus, the dimensions of hand parts which are not affected by age or gender, such as hand length, palm length, hand breadth, and maximum hand thickness, are recommended to be used first in sex determination for a wide age range group. The results suggest that the detailed hand dimensions can also be used to identify sex for better accuracy; however, the aging effects need to be considered in estimating aged suspects. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Wong, Kenneth Pak Leung; Han, Audrey XinYun; Wong, Jeannie Leh Ying; Lee, Dave Yee Han
2017-02-01
The accuracy of magnetic resonance (MR) imaging in assessing meniscal and cartilage injuries in anterior cruciate ligament (ACL)-deficient knees as compared to arthroscopy was evaluated in the present study. The results of all preoperative MR imaging performed within 3 months prior to the ACL reconstruction were compared against intraoperative arthroscopic findings. A total of 206 patients were identified. The location and type of meniscal injuries as well as the location and grade of the cartilage injuries were studied. The negative predictive value, positive predictive value, sensitivity, specificity and accuracy of MR imaging for these 206 cases were calculated and analysed. In patients with an ACL injury, the highest incidence of concomitant injury was that of medial meniscus tears, 124 (60.2 %), followed by lateral meniscus tears, 105 (51.0 %), and cartilage injuries, 66 (32.0 %). Twenty-three (11.2 %) patients sustained injuries to all of the previously named structures. MR imaging was most accurate in detecting medial meniscus tears (85.9 %). MR imaging for medial meniscus tears also had the highest sensitivity (88.0 %) and positive predictive value (88.7 %), while MR imaging for cartilage injuries had the largest specificity (84.1 %) and negative predictive value (87.1 %). It was least accurate in evaluating lateral meniscus tears (74.3 %). The diagnostic accuracy of medial meniscus imaging is significantly influenced by age and the presence of lateral meniscus tears, while the duration between MR imaging and surgery has greater impact on the likelihood of lateral meniscus and cartilage injuries actually being present during surgery. The majority of meniscus tears missed by MR imaging affected the posterior horn and were complex in nature. Cartilage injuries affecting the medial femoral condyle or medial patella facet were also often missed by MR imaging. MR imaging remains a reliable tool for assessing meniscus tears and cartilage defects preoperatively. It is most accurate when evaluating medial meniscus tears. However, MR imaging should be used with discretion especially if there is a high index of suspicion of lateral meniscus tears. IV.
Lee, Sunghoon; Lee, Byungwook; Jang, Insoo; Kim, Sangsoo; Bhak, Jong
2006-01-01
The Localizome server predicts the transmembrane (TM) helix number and TM topology of a user-supplied eukaryotic protein and presents the result as an intuitive graphic representation. It utilizes hmmpfam to detect the presence of Pfam domains and a prediction algorithm, Phobius, to predict the TM helices. The results are combined and checked against the TM topology rules stored in a protein domain database called LocaloDom. LocaloDom is a curated database that contains TM topologies and TM helix numbers of known protein domains. It was constructed from Pfam domains combined with Swiss-Prot annotations and Phobius predictions. The Localizome server corrects the combined results of the user sequence to conform to the rules stored in LocaloDom. Compared with other programs, this server showed the highest accuracy for TM topology prediction: for soluble proteins, the accuracy and coverage were 99 and 75%, respectively, while for TM protein domain regions, they were 96 and 68%, respectively. With a graphical representation of TM topology and TM helix positions with the domain units, the Localizome server is a highly accurate and comprehensive information source for subcellular localization for soluble proteins as well as membrane proteins. The Localizome server can be found at . PMID:16845118
Liu, Qianying; Lei, Zhixin; Zhu, Feng; Ihsan, Awais; Wang, Xu; Yuan, Zonghui
2017-01-01
Genotoxicity and carcinogenicity testing of pharmaceuticals prior to commercialization is requested by regulatory agencies. The bacterial mutagenicity test was considered having the highest accuracy of carcinogenic prediction. However, some evidences suggest that it always results in false-positive responses when the bacterial mutagenicity test is used to predict carcinogenicity. Along with major changes made to the International Committee on Harmonization guidance on genotoxicity testing [S2 (R1)], the old data (especially the cytotgenetic data) may not meet current guidelines. This review provides a compendium of retrievable results of genotoxicity and animal carcinogenicity of 136 antiparasitics. Neither genotoxicity nor carcinogenicity data is available for 84 (61.8%), while 52 (38.2%) have been evaluated in at least one genotoxicity or carcinogenicity study, and only 20 (14.7%) in both genotoxicity and carcinogenicity studies. Among 33 antiparasitics with at least one old result in in vitro genotoxicity, 15 (45.5%) are in agreement with the current ICH S2 (R1) guidance for data acceptance. Compared with other genotoxicity assays, the DNA lesions can significantly increase the accuracy of prediction of carcinogenicity. Together, a combination of DNA lesion and bacterial tests is a more accurate way to predict carcinogenicity. PMID:29170735
3D Cloud Field Prediction using A-Train Data and Machine Learning Techniques
NASA Astrophysics Data System (ADS)
Johnson, C. L.
2017-12-01
Validation of cloud process parameterizations used in global climate models (GCMs) would greatly benefit from observed 3D cloud fields at the size comparable to that of a GCM grid cell. For the highest resolution simulations, surface grid cells are on the order of 100 km by 100 km. CloudSat/CALIPSO data provides 1 km width of detailed vertical cloud fraction profile (CFP) and liquid and ice water content (LWC/IWC). This work utilizes four machine learning algorithms to create nonlinear regressions of CFP, LWC, and IWC data using radiances, surface type and location of measurement as predictors and applies the regression equations to off-track locations generating 3D cloud fields for 100 km by 100 km domains. The CERES-CloudSat-CALIPSO-MODIS (C3M) merged data set for February 2007 is used. Support Vector Machines, Artificial Neural Networks, Gaussian Processes and Decision Trees are trained on 1000 km of continuous C3M data. Accuracy is computed using existing vertical profiles that are excluded from the training data and occur within 100 km of the training data. Accuracy of the four algorithms is compared. Average accuracy for one day of predicted data is 86% for the most successful algorithm. The methodology for training the algorithms, determining valid prediction regions and applying the equations off-track is discussed. Predicted 3D cloud fields are provided as inputs to the Ed4 NASA LaRC Fu-Liou radiative transfer code and resulting TOA radiances compared to observed CERES/MODIS radiances. Differences in computed radiances using predicted profiles and observed radiances are compared.
Yan, Long; Wang, Hong; Zhang, Xuan; Li, Ming-Yue; He, Juan
2017-01-01
Influence of meteorological variables on the transmission of bacillary dysentery (BD) is under investigated topic and effective forecasting models as public health tool are lacking. This paper aimed to quantify the relationship between meteorological variables and BD cases in Beijing and to establish an effective forecasting model. A time series analysis was conducted in the Beijing area based upon monthly data on weather variables (i.e. temperature, rainfall, relative humidity, vapor pressure, and wind speed) and on the number of BD cases during the period 1970-2012. Autoregressive integrated moving average models with explanatory variables (ARIMAX) were built based on the data from 1970 to 2004. Prediction of monthly BD cases from 2005 to 2012 was made using the established models. The prediction accuracy was evaluated by the mean square error (MSE). Firstly, temperature with 2-month and 7-month lags and rainfall with 12-month lag were found positively correlated with the number of BD cases in Beijing. Secondly, ARIMAX model with covariates of temperature with 7-month lag (β = 0.021, 95% confidence interval(CI): 0.004-0.038) and rainfall with 12-month lag (β = 0.023, 95% CI: 0.009-0.037) displayed the highest prediction accuracy. The ARIMAX model developed in this study showed an accurate goodness of fit and precise prediction accuracy in the short term, which would be beneficial for government departments to take early public health measures to prevent and control possible BD popularity.
Liu, Guo-Ping; Yan, Jian-Jun; Wang, Yi-Qin; Fu, Jing-Jing; Xu, Zhao-Xia; Guo, Rui; Qian, Peng
2012-01-01
Background. In Traditional Chinese Medicine (TCM), most of the algorithms are used to solve problems of syndrome diagnosis that only focus on one syndrome, that is, single label learning. However, in clinical practice, patients may simultaneously have more than one syndrome, which has its own symptoms (signs). Methods. We employed a multilabel learning using the relevant feature for each label (REAL) algorithm to construct a syndrome diagnostic model for chronic gastritis (CG) in TCM. REAL combines feature selection methods to select the significant symptoms (signs) of CG. The method was tested on 919 patients using the standard scale. Results. The highest prediction accuracy was achieved when 20 features were selected. The features selected with the information gain were more consistent with the TCM theory. The lowest average accuracy was 54% using multi-label neural networks (BP-MLL), whereas the highest was 82% using REAL for constructing the diagnostic model. For coverage, hamming loss, and ranking loss, the values obtained using the REAL algorithm were the lowest at 0.160, 0.142, and 0.177, respectively. Conclusion. REAL extracts the relevant symptoms (signs) for each syndrome and improves its recognition accuracy. Moreover, the studies will provide a reference for constructing syndrome diagnostic models and guide clinical practice. PMID:22719781
Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data.
Alakwaa, Fadhl M; Chaudhary, Kumardeep; Garmire, Lana X
2018-01-05
Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+), and 67 negative estrogen receptor (ER-) to test the accuracies of feed-forward networks, a deep learning (DL) framework, as well as six widely used machine learning models, namely random forest (RF), support vector machines (SVM), recursive partitioning and regression trees (RPART), linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), and generalized boosted models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER- patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value <0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion and absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accuracy (AUC = 0.93) and better revelation of disease biology. We encourage the adoption of feed-forward networks based deep learning method in the metabolomics research community for classification.
NASA Astrophysics Data System (ADS)
Castillo, Jose Alan A.; Apan, Armando A.; Maraseni, Tek N.; Salmo, Severino G.
2017-12-01
The recent launch of the Sentinel-1 (SAR) and Sentinel-2 (multispectral) missions offers a new opportunity for land-based biomass mapping and monitoring especially in the tropics where deforestation is highest. Yet, unlike in agriculture and inland land uses, the use of Sentinel imagery has not been evaluated for biomass retrieval in mangrove forest and the non-forest land uses that replaced mangroves. In this study, we evaluated the ability of Sentinel imagery for the retrieval and predictive mapping of above-ground biomass of mangroves and their replacement land uses. We used Sentinel SAR and multispectral imagery to develop biomass prediction models through the conventional linear regression and novel Machine Learning algorithms. We developed models each from SAR raw polarisation backscatter data, multispectral bands, vegetation indices, and canopy biophysical variables. The results show that the model based on biophysical variable Leaf Area Index (LAI) derived from Sentinel-2 was more accurate in predicting the overall above-ground biomass. In contrast, the model which utilised optical bands had the lowest accuracy. However, the SAR-based model was more accurate in predicting the biomass in the usually deficient to low vegetation cover non-forest replacement land uses such as abandoned aquaculture pond, cleared mangrove and abandoned salt pond. These models had 0.82-0.83 correlation/agreement of observed and predicted value, and root mean square error of 27.8-28.5 Mg ha-1. Among the Sentinel-2 multispectral bands, the red and red edge bands (bands 4, 5 and 7), combined with elevation data, were the best variable set combination for biomass prediction. The red edge-based Inverted Red-Edge Chlorophyll Index had the highest prediction accuracy among the vegetation indices. Overall, Sentinel-1 SAR and Sentinel-2 multispectral imagery can provide satisfactory results in the retrieval and predictive mapping of the above-ground biomass of mangroves and the replacement non-forest land uses, especially with the inclusion of elevation data. The study demonstrates encouraging results in biomass mapping of mangroves and other coastal land uses in the tropics using the freely accessible and relatively high-resolution Sentinel imagery.
Diagnostic accuracy of [99mTc]Tc-Sestamibi in the assessment of thyroid nodules
Yordanova, Anna; Mahjoob, Soha; Lingohr, Philipp; Kalff, Jörg; Türler, Andreas; Palmedo, Holger; Biersack, Hans-Jürgen; Kristiansen, Glen; Farahati, Jamshid; Essler, Markus; Ahmadzadehfar, Hojjat
2017-01-01
[99mTc]Tc-Sestamibi (MIBI) is an increasingly used tool for evaluation of thyroid nodules. However, there is a lack of evidence about the accuracy of this method in the European population. The aim of this study was to assess the utility of MIBI for the differentiation of thyroid nodules in a large cohort. 161 patients underwent MIBI, followed by a thyroidectomy. We used a dual phase MIBI protocol. Interpretation of the images included a scoring system from 0 (absent) to 3 (increased); this was to provide a scale for the uptake of the thyroid nodule in comparison to the paranodular tissue. Additionally, we evaluated the tracer uptake trend in late images compared to early images. We used the final histopathology as the reference standard. Scores 0-1 in early images, scores 0-2 in late images, and an absence of increasing uptake in the thyroid nodule in late images, showed the best predictive values to exclude malignancy, respectively (negative predictive value (NPV) 89%). Highest sensitivity (91%) for malignant nodules was evident in early images with a score 1-3. Highest specificity (91%) was obtained when the negative was defined as an absence of uptake-increase, in the late images. This study confirms that the most valuable feature of MIBI is the high NPV. Thus, with the appropriate interpretation method, high sensitivity and specificity, and moderate PPV can be obtained. PMID:29212258
Diagnostic accuracy of [99mTc]Tc-Sestamibi in the assessment of thyroid nodules.
Yordanova, Anna; Mahjoob, Soha; Lingohr, Philipp; Kalff, Jörg; Türler, Andreas; Palmedo, Holger; Biersack, Hans-Jürgen; Kristiansen, Glen; Farahati, Jamshid; Essler, Markus; Ahmadzadehfar, Hojjat
2017-11-07
[ 99m Tc]Tc-Sestamibi (MIBI) is an increasingly used tool for evaluation of thyroid nodules. However, there is a lack of evidence about the accuracy of this method in the European population. The aim of this study was to assess the utility of MIBI for the differentiation of thyroid nodules in a large cohort. 161 patients underwent MIBI, followed by a thyroidectomy. We used a dual phase MIBI protocol. Interpretation of the images included a scoring system from 0 (absent) to 3 (increased); this was to provide a scale for the uptake of the thyroid nodule in comparison to the paranodular tissue. Additionally, we evaluated the tracer uptake trend in late images compared to early images. We used the final histopathology as the reference standard. Scores 0-1 in early images, scores 0-2 in late images, and an absence of increasing uptake in the thyroid nodule in late images, showed the best predictive values to exclude malignancy, respectively (negative predictive value (NPV) 89%). Highest sensitivity (91%) for malignant nodules was evident in early images with a score 1-3. Highest specificity (91%) was obtained when the negative was defined as an absence of uptake-increase, in the late images. This study confirms that the most valuable feature of MIBI is the high NPV. Thus, with the appropriate interpretation method, high sensitivity and specificity, and moderate PPV can be obtained.
Hussain, Lal; Ahmed, Adeel; Saeed, Sharjil; Rathore, Saima; Awan, Imtiaz Ahmed; Shah, Saeed Arif; Majid, Abdul; Idris, Adnan; Awan, Anees Ahmed
2018-02-06
Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.
Smeets, Miek; Degryse, Jan; Janssens, Stefan; Matheï, Catharina; Wallemacq, Pierre; Vanoverschelde, Jean-Louis; Aertgeerts, Bert; Vaes, Bert
2016-10-06
Different diagnostic algorithms for non-acute heart failure (HF) exist. Our aim was to compare the ability of these algorithms to identify HF in symptomatic patients aged 80 years and older and identify those patients at highest risk for mortality. Diagnostic accuracy and validation study. General practice, Belgium. 365 patients with HF symptoms aged 80 years and older (BELFRAIL cohort). Participants underwent a full clinical assessment, including a detailed echocardiographic examination at home. The diagnostic accuracy of 4 different algorithms was compared using an intention-to-diagnose analysis. The European Society of Cardiology (ESC) definition of HF was used as the reference standard for HF diagnosis. Kaplan-Meier curves for 5-year all-cause mortality were plotted and HRs and corresponding 95% CIs were calculated to compare the mortality risk predicting abilities of the different algorithms. Net reclassification improvement (NRI) was calculated. The prevalence of HF was 20% (n=74). The 2012 ESC algorithm yielded the highest sensitivity (92%, 95% CI 83% to 97%) as well as the highest referral rate (71%, n=259), whereas the Oudejans algorithm yielded the highest specificity (73%, 95% CI 68% to 78%) and the lowest referral rate (36%, n=133). These differences could be ascribed to differences in N-terminal probrain natriuretic peptide cut-off values (125 vs 400 pg/mL). The Kelder and Oudejans algorithms exhibited NRIs of 12% (95% CI 0.7% to 22%, p=0.04) and 22% (95% CI 9% to 32%, p<0.001), respectively, compared with the ESC algorithm. All algorithms detected patients at high risk for mortality (HR 1.9, 95% CI 1.4 to 2.5; Kelder) to 2.3 (95% CI 1.7 to 3.1; Oudejans). No significant differences were observed among the algorithms with respect to mortality risk predicting abilities. Choosing a diagnostic algorithm for non-acute HF in elderly patients represents a trade-off between sensitivity and specificity, mainly depending on differences between cut-off values for natriuretic peptides. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
A Method for Assessing the Accuracy of a Photogrammetry System for Precision Deployable Structures
NASA Technical Reports Server (NTRS)
Moore, Ashley
2005-01-01
The measurement techniques used to validate analytical models of large deployable structures are an integral Part of the technology development process and must be precise and accurate. Photogrammetry and videogrammetry are viable, accurate, and unobtrusive methods for measuring such large Structures. Photogrammetry uses Software to determine the three-dimensional position of a target using camera images. Videogrammetry is based on the same principle, except a series of timed images are analyzed. This work addresses the accuracy of a digital photogrammetry system used for measurement of large, deployable space structures at JPL. First, photogrammetry tests are performed on a precision space truss test article, and the images are processed using Photomodeler software. The accuracy of the Photomodeler results is determined through, comparison with measurements of the test article taken by an external testing group using the VSTARS photogrammetry system. These two measurements are then compared with Australis photogrammetry software that simulates a measurement test to predict its accuracy. The software is then used to study how particular factors, such as camera resolution and placement, affect the system accuracy to help design the setup for the videogrammetry system that will offer the highest level of accuracy for measurement of deploying structures.
Boerner, Vinzent; Johnston, David J; Tier, Bruce
2014-10-24
The major obstacles for the implementation of genomic selection in Australian beef cattle are the variety of breeds and in general, small numbers of genotyped and phenotyped individuals per breed. The Australian Beef Cooperative Research Center (Beef CRC) investigated these issues by deriving genomic prediction equations (PE) from a training set of animals that covers a range of breeds and crosses including Angus, Murray Grey, Shorthorn, Hereford, Brahman, Belmont Red, Santa Gertrudis and Tropical Composite. This paper presents accuracies of genomically estimated breeding values (GEBV) that were calculated from these PE in the commercial pure-breed beef cattle seed stock sector. PE derived by the Beef CRC from multi-breed and pure-breed training populations were applied to genotyped Angus, Limousin and Brahman sires and young animals, but with no pure-breed Limousin in the training population. The accuracy of the resulting GEBV was assessed by their genetic correlation to their phenotypic target trait in a bi-variate REML approach that models GEBV as trait observations. Accuracies of most GEBV for Angus and Brahman were between 0.1 and 0.4, with accuracies for abattoir carcass traits generally greater than for live animal body composition traits and reproduction traits. Estimated accuracies greater than 0.5 were only observed for Brahman abattoir carcass traits and for Angus carcass rib fat. Averaged across traits within breeds, accuracies of GEBV were highest when PE from the pooled across-breed training population were used. However, for the Angus and Brahman breeds the difference in accuracy from using pure-breed PE was small. For the Limousin breed no reasonable results could be achieved for any trait. Although accuracies were generally low compared to published accuracies estimated within breeds, they are in line with those derived in other multi-breed populations. Thus PE developed by the Beef CRC can contribute to the implementation of genomic selection in Australian beef cattle breeding.
NASA Astrophysics Data System (ADS)
Vathsala, H.; Koolagudi, Shashidhar G.
2017-01-01
In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good.
Multivariate models for prediction of human skin sensitization hazard.
Strickland, Judy; Zang, Qingda; Paris, Michael; Lehmann, David M; Allen, David; Choksi, Neepa; Matheson, Joanna; Jacobs, Abigail; Casey, Warren; Kleinstreuer, Nicole
2017-03-01
One of the Interagency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays - the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT) and KeratinoSens™ assay - six physicochemical properties and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches, logistic regression and support vector machine, to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three logistic regression and three support vector machine) with the highest accuracy (92%) used: (1) DPRA, h-CLAT and read-across; (2) DPRA, h-CLAT, read-across and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens and log P. The models performed better at predicting human skin sensitization hazard than the murine local lymph node assay (accuracy 88%), any of the alternative methods alone (accuracy 63-79%) or test batteries combining data from the individual methods (accuracy 75%). These results suggest that computational methods are promising tools to identify effectively the potential human skin sensitizers without animal testing. Published 2016. This article has been contributed to by US Government employees and their work is in the public domain in the USA. Published 2016. This article has been contributed to by US Government employees and their work is in the public domain in the USA.
Rastogi, Amit; Early, Dayna S; Gupta, Neil; Bansal, Ajay; Singh, Vikas; Ansstas, Michael; Jonnalagadda, Sreenivasa S; Hovis, Christine E; Gaddam, Srinivas; Wani, Sachin B; Edmundowicz, Steven A; Sharma, Prateek
2011-09-01
Missing adenomas and the inability to accurately differentiate between polyp histology remain the main limitations of standard-definition white-light (SD-WL) colonoscopy. To compare the adenoma detection rates of SD-WL with those of high-definition white-light (HD-WL) and narrow-band imaging (NBI) as well as the accuracy of predicting polyp histology. Multicenter, prospective, randomized, controlled trial. Two academic medical centers in the United States. Subjects undergoing screening or surveillance colonoscopy. Subjects were randomized to undergo colonoscopy with one of the following: SD-WL, HD-WL, or NBI. The proportion of subjects detected with adenomas, adenomas detected per subject, and the accuracy of predicting polyp histology real time. A total of 630 subjects were included. The proportion of subjects with adenomas was 38.6% with SD-WL compared with 45.7% with HD-WL and 46.2% with NBI (P = .17 and P = .14, respectively). Adenomas detected per subject were 0.69 with SD-WL compared with 1.12 with HD-WL and 1.13 with NBI (P = .016 and P = .014, respectively). HD-WL and NBI detected more subjects with flat and right-sided adenomas compared with SD-WL (all P values <.005). NBI had a superior sensitivity (90%) and accuracy (82%) to predict adenomas compared with SD-WL and HD-WL (all P values <.005). Academic medical centers with experienced endoscopists. There was no difference in the proportion of subjects with adenomas detected with SD-WL, HD-WL, and NBI. However, HD-WL and NBI detected significantly more adenomas per subject (>60%) compared with SD-WL. NBI had the highest accuracy in predicting adenomas in real time during colonoscopy. ( NCT 00614770.). Copyright © 2011 American Society for Gastrointestinal Endoscopy. Published by Mosby, Inc. All rights reserved.
Wang, Hsin-Wei; Lin, Ya-Chi; Pai, Tun-Wen; Chang, Hao-Teng
2011-01-01
Epitopes are antigenic determinants that are useful because they induce B-cell antibody production and stimulate T-cell activation. Bioinformatics can enable rapid, efficient prediction of potential epitopes. Here, we designed a novel B-cell linear epitope prediction system called LEPS, Linear Epitope Prediction by Propensities and Support Vector Machine, that combined physico-chemical propensity identification and support vector machine (SVM) classification. We tested the LEPS on four datasets: AntiJen, HIV, a newly generated PC, and AHP, a combination of these three datasets. Peptides with globally or locally high physicochemical propensities were first identified as primitive linear epitope (LE) candidates. Then, candidates were classified with the SVM based on the unique features of amino acid segments. This reduced the number of predicted epitopes and enhanced the positive prediction value (PPV). Compared to four other well-known LE prediction systems, the LEPS achieved the highest accuracy (72.52%), specificity (84.22%), PPV (32.07%), and Matthews' correlation coefficient (10.36%).
ShinyGPAS: interactive genomic prediction accuracy simulator based on deterministic formulas.
Morota, Gota
2017-12-20
Deterministic formulas for the accuracy of genomic predictions highlight the relationships among prediction accuracy and potential factors influencing prediction accuracy prior to performing computationally intensive cross-validation. Visualizing such deterministic formulas in an interactive manner may lead to a better understanding of how genetic factors control prediction accuracy. The software to simulate deterministic formulas for genomic prediction accuracy was implemented in R and encapsulated as a web-based Shiny application. Shiny genomic prediction accuracy simulator (ShinyGPAS) simulates various deterministic formulas and delivers dynamic scatter plots of prediction accuracy versus genetic factors impacting prediction accuracy, while requiring only mouse navigation in a web browser. ShinyGPAS is available at: https://chikudaisei.shinyapps.io/shinygpas/ . ShinyGPAS is a shiny-based interactive genomic prediction accuracy simulator using deterministic formulas. It can be used for interactively exploring potential factors that influence prediction accuracy in genome-enabled prediction, simulating achievable prediction accuracy prior to genotyping individuals, or supporting in-class teaching. ShinyGPAS is open source software and it is hosted online as a freely available web-based resource with an intuitive graphical user interface.
Villarreal, Miguel L.; van Riper, Charles; Petrakis, Roy E.
2013-01-01
Riparian vegetation provides important wildlife habitat in the Southwestern United States, but limited distributions and spatial complexity often leads to inaccurate representation in maps used to guide conservation. We test the use of data conflation and aggregation on multiple vegetation/land-cover maps to improve the accuracy of habitat models for the threatened western yellow-billed cuckoo (Coccyzus americanus occidentalis). We used species observations (n = 479) from a state-wide survey to develop habitat models from 1) three vegetation/land-cover maps produced at different geographic scales ranging from state to national, and 2) new aggregate maps defined by the spatial agreement of cover types, which were defined as high (agreement = all data sets), moderate (agreement ≥ 2), and low (no agreement required). Model accuracies, predicted habitat locations, and total area of predicted habitat varied considerably, illustrating the effects of input data quality on habitat predictions and resulting potential impacts on conservation planning. Habitat models based on aggregated and conflated data were more accurate and had higher model sensitivity than original vegetation/land-cover, but this accuracy came at the cost of reduced geographic extent of predicted habitat. Using the highest performing models, we assessed cuckoo habitat preference and distribution in Arizona and found that major watersheds containing high-probably habitat are fragmented by a wide swath of low-probability habitat. Focus on riparian restoration in these areas could provide more breeding habitat for the threatened cuckoo, offset potential future habitat losses in adjacent watershed, and increase regional connectivity for other threatened vertebrates that also use riparian corridors.
Herrick, Ariane L; Peytrignet, Sebastien; Lunt, Mark; Pan, Xiaoyan; Hesselstrand, Roger; Mouthon, Luc; Silman, Alan J; Dinsdale, Graham; Brown, Edith; Czirják, László; Distler, Jörg H W; Distler, Oliver; Fligelstone, Kim; Gregory, William J; Ochiel, Rachel; Vonk, Madelon C; Ancuţa, Codrina; Ong, Voon H; Farge, Dominique; Hudson, Marie; Matucci-Cerinic, Marco; Balbir-Gurman, Alexandra; Midtvedt, Øyvind; Jobanputra, Paresh; Jordan, Alison C; Stevens, Wendy; Moinzadeh, Pia; Hall, Frances C; Agard, Christian; Anderson, Marina E; Diot, Elisabeth; Madhok, Rajan; Akil, Mohammed; Buch, Maya H; Chung, Lorinda; Damjanov, Nemanja S; Gunawardena, Harsha; Lanyon, Peter; Ahmad, Yasmeen; Chakravarty, Kuntal; Jacobsen, Søren; MacGregor, Alexander J; McHugh, Neil; Müller-Ladner, Ulf; Riemekasten, Gabriela; Becker, Michael; Roddy, Janet; Carreira, Patricia E; Fauchais, Anne Laure; Hachulla, Eric; Hamilton, Jennifer; İnanç, Murat; McLaren, John S; van Laar, Jacob M; Pathare, Sanjay; Proudman, Susanna M; Rudin, Anna; Sahhar, Joanne; Coppere, Brigitte; Serratrice, Christine; Sheeran, Tom; Veale, Douglas J; Grange, Claire; Trad, Georges-Selim; Denton, Christopher P
2018-01-01
Objectives Our aim was to use the opportunity provided by the European Scleroderma Observational Study to (1) identify and describe those patients with early diffuse cutaneous systemic sclerosis (dcSSc) with progressive skin thickness, and (2) derive prediction models for progression over 12 months, to inform future randomised controlled trials (RCTs). Methods The modified Rodnan skin score (mRSS) was recorded every 3 months in 326 patients. ‘Progressors’ were defined as those experiencing a 5-unit and 25% increase in mRSS score over 12 months (±3 months). Logistic models were fitted to predict progression and, using receiver operating characteristic (ROC) curves, were compared on the basis of the area under curve (AUC), accuracy and positive predictive value (PPV). Results 66 patients (22.5%) progressed, 227 (77.5%) did not (33 could not have their status assessed due to insufficient data). Progressors had shorter disease duration (median 8.1 vs 12.6 months, P=0.001) and lower mRSS (median 19 vs 21 units, P=0.030) than non-progressors. Skin score was highest, and peaked earliest, in the anti-RNA polymerase III (Pol3+) subgroup (n=50). A first predictive model (including mRSS, duration of skin thickening and their interaction) had an accuracy of 60.9%, AUC of 0.666 and PPV of 33.8%. By adding a variable for Pol3 positivity, the model reached an accuracy of 71%, AUC of 0.711 and PPV of 41%. Conclusions Two prediction models for progressive skin thickening were derived, for use both in clinical practice and for cohort enrichment in RCTs. These models will inform recruitment into the many clinical trials of dcSSc projected for the coming years. Trial registration number NCT02339441. PMID:29306872
Short-term load forecasting of power system
NASA Astrophysics Data System (ADS)
Xu, Xiaobin
2017-05-01
In order to ensure the scientific nature of optimization about power system, it is necessary to improve the load forecasting accuracy. Power system load forecasting is based on accurate statistical data and survey data, starting from the history and current situation of electricity consumption, with a scientific method to predict the future development trend of power load and change the law of science. Short-term load forecasting is the basis of power system operation and analysis, which is of great significance to unit combination, economic dispatch and safety check. Therefore, the load forecasting of the power system is explained in detail in this paper. First, we use the data from 2012 to 2014 to establish the partial least squares model to regression analysis the relationship between daily maximum load, daily minimum load, daily average load and each meteorological factor, and select the highest peak by observing the regression coefficient histogram Day maximum temperature, daily minimum temperature and daily average temperature as the meteorological factors to improve the accuracy of load forecasting indicators. Secondly, in the case of uncertain climate impact, we use the time series model to predict the load data for 2015, respectively, the 2009-2014 load data were sorted out, through the previous six years of the data to forecast the data for this time in 2015. The criterion for the accuracy of the prediction is the average of the standard deviations for the prediction results and average load for the previous six years. Finally, considering the climate effect, we use the BP neural network model to predict the data in 2015, and optimize the forecast results on the basis of the time series model.
Statistical validation of a solar wind propagation model from 1 to 10 AU
NASA Astrophysics Data System (ADS)
Zieger, Bertalan; Hansen, Kenneth C.
2008-08-01
A one-dimensional (1-D) numerical magnetohydrodynamic (MHD) code is applied to propagate the solar wind from 1 AU through 10 AU, i.e., beyond the heliocentric distance of Saturn's orbit, in a non-rotating frame of reference. The time-varying boundary conditions at 1 AU are obtained from hourly solar wind data observed near the Earth. Although similar MHD simulations have been carried out and used by several authors, very little work has been done to validate the statistical accuracy of such solar wind predictions. In this paper, we present an extensive analysis of the prediction efficiency, using 12 selected years of solar wind data from the major heliospheric missions Pioneer, Voyager, and Ulysses. We map the numerical solution to each spacecraft in space and time, and validate the simulation, comparing the propagated solar wind parameters with in-situ observations. We do not restrict our statistical analysis to the times of spacecraft alignment, as most of the earlier case studies do. Our superposed epoch analysis suggests that the prediction efficiency is significantly higher during periods with high recurrence index of solar wind speed, typically in the late declining phase of the solar cycle. Among the solar wind variables, the solar wind speed can be predicted to the highest accuracy, with a linear correlation of 0.75 on average close to the time of opposition. We estimate the accuracy of shock arrival times to be as high as 10-15 hours within ±75 d from apparent opposition during years with high recurrence index. During solar activity maximum, there is a clear bias for the model to predicted shocks arriving later than observed in the data, suggesting that during these periods, there is an additional acceleration mechanism in the solar wind that is not included in the model.
Smoothing and Predicting Celestial Pole Offsets using a Kalman Filter and Smoother
NASA Astrophysics Data System (ADS)
Nastula, J.; Chin, T. M.; Gross, R. S.; Winska, M.; Winska, J.
2017-12-01
Since the early days of interplanetary spaceflight, accounting for changes in the Earth's rotation is recognized to be critical for accurate navigation. In the 1960s, tracking anomalies during the Ranger VII and VIII lunar missions were traced to errors in the Earth orientation parameters. As a result, Earth orientation calibration methods were improved to support the Mariner IV and V planetary missions. Today, accurate Earth orientation parameters are used to track and navigate every interplanetary spaceflight mission. The interplanetary spacecraft tracking and navigation teams at JPL require the UT1 and polar motion parameters, and these Earth orientation parameters are estimated by the use of a Kalman filter to combine past measurements of these parameters and predict their future evolution. A model was then used to provide the nutation/precession components of the Earth's orientation separately. As a result, variations caused by the free core nutation were not taken into account. But for the highest accuracy, these variations must be considered. So JPL recently developed an approach based upon the use of a Kalman filter and smoother to provide smoothed and predicted celestial pole offsets (CPOs) to the interplanetary spacecraft tracking and navigation teams. The approach used at JPL to do this and an evaluation of the accuracy of the predicted CPOs will be given here.
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
NASA Astrophysics Data System (ADS)
Balasubramanian, A.; Shamsuddin, R.; Prabhakaran, B.; Sawant, A.
2017-03-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%) (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-01-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5–91.4%); (ii) the predictive modeling yields lowest accuracies (50–60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96–0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient lookahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors. PMID:28075331
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts.
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-03-07
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%); (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Kamalandua, Aubeline
2015-01-01
Age estimation from DNA methylation markers has seen an exponential growth of interest, not in the least from forensic scientists. The current published assays, however, can still be improved by lowering the number of markers in the assay and by providing more accurate models to predict chronological age. From the published literature we selected 4 age-associated genes (ASPA, PDE4C, ELOVL2, and EDARADD) and determined CpG methylation levels from 206 blood samples of both deceased and living individuals (age range: 0–91 years). This data was subsequently used to compare prediction accuracy with both linear and non-linear regression models. A quadratic regression model in which the methylation levels of ELOVL2 were squared showed the highest accuracy with a Mean Absolute Deviation (MAD) between chronological age and predicted age of 3.75 years and an adjusted R2 of 0.95. No difference in accuracy was observed for samples obtained either from living and deceased individuals or between the 2 genders. In addition, 29 teeth from different individuals (age range: 19–70 years) were analyzed using the same set of markers resulting in a MAD of 4.86 years and an adjusted R2 of 0.74. Cross validation of the results obtained from blood samples demonstrated the robustness and reproducibility of the assay. In conclusion, the set of 4 CpG DNA methylation markers is capable of producing highly accurate age predictions for blood samples from deceased and living individuals PMID:26280308
Gong, Yin-Xi; He, Cheng; Yan, Fei; Feng, Zhong-Ke; Cao, Meng-Lei; Gao, Yuan; Miao, Jie; Zhao, Jin-Long
2013-10-01
Multispectral remote sensing data containing rich site information are not fully used by the classic site quality evaluation system, as it merely adopts artificial ground survey data. In order to establish a more effective site quality evaluation system, a neural network model which combined remote sensing spectra factors with site factors and site index relations was established and used to study the sublot site quality evaluation in the Wangyedian Forest Farm in Inner Mongolia Province, Chifeng City. Based on the improved back propagation artificial neural network (BPANN), this model combined multispectral remote sensing data with sublot survey data, and took larch as example, Through training data set sensitivity analysis weak or irrelevant factor was excluded, the size of neural network was simplified, and the efficiency of network training was improved. This optimal site index prediction model had an accuracy up to 95.36%, which was 9.83% higher than that of the neural network model based on classic sublot survey data, and this shows that using multi-spectral remote sensing and small class survey data to determine the status of larch index prediction model has the highest predictive accuracy. The results fully indicate the effectiveness and superiority of this method.
Tkach, D C; Hargrove, L J
2013-01-01
Advances in battery and actuator technology have enabled clinical use of powered lower limb prostheses such as the BiOM Powered Ankle. To allow ambulation over various types of terrains, such devices rely on built-in mechanical sensors or manual actuation by the amputee to transition into an operational mode that is suitable for a given terrain. It is unclear if mechanical sensors alone can accurately modulate operational modes while voluntary actuation prevents seamless, naturalistic gait. Ensuring that the prosthesis is ready to accommodate new terrain types at first step is critical for user safety. EMG signals from patient's residual leg muscles may provide additional information to accurately choose the proper mode of prosthesis operation. Using a pattern recognition classifier we compared the accuracy of predicting 8 different mode transitions based on (1) prosthesis mechanical sensor output (2) EMG recorded from residual limb and (3) fusion of EMG and mechanical sensor data. Our findings indicate that the neuromechanical sensor fusion significantly decreases errors in predicting 10 mode transitions as compared to using either mechanical sensors or EMG alone (2.3±0.7% vs. 7.8±0.9% and 20.2±2.0% respectively).
Saatchi, Mahdi; McClure, Mathew C; McKay, Stephanie D; Rolf, Megan M; Kim, JaeWoo; Decker, Jared E; Taxis, Tasia M; Chapple, Richard H; Ramey, Holly R; Northcutt, Sally L; Bauck, Stewart; Woodward, Brent; Dekkers, Jack C M; Fernando, Rohan L; Schnabel, Robert D; Garrick, Dorian J; Taylor, Jeremy F
2011-11-28
Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.
2011-01-01
Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy. PMID:22122853
Schroeck, Florian R; Patterson, Olga V; Alba, Patrick R; Pattison, Erik A; Seigne, John D; DuVall, Scott L; Robertson, Douglas J; Sirovich, Brenda; Goodney, Philip P
2017-12-01
To take the first step toward assembling population-based cohorts of patients with bladder cancer with longitudinal pathology data, we developed and validated a natural language processing (NLP) engine that abstracts pathology data from full-text pathology reports. Using 600 bladder pathology reports randomly selected from the Department of Veterans Affairs, we developed and validated an NLP engine to abstract data on histology, invasion (presence vs absence and depth), grade, the presence of muscularis propria, and the presence of carcinoma in situ. Our gold standard was based on an independent review of reports by 2 urologists, followed by adjudication. We assessed the NLP performance by calculating the accuracy, the positive predictive value, and the sensitivity. We subsequently applied the NLP engine to pathology reports from 10,725 patients with bladder cancer. When comparing the NLP output to the gold standard, NLP achieved the highest accuracy (0.98) for the presence vs the absence of carcinoma in situ. Accuracy for histology, invasion (presence vs absence), grade, and the presence of muscularis propria ranged from 0.83 to 0.96. The most challenging variable was depth of invasion (accuracy 0.68), with an acceptable positive predictive value for lamina propria (0.82) and for muscularis propria (0.87) invasion. The validated engine was capable of abstracting pathologic characteristics for 99% of the patients with bladder cancer. NLP had high accuracy for 5 of 6 variables and abstracted data for the vast majority of the patients. This now allows for the assembly of population-based cohorts with longitudinal pathology data. Published by Elsevier Inc.
Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo
2017-07-01
Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.
Shutter, Lori; Tong, Karen A; Holshouser, Barbara A
2004-12-01
Proton magnetic resonance spectroscopy (MRS) is being used to evaluate individuals with acute traumatic brain injury and several studies have shown that changes in certain brain metabolites (N-acetylaspartate, choline) are associated with poor neurologic outcomes. The majority of previous MRS studies have been obtained relatively late after injury and none have examined the role of glutamate/ glutamine (Glx). We conducted a prospective MRS study of 42 severely injured adults to measure quantitative metabolite changes early (7 days) after injury in normal appearing brain. We used these findings to predict long-term neurologic outcome and to determine if MRS data alone or in combination with clinical outcome variables provided better prediction of long-term outcomes. We found that glutamate/glutamine (Glx) and choline (Cho) were significantly elevated in occipital gray and parietal white matter early after injury in patients with poor long-term (6-12-month) outcomes. Glx and Cho ratios predicted long-term outcome with 94% accuracy and when combined with the motor Glasgow Coma Scale score provided the highest predictive accuracy (97%). Somatosensory evoked potentials were not as accurate as MRS data in predicting outcome. Elevated Glx and Cho are more sensitive indicators of injury and predictors of poor outcome when spectroscopy is done early after injury. This may be a reflection of early excitotoxic injury (i.e., elevated Glx) and of injury associated with membrane disruption (i.e., increased Cho) secondary to diffuse axonal injury.
Sensitivity analysis of gene ranking methods in phenotype prediction.
deAndrés-Galiana, Enrique J; Fernández-Martínez, Juan L; Sonis, Stephen T
2016-12-01
It has become clear that noise generated during the assay and analytical processes has the ability to disrupt accurate interpretation of genomic studies. Not only does such noise impact the scientific validity and costs of studies, but when assessed in the context of clinically translatable indications such as phenotype prediction, it can lead to inaccurate conclusions that could ultimately impact patients. We applied a sequence of ranking methods to damp noise associated with microarray outputs, and then tested the utility of the approach in three disease indications using publically available datasets. This study was performed in three phases. We first theoretically analyzed the effect of noise in phenotype prediction problems showing that it can be expressed as a modeling error that partially falsifies the pathways. Secondly, via synthetic modeling, we performed the sensitivity analysis for the main gene ranking methods to different types of noise. Finally, we studied the predictive accuracy of the gene lists provided by these ranking methods in synthetic data and in three different datasets related to cancer, rare and neurodegenerative diseases to better understand the translational aspects of our findings. In the case of synthetic modeling, we showed that Fisher's Ratio (FR) was the most robust gene ranking method in terms of precision for all the types of noise at different levels. Significance Analysis of Microarrays (SAM) provided slightly lower performance and the rest of the methods (fold change, entropy and maximum percentile distance) were much less precise and accurate. The predictive accuracy of the smallest set of high discriminatory probes was similar for all the methods in the case of Gaussian and Log-Gaussian noise. In the case of class assignment noise, the predictive accuracy of SAM and FR is higher. Finally, for real datasets (Chronic Lymphocytic Leukemia, Inclusion Body Myositis and Amyotrophic Lateral Sclerosis) we found that FR and SAM provided the highest predictive accuracies with the smallest number of genes. Biological pathways were found with an expanded list of genes whose discriminatory power has been established via FR. We have shown that noise in expression data and class assignment partially falsifies the sets of discriminatory probes in phenotype prediction problems. FR and SAM better exploit the principle of parsimony and are able to find subsets with less number of high discriminatory genes. The predictive accuracy and the precision are two different metrics to select the important genes, since in the presence of noise the most predictive genes do not completely coincide with those that are related to the phenotype. Based on the synthetic results, FR and SAM are recommended to unravel the biological pathways that are involved in the disease development. Copyright © 2016 Elsevier Inc. All rights reserved.
[Accuracy of three methods for the rapid diagnosis of oral candidiasis].
Lyu, X; Zhao, C; Yan, Z M; Hua, H
2016-10-09
Objective: To explore a simple, rapid and efficient method for the diagnosis of oral candidiasis in clinical practice. Methods: Totally 124 consecutive patients with suspected oral candidiasis were enrolled from Department of Oral Medicine, Peking University School and Hospital of Stomatology, Beijing, China. Exfoliated cells of oral mucosa and saliva or concentrated oral rinse) obtained from all participants were tested by three rapid smear methods(10% KOH smear, gram-stained smear, Congo red stained smear). The diagnostic efficacy(sensitivity, specificity, Youden's index, likelihood ratio, consistency, predictive value and area under curve(AUC) of each of the above mentioned three methods was assessed by comparing the results with the gold standard(combination of clinical diagnosis, laboratory diagnosis and expert opinion). Results: Gram-stained smear of saliva(or concentrated oral rinse) demonstrated highest sensitivity(82.3%). Test of 10%KOH smear of exfoliated cells showed highest specificity(93.5%). Congo red stained smear of saliva(or concentrated oral rinse) displayed highest diagnostic efficacy(79.0% sensitivity, 80.6% specificity, 0.60 Youden's index, 4.08 positive likelihood ratio, 0.26 negative likelihood ratio, 80% consistency, 80.3% positive predictive value, 79.4% negative predictive value and 0.80 AUC). Conclusions: Test of Congo red stained smear of saliva(or concentrated oral rinse) could be used as a point-of-care tool for the rapid diagnosis of oral candidiasis in clinical practice. Trial registration: Chinese Clinical Trial Registry, ChiCTR-DDD-16008118.
Hadush, Marta Yemane; Berhe, Amanuel Hadgu; Medhanyie, Araya Abrha
2017-04-21
Low birth weight (Birth weight < 2500 g) is a leading cause of prenatal and neonatal deaths. The early identification of Low birth weight (LBW) neonates is essential for any comprehensive initiative to improve their chance of survival. However, a large proportion of births in developing countries take place at home and birth weight statistics are not available. Therefore, there is a need to develop simple, inexpensive and practical methods to identify low birth weight (LBW) neonates soon after birth. This is a hospital based cross sectional study. Four hundred twenty two (422) live born neonates were included and anthropometric measurements were carried out within 24 h of birth by three trained nurses. Birth weight was measured by digital scale. Head and chest circumference were measured by using non extendable measuring tape and foot length with hard transparent plastic ruler. Data was entered into SPSS version 20 for analysis. Characteristics of study participants were analyzed using descriptive statistics such as frequency and percentage for categorical data and mean and standard deviation for continuous data. Correlation with birth weight using Pearson's correlation coefficient and linear regression were used to identify the association between dependent and independent variables. Receiver operating characteristic (ROC) curve was used to evaluate accuracy of the anthropometric measurements to predict LBW. The prevalence of low birth weight was found to be 27%. All anthropometric measurements had a positive correlation with birth weight, chest circumference attaining the highest correlation with birth weight (r = 0.85) and foot length had the weakest correlation (r = 0.74). Head circumference had the highest predictive value for birth weight (AUC = 0.93) followed by Chest circumference (AUC = 0.91). A cut off point of chest circumference 30.15 cm had 84.2% sensitivity, 85.4% specificity and diagnostic accuracy (P < 0.001). A cut off point of head circumference 33.25 had the highest positive predictive value (77%). Chest circumference and head circumference were found to be better surrogate measurements to identify low birth weight neonates.
Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-01
In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358
Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-08
In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Julián-Jiménez, Agustín; González Del Castillo, Juan; Candel, Francisco Javier
2017-06-07
Between all patients treated in the Emergency Department (ED), 1.35% are diagnosed with community-acquired pneumonia (CAP). CAP is the main cause of death due to infectious disease (10-14%) and the most frequent reason of sepsis-septic shock in the ED. In the last decade, the search for objective tools to help establishing an early diagnosis, bacterial aetiology, severity, suspicion of bacteremia and the prognosis of mortality has increased. Biomarkers have shown their usefulness in this matter. Procalcitonin (obtains the highest accuracy for CAP diagnosis, bacterial aetiology and the presence of bacteremia), lactate (biomarker of hypoxia and tissue hypoperfusion) and proadrenomedullin (which has the greatest accuracy to predict mortality which in combination with the prognostic severity scales obtains even better results). The aim of this review is to highlight recently published scientific evidence and to compare the utility and prognostic accuracy of the biomarkers in CAP patients treated in the ED. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
Connolly, Florian; Schreiber, Stephan J; Leithner, Christoph; Bohner, Georg; Vajkoczy, Peter; Valdueza, José M
2017-12-15
OBJECTIVE Transcranial color-coded duplex sonography (TCCS) is a reliable tool that is used to assess vasospasm in the M 1 segment of the middle cerebral artery (MCA) after subarachnoid hemorrhage (SAH). A distinct increase in blood flow velocity (BFV) is the principal criterion for vasospasm. The MCA/internal carotid artery (ICA) index (Lindegaard Index) is also widely used to distinguish between vasospasm and cerebral hyperperfusion. However, extracranial ultrasonography assessment of the neck vessels might be difficult in an intensive care unit. Therefore, the authors evaluated whether the relationship of intracranial arterial to venous BFV might indicate vasospasm with similar or even better accuracy. METHODS Patients who presented between 2008 and 2015 with aneurysmal SAH were prospectively enrolled in the study. Digital subtraction angiography (DSA) and TCCS were performed within 24 hours of each other to assess vasospasm 8-10 days after SAH. The following different TCCS parameters were analyzed to assess vasospasm in the MCA and were compared with the gold-standard DSA parameters: 1) mean time-averaged maximum BFV (V mean ) of the MCA, 2) peak systolic velocity (PSV) of the MCA, 3) the Lindegaard Index using V mean as well as PSV, and 4) a new arteriovenous index (AVI) between the MCA and the basal vein of Rosenthal using V mean and PSV. The best cutoff values for these parameters to distinguish vasospasm from normal perfusion or hyperperfusion were calculated using receiver operating characteristic curve analysis. Sensitivity, specificity, positive predictive value, and negative predictive value as well as the overall accuracy for each cutoff value were analyzed. RESULTS A total of 102 patients (mean age 52 ± 12 years) were evaluated. Bilateral MCA assessment by TCCS was successful in all patients. In 6 cases (3%), the BFV of the basal vein of Rosenthal could not be analyzed. The AVI could not be calculated in 50 of 204 cases (25%) because the insonation quality was very low in one of the ICAs. An AVI > 10 for V mean and an AVI > 12 for systolic velocity provided the highest accuracies of 87% and 86%, respectively. Regarding the Lindegaard Index, the accuracy was highest using a threshold of > 3 for the mean BFV (84%) as well as systolic BFV (80%). BFVs in the MCA of ≥ 120 cm/sec (V mean ) and ≥ 200 cm/sec (PSV) predicted vasospasm with accuracies of 84% and 83%, respectively. A combined analysis of the MCA BFV and the AVI led to a slight increase in specificity (V mean , 94%; PSV, 93%) and positive predictive value (V mean , 88%; PSV 86%) without further improvement in accuracy (V mean , 88%; PSV, 84%). CONCLUSIONS The intracranial AVI is a reliable parameter that can be used to assess vasospasm after SAH. Its reliability for differentiating vasospasm and hyperperfusion is slightly higher than that for the established Lindegaard Index, and this method has the additional advantage of a remarkably lower failure rate.
Frequency of Respiratory Nursing Diagnoses and Accuracy of Clinical Indicators in Preterm Infants.
Avena, Marta José; Pedreira, Mavilde da Luz Gonçalves; Bassolli de Oliveira Alves, Lucas; Herdman, T Heather; de Gutiérrez, Maria Gaby Rivero
2018-03-05
To identify the frequency of the nursing diagnoses, ineffective breathing pattern, impaired gas exchange and impaired spontaneous ventilation in newborns; and, to analyze the accuracy of diagnostic indicators identified for each of these diagnoses. This was a cross-sectional study conducted with a nonprobability sample of 92 infants. Data collected were represented by demographic and clinical variables, clinical indicators of the three respiratory nursing diagnoses from NANDA International, and were analyzed according to frequency and agreement between pairs of expert nurses (Kappa). Ineffective breathing pattern was identified in 74.5% of infants; impaired gas exchange was noted in 31.5%; impaired spontaneous ventilation was found in 16.8% of subjects. Use of accessory muscles to breathe showed the highest sensitivity for ineffective breathing pattern; abnormal blood gases had the best predictive value for impaired gas exchange. Use of accessory muscles to breathe had the highest sensitivity for impaired spontaneous ventilation. Ineffective breathing pattern was the most frequently identified; use of accessory muscles, alteration in depth of breathing, abnormal breathing, and dyspnea were the most representative signs/symptoms. Early recognition of respiratory conditions can support safe interventions to ensure appropriate outcomes. © 2018 NANDA International, Inc.
The application of a geometric optical canopy reflectance model to semiarid shrub vegetation
NASA Technical Reports Server (NTRS)
Franklin, Janet; Turner, Debra L.
1992-01-01
Estimates are obtained of the average plant size and density of shrub vegetation on the basis of SPOT High Resolution Visible Multispectral imagery from Chihuahuan desert areas, using the Li and Strahler (1985) model. The aggregated predictions for a number of stands within a class were accurate to within one or two standard errors of the observed average value. Accuracy was highest for those classes of vegetation where the nonrandom scrub pattern was characterized for the class on the basis of the average coefficient of the determination of density.
Seyednasrollah, Fatemeh; Mäkelä, Johanna; Pitkänen, Niina; Juonala, Markus; Hutri-Kähönen, Nina; Lehtimäki, Terho; Viikari, Jorma; Kelly, Tanika; Li, Changwei; Bazzano, Lydia; Elo, Laura L; Raitakari, Olli T
2017-06-01
Obesity is a known risk factor for cardiovascular disease. Early prediction of obesity is essential for prevention. The aim of this study is to assess the use of childhood clinical factors and the genetic risk factors in predicting adulthood obesity using machine learning methods. A total of 2262 participants from the Cardiovascular Risk in YFS (Young Finns Study) were followed up from childhood (age 3-18 years) to adulthood for 31 years. The data were divided into training (n=1625) and validation (n=637) set. The effect of known genetic risk factors (97 single-nucleotide polymorphisms) was investigated as a weighted genetic risk score of all 97 single-nucleotide polymorphisms (WGRS97) or a subset of 19 most significant single-nucleotide polymorphisms (WGRS19) using boosting machine learning technique. WGRS97 and WGRS19 were validated using external data (n=369) from BHS (Bogalusa Heart Study). WGRS19 improved the accuracy of predicting adulthood obesity in training (area under the curve [AUC=0.787 versus AUC=0.744, P <0.0001) and validation data (AUC=0.769 versus AUC=0.747, P =0.026). WGRS97 improved the accuracy in training (AUC=0.782 versus AUC=0.744, P <0.0001) but not in validation data (AUC=0.749 versus AUC=0.747, P =0.785). Higher WGRS19 associated with higher body mass index at 9 years and WGRS97 at 6 years. Replication in BHS confirmed our findings that WGRS19 and WGRS97 are associated with body mass index. WGRS19 improves prediction of adulthood obesity. Predictive accuracy is highest among young children (3-6 years), whereas among older children (9-18 years) the risk can be identified using childhood clinical factors. The model is helpful in screening children with high risk of developing obesity. © 2017 American Heart Association, Inc.
Sun, Xin; Young, Jennifer; Liu, Jeng-Hung; Newman, David
2018-06-01
The objective of this project was to develop a computer vision system (CVS) for objective measurement of pork loin under industry speed requirement. Color images of pork loin samples were acquired using a CVS. Subjective color and marbling scores were determined according to the National Pork Board standards by a trained evaluator. Instrument color measurement and crude fat percentage were used as control measurements. Image features (18 color features; 1 marbling feature; 88 texture features) were extracted from whole pork loin color images. Artificial intelligence prediction model (support vector machine) was established for pork color and marbling quality grades. The results showed that CVS with support vector machine modeling reached the highest prediction accuracy of 92.5% for measured pork color score and 75.0% for measured pork marbling score. This research shows that the proposed artificial intelligence prediction model with CVS can provide an effective tool for predicting color and marbling in the pork industry at online speeds. Copyright © 2018 Elsevier Ltd. All rights reserved.
CSmetaPred: a consensus method for prediction of catalytic residues.
Choudhary, Preeti; Kumar, Shailesh; Bachhawat, Anand Kumar; Pandit, Shashi Bhushan
2017-12-22
Knowledge of catalytic residues can play an essential role in elucidating mechanistic details of an enzyme. However, experimental identification of catalytic residues is a tedious and time-consuming task, which can be expedited by computational predictions. Despite significant development in active-site prediction methods, one of the remaining issues is ranked positions of putative catalytic residues among all ranked residues. In order to improve ranking of catalytic residues and their prediction accuracy, we have developed a meta-approach based method CSmetaPred. In this approach, residues are ranked based on the mean of normalized residue scores derived from four well-known catalytic residue predictors. The mean residue score of CSmetaPred is combined with predicted pocket information to improve prediction performance in meta-predictor, CSmetaPred_poc. Both meta-predictors are evaluated on two comprehensive benchmark datasets and three legacy datasets using Receiver Operating Characteristic (ROC) and Precision Recall (PR) curves. The visual and quantitative analysis of ROC and PR curves shows that meta-predictors outperform their constituent methods and CSmetaPred_poc is the best of evaluated methods. For instance, on CSAMAC dataset CSmetaPred_poc (CSmetaPred) achieves highest Mean Average Specificity (MAS), a scalar measure for ROC curve, of 0.97 (0.96). Importantly, median predicted rank of catalytic residues is the lowest (best) for CSmetaPred_poc. Considering residues ranked ≤20 classified as true positive in binary classification, CSmetaPred_poc achieves prediction accuracy of 0.94 on CSAMAC dataset. Moreover, on the same dataset CSmetaPred_poc predicts all catalytic residues within top 20 ranks for ~73% of enzymes. Furthermore, benchmarking of prediction on comparative modelled structures showed that models result in better prediction than only sequence based predictions. These analyses suggest that CSmetaPred_poc is able to rank putative catalytic residues at lower (better) ranked positions, which can facilitate and expedite their experimental characterization. The benchmarking studies showed that employing meta-approach in combining residue-level scores derived from well-known catalytic residue predictors can improve prediction accuracy as well as provide improved ranked positions of known catalytic residues. Hence, such predictions can assist experimentalist to prioritize residues for mutational studies in their efforts to characterize catalytic residues. Both meta-predictors are available as webserver at: http://14.139.227.206/csmetapred/ .
Vorberg, Susann
2013-01-01
Abstract Biodegradability describes the capacity of substances to be mineralized by free‐living bacteria. It is a crucial property in estimating a compound’s long‐term impact on the environment. The ability to reliably predict biodegradability would reduce the need for laborious experimental testing. However, this endpoint is difficult to model due to unavailability or inconsistency of experimental data. Our approach makes use of the Online Chemical Modeling Environment (OCHEM) and its rich supply of machine learning methods and descriptor sets to build classification models for ready biodegradability. These models were analyzed to determine the relationship between characteristic structural properties and biodegradation activity. The distinguishing feature of the developed models is their ability to estimate the accuracy of prediction for each individual compound. The models developed using seven individual descriptor sets were combined in a consensus model, which provided the highest accuracy. The identified overrepresented structural fragments can be used by chemists to improve the biodegradability of new chemical compounds. The consensus model, the datasets used, and the calculated structural fragments are publicly available at http://ochem.eu/article/31660. PMID:27485201
Vijayakumar, Vishal; Case, Michelle; Shirinpour, Sina; He, Bin
2017-12-01
Effective pain assessment and management strategies are needed to better manage pain. In addition to self-report, an objective pain assessment system can provide a more complete picture of the neurophysiological basis for pain. In this study, a robust and accurate machine learning approach is developed to quantify tonic thermal pain across healthy subjects into a maximum of ten distinct classes. A random forest model was trained to predict pain scores using time-frequency wavelet representations of independent components obtained from electroencephalography (EEG) data, and the relative importance of each frequency band to pain quantification is assessed. The mean classification accuracy for predicting pain on an independent test subject for a range of 1-10 is 89.45%, highest among existing state of the art quantification algorithms for EEG. The gamma band is the most important to both intersubject and intrasubject classification accuracy. The robustness and generalizability of the classifier are demonstrated. Our results demonstrate the potential of this tool to be used clinically to help us to improve chronic pain treatment and establish spectral biomarkers for future pain-related studies using EEG.
Eigbefoh, J O; Isabu, P; Okpere, E; Abebe, J
2008-07-01
Untreated urinary tract infection can have devastating maternal and neonatal effects. Thus, routine screening for bacteriuria is advocated. This study was designed to evaluate the diagnostic accuracy of the rapid dipstick test to predict urinary tract infection in pregnancy with the gold standard of urine microscopy, culture and sensitivity acting as the control. The urine dipstick test uses the leucocyte esterase, nitrite and test for protein singly and in combination. The result of the dipstick was compared with the gold standard, urine microscopy, culture and sensitivity using confidence interval for proportions. The reliability and validity of the urine dipstick was also evaluated. Overall, the urine dipstick test has a poor correlation with urine culture (p = 0.125, CI 95%). The same holds true for individual components of the dipstick test. The overall sensitivity of the urine dipstick test was poor at 2.3%. Individual sensitivity of the various components varied between 9.1% for leucocyte esterase and the nitrite test to 56.8% for leucocyte esterase alone. The other components of the dipstick test, the test of nitrite, test for protein and combination of the test (leucocyte esterase, nitrite and proteinuria) appear to decrease the sensitivity of the leucocyte esterase test alone. The ability of the urine dipstick test to correctly rule out urinary tract infection (specificity) was high. The positive predictive value for the dipstick test was high, with the leucocyte esterase test having the highest positive predictive value compared with the other components of the dipstick test. The negative predictive value (NPV) was expectedly highest for the leucocyte esterase test alone with values higher than the other components of the urine dipstick test singly and in various combinations. Compared with the other parameters of the urine dipstick test, singly and in combination, leucocyte esterase appears to be the most accurate (90.25%). The dipstick test has a limited use in screening for asymptomatic bacteriuria. The leucocyte esterase test component of the dipstick test appears to have the highest reliability and validity. The other parameters of the dipstick test decreases the reliability and validity of the leucocyte esterase test. A positive test merits empirical antibiotics, while a negative test is an indication for urine culture. The urine dipstick test if positive will also be useful in follow-up of patient after treatment of urinary tract infection. This is useful in poor resource setting especially in the third world where there is a dearth of trained personnel and equipment for urine culture.
Hassan, Eman M; Omran, Dalia A; El Beshlawey, Mohamad L; Abdo, Mahmoud; El Askary, Ahmad
2014-02-01
Gastroesophageal varices are present in approximately 50% of patients with liver cirrhosis. The aim of this study was to evaluate liver stiffness measurement (LSM), Fib-4, Forns Index and Lok Score as noninvasive predictors of esophageal varices (EV). This prospective study included 65 patients with HCV-related liver cirrhosis. All patients underwent routine laboratory tests, transient elastograhy (TE) and esophagogastroduodenoscopy. FIB-4, Forns Index and Lok Score were calculated. The diagnostic performances of these methods were assessed using sensitivity, specificity, positive predictive value, negative predictive value, accuracy and receiver operating characteristic curves. All predictors (LSM, FIB-4, Forns Index and Lok Score) demonstrated statistically significant correlation with the presence and the grade of EV. TE could diagnose EV at a cutoff value of 18.2kPa. Fib-4, Forns Index, and Lok Score could diagnose EV at cutoff values of 2.8, 6.61 and 0.63, respectively. For prediction of large varices (grade 2, 3), LSM showed the highest accuracy (80%) with a cutoff of 22.4kPa and AUROC of 0.801. Its sensitivity was 84%, specificity 72%, PPV 84% and NPV 72%. The diagnostic accuracies of FIB-4, Forns Index and Lok Score were 70%, 70% and76%, respectively, at cutoffs of 3.3, 6.9 and 0.7, respectively. For diagnosis of large esophageal varices, adding TE to each of the other diagnostic indices (serum fibrosis scores) increased their sensitivities with little decrease in their specificities. Moreover, this combination decreased the LR- in all tests. Noninvasive predictors can restrict endoscopic screening. This is very important as non invasiveness is now a major goal in hepatology. Copyright © 2013 Elsevier España, S.L. and AEEH y AEG. All rights reserved.
NASA Astrophysics Data System (ADS)
Krasnoshchekov, Sergey V.; Schutski, Roman S.; Craig, Norman C.; Sibaev, Marat; Crittenden, Deborah L.
2018-02-01
Three dihalogenated methane derivatives (CH2F2, CH2FCl, and CH2Cl2) were used as model systems to compare and assess the accuracy of two different approaches for predicting observed fundamental frequencies: canonical operator Van Vleck vibrational perturbation theory (CVPT) and vibrational configuration interaction (VCI). For convenience and consistency, both methods employ the Watson Hamiltonian in rectilinear normal coordinates, expanding the potential energy surface (PES) as a Taylor series about equilibrium and constructing the wavefunction from a harmonic oscillator product basis. At the highest levels of theory considered here, fourth-order CVPT and VCI in a harmonic oscillator basis with up to 10 quanta of vibrational excitation in conjunction with a 4-mode representation sextic force field (SFF-4MR) computed at MP2/cc-pVTZ with replacement CCSD(T)/aug-cc-pVQZ harmonic force constants, the agreement between computed fundamentals is closer to 0.3 cm-1 on average, with a maximum difference of 1.7 cm-1. The major remaining accuracy-limiting factors are the accuracy of the underlying electronic structure model, followed by the incompleteness of the PES expansion. Nonetheless, computed and experimental fundamentals agree to within 5 cm-1, with an average difference of 2 cm-1, confirming the utility and accuracy of both theoretical models. One exception to this rule is the formally IR-inactive but weakly allowed through Coriolis-coupling H-C-H out-of-plane twisting mode of dichloromethane, whose spectrum we therefore revisit and reassign. We also investigate convergence with respect to order of CVPT, VCI excitation level, and order of PES expansion, concluding that premature truncation substantially decreases accuracy, although VCI(6)/SFF-4MR results are still of acceptable accuracy, and some error cancellation is observed with CVPT2 using a quartic force field.
Masaoka, T; Amano, K; Takedani, H; Suzuki, T; Otaki, M; Seita, I; Tateiwa, T; Shishido, T; Yamamoto, K; Fukutake, K
2017-03-01
Detecting signs of joint deterioration is important for early effective orthopaedic intervention in managing haemophilic arthropathy. We developed a simple, patient self-administered sheet to evaluate the joint condition, and assessed the predictive ability of this assessment sheet for the need for an orthopaedic intervention. This was a single-centre, cross-sectional study. The association between the score of each of the four items of the assessment sheet (bleeding, swelling, pain and physical impairment) and the results of radiological findings and physical examinations based on Haemophilia Joint Health Score 2.1 was assessed. An optimal scoring system was explored by the area under the curve (AUC). The cut-off value for the need for surgery or physiotherapy was determined using the receiver operating characteristic curve procedure. Forty-two patients were included. The 'physical impairment' item showed the highest correlation coefficient with the results of radiographic and physical examinations (range: 0.57-0.76). The AUC of finally adjusted scoring indicates good ability to discriminate between patients with and without a need for orthopaedic intervention. The positive predictive value was the highest at a cut-off value of 4 points for knees (63.0%) and ankles (70.0%), at 5 points for elbows (66.7%) and the highest predictive accuracy at the cut-off value of 4 points for all the joints. The linear trend of the need for an orthopaedic intervention was observed with an increasing score. The joint condition assessment sheet can help clinicians assess the need for orthopaedic intervention for haemophilic arthropathy in Japanese patients with haemophilia. © 2016 John Wiley & Sons Ltd.
Accuracy of genomic breeding values for meat tenderness in Polled Nellore cattle.
Magnabosco, C U; Lopes, F B; Fragoso, R C; Eifert, E C; Valente, B D; Rosa, G J M; Sainz, R D
2016-07-01
Zebu () cattle, mostly of the Nellore breed, comprise more than 80% of the beef cattle in Brazil, given their tolerance of the tropical climate and high resistance to ectoparasites. Despite their advantages for production in tropical environments, zebu cattle tend to produce tougher meat than Bos taurus breeds. Traditional genetic selection to improve meat tenderness is constrained by the difficulty and cost of phenotypic evaluation for meat quality. Therefore, genomic selection may be the best strategy to improve meat quality traits. This study was performed to compare the accuracies of different Bayesian regression models in predicting molecular breeding values for meat tenderness in Polled Nellore cattle. The data set was composed of Warner-Bratzler shear force (WBSF) of longissimus muscle from 205, 141, and 81 animals slaughtered in 2005, 2010, and 2012, respectively, which were selected and mated so as to create extreme segregation for WBSF. The animals were genotyped with either the Illumina BovineHD (HD; 777,000 from 90 samples) chip or the GeneSeek Genomic Profiler (GGP Indicus HD; 77,000 from 337 samples). The quality controls of SNP were Hard-Weinberg Proportion -value ≥ 0.1%, minor allele frequency > 1%, and call rate > 90%. The FImpute program was used for imputation from the GGP Indicus HD chip to the HD chip. The effect of each SNP was estimated using ridge regression, least absolute shrinkage and selection operator (LASSO), Bayes A, Bayes B, and Bayes Cπ methods. Different numbers of SNP were used, with 1, 2, 3, 4, 5, 7, 10, 20, 40, 60, 80, or 100% of the markers preselected based on their significance test (-value from genomewide association studies [GWAS]) or randomly sampled. The prediction accuracy was assessed by the correlation between genomic breeding value and the observed WBSF phenotype, using a leave-one-out cross-validation methodology. The prediction accuracies using all markers were all very similar for all models, ranging from 0.22 (Bayes Cπ) to 0.25 (Bayes B). When preselecting SNP based on GWAS results, the highest correlation (0.27) between WBSF and the genomic breeding value was achieved using the Bayesian LASSO model with 15,030 (3%) markers. Although this study used relatively few animals, the design of the segregating population ensured wide genetic variability for meat tenderness, which was important to achieve acceptable accuracy of genomic prediction. Although all models showed similar levels of prediction accuracy, some small advantages were observed with the Bayes B approach when higher numbers of markers were preselected based on their -values resulting from a GWAS analysis.
Apostolova, Liana G.; Hwang, Kristy S.; Kohannim, Omid; Avila, David; Elashoff, David; Jack, Clifford R.; Shaw, Leslie; Trojanowski, John Q.; Weiner, Michael W.; Thompson, Paul M.
2014-01-01
Biomarkers are the only feasible way to detect and monitor presymptomatic Alzheimer's disease (AD). No single biomarker can predict future cognitive decline with an acceptable level of accuracy. In addition to designing powerful multimodal diagnostic platforms, a careful investigation of the major sources of disease heterogeneity and their influence on biomarker changes is needed. Here we investigated the accuracy of a novel multimodal biomarker classifier for differentiating cognitively normal (NC), mild cognitive impairment (MCI) and AD subjects with and without stratification by ApoE4 genotype. 111 NC, 182 MCI and 95 AD ADNI participants provided both structural MRI and CSF data at baseline. We used an automated machine-learning classifier to test the ability of hippocampal volume and CSF Aβ, t-tau and p-tau levels, both separately and in combination, to differentiate NC, MCI and AD subjects, and predict conversion. We hypothesized that the combined hippocampal/CSF biomarker classifier model would achieve the highest accuracy in differentiating between the three diagnostic groups and that ApoE4 genotype will affect both diagnostic accuracy and biomarker selection. The combined hippocampal/CSF classifier performed better than hippocampus-only classifier in differentiating NC from MCI and NC from AD. It also outperformed the CSF-only classifier in differentiating NC vs. AD. Our amyloid marker played a role in discriminating NC from MCI or AD but not for MCI vs. AD. Neurodegenerative markers contributed to accurate discrimination of AD from NC and MCI but not NC from MCI. Classifiers predicting MCI conversion performed well only after ApoE4 stratification. Hippocampal volume and sex achieved AUC = 0.68 for predicting conversion in the ApoE4-positive MCI, while CSF p-tau, education and sex achieved AUC = 0.89 for predicting conversion in ApoE4-negative MCI. These observations support the proposed biomarker trajectory in AD, which postulates that amyloid markers become abnormal early in the disease course while markers of neurodegeneration become abnormal later in the disease course and suggests that ApoE4 could be at least partially responsible for some of the observed disease heterogeneity. PMID:24634832
Machine learning models in breast cancer survival prediction.
Montazeri, Mitra; Montazeri, Mohadeseh; Montazeri, Mahdieh; Beigzadeh, Amin
2016-01-01
Breast cancer is one of the most common cancers with a high mortality rate among women. With the early diagnosis of breast cancer survival will increase from 56% to more than 86%. Therefore, an accurate and reliable system is necessary for the early diagnosis of this cancer. The proposed model is the combination of rules and different machine learning techniques. Machine learning models can help physicians to reduce the number of false decisions. They try to exploit patterns and relationships among a large number of cases and predict the outcome of a disease using historical cases stored in datasets. The objective of this study is to propose a rule-based classification method with machine learning techniques for the prediction of different types of Breast cancer survival. We use a dataset with eight attributes that include the records of 900 patients in which 876 patients (97.3%) and 24 (2.7%) patients were females and males respectively. Naive Bayes (NB), Trees Random Forest (TRF), 1-Nearest Neighbor (1NN), AdaBoost (AD), Support Vector Machine (SVM), RBF Network (RBFN), and Multilayer Perceptron (MLP) machine learning techniques with 10-cross fold technique were used with the proposed model for the prediction of breast cancer survival. The performance of machine learning techniques were evaluated with accuracy, precision, sensitivity, specificity, and area under ROC curve. Out of 900 patients, 803 patients and 97 patients were alive and dead, respectively. In this study, Trees Random Forest (TRF) technique showed better results in comparison to other techniques (NB, 1NN, AD, SVM and RBFN, MLP). The accuracy, sensitivity and the area under ROC curve of TRF are 96%, 96%, 93%, respectively. However, 1NN machine learning technique provided poor performance (accuracy 91%, sensitivity 91% and area under ROC curve 78%). This study demonstrates that Trees Random Forest model (TRF) which is a rule-based classification model was the best model with the highest level of accuracy. Therefore, this model is recommended as a useful tool for breast cancer survival prediction as well as medical decision making.
Lou, Wangchao; Wang, Xiaoqing; Chen, Fan; Chen, Yixiao; Jiang, Bo; Zhang, Hua
2014-01-01
Developing an efficient method for determination of the DNA-binding proteins, due to their vital roles in gene regulation, is becoming highly desired since it would be invaluable to advance our understanding of protein functions. In this study, we proposed a new method for the prediction of the DNA-binding proteins, by performing the feature rank using random forest and the wrapper-based feature selection using forward best-first search strategy. The features comprise information from primary sequence, predicted secondary structure, predicted relative solvent accessibility, and position specific scoring matrix. The proposed method, called DBPPred, used Gaussian naïve Bayes as the underlying classifier since it outperformed five other classifiers, including decision tree, logistic regression, k-nearest neighbor, support vector machine with polynomial kernel, and support vector machine with radial basis function. As a result, the proposed DBPPred yields the highest average accuracy of 0.791 and average MCC of 0.583 according to the five-fold cross validation with ten runs on the training benchmark dataset PDB594. Subsequently, blind tests on the independent dataset PDB186 by the proposed model trained on the entire PDB594 dataset and by other five existing methods (including iDNA-Prot, DNA-Prot, DNAbinder, DNABIND and DBD-Threader) were performed, resulting in that the proposed DBPPred yielded the highest accuracy of 0.769, MCC of 0.538, and AUC of 0.790. The independent tests performed by the proposed DBPPred on completely a large non-DNA binding protein dataset and two RNA binding protein datasets also showed improved or comparable quality when compared with the relevant prediction methods. Moreover, we observed that majority of the selected features by the proposed method are statistically significantly different between the mean feature values of the DNA-binding and the non DNA-binding proteins. All of the experimental results indicate that the proposed DBPPred can be an alternative perspective predictor for large-scale determination of DNA-binding proteins. PMID:24475169
Cho, Sooyoung; Shin, Aesun; Song, Daesub; Park, Jae Kyung; Kim, Yeonjung; Choi, Ji-Yeob; Kang, Daehee; Lee, Jong-Koo
2017-10-01
To assess the validity of the cohort study participants' self-reported cancer history via data linkage to a cancer registry database. We included 143,965 participants from the Health Examinees (HEXA) study recruited between 2004 and 2013 who gave informed consent for record linkage to the Korean Central Cancer Registry (KCCR). The sensitivity and the positive predictive value of self-reported histories of cancer were calculated and 95% confidence intervals were estimated. A total of 4,860 participants who had at least one record in the KCCR were included in the calculation of sensitivity. In addition, 3,671 participants who reported a cancer history at enrollment were included in the calculation of positive predictive value. The overall sensitivity of self-reported cancer history was 72.0%. Breast cancer history among women showed the highest sensitivity (81.2%), whereas the lowest sensitivity was observed for liver cancer (53.7%) and cervical cancer (52.1%). The overall positive predictive value was 81.9%. The highest positive predictive value was observed for thyroid cancer (96.1%) and prostate cancer (96.1%), and the lowest was observed for cervical cancer (43.7%). The accuracy of self-reported cancer history varied by cancer site and may not be sufficient to ascertain cancer incidence, especially for cervical and bladder cancers. Copyright © 2017. Published by Elsevier Ltd.
Predictive values of thermal and electrical dental pulp tests: a clinical study.
Villa-Chávez, Carlos E; Patiño-Marín, Nuria; Loyola-Rodríguez, Juan P; Zavala-Alonso, Norma V; Martínez-Castañón, Gabriel A; Medina-Solís, Carlo E
2013-08-01
For a diagnostic test to be useful, it is necessary to determine the probability that the test will provide the correct diagnosis. Therefore, it is necessary to calculate the predictive value of diagnostics. The aim of the present study was to identify the sensitivity, specificity, positive and negative predictive values, accuracy, and reproducibility of thermal and electrical tests of pulp sensitivity. The thermal tests studied were the 1, 1, 1, 2-tetrafluoroethane (cold) and hot gutta-percha (hot) tests. For the electrical test, the Analytic Technology Pulp Tester (Analytic Technology, Redmond, WA) was used. A total of 110 teeth were tested: 60 teeth with vital pulp and 50 teeth with necrotic pulps (disease prevalence of 45%). The ideal standard was established by direct pulp inspection. The sensitivities of the diagnostic tests were 0.88 for the cold test, 0.86 for the heat test, and 0.76 for the electrical test, and the specificity was 1.0 for all 3 tests. The negative predictive value was 0.90 for the cold test, 0.89 for the heat test, and 0.83 for the electrical test, and the positive predictive value was 1.0 for all 3 tests. The highest accuracy (0.94) and reproducibility (0.88) were observed for the cold test. The cold test was the most accurate method for diagnostic testing. Copyright © 2013 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Cha, Jongtae; Kim, Soyoung; Wang, Jiyoung; Yun, Mijin; Cho, Arthur
2018-02-01
The purpose of this study was to investigate the value of 18 F-fluorodeoxyglucose positron emission tomography/computed tomography (FDG PET/CT) parameters in the detection of regional lymph node (LN) metastasis in patients with cutaneous melanoma. We evaluated patients with cutaneous melanoma who underwent FDG PET/CT for initial staging or recurrence evaluation. A total of 103 patients were enrolled, and 165 LNs were evaluated. LNs that were confirmed pathologically or by follow-up imaging were included in this study. PET parameters, including maximum standardized uptake value (SUVmax), total lesion glycolysis and tumour-to-liver ratio, were used to determine the presence of metastases, and the results were compared with CT-determined LN metastasis. Receiver operating characteristic (ROC) curve analysis was used to determine the optimal cut-off values of the FDG PET parameters. A total of 93 LNs were malignant, and 84 LNs were smaller than 10 mm. In all 165 LNs, an SUVmax of >2.51 showed a sensitivity of 73.1%, a specificity of 88.9%, and an accuracy of 80.0% in detecting metastatic LNs. CT showed a higher specificity (87.3%) and lower accuracy (65.5%). For non-enlarged regional LNs (<10 mm), an SUVmax cut-off value of 1.4 showed the highest negative predictive value (81.3%). For enlarged LNs (≥10 mm), an SUVmax cut-off value of 2.4 showed the highest sensitivity (90.7%) and accuracy (88.9%) in detecting metastatic LNs. In patients with cutaneous melanoma, an SUVmax of >2.4 showed a high sensitivity (91%) and accuracy (89%) in detecting metastasis in LNs ≥1 cm, and LNs <1 cm with an SUVmax <1.4 were likely to be benign.
Grimmer, K; Milanese, S; Beaton, K; Atlas, A
2014-01-01
The Hospital Admission Risk Profile (HARP) instrument is commonly used to assess risk of functional decline when older people are admitted to hospital. HARP has moderate diagnostic accuracy (65%) for downstream decreased scores in activities of daily living. This paper reports the diagnostic accuracy of HARP for downstream quality of life. It also tests whether adding other measures to HARP improves its diagnostic accuracy. One hundred and forty-eight independent community dwelling individuals aged 65 years or older were recruited in the emergency department of one large Australian hospital with a medical problem for which they were discharged without a hospital ward admission. Data, including age, sex, primary language, highest level of education, postcode, living status, requiring care for daily activities, using a gait aid, receiving formal community supports, instrumental activities of daily living in the last week, hospitalization and falls in the last 12 months, and mental state were collected at recruitment. HARP scores were derived from a formula that summed scores assigned to age, activities of daily living, and mental state categories. Physical and mental component scores of a quality of life measure were captured by telephone interview at 1 and 3 months after recruitment. HARP scores are moderately accurate at predicting downstream decline in physical quality of life, but did not predict downstream decline in mental quality of life. The addition of other variables to HARP did not improve its diagnostic accuracy for either measure of quality of life. HARP is a poor predictor of quality of life.
Radiogenomics to characterize regional genetic heterogeneity in glioblastoma
Hu, Leland S.; Ning, Shuluo; Eschbacher, Jennifer M.; Baxter, Leslie C.; Gaw, Nathan; Ranjbar, Sara; Plasencia, Jonathan; Dueck, Amylou C.; Peng, Sen; Smith, Kris A.; Nakaji, Peter; Karis, John P.; Quarles, C. Chad; Wu, Teresa; Loftus, Joseph C.; Jenkins, Robert B.; Sicotte, Hugues; Kollmeyer, Thomas M.; O'Neill, Brian P.; Elmquist, William; Hoxworth, Joseph M.; Frakes, David; Sarkaria, Jann; Swanson, Kristin R.; Tran, Nhan L.; Li, Jing; Mitchell, J. Ross
2017-01-01
Background Glioblastoma (GBM) exhibits profound intratumoral genetic heterogeneity. Each tumor comprises multiple genetically distinct clonal populations with different therapeutic sensitivities. This has implications for targeted therapy and genetically informed paradigms. Contrast-enhanced (CE)-MRI and conventional sampling techniques have failed to resolve this heterogeneity, particularly for nonenhancing tumor populations. This study explores the feasibility of using multiparametric MRI and texture analysis to characterize regional genetic heterogeneity throughout MRI-enhancing and nonenhancing tumor segments. Methods We collected multiple image-guided biopsies from primary GBM patients throughout regions of enhancement (ENH) and nonenhancing parenchyma (so called brain-around-tumor, [BAT]). For each biopsy, we analyzed DNA copy number variants for core GBM driver genes reported by The Cancer Genome Atlas. We co-registered biopsy locations with MRI and texture maps to correlate regional genetic status with spatially matched imaging measurements. We also built multivariate predictive decision-tree models for each GBM driver gene and validated accuracies using leave-one-out-cross-validation (LOOCV). Results We collected 48 biopsies (13 tumors) and identified significant imaging correlations (univariate analysis) for 6 driver genes: EGFR, PDGFRA, PTEN, CDKN2A, RB1, and TP53. Predictive model accuracies (on LOOCV) varied by driver gene of interest. Highest accuracies were observed for PDGFRA (77.1%), EGFR (75%), CDKN2A (87.5%), and RB1 (87.5%), while lowest accuracy was observed in TP53 (37.5%). Models for 4 driver genes (EGFR, RB1, CDKN2A, and PTEN) showed higher accuracy in BAT samples (n = 16) compared with those from ENH segments (n = 32). Conclusion MRI and texture analysis can help characterize regional genetic heterogeneity, which offers potential diagnostic value under the paradigm of individualized oncology. PMID:27502248
Accuracy of Zika virus disease case definition during simultaneous Dengue and Chikungunya epidemics.
Braga, José Ueleres; Bressan, Clarisse; Dalvi, Ana Paula Razal; Calvet, Guilherme Amaral; Daumas, Regina Paiva; Rodrigues, Nadia; Wakimoto, Mayumi; Nogueira, Rita Maria Ribeiro; Nielsen-Saines, Karin; Brito, Carlos; Bispo de Filippis, Ana Maria; Brasil, Patrícia
2017-01-01
Zika is a new disease in the American continent and its surveillance is of utmost importance, especially because of its ability to cause neurological manifestations as Guillain-Barré syndrome and serious congenital malformations through vertical transmission. The detection of suspected cases by the surveillance system depends on the case definition adopted. As the laboratory diagnosis of Zika infection still relies on the use of expensive and complex molecular techniques with low sensitivity due to a narrow window of detection, most suspected cases are not confirmed by laboratory tests, mainly reserved for pregnant women and newborns. In this context, an accurate definition of a suspected Zika case is crucial in order for the surveillance system to gauge the magnitude of an epidemic. We evaluated the accuracy of various Zika case definitions in a scenario where Dengue and Chikungunya viruses co-circulate. Signs and symptoms that best discriminated PCR confirmed Zika from other laboratory confirmed febrile or exanthematic diseases were identified to propose and test predictive models for Zika infection based on these clinical features. Our derived score prediction model had the best performance because it demonstrated the highest sensitivity and specificity, 86·6% and 78·3%, respectively. This Zika case definition also had the highest values for auROC (0·903) and R2 (0·417), and the lowest Brier score 0·096. In areas where multiple arboviruses circulate, the presence of rash with pruritus or conjunctival hyperemia, without any other general clinical manifestations such as fever, petechia or anorexia is the best Zika case definition.
Lee, Young Seok; Park, Sunghoon; Oh, Yeon-Mok; Lee, Sang-Do; Park, Sung-Woo; Kim, Young Sam; In, Kwang Ho; Jung, Bock Hyun; Lee, Kwan Ho; Ra, Seung Won; Hwang, Yong Il; Park, Yong-Bum
2013-01-01
This study was conducted to investigate the association between the chronic obstructive pulmonary disease (COPD) assessment test (CAT) and depression in COPD patients. The Korean versions of the CAT and patient health questionnaire-9 (PHQ-9) were used to assess COPD symptoms and depressive disorder, respectively. In total, 803 patients with COPD were enrolled from 32 hospitals and the prevalence of depression was 23.8%. The CAT score correlated well with the PHQ-9 score (r=0.631; P<0.001) and was significantly associated with the presence of depression (β±standard error, 0.452±0.020; P<0.001). There was a tendency toward increasing severity of depression in patients with higher CAT scores. By assessment groups based on the 2011 Global Initiative for Chronic Obstructive Lung Disease guidelines, the prevalence of depression was affected more by current symptoms than by airway limitation. The area under the receiver operating characteristic curve for the CAT was 0.849 for predicting depression, and CAT scores ≥21 had the highest accuracy rate (80.6%). Among the eight CAT items, energy score showed the best correlation and highest power of discrimination. CAT scores are significantly associated with the presence of depression and have good performance for predicting depression in COPD patients. PMID:23853488
Pearce, J; Ferrier, S; Scotts, D
2001-06-01
To use models of species distributions effectively in conservation planning, it is important to determine the predictive accuracy of such models. Extensive modelling of the distribution of vascular plant and vertebrate fauna species within north-east New South Wales has been undertaken by linking field survey data to environmental and geographical predictors using logistic regression. These models have been used in the development of a comprehensive and adequate reserve system within the region. We evaluate the predictive accuracy of models for 153 small reptile, arboreal marsupial, diurnal bird and vascular plant species for which independent evaluation data were available. The predictive performance of each model was evaluated using the relative operating characteristic curve to measure discrimination capacity. Good discrimination ability implies that a model's predictions provide an acceptable index of species occurrence. The discrimination capacity of 89% of the models was significantly better than random, with 70% of the models providing high levels of discrimination. Predictions generated by this type of modelling therefore provide a reasonably sound basis for regional conservation planning. The discrimination ability of models was highest for the less mobile biological groups, particularly the vascular plants and small reptiles. In the case of diurnal birds, poor performing models tended to be for species which occur mainly within specific habitats not well sampled by either the model development or evaluation data, highly mobile species, species that are locally nomadic or those that display very broad habitat requirements. Particular care needs to be exercised when employing models for these types of species in conservation planning.
Carlton, Joshua A; Maxwell, Adam W; Bauer, Lyndsey B; McElroy, Sara M; Layfield, Lester J; Ahsan, Humera; Agarwal, Ajay
2017-06-01
Background and purpose In patients with squamous cell carcinoma of the head and neck (HNSCC), extracapsular spread (ECS) of metastases in cervical lymph nodes affects prognosis and therapy. We assessed the accuracy of intravenous contrast-enhanced computed tomography (CT) and the utility of imaging criteria for preoperative detection of ECS in metastatic cervical lymph nodes in patients with HNSCC. Materials and methods Preoperative intravenous contrast-enhanced neck CT images of 93 patients with histopathological HNSCC metastatic nodes were retrospectively assessed by two neuroradiologists for ECS status and ECS imaging criteria. Radiological assessments were compared with histopathological assessments of neck dissection specimens, and interobserver agreement of ECS status and ECS imaging criteria were measured. Results Sensitivity, specificity, positive predictive value, and accuracy for overall ECS assessment were 57%, 81%, 82% and 67% for observer 1, and 66%, 76%, 80% and 70% for observer 2, respectively. Correlating three or more ECS imaging criteria with histopathological ECS increased specificity and positive predictive value, but decreased sensitivity and accuracy. Interobserver agreement for overall ECS assessment demonstrated a kappa of 0.59. Central necrosis had the highest kappa of 0.74. Conclusion CT has moderate specificity for ECS assessment in HNSCC metastatic cervical nodes. Identifying three or more ECS imaging criteria raises specificity and positive predictive value, therefore preoperative identification of multiple criteria may be clinically useful. Interobserver agreement is moderate for overall ECS assessment, substantial for central necrosis. Other ECS CT criteria had moderate agreement at best and therefore should not be used individually as criteria for detecting ECS by CT.
Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction.
Zhou, Yao; Vales, M Isabel; Wang, Aoxue; Zhang, Zhiwu
2017-09-01
Accuracy of genomic prediction is commonly calculated as the Pearson correlation coefficient between the predicted and observed phenotypes in the inference population by using cross-validation analysis. More frequently than expected, significant negative accuracies of genomic prediction have been reported in genomic selection studies. These negative values are surprising, given that the minimum value for prediction accuracy should hover around zero when randomly permuted data sets are analyzed. We reviewed the two common approaches for calculating the Pearson correlation and hypothesized that these negative accuracy values reflect potential bias owing to artifacts caused by the mathematical formulas used to calculate prediction accuracy. The first approach, Instant accuracy, calculates correlations for each fold and reports prediction accuracy as the mean of correlations across fold. The other approach, Hold accuracy, predicts all phenotypes in all fold and calculates correlation between the observed and predicted phenotypes at the end of the cross-validation process. Using simulated and real data, we demonstrated that our hypothesis is true. Both approaches are biased downward under certain conditions. The biases become larger when more fold are employed and when the expected accuracy is low. The bias of Instant accuracy can be corrected using a modified formula. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
SU-F-R-04: Radiomics for Survival Prediction in Glioblastoma (GBM)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, H; Molitoris, J; Bhooshan, N
Purpose: To develop a quantitative radiomics approach for survival prediction of glioblastoma (GBM) patients treated with chemoradiotherapy (CRT). Methods: 28 GBM patients who received CRT at our institution were retrospectively studied. 255 radiomic features were extracted from 3 gadolinium-enhanced T1 weighted MRIs for 2 regions of interest (ROIs) (the surgical cavity and its surrounding enhancement rim). The 3 MRIs were at pre-treatment, 1-month and 3-month post-CRT. The imaging features comprehensively quantified the intensity, spatial variation (texture), geometric property and their spatial-temporal changes for the 2 ROIs. 3 demographics features (age, race, gender) and 12 clinical parameters (KPS, extent of resection,more » whether concurrent temozolomide was adjusted/stopped and radiotherapy related information) were also included. 4 Machine learning models (logistic regression (LR), support vector machine (SVM), decision tree (DT), neural network (NN)) were applied to predict overall survival (OS) and progression-free survival (PFS). The number of cases and percentage of cases predicted correctly were collected and AUC (area under the receiver operating characteristic (ROC) curve) were determined after leave-one-out cross-validation. Results: From univariate analysis, 27 features (1 demographic, 1 clinical and 25 imaging) were statistically significant (p<0.05) for both OS and PFS. Two sets of features (each contained 24 features) were algorithmically selected from all features to predict OS and PFS. High prediction accuracy of OS was achieved by using NN (96%, 27 of 28 cases were correctly predicted, AUC = 0.99), LR (93%, 26 of 28 cases were correctly predicted, AUC = 0.95) and SVM (93%, 26 of 28 cases were correctly predicted, AUC = 0.90). When predicting PFS, NN obtained the highest prediction accuracy (89%, 25 of 28 cases were correctly predicted, AUC = 0.92). Conclusion: Radiomics approach combined with patients’ demographics and clinical parameters can accurately predict survival in GBM patients treated with CRT.« less
Model-based prediction of myelosuppression and recovery based on frequent neutrophil monitoring.
Netterberg, Ida; Nielsen, Elisabet I; Friberg, Lena E; Karlsson, Mats O
2017-08-01
To investigate whether a more frequent monitoring of the absolute neutrophil counts (ANC) during myelosuppressive chemotherapy, together with model-based predictions, can improve therapy management, compared to the limited clinical monitoring typically applied today. Daily ANC in chemotherapy-treated cancer patients were simulated from a previously published population model describing docetaxel-induced myelosuppression. The simulated values were used to generate predictions of the individual ANC time-courses, given the myelosuppression model. The accuracy of the predicted ANC was evaluated under a range of conditions with reduced amount of ANC measurements. The predictions were most accurate when more data were available for generating the predictions and when making short forecasts. The inaccuracy of ANC predictions was highest around nadir, although a high sensitivity (≥90%) was demonstrated to forecast Grade 4 neutropenia before it occurred. The time for a patient to recover to baseline could be well forecasted 6 days (±1 day) before the typical value occurred on day 17. Daily monitoring of the ANC, together with model-based predictions, could improve anticancer drug treatment by identifying patients at risk for severe neutropenia and predicting when the next cycle could be initiated.
Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?
2017-01-01
Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making. PMID:28837692
Anzalone, Nicoletta; Castellano, Antonella; Cadioli, Marcello; Conte, Gian Marco; Cuccarini, Valeria; Bizzi, Alberto; Grimaldi, Marco; Costa, Antonella; Grillea, Giovanni; Vitali, Paolo; Aquino, Domenico; Terreni, Maria Rosa; Torri, Valter; Erickson, Bradley J; Caulo, Massimo
2018-06-01
Purpose To evaluate the feasibility of a standardized protocol for acquisition and analysis of dynamic contrast material-enhanced (DCE) and dynamic susceptibility contrast (DSC) magnetic resonance (MR) imaging in a multicenter clinical setting and to verify its accuracy in predicting glioma grade according to the new World Health Organization 2016 classification. Materials and Methods The local research ethics committees of all centers approved the study, and informed consent was obtained from patients. One hundred patients with glioma were prospectively examined at 3.0 T in seven centers that performed the same preoperative MR imaging protocol, including DCE and DSC sequences. Two independent readers identified the perfusion hotspots on maps of volume transfer constant (K trans ), plasma (v p ) and extravascular-extracellular space (v e ) volumes, initial area under the concentration curve, and relative cerebral blood volume (rCBV). Differences in parameters between grades and molecular subtypes were assessed by using Kruskal-Wallis and Mann-Whitney U tests. Diagnostic accuracy was evaluated by using receiver operating characteristic curve analysis. Results The whole protocol was tolerated in all patients. Perfusion maps were successfully obtained in 94 patients. An excellent interreader reproducibility of DSC- and DCE-derived measures was found. Among DCE-derived parameters, v p and v e had the highest accuracy (are under the receiver operating characteristic curve [A z ] = 0.847 and 0.853) for glioma grading. DSC-derived rCBV had the highest accuracy (A z = 0.894), but the difference was not statistically significant (P > .05). Among lower-grade gliomas, a moderate increase in both v p and rCBV was evident in isocitrate dehydrogenase wild-type tumors, although this was not significant (P > .05). Conclusion A standardized multicenter acquisition and analysis protocol of DCE and DSC MR imaging is feasible and highly reproducible. Both techniques showed a comparable, high diagnostic accuracy for grading gliomas. © RSNA, 2018 Online supplemental material is available for this article.
Tracking on non-active collaborative objects from San Fernando Laser station
NASA Astrophysics Data System (ADS)
Catalán, Manuel; Quijano, Manuel; Cortina, Luis M.; Pazos, Antonio A.; Martín-Davila, José
2016-04-01
The Royal Observatory of the Spanish Navy (ROA) works on satellite geodesy from the early days of the space age, when the first artificial satellite tracking telescope was installed in 1958: the Baker-Nunn camera. In 1975 a French satellite Laser ranging (SLR) station was installed and operated at ROA . Since 1980, ROA has been operating this instrument which was upgraded to a third generation and it is still keep into a continuous update to reach the highest level of operability. Since then ROA has participated in different space geodesy campaigns through the International Laser Service Stations (ILRS) or its European regional organization (EUROLAS), tracking a number of artificial satellites types : ERS, ENVISAT, LAGEOS, TOPEX- POSEIDON to name but a few. Recently we opened a new field of research: space debris tracking, which is receiving increasing importance and attention from international space agencies. The main problem is the relatively low accuracy of common used methods. It is clear that improving the predicted orbit accuracy is necessary to fulfill our aims (avoiding unnecessary anti-collision maneuvers,..). Following results obtained by other colleagues (Austria, China, USA,...) we proposed to share our time-schedule using our satellite ranging station to obtain data which will make orbital elements predictions far more accurate (sub-meter accuracy), while we still keep our tracking routines over active satellites. In this communication we report the actions fulfill until nowadays.
Herrick, Ariane L; Peytrignet, Sebastien; Lunt, Mark; Pan, Xiaoyan; Hesselstrand, Roger; Mouthon, Luc; Silman, Alan J; Dinsdale, Graham; Brown, Edith; Czirják, László; Distler, Jörg H W; Distler, Oliver; Fligelstone, Kim; Gregory, William J; Ochiel, Rachel; Vonk, Madelon C; Ancuţa, Codrina; Ong, Voon H; Farge, Dominique; Hudson, Marie; Matucci-Cerinic, Marco; Balbir-Gurman, Alexandra; Midtvedt, Øyvind; Jobanputra, Paresh; Jordan, Alison C; Stevens, Wendy; Moinzadeh, Pia; Hall, Frances C; Agard, Christian; Anderson, Marina E; Diot, Elisabeth; Madhok, Rajan; Akil, Mohammed; Buch, Maya H; Chung, Lorinda; Damjanov, Nemanja S; Gunawardena, Harsha; Lanyon, Peter; Ahmad, Yasmeen; Chakravarty, Kuntal; Jacobsen, Søren; MacGregor, Alexander J; McHugh, Neil; Müller-Ladner, Ulf; Riemekasten, Gabriela; Becker, Michael; Roddy, Janet; Carreira, Patricia E; Fauchais, Anne Laure; Hachulla, Eric; Hamilton, Jennifer; İnanç, Murat; McLaren, John S; van Laar, Jacob M; Pathare, Sanjay; Proudman, Susanna M; Rudin, Anna; Sahhar, Joanne; Coppere, Brigitte; Serratrice, Christine; Sheeran, Tom; Veale, Douglas J; Grange, Claire; Trad, Georges-Selim; Denton, Christopher P
2018-04-01
Our aim was to use the opportunity provided by the European Scleroderma Observational Study to (1) identify and describe those patients with early diffuse cutaneous systemic sclerosis (dcSSc) with progressive skin thickness, and (2) derive prediction models for progression over 12 months, to inform future randomised controlled trials (RCTs). The modified Rodnan skin score (mRSS) was recorded every 3 months in 326 patients. 'Progressors' were defined as those experiencing a 5-unit and 25% increase in mRSS score over 12 months (±3 months). Logistic models were fitted to predict progression and, using receiver operating characteristic (ROC) curves, were compared on the basis of the area under curve (AUC), accuracy and positive predictive value (PPV). 66 patients (22.5%) progressed, 227 (77.5%) did not (33 could not have their status assessed due to insufficient data). Progressors had shorter disease duration (median 8.1 vs 12.6 months, P=0.001) and lower mRSS (median 19 vs 21 units, P=0.030) than non-progressors. Skin score was highest, and peaked earliest, in the anti-RNA polymerase III (Pol3+) subgroup (n=50). A first predictive model (including mRSS, duration of skin thickening and their interaction) had an accuracy of 60.9%, AUC of 0.666 and PPV of 33.8%. By adding a variable for Pol3 positivity, the model reached an accuracy of 71%, AUC of 0.711 and PPV of 41%. Two prediction models for progressive skin thickening were derived, for use both in clinical practice and for cohort enrichment in RCTs. These models will inform recruitment into the many clinical trials of dcSSc projected for the coming years. NCT02339441. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Prediction of Peaks of Seasonal Influenza in Military Health-Care Data
Buczak, Anna L.; Baugher, Benjamin; Guven, Erhan; Moniz, Linda; Babin, Steven M.; Chretien, Jean-Paul
2016-01-01
Influenza is a highly contagious disease that causes seasonal epidemics with significant morbidity and mortality. The ability to predict influenza peak several weeks in advance would allow for timely preventive public health planning and interventions to be used to mitigate these outbreaks. Because influenza may also impact the operational readiness of active duty personnel, the US military places a high priority on surveillance and preparedness for seasonal outbreaks. A method for creating models for predicting peak influenza visits per total health-care visits (ie, activity) weeks in advance has been developed using advanced data mining techniques on disparate epidemiological and environmental data. The model results are presented and compared with those of other popular data mining classifiers. By rigorously testing the model on data not used in its development, it is shown that this technique can predict the week of highest influenza activity for a specific region with overall better accuracy than other methods examined in this article. PMID:27127415
Triple Test in Carcinoma Breast
Sameer; Mukherjee, Arindam
2014-01-01
Introduction: The commonest clinical presentation in majority of breast pathology is a lump. A definite diagnosis of breast lump is very important for the surgeon to decide on the final course of treatment and also saves the patient from unnecessary physical, emotional and psychological trauma if there is a definite preoperative diagnosis of benign lesion. The present study was done to evaluate the effectiveness and relevance of “TRIPLE TEST”in diagnosis of carcinoma breast in rural labour class population. Materials and Methods: The present study was a prospective study conducted on patients over 35 years of age having palpable breast lumps presenting in the out patient department of general surgery, ESI Hospital Basaidarapur New Delhi, India. The duration of study was from May 2007 to June 2009 and a total of 100 cases were studied. Each patient was subjected to a detailed history, clinical breast examination ,diagnostic mammography and FNAC. In this study, the results of each modality was divided in three groups: benign, suspicious and malignant. The sensitivity, specificity, positive predictive value, negative predictive value and diagnostic accuracy of each test was calculated individually and as combined. Result: Out of 100 patients enrolled in this study, 60 cases were benign and 40 cases were of malignant breast disease. The age of patients with carcinoma breast in the series varied from 35 years to 70 years. The highest incidence of malignancy noted was 30% in 41-50 years age group (4th decade) followed by 27.5% in 51-60 years age group (5th decade). The sensitivity of clinical examination was found to be 75%, specificity was 83.3%, positive predictive value (PPV) of 75% and diagnostic accuracy of 80%. The sensitivity, specificity, positive predictive value and diagnostic accuracy of mammography was calculated and was found to be 94.9% , 90% , 86% and 92% respectively. The sensitivity, specificity, positive predictive value and diagnostic accuracy of FNAC was 94.7%, 98.3%, 97.3% and 96.6% respectively. Out of 100 cases triple test was concordant (all three test either benign or malignant) in 80 cases, all the benign cases detected by triple test were benign on final biopsy i.e. 100% specificity and 100% negative predictive value. Conclusion: TTS is an accurate and least invasive diagnostic test based on which definitive treatment can be initiated. PMID:25478391
Predictive performance of four frailty measures in an older Australian population
Widagdo, Imaina S.; Pratt, Nicole; Russell, Mary; Roughead, Elizabeth E.
2015-01-01
Background: there are several different frailty measures available for identifying the frail elderly. However, their predictive performance in an Australian population has not been examined. Objective: to examine the predictive performance of four internationally validated frailty measures in an older Australian population. Methods: a retrospective study in the Australian Longitudinal Study of Ageing (ALSA) with 2,087 participants. Frailty was measured at baseline using frailty phenotype (FP), simplified frailty phenotype (SFP), frailty index (FI) and prognostic frailty score (PFS). Odds ratios (OR) were calculated to measure the association between frailty and outcomes at Wave 3 including mortality, hospitalisation, nursing home admission, fall and a combination of all outcomes. Predictive performance was measured by assessing sensitivity, specificity, positive and negative predictive values (PPV and NPV) and likelihood ratio (LR). Area under the curve (AUC) of dichotomised and the multilevel or continuous model of the measures was examined. Results: prevalence of frailty varied from 2% up to 49% between the measures. Frailty was significantly associated with an increased risk of any outcome, OR (95% confidence interval) for FP: 1.9 (1.4–2.8), SFP: 3.6 (1.5–8.8), FI: 3.4 (2.7–4.3) and PFS: 2.3 (1.8–2.8). PFS had high sensitivity across all outcomes (sensitivity: 55.2–77.1%). The PPV for any outcome was highest for SFP and FI (70.8 and 69.7%, respectively). Only FI had acceptable accuracy in predicting outcomes, AUC: 0.59–0.70. Conclusions: being identified as frail by any of the four measures was associated with an increased risk of outcomes; however, their predictive accuracy varied. PMID:26504118
Predicted deep-sea coral habitat suitability for the U.S. West coast.
Guinotte, John M; Davies, Andrew J
2014-01-01
Regional scale habitat suitability models provide finer scale resolution and more focused predictions of where organisms may occur. Previous modelling approaches have focused primarily on local and/or global scales, while regional scale models have been relatively few. In this study, regional scale predictive habitat models are presented for deep-sea corals for the U.S. West Coast (California, Oregon and Washington). Model results are intended to aid in future research or mapping efforts and to assess potential coral habitat suitability both within and outside existing bottom trawl closures (i.e. Essential Fish Habitat (EFH)) and identify suitable habitat within U.S. National Marine Sanctuaries (NMS). Deep-sea coral habitat suitability was modelled at 500 m×500 m spatial resolution using a range of physical, chemical and environmental variables known or thought to influence the distribution of deep-sea corals. Using a spatial partitioning cross-validation approach, maximum entropy models identified slope, temperature, salinity and depth as important predictors for most deep-sea coral taxa. Large areas of highly suitable deep-sea coral habitat were predicted both within and outside of existing bottom trawl closures and NMS boundaries. Predicted habitat suitability over regional scales are not currently able to identify coral areas with pin point accuracy and probably overpredict actual coral distribution due to model limitations and unincorporated variables (i.e. data on distribution of hard substrate) that are known to limit their distribution. Predicted habitat results should be used in conjunction with multibeam bathymetry, geological mapping and other tools to guide future research efforts to areas with the highest probability of harboring deep-sea corals. Field validation of predicted habitat is needed to quantify model accuracy, particularly in areas that have not been sampled.
Predicted Deep-Sea Coral Habitat Suitability for the U.S. West Coast
Guinotte, John M.; Davies, Andrew J.
2014-01-01
Regional scale habitat suitability models provide finer scale resolution and more focused predictions of where organisms may occur. Previous modelling approaches have focused primarily on local and/or global scales, while regional scale models have been relatively few. In this study, regional scale predictive habitat models are presented for deep-sea corals for the U.S. West Coast (California, Oregon and Washington). Model results are intended to aid in future research or mapping efforts and to assess potential coral habitat suitability both within and outside existing bottom trawl closures (i.e. Essential Fish Habitat (EFH)) and identify suitable habitat within U.S. National Marine Sanctuaries (NMS). Deep-sea coral habitat suitability was modelled at 500 m×500 m spatial resolution using a range of physical, chemical and environmental variables known or thought to influence the distribution of deep-sea corals. Using a spatial partitioning cross-validation approach, maximum entropy models identified slope, temperature, salinity and depth as important predictors for most deep-sea coral taxa. Large areas of highly suitable deep-sea coral habitat were predicted both within and outside of existing bottom trawl closures and NMS boundaries. Predicted habitat suitability over regional scales are not currently able to identify coral areas with pin point accuracy and probably overpredict actual coral distribution due to model limitations and unincorporated variables (i.e. data on distribution of hard substrate) that are known to limit their distribution. Predicted habitat results should be used in conjunction with multibeam bathymetry, geological mapping and other tools to guide future research efforts to areas with the highest probability of harboring deep-sea corals. Field validation of predicted habitat is needed to quantify model accuracy, particularly in areas that have not been sampled. PMID:24759613
Hemida, Khalid; Shabana, Sherif Sadek; Said, Hani; Ali-Eldin, Fatma
2016-01-01
Introduction Patients with chronic liver diseases are at great risk for both morbidity and mortality during the post-operative period due to the stress of surgery and the effects of general anaesthesia. Aim The main aim of this study was to evaluate the value of Model for End-stage Liver Disease (MELD) score, as compared to Child-Turcotte-Pugh (CTP) score, for prediction of 30- day post-operative mortality in Egyptian patients with liver cirrhosis undergoing non-hepatic surgery under general anaesthesia. Materials and Methods A total of 60 patients with Hepatitis C Virus (HCV) - related liver cirrhosis were included in this study. Sensitivity and specificity of MELD and CTP scores were evaluated for the prediction of post-operative mortality. A total of 20 patients who had no clinical, biochemical or radiological evidence of liver disease were included to serve as a control group. Results The highest sensitivity and specificity for detection of post-operative mortality was detected at a MELD score of 13.5. CTP score had a sensitivity of 75%, a specificity of 96.4%, and an overall accuracy of 95% for prediction of post-operative mortality. On the other side and at a cut-off value of 13.5, MELD score had a sensitivity of 100%, a specificity of 64.0%, and an overall accuracy of 66.6% for prediction of post-operative mortality in patients with HCV- related liver cirrhosis. Conclusion MELD score proved to be more sensitive but less specific than CTP score for prediction of post-operative mortality. CTP and MELD scores may be complementary rather than competitive in predicting post-operative mortality in patients with HCV- related liver cirrhosis. PMID:27891371
Launay, C P; Rivière, H; Kabeshova, A; Beauchet, O
2015-09-01
To examine performance criteria (i.e., sensitivity, specificity, positive predictive value [PPV], negative predictive value [NPV], likelihood ratios [LR], area under receiver operating characteristic curve [AUROC]) of a 10-item brief geriatric assessment (BGA) for the prediction of prolonged length hospital stay (LHS) in older patients hospitalized in acute care wards after an emergency department (ED) visit, using artificial neural networks (ANNs); and to describe the contribution of each BGA item to the predictive accuracy using the AUROC value. A total of 993 geriatric ED users admitted to acute care wards were included in this prospective cohort study. Age >85years, gender male, polypharmacy, non use of formal and/or informal home-help services, history of falls, temporal disorientation, place of living, reasons and nature for ED admission, and use of psychoactive drugs composed the 10 items of BGA and were recorded at the ED admission. The prolonged LHS was defined as the top third of LHS. The ANNs were conducted using two feeds forward (multilayer perceptron [MLP] and modified MLP). The best performance was reported with the modified MLP involving the 10 items (sensitivity=62.7%; specificity=96.6%; PPV=87.1; NPV=87.5; positive LR=18.2; AUC=90.5). In this model, presence of chronic conditions had the highest contributions (51.3%) in AUROC value. The 10-item BGA appears to accurately predict prolonged LHS, using the ANN MLP method, showing the best criteria performance ever reported until now. Presence of chronic conditions was the main contributor for the predictive accuracy. Copyright © 2015 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Biomarker kinetics in the prediction of VAP diagnosis: results from the BioVAP study.
Póvoa, Pedro; Martin-Loeches, Ignacio; Ramirez, Paula; Bos, Lieuwe D; Esperatti, Mariano; Silvestre, Joana; Gili, Gisela; Goma, Gema; Berlanga, Eugenio; Espasa, Mateu; Gonçalves, Elsa; Torres, Antoni; Artigas, Antonio
2016-12-01
Prediction of diagnosis of ventilator-associated pneumonia (VAP) remains difficult. Our aim was to assess the value of biomarker kinetics in VAP prediction. We performed a prospective, multicenter, observational study to evaluate predictive accuracy of biomarker kinetics, namely C-reactive protein (CRP), procalcitonin (PCT), mid-region fragment of pro-adrenomedullin (MR-proADM), for VAP management in 211 patients receiving mechanical ventilation for >72 h. For the present analysis, we assessed all (N = 138) mechanically ventilated patients without an infection at admission. The kinetics of each variable, from day 1 to day 6 of mechanical ventilation, was assessed with each variable's slopes (rate of biomarker change per day), highest level and maximum amplitude of variation (Δ (max)). A total of 35 patients (25.4 %) developed a VAP and were compared with 70 non-infected controls (50.7 %). We excluded 33 patients (23.9 %) who developed a non-VAP nosocomial infection. Among the studied biomarkers, CRP and CRP ratio showed the best performance in VAP prediction. The slope of CRP change over time (adjusted odds ratio [aOR] 1.624, confidence interval [CI]95% [1.206, 2.189], p = 0.001), the highest CRP ratio concentration (aOR 1.202, CI95% [1.061, 1.363], p = 0.004) and Δ (max) CRP (aOR 1.139, CI95% [1.039, 1.248], p = 0.006), during the first 6 days of mechanical ventilation, were all significantly associated with VAP development. Both PCT and MR-proADM showed a poor predictive performance as well as temperature and white cell count. Our results suggest that in patients under mechanical ventilation, daily CRP monitoring was useful in VAP prediction. Trial registration NCT02078999.
Zarco-Perello, Salvador; Simões, Nuno
2017-01-01
Information about the distribution and abundance of the habitat-forming sessile organisms in marine ecosystems is of great importance for conservation and natural resource managers. Spatial interpolation methodologies can be useful to generate this information from in situ sampling points, especially in circumstances where remote sensing methodologies cannot be applied due to small-scale spatial variability of the natural communities and low light penetration in the water column. Interpolation methods are widely used in environmental sciences; however, published studies using these methodologies in coral reef science are scarce. We compared the accuracy of the two most commonly used interpolation methods in all disciplines, inverse distance weighting (IDW) and ordinary kriging (OK), to predict the distribution and abundance of hard corals, octocorals, macroalgae, sponges and zoantharians and identify hotspots of these habitat-forming organisms using data sampled at three different spatial scales (5, 10 and 20 m) in Madagascar reef, Gulf of Mexico. The deeper sandy environments of the leeward and windward regions of Madagascar reef were dominated by macroalgae and seconded by octocorals. However, the shallow rocky environments of the reef crest had the highest richness of habitat-forming groups of organisms; here, we registered high abundances of octocorals and macroalgae, with sponges, Millepora alcicornis and zoantharians dominating in some patches, creating high levels of habitat heterogeneity. IDW and OK generated similar maps of distribution for all the taxa; however, cross-validation tests showed that IDW outperformed OK in the prediction of their abundances. When the sampling distance was at 20 m, both interpolation techniques performed poorly, but as the sampling was done at shorter distances prediction accuracies increased, especially for IDW. OK had higher mean prediction errors and failed to correctly interpolate the highest abundance values measured in situ , except for macroalgae, whereas IDW had lower mean prediction errors and high correlations between predicted and measured values in all cases when sampling was every 5 m. The accurate spatial interpolations created using IDW allowed us to see the spatial variability of each taxa at a biological and spatial resolution that remote sensing would not have been able to produce. Our study sets the basis for further research projects and conservation management in Madagascar reef and encourages similar studies in the region and other parts of the world where remote sensing technologies are not suitable for use.
Simões, Nuno
2017-01-01
Information about the distribution and abundance of the habitat-forming sessile organisms in marine ecosystems is of great importance for conservation and natural resource managers. Spatial interpolation methodologies can be useful to generate this information from in situ sampling points, especially in circumstances where remote sensing methodologies cannot be applied due to small-scale spatial variability of the natural communities and low light penetration in the water column. Interpolation methods are widely used in environmental sciences; however, published studies using these methodologies in coral reef science are scarce. We compared the accuracy of the two most commonly used interpolation methods in all disciplines, inverse distance weighting (IDW) and ordinary kriging (OK), to predict the distribution and abundance of hard corals, octocorals, macroalgae, sponges and zoantharians and identify hotspots of these habitat-forming organisms using data sampled at three different spatial scales (5, 10 and 20 m) in Madagascar reef, Gulf of Mexico. The deeper sandy environments of the leeward and windward regions of Madagascar reef were dominated by macroalgae and seconded by octocorals. However, the shallow rocky environments of the reef crest had the highest richness of habitat-forming groups of organisms; here, we registered high abundances of octocorals and macroalgae, with sponges, Millepora alcicornis and zoantharians dominating in some patches, creating high levels of habitat heterogeneity. IDW and OK generated similar maps of distribution for all the taxa; however, cross-validation tests showed that IDW outperformed OK in the prediction of their abundances. When the sampling distance was at 20 m, both interpolation techniques performed poorly, but as the sampling was done at shorter distances prediction accuracies increased, especially for IDW. OK had higher mean prediction errors and failed to correctly interpolate the highest abundance values measured in situ, except for macroalgae, whereas IDW had lower mean prediction errors and high correlations between predicted and measured values in all cases when sampling was every 5 m. The accurate spatial interpolations created using IDW allowed us to see the spatial variability of each taxa at a biological and spatial resolution that remote sensing would not have been able to produce. Our study sets the basis for further research projects and conservation management in Madagascar reef and encourages similar studies in the region and other parts of the world where remote sensing technologies are not suitable for use. PMID:29204321
Beltrame, Anna; Guerriero, Massimo; Angheben, Andrea; Gobbi, Federico; Requena-Mendez, Ana; Zammarchi, Lorenzo; Formenti, Fabio; Perandin, Francesca; Bisoffi, Zeno
2017-01-01
Background Schistosomiasis is a neglected infection affecting millions of people, mostly living in sub-Saharan Africa. Morbidity and mortality due to chronic infection are relevant, although schistosomiasis is often clinically silent. Different diagnostic tests have been implemented in order to improve screening and diagnosis, that traditionally rely on parasitological tests with low sensitivity. Aim of this study was to evaluate the accuracy of different tests for the screening of schistosomiasis in African migrants, in a non endemic setting. Methodology/Principal findings A retrospective study was conducted on 373 patients screened at the Centre for Tropical Diseases (CTD) in Negrar, Verona, Italy. Biological samples were tested with: stool/urine microscopy, Circulating Cathodic Antigen (CCA) dipstick test, ELISA, Western blot, immune-chromatographic test (ICT). Test accuracy and predictive values of the immunological tests were assessed primarily on the basis of the results of microscopy (primary reference standard): ICT and WB resulted the test with highest sensitivity (94% and 92%, respectively), with a high NPV (98%). CCA showed the highest specificity (93%), but low sensitivity (48%). The analysis was conducted also using a composite reference standard, CRS (patients classified as infected in case of positive microscopy and/or at least 2 concordant positive immunological tests) and Latent Class Analysis (LCA). The latter two models demonstrated excellent agreement (Cohen’s kappa: 0.92) for the classification of the results. In fact, they both confirmed ICT as the test with the highest sensitivity (96%) and NPV (97%), moreover PPV was reasonably good (78% and 72% according to CRS and LCA, respectively). ELISA resulted the most specific immunological test (over 99%). The ICT appears to be a suitable screening test, even when used alone. Conclusions The rapid test ICT was the most sensitive test, with the potential of being used as a single screening test for African migrants. PMID:28582412
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding
2013-01-01
Background In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. Results The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. Conclusions The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies. PMID:24314298
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding.
Ould Estaghvirou, Sidi Boubacar; Ogutu, Joseph O; Schulz-Streeck, Torben; Knaak, Carsten; Ouzunova, Milena; Gordillo, Andres; Piepho, Hans-Peter
2013-12-06
In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies.
Statistical algorithms improve accuracy of gene fusion detection
Hsieh, Gillian; Bierman, Rob; Szabo, Linda; Lee, Alex Gia; Freeman, Donald E.; Watson, Nathaniel; Sweet-Cordero, E. Alejandro
2017-01-01
Abstract Gene fusions are known to play critical roles in tumor pathogenesis. Yet, sensitive and specific algorithms to detect gene fusions in cancer do not currently exist. In this paper, we present a new statistical algorithm, MACHETE (Mismatched Alignment CHimEra Tracking Engine), which achieves highly sensitive and specific detection of gene fusions from RNA-Seq data, including the highest Positive Predictive Value (PPV) compared to the current state-of-the-art, as assessed in simulated data. We show that the best performing published algorithms either find large numbers of fusions in negative control data or suffer from low sensitivity detecting known driving fusions in gold standard settings, such as EWSR1-FLI1. As proof of principle that MACHETE discovers novel gene fusions with high accuracy in vivo, we mined public data to discover and subsequently PCR validate novel gene fusions missed by other algorithms in the ovarian cancer cell line OVCAR3. These results highlight the gains in accuracy achieved by introducing statistical models into fusion detection, and pave the way for unbiased discovery of potentially driving and druggable gene fusions in primary tumors. PMID:28541529
Quantitative prediction of shrimp disease incidence via the profiles of gut eukaryotic microbiota.
Xiong, Jinbo; Yu, Weina; Dai, Wenfang; Zhang, Jinjie; Qiu, Qiongfen; Ou, Changrong
2018-04-01
One common notion is emerging that gut eukaryotes are commensal or beneficial, rather than detrimental. To date, however, surprisingly few studies have been taken to discern the factors that govern the assembly of gut eukaryotes, despite growing interest in the dysbiosis of gut microbiota-disease relationship. Herein, we firstly explored how the gut eukaryotic microbiotas were assembled over shrimp postlarval to adult stages and a disease progression. The gut eukaryotic communities changed markedly as healthy shrimp aged, and converged toward an adult-microbiota configuration. However, the adult-like stability was distorted by disease exacerbation. A null model untangled that the deterministic processes that governed the gut eukaryotic assembly tended to be more important over healthy shrimp development, whereas this trend was inverted as the disease progressed. After ruling out the baseline of gut eukaryotes over shrimp ages, we identified disease-discriminatory taxa (species level afforded the highest accuracy of prediction) that characteristic of shrimp health status. The profiles of these taxa contributed an overall 92.4% accuracy in predicting shrimp health status. Notably, this model can accurately diagnose the onset of shrimp disease. Interspecies interaction analysis depicted how the disease-discriminatory taxa interacted with one another in sustaining shrimp health. Taken together, our findings offer novel insights into the underlying ecological processes that govern the assembly of gut eukaryotes over shrimp postlarval to adult stages and a disease progression. Intriguingly, the established model can quantitatively and accurately predict the incidences of shrimp disease.
[Forest lighting fire forecasting for Daxing'anling Mountains based on MAXENT model].
Sun, Yu; Shi, Ming-Chang; Peng, Huan; Zhu, Pei-Lin; Liu, Si-Lin; Wu, Shi-Lei; He, Cheng; Chen, Feng
2014-04-01
Daxing'anling Mountains is one of the areas with the highest occurrence of forest lighting fire in Heilongjiang Province, and developing a lightning fire forecast model to accurately predict the forest fires in this area is of importance. Based on the data of forest lightning fires and environment variables, the MAXENT model was used to predict the lightning fire in Daxing' anling region. Firstly, we studied the collinear diagnostic of each environment variable, evaluated the importance of the environmental variables using training gain and the Jackknife method, and then evaluated the prediction accuracy of the MAXENT model using the max Kappa value and the AUC value. The results showed that the variance inflation factor (VIF) values of lightning energy and neutralized charge were 5.012 and 6.230, respectively. They were collinear with the other variables, so the model could not be used for training. Daily rainfall, the number of cloud-to-ground lightning, and current intensity of cloud-to-ground lightning were the three most important factors affecting the lightning fires in the forest, while the daily average wind speed and the slope was of less importance. With the increase of the proportion of test data, the max Kappa and AUC values were increased. The max Kappa values were above 0.75 and the average value was 0.772, while all of the AUC values were above 0.5 and the average value was 0. 859. With a moderate level of prediction accuracy being achieved, the MAXENT model could be used to predict forest lightning fire in Daxing'anling Mountains.
NASA Astrophysics Data System (ADS)
Koehl, Patrice; Orland, Henri; Delarue, Marc
2011-08-01
We present an extension of the self-consistent mean field theory for protein side-chain modeling in which solvation effects are included based on the Poisson-Boltzmann (PB) theory. In this approach, the protein is represented with multiple copies of its side chains. Each copy is assigned a weight that is refined iteratively based on the mean field energy generated by the rest of the protein, until self-consistency is reached. At each cycle, the variational free energy of the multi-copy system is computed; this free energy includes the internal energy of the protein that accounts for vdW and electrostatics interactions and a solvation free energy term that is computed using the PB equation. The method converges in only a few cycles and takes only minutes of central processing unit time on a commodity personal computer. The predicted conformation of each residue is then set to be its copy with the highest weight after convergence. We have tested this method on a database of hundred highly refined NMR structures to circumvent the problems of crystal packing inherent to x-ray structures. The use of the PB-derived solvation free energy significantly improves prediction accuracy for surface side chains. For example, the prediction accuracies for χ1 for surface cysteine, serine, and threonine residues improve from 68%, 35%, and 43% to 80%, 53%, and 57%, respectively. A comparison with other side-chain prediction algorithms demonstrates that our approach is consistently better in predicting the conformations of exposed side chains.
Seymer, A; Keinrath, P; Holzmannhofer, J; Pirich, C; Hergan, K; Meissnitzer, M W
2015-01-01
Objective: To prospectively analyse the diagnostic value of semi-quantitative breast-specific gamma imaging (BSGI) in the work-up of suspicious breast lesions compared with that of mammography (MG), breast ultrasound and MRI of the breast. Methods: Within a 15-month period, 67 patients with 92 breast lesions rated as Category IV or V according to the breast imaging reporting and data system detected with MG and/or ultrasound were included into the study. After the injection of 740–1110 MBq of Technetium-99m (99mTc) SestaMIBI intravenously, scintigrams were obtained in two projections comparable to MG. The BSGI was analysed visually and semi-quantitatively by calculating a relative uptake factor (X). With the exception of two patients with cardiac pacemakers, all patients underwent 3-T breast MRI. Biopsy results were obtained as the reference standard in all patients. Sensitivity, specificity, positive- and negative-predictive values, accuracy and area under the curve were calculated for each modality. Results: Among the 92 lesions, 67 (72.8%) were malignant. 60 of the 67 cancers of any size were detected by BSGI with an overall sensitivity of 90%, only exceeded by ultrasound with a sensitivity of 99%. The sensitivity of BSGI for lesions <1 cm declined significantly to 60%. Overall specificity of ultrasound was only 20%. Specificity, accuracy and positive-predictive value were the highest for BSGI (56%, 80% and 85%, respectively). X was significantly higher for malignant lesions (mean, 4.27) and differed significantly between ductal types (mean, 4.53) and the other histopathological entities (mean, 3.12). Conclusion: Semi-quantitative BSGI with calculation of the relative uptake factor (X) can help to characterize breast lesions. BSGI negativity may obviate the need for biopsy of breast lesions >1 cm with low or intermediate prevalence for malignancy. Advances in knowledge: Compared with morphological imaging modalities, specificity, positive-predictive value for malignancy and accuracy were the highest for BSGI in our study. BSGI negativity may support the decision not to biopsy in selected lesions with a low or low-to-moderate pre-test probability for malignancy. PMID:25882690
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data.
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-05-15
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection.
A New Approach for Resolving Conflicts in Actionable Behavioral Rules
Zhu, Dan; Zeng, Daniel
2014-01-01
Knowledge is considered actionable if users can take direct actions based on such knowledge to their advantage. Among the most important and distinctive actionable knowledge are actionable behavioral rules that can directly and explicitly suggest specific actions to take to influence (restrain or encourage) the behavior in the users' best interest. However, in mining such rules, it often occurs that different rules may suggest the same actions with different expected utilities, which we call conflicting rules. To resolve the conflicts, a previous valid method was proposed. However, inconsistency of the measure for rule evaluating may hinder its performance. To overcome this problem, we develop a new method that utilizes rule ranking procedure as the basis for selecting the rule with the highest utility prediction accuracy. More specifically, we propose an integrative measure, which combines the measures of the support and antecedent length, to evaluate the utility prediction accuracies of conflicting rules. We also introduce a tunable weight parameter to allow the flexibility of integration. We conduct several experiments to test our proposed approach and evaluate the sensitivity of the weight parameter. Empirical results indicate that our approach outperforms those from previous research. PMID:25162054
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-01-01
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection. PMID:28505135
Cooperativity among Short Amyloid Stretches in Long Amyloidogenic Sequences
He, Zhisong; Shi, Xiaohe; Feng, Kaiyan; Ma, Buyong; Cai, Yu-Dong
2012-01-01
Amyloid fibrillar aggregates of polypeptides are associated with many neurodegenerative diseases. Short peptide segments in protein sequences may trigger aggregation. Identifying these stretches and examining their behavior in longer protein segments is critical for understanding these diseases and obtaining potential therapies. In this study, we combined machine learning and structure-based energy evaluation to examine and predict amyloidogenic segments. Our feature selection method discovered that windows consisting of long amino acid segments of ∼30 residues, instead of the commonly used short hexapeptides, provided the highest accuracy. Weighted contributions of an amino acid at each position in a 27 residue window revealed three cooperative regions of short stretch, resemble the β-strand-turn-β-strand motif in A-βpeptide amyloid and β-solenoid structure of HET-s(218–289) prion (C). Using an in-house energy evaluation algorithm, the interaction energy between two short stretches in long segment is computed and incorporated as an additional feature. The algorithm successfully predicted and classified amyloid segments with an overall accuracy of 75%. Our study revealed that genome-wide amyloid segments are not only dependent on short high propensity stretches, but also on nearby residues. PMID:22761773
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
Young, Jonathan; Modat, Marc; Cardoso, Manuel J; Mendelson, Alex; Cash, Dave; Ourselin, Sebastien
2013-01-01
Accurately identifying the patients that have mild cognitive impairment (MCI) who will go on to develop Alzheimer's disease (AD) will become essential as new treatments will require identification of AD patients at earlier stages in the disease process. Most previous work in this area has centred around the same automated techniques used to diagnose AD patients from healthy controls, by coupling high dimensional brain image data or other relevant biomarker data to modern machine learning techniques. Such studies can now distinguish between AD patients and controls as accurately as an experienced clinician. Models trained on patients with AD and control subjects can also distinguish between MCI patients that will convert to AD within a given timeframe (MCI-c) and those that remain stable (MCI-s), although differences between these groups are smaller and thus, the corresponding accuracy is lower. The most common type of classifier used in these studies is the support vector machine, which gives categorical class decisions. In this paper, we introduce Gaussian process (GP) classification to the problem. This fully Bayesian method produces naturally probabilistic predictions, which we show correlate well with the actual chances of converting to AD within 3 years in a population of 96 MCI-s and 47 MCI-c subjects. Furthermore, we show that GPs can integrate multimodal data (in this study volumetric MRI, FDG-PET, cerebrospinal fluid, and APOE genotype with the classification process through the use of a mixed kernel). The GP approach aids combination of different data sources by learning parameters automatically from training data via type-II maximum likelihood, which we compare to a more conventional method based on cross validation and an SVM classifier. When the resulting probabilities from the GP are dichotomised to produce a binary classification, the results for predicting MCI conversion based on the combination of all three types of data show a balanced accuracy of 74%. This is a substantially higher accuracy than could be obtained using any individual modality or using a multikernel SVM, and is competitive with the highest accuracy yet achieved for predicting conversion within three years on the widely used ADNI dataset.
Palaiokostas, Christos; Ferraresso, Serena; Franch, Rafaella; Houston, Ross D.; Bargelloni, Luca
2016-01-01
Gilthead sea bream (Sparus aurata) is a species of paramount importance to the Mediterranean aquaculture industry, with an annual production exceeding 140,000 metric tons. Pasteurellosis due to the Gram-negative bacterium Photobacterium damselae subsp. piscicida (Phdp) causes significant mortality, especially during larval and juvenile stages, and poses a serious threat to bream production. Selective breeding for improved resistance to pasteurellosis is a promising avenue for disease control, and the use of genetic markers to predict breeding values can improve the accuracy of selection, and allow accurate calculation of estimated breeding values of nonchallenged animals. In the current study, a population of 825 sea bream juveniles, originating from a factorial cross between 67 broodfish (32 sires, 35 dams), were challenged by 30 min immersion with 1 × 105 CFU virulent Phdp. Mortalities and survivors were recorded and sampled for genotyping by sequencing. The restriction-site associated DNA sequencing approach, 2b-RAD, was used to generate genome-wide single nucleotide polymorphism (SNP) genotypes for all samples. A high-density linkage map containing 12,085 SNPs grouped into 24 linkage groups (consistent with the karyotype) was constructed. The heritability of surviving days (censored data) was 0.22 (95% highest density interval: 0.11–0.36) and 0.28 (95% highest density interval: 0.17–0.4) using the pedigree and the genomic relationship matrix respectively. A genome-wide association study did not reveal individual SNPs significantly associated with resistance at a genome-wide significance level. Genomic prediction approaches were tested to investigate the potential of the SNPs obtained by 2b-RAD for estimating breeding values for resistance. The accuracy of the genomic prediction models (r = 0.38–0.46) outperformed the traditional BLUP approach based on pedigree records (r = 0.30). Overall results suggest that major quantitative trait loci affecting resistance to pasteurellosis were not present in this population, but highlight the effectiveness of 2b-RAD genotyping by sequencing for genomic selection in a mass spawning fish species. PMID:27652890
Ernst, Corinna; Hahnen, Eric; Engel, Christoph; Nothnagel, Michael; Weber, Jonas; Schmutzler, Rita K; Hauke, Jan
2018-03-27
The use of next-generation sequencing approaches in clinical diagnostics has led to a tremendous increase in data and a vast number of variants of uncertain significance that require interpretation. Therefore, prediction of the effects of missense mutations using in silico tools has become a frequently used approach. Aim of this study was to assess the reliability of in silico prediction as a basis for clinical decision making in the context of hereditary breast and/or ovarian cancer. We tested the performance of four prediction tools (Align-GVGD, SIFT, PolyPhen-2, MutationTaster2) using a set of 236 BRCA1/2 missense variants that had previously been classified by expert committees. However, a major pitfall in the creation of a reliable evaluation set for our purpose is the generally accepted classification of BRCA1/2 missense variants using the multifactorial likelihood model, which is partially based on Align-GVGD results. To overcome this drawback we identified 161 variants whose classification is independent of any previous in silico prediction. In addition to the performance as stand-alone tools we examined the sensitivity, specificity, accuracy and Matthews correlation coefficient (MCC) of combined approaches. PolyPhen-2 achieved the lowest sensitivity (0.67), specificity (0.67), accuracy (0.67) and MCC (0.39). Align-GVGD achieved the highest values of specificity (0.92), accuracy (0.92) and MCC (0.73), but was outperformed regarding its sensitivity (0.90) by SIFT (1.00) and MutationTaster2 (1.00). All tools suffered from poor specificities, resulting in an unacceptable proportion of false positive results in a clinical setting. This shortcoming could not be bypassed by combination of these tools. In the best case scenario, 138 families would be affected by the misclassification of neutral variants within the cohort of patients of the German Consortium for Hereditary Breast and Ovarian Cancer. We show that due to low specificities state-of-the-art in silico prediction tools are not suitable to predict pathogenicity of variants of uncertain significance in BRCA1/2. Thus, clinical consequences should never be based solely on in silico forecasts. However, our data suggests that SIFT and MutationTaster2 could be suitable to predict benignity, as both tools did not result in false negative predictions in our analysis.
Hastrup, Sidsel; Damgaard, Dorte; Johnsen, Søren Paaske; Andersen, Grethe
2016-07-01
We designed and validated a simple prehospital stroke scale to identify emergent large vessel occlusion (ELVO) in patients with acute ischemic stroke and compared the scale to other published scales for prediction of ELVO. A national historical test cohort of 3127 patients with information on intracranial vessel status (angiography) before reperfusion therapy was identified. National Institutes of Health Stroke Scale (NIHSS) items with the highest predictive value of occlusion of a large intracranial artery were identified, and the most optimal combination meeting predefined criteria to ensure usefulness in the prehospital phase was determined. The predictive performance of Prehospital Acute Stroke Severity (PASS) scale was compared with other published scales for ELVO. The PASS scale was composed of 3 NIHSS scores: level of consciousness (month/age), gaze palsy/deviation, and arm weakness. In derivation of PASS 2/3 of the test cohort was used and showed accuracy (area under the curve) of 0.76 for detecting large arterial occlusion. Optimal cut point ≥2 abnormal scores showed: sensitivity=0.66 (95% CI, 0.62-0.69), specificity=0.83 (0.81-0.85), and area under the curve=0.74 (0.72-0.76). Validation on 1/3 of the test cohort showed similar performance. Patients with a large artery occlusion on angiography with PASS ≥2 had a median NIHSS score of 17 (interquartile range=6) as opposed to PASS <2 with a median NIHSS score of 6 (interquartile range=5). The PASS scale showed equal performance although more simple when compared with other scales predicting ELVO. The PASS scale is simple and has promising accuracy for prediction of ELVO in the field. © 2016 American Heart Association, Inc.
Development of Anthropometry-Based Equations for the Estimation of the Total Body Water in Koreans
Lee, Seoung Woo; Kim, Gyeong A; Lim, Hee Jung; Lee, Sun Young; Park, Geun Ho; Song, Joon Ho
2005-01-01
For developing race-specific anthropometry-based total body water (TBW) equations, we measured TBW using bioelectrical impedance analysis (TBWBIA) in 2,943 healthy Korean adults. Among them, 2,223 were used as a reference group. Two equations (TBWK1 and TBWK2) were developed based on age, sex, height, and body weight. The adjusted R2 was 0.908 for TBWK1 and 0.910 for TBWK2. The remaining 720 subjects were used for the validation of our results. Watson (TBWW) and Hume-Weyers (TBWH) formulas were also used. In men, TBWBIA showed the highest correlation with TBWH, followed by TBWK1, TBWK2 and TBWW. TBWK1 and TBWK2 showed the lower root mean square errors (RMSE) and mean prediction errors (ME) than TBWW and TBWH. On the Bland-Altman plot, the correlations between the differences and means were smaller for TBWK2 than for TBWK1. On the contrary, TBWBIA showed the highest correlation with TBWW, followed by TBWK2, TBWK1, and TBWH in females. RMSE was smallest in TBWW, followed by TBWK2, TBWK1 and TBWH. ME was closest to zero for TBWK2, followed by TBWK1, TBWW and TBWH. The correlation coefficients between the means and differences were highest in TBWW, and lowest in TBWK2. In conclusion, TBWK2 provides better accuracy with a smaller bias than the TBWW or TBWH in males. TBWK2 shows a similar accuracy, but with a smaller bias than TBWW in females. PMID:15953867
Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions
Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.
2013-01-01
Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843
Comparison of Perfusion CT Software to Predict the Final Infarct Volume After Thrombectomy.
Austein, Friederike; Riedel, Christian; Kerby, Tina; Meyne, Johannes; Binder, Andreas; Lindner, Thomas; Huhndorf, Monika; Wodarg, Fritz; Jansen, Olav
2016-09-01
Computed tomographic perfusion represents an interesting physiological imaging modality to select patients for reperfusion therapy in acute ischemic stroke. The purpose of our study was to determine the accuracy of different commercial perfusion CT software packages (Philips (A), Siemens (B), and RAPID (C)) to predict the final infarct volume (FIV) after mechanical thrombectomy. Single-institutional computed tomographic perfusion data from 147 mechanically recanalized acute ischemic stroke patients were postprocessed. Ischemic core and FIV were compared about thrombolysis in cerebral infarction (TICI) score and time interval to reperfusion. FIV was measured at follow-up imaging between days 1 and 8 after stroke. In 118 successfully recanalized patients (TICI 2b/3), a moderately to strongly positive correlation was observed between ischemic core and FIV. The highest accuracy and best correlation are shown in early and fully recanalized patients (Pearson r for A=0.42, B=0.64, and C=0.83; P<0.001). Bland-Altman plots and boxplots demonstrate smaller ranges in package C than in A and B. Significant differences were found between the packages about over- and underestimation of the ischemic core. Package A, compared with B and C, estimated more than twice as many patients with a malignant stroke profile (P<0.001). Package C best predicted hypoperfusion volume in nonsuccessfully recanalized patients. Our study demonstrates best accuracy and approximation between the results of a fully automated software (RAPID) and FIV, especially in early and fully recanalized patients. Furthermore, this software package overestimated the FIV to a significantly lower degree and estimated a malignant mismatch profile less often than other software. © 2016 American Heart Association, Inc.
Prognostic accuracy of five simple scales in childhood bacterial meningitis.
Pelkonen, Tuula; Roine, Irmeli; Monteiro, Lurdes; Cruzeiro, Manuel Leite; Pitkäranta, Anne; Kataja, Matti; Peltola, Heikki
2012-08-01
In childhood acute bacterial meningitis, the level of consciousness, measured with the Glasgow coma scale (GCS) or the Blantyre coma scale (BCS), is the most important predictor of outcome. The Herson-Todd scale (HTS) was developed for Haemophilus influenzae meningitis. Our objective was to identify prognostic factors, to form a simple scale, and to compare the predictive accuracy of these scales. Seven hundred and twenty-three children with bacterial meningitis in Luanda were scored by GCS, BCS, and HTS. The simple Luanda scale (SLS), based on our entire database, comprised domestic electricity, days of illness, convulsions, consciousness, and dyspnoea at presentation. The Bayesian Luanda scale (BLS) added blood glucose concentration. The accuracy of the 5 scales was determined for 491 children without an underlying condition, against the outcomes of death, severe neurological sequelae or death, or a poor outcome (severe neurological sequelae, death, or deafness), at hospital discharge. The highest accuracy was achieved with the BLS, whose area under the curve (AUC) for death was 0.83, for severe neurological sequelae or death was 0.84, and for poor outcome was 0.82. Overall, the AUCs for SLS were ≥0.79, for GCS were ≥0.76, for BCS were ≥0.74, and for HTS were ≥0.68. Adding laboratory parameters to a simple scoring system, such as the SLS, improves the prognostic accuracy only little in bacterial meningitis.
Sex estimation by femur in modern Thai population.
Monum, T; Prasitwattanseree, S; Das, S; Siriphimolwat, P; Mahakkanukrauh, P
2017-01-01
Sex estimation is an important step of postmortem investigation and the femur is a useful bone for sex estimation by using metric analysis method. Even though there have been a reported sex estimation method by using femur in Thais, the temporal change related to time and anthropological data need to be renewed. Thus the aim of this study is to re-evaluate sex estimation by femur in Thais. 97 adult male and 103 female femora were random chosen from Forensic osteology research center and 6 measurements were applied tend to. To compare with previous Thai data, mid shaft diameter to increase but femoral head and epicondylar breadth to stabilize and when tested previous discriminant function by vertical head diameter and epicondalar breadth, the accuracy of prediction was lower than previous report. From the new data, epicondalar breadth is the best variable for distinguishing male and female at 88.7 percent of accuracy, following by transverse and vertical head diameter at 86.7 percent and femoral neck diameter at 81.7 percent of accuracy. Multivariate discriminant analysis indicated transverse head diameter and epicondylar breadth performed highest rate of accuracy at 89.7 percent. The percent of accuracy of femur was close to previous reported sex estimation by talus and calcaneus in Thai population. Thus, for especially in case of lower limb remain, which absence of pelvis.
Nguyen, Huyen T; Jia, Guang; Shah, Zarine K; Pohar, Kamal; Mortazavi, Amir; Zynger, Debra L; Wei, Lai; Yang, Xiangyu; Clark, Daniel; Knopp, Michael V
2015-05-01
To apply k-means clustering of two pharmacokinetic parameters derived from 3T dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to predict the chemotherapeutic response in bladder cancer at the mid-cycle timepoint. With the predetermined number of three clusters, k-means clustering was performed on nondimensionalized Amp and kep estimates of each bladder tumor. Three cluster volume fractions (VFs) were calculated for each tumor at baseline and mid-cycle. The changes of three cluster VFs from baseline to mid-cycle were correlated with the tumor's chemotherapeutic response. Receiver-operating-characteristics curve analysis was used to evaluate the performance of each cluster VF change as a biomarker of chemotherapeutic response in bladder cancer. The k-means clustering partitioned each bladder tumor into cluster 1 (low kep and low Amp), cluster 2 (low kep and high Amp), cluster 3 (high kep and low Amp). The changes of all three cluster VFs were found to be associated with bladder tumor response to chemotherapy. The VF change of cluster 2 presented with the highest area-under-the-curve value (0.96) and the highest sensitivity/specificity/accuracy (96%/100%/97%) with a selected cutoff value. The k-means clustering of the two DCE-MRI pharmacokinetic parameters can characterize the complex microcirculatory changes within a bladder tumor to enable early prediction of the tumor's chemotherapeutic response. © 2014 Wiley Periodicals, Inc.
Nguyen, Huyen T.; Jia, Guang; Shah, Zarine K.; Pohar, Kamal; Mortazavi, Amir; Zynger, Debra L.; Wei, Lai; Yang, Xiangyu; Clark, Daniel; Knopp, Michael V.
2015-01-01
Purpose To apply k-means clustering of two pharmacokinetic parameters derived from 3T DCE-MRI to predict chemotherapeutic response in bladder cancer at the mid-cycle time-point. Materials and Methods With the pre-determined number of 3 clusters, k-means clustering was performed on non-dimensionalized Amp and kep estimates of each bladder tumor. Three cluster volume fractions (VFs) were calculated for each tumor at baseline and mid-cycle. The changes of three cluster VFs from baseline to mid-cycle were correlated with the tumor’s chemotherapeutic response. Receiver-operating-characteristics curve analysis was used to evaluate the performance of each cluster VF change as a biomarker of chemotherapeutic response in bladder cancer. Results k-means clustering partitioned each bladder tumor into cluster 1 (low kep and low Amp), cluster 2 (low kep and high Amp), cluster 3 (high kep and low Amp). The changes of all three cluster VFs were found to be associated with bladder tumor response to chemotherapy. The VF change of cluster 2 presented with the highest area-under-the-curve value (0.96) and the highest sensitivity/specificity/accuracy (96%/100%/97%) with a selected cutoff value. Conclusion k-means clustering of the two DCE-MRI pharmacokinetic parameters can characterize the complex microcirculatory changes within a bladder tumor to enable early prediction of the tumor’s chemotherapeutic response. PMID:24943272
Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.
Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam
2016-11-01
Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Magheli, Ahmed; Hinz, Stefan; Hege, Claudia; Stephan, Carsten; Jung, Klaus; Miller, Kurt; Lein, Michael
2010-01-01
We investigated the value of pretreatment prostate specific antigen density to predict Gleason score upgrading in light of significant changes in grading routine in the last 2 decades. Of 1,061 consecutive men who underwent radical prostatectomy between 1999 and 2004, 843 were eligible for study. Prostate specific antigen density was calculated and a cutoff for highest accuracy to predict Gleason upgrading was determined using ROC curve analysis. The predictive accuracy of prostate specific antigen and prostate specific antigen density to predict Gleason upgrading was evaluated using ROC curve analysis based on predicted probabilities from logistic regression models. Prostate specific antigen and prostate specific antigen density predicted Gleason upgrading on univariate analysis (as continuous variables OR 1.07 and 7.21, each p <0.001) and on multivariate analysis (as continuous variables with prostate specific antigen density adjusted for prostate specific antigen OR 1.07, p <0.001 and OR 4.89, p = 0.037, respectively). When prostate specific antigen density was added to the model including prostate specific antigen and other Gleason upgrading predictors, prostate specific antigen lost its predictive value (OR 1.02, p = 0.423), while prostate specific antigen density remained an independent predictor (OR 4.89, p = 0.037). Prostate specific antigen density was more accurate than prostate specific antigen to predict Gleason upgrading (AUC 0.61 vs 0.57, p = 0.030). Prostate specific antigen density is a significant independent predictor of Gleason upgrading even when accounting for prostate specific antigen. This could be especially important in patients with low risk prostate cancer who seek less invasive therapy such as active surveillance since potentially life threatening disease may be underestimated. Further studies are warranted to help evaluate the role of prostate specific antigen density in Gleason upgrading and its significance for biochemical outcome.
Feinstein, Wei P; Brylinski, Michal
2015-01-01
Computational approaches have emerged as an instrumental methodology in modern research. For example, virtual screening by molecular docking is routinely used in computer-aided drug discovery. One of the critical parameters for ligand docking is the size of a search space used to identify low-energy binding poses of drug candidates. Currently available docking packages often come with a default protocol for calculating the box size, however, many of these procedures have not been systematically evaluated. In this study, we investigate how the docking accuracy of AutoDock Vina is affected by the selection of a search space. We propose a new procedure for calculating the optimal docking box size that maximizes the accuracy of binding pose prediction against a non-redundant and representative dataset of 3,659 protein-ligand complexes selected from the Protein Data Bank. Subsequently, we use the Directory of Useful Decoys, Enhanced to demonstrate that the optimized docking box size also yields an improved ranking in virtual screening. Binding pockets in both datasets are derived from the experimental complex structures and, additionally, predicted by eFindSite. A systematic analysis of ligand binding poses generated by AutoDock Vina shows that the highest accuracy is achieved when the dimensions of the search space are 2.9 times larger than the radius of gyration of a docking compound. Subsequent virtual screening benchmarks demonstrate that this optimized docking box size also improves compound ranking. For instance, using predicted ligand binding sites, the average enrichment factor calculated for the top 1 % (10 %) of the screening library is 8.20 (3.28) for the optimized protocol, compared to 7.67 (3.19) for the default procedure. Depending on the evaluation metric, the optimal docking box size gives better ranking in virtual screening for about two-thirds of target proteins. This fully automated procedure can be used to optimize docking protocols in order to improve the ranking accuracy in production virtual screening simulations. Importantly, the optimized search space systematically yields better results than the default method not only for experimental pockets, but also for those predicted from protein structures. A script for calculating the optimal docking box size is freely available at www.brylinski.org/content/docking-box-size. Graphical AbstractWe developed a procedure to optimize the box size in molecular docking calculations. Left panel shows the predicted binding pose of NADP (green sticks) compared to the experimental complex structure of human aldose reductase (blue sticks) using a default protocol. Right panel shows the docking accuracy using an optimized box size.
Thompson, Patrick C; Dalman, Ronald L; Harris, E John; Chandra, Venita; Lee, Jason T; Mell, Matthew W
2016-12-01
The clinical decision-making utility of scoring algorithms for predicting mortality after ruptured abdominal aortic aneurysms (rAAAs) remains unknown. We sought to determine the clinical utility of the algorithms compared with our clinical decision making and outcomes for management of rAAA during a 10-year period. Patients admitted with a diagnosis rAAA at a large university hospital were identified from 2005 to 2014. The Glasgow Aneurysm Score, Hardman Index, Vancouver Score, Edinburgh Ruptured Aneurysm Score, University of Washington Ruptured Aneurysm Score, Vascular Study Group of New England rAAA Risk Score, and the Artificial Neural Network Score were analyzed for accuracy in predicting mortality. Among patients quantified into the highest-risk group (predicted mortality >80%-85%), we compared the predicted with the actual outcome to determine how well these scores predicted futility. The cohort comprised 64 patients. Of those, 24 (38%) underwent open repair, 36 (56%) underwent endovascular repair, and 4 (6%) received only comfort care. Overall mortality was 30% (open repair, 26%; endovascular repair, 24%; no repair, 100%). As assessed by the scoring systems, 5% to 35% of patients were categorized as high-mortality risk. Intersystem agreement was poor, with κ values ranging from 0.06 to 0.79. Actual mortality was lower than the predicted mortality (50%-70% vs 78%-100%) for all scoring systems, with each scoring system overestimating mortality by 10% to 50%. Mortality rates for patients not designated into the high-risk cohort were dramatically lower, ranging from 7% to 29%. Futility, defined as 100% mortality, was predicted in five of 63 patients with the Hardman Index and in two of 63 of the University of Washington score. Of these, surgery was not offered to one of five and one of two patients, respectively. If one of these two models were used to withhold operative intervention, the mortality of these patients would have been 100%. The actual mortality for these patients was 60% and 50%, respectively. Clinical algorithms for predicting mortality after rAAA were not useful for predicting futility. Most patients with rAAA were not classified in the highest-risk group by the clinical decision models. Among patients identified as highest risk, predicted mortality was overestimated compared with actual mortality. The data from this study support the limited value to surgeons of the currently published algorithms. Copyright © 2016 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Improvement of attention with amphetamine in low- and high-performing rats.
Turner, Karly M; Burne, Thomas H J
2016-09-01
Attentional deficits occur in a range of neuropsychiatric disorders, such as schizophrenia and attention deficit hyperactivity disorder. Psychostimulants are one of the main treatments for attentional deficits, yet there are limited reports of procognitive effects of amphetamine in preclinical studies. Therefore, task development may be needed to improve predictive validity when measuring attention in rodents. This study aimed to use a modified signal detection task (SDT) to determine if and at what doses amphetamine could improve attention in rats. Sprague-Dawley rats were trained on the SDT prior to amphetamine challenge (0.1, 0.25, 0.75 and 1.25 mg/kg). This dose range was predicted to enhance and disrupt cognition with the effect differing between individuals depending on baseline performance. Acute low dose amphetamine (0.1 and 0.25 mg/kg) improved accuracy, while the highest dose (1.25 mg/kg) significantly disrupted performance. The effects differed for low- and high-performing groups across these doses. The effect of amphetamine on accuracy was found to significantly correlate with baseline performance in rats. This study demonstrates that improvement in attentional performance with systemic amphetamine is dependent on baseline accuracy in rats. Indicative of the inverted U-shaped relationship between dopamine and cognition, there was a baseline-dependent shift in performance with increasing doses of amphetamine. The SDT may be a useful tool for investigating individual differences in attention and response to psychostimulants in rodents.
Sforza, Alfonso; Mancusi, Costantino; Carlino, Maria Viviana; Buonauro, Agostino; Barozzi, Marco; Romano, Giuseppe; Serra, Sossio; de Simone, Giovanni
2017-06-19
The availability of ultra-miniaturized pocket ultrasound devices (PUD) adds diagnostic power to the clinical examination. Information on accuracy of ultrasound with handheld units in immediate differential diagnosis in emergency department (ED) is poor. The aim of this study is to test the usefulness and accuracy of lung ultrasound (LUS) alone or combined with ultrasound of the heart and inferior vena cava (IVC) using a PUD for the differential diagnosis of acute dyspnea (AD). We included 68 patients presenting to the ED of "Maurizio Bufalini" Hospital in Cesena (Italy) for AD. All patients underwent integrated ultrasound examination (IUE) of lung-heart-IVC, using PUD. The series was divided into patients with dyspnea of cardiac or non-cardiac origin. We used 2 × 2 contingency tables to analyze sensitivity, specificity, positive predictive value and negative predictive value of the three ultrasonic methods and their various combinations for the diagnosis of cardiogenic dyspnea (CD), comparing with the final diagnosis made by an independent emergency physician. LUS alone exhibited a good sensitivity (92.6%) and specificity (80.5%). The highest accuracy (90%) for the diagnosis of CD was obtained with the combination of LUS and one of the other two methods (heart or IVC). The IUE with PUD is a useful extension of the clinical examination, can be readily available at the bedside or in ambulance, requires few minutes and has a reliable diagnostic discriminant ability in the setting of AD.
A Novel Calibration-Minimum Method for Prediction of Mole Fraction in Non-Ideal Mixture.
Shibayama, Shojiro; Kaneko, Hiromasa; Funatsu, Kimito
2017-04-01
This article proposes a novel concentration prediction model that requires little training data and is useful for rapid process understanding. Process analytical technology is currently popular, especially in the pharmaceutical industry, for enhancement of process understanding and process control. A calibration-free method, iterative optimization technology (IOT), was proposed to predict pure component concentrations, because calibration methods such as partial least squares, require a large number of training samples, leading to high costs. However, IOT cannot be applied to concentration prediction in non-ideal mixtures because its basic equation is derived from the Beer-Lambert law, which cannot be applied to non-ideal mixtures. We proposed a novel method that realizes prediction of pure component concentrations in mixtures from a small number of training samples, assuming that spectral changes arising from molecular interactions can be expressed as a function of concentration. The proposed method is named IOT with virtual molecular interaction spectra (IOT-VIS) because the method takes spectral change as a virtual spectrum x nonlin,i into account. It was confirmed through the two case studies that the predictive accuracy of IOT-VIS was the highest among existing IOT methods.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
Memon, S; Lynch, A C; Bressel, M; Wise, A G; Heriot, A G
2015-09-01
Restaging imaging by MRI or endorectal ultrasound (ERUS) following neoadjuvant chemoradiotherapy is not routinely performed, but the assessment of response is becoming increasingly important to facilitate individualization of management. A search of the MEDLINE and Scopus databases was performed for studies that evaluated the accuracy of restaging of rectal cancer following neoadjuvant chemoradiotherapy with MRI or ERUS against the histopathological outcome. A systematic review of selected studies was performed. The methodological quality of studies that qualified for meta-analysis was critically assessed to identify studies suitable for inclusion in the meta-analysis. Sixty-three articles were included in the systematic review. Twelve restaging MRI studies and 18 restaging ERUS studies were eligible for meta-analysis of T-stage restaging accuracy. Overall, ERUS T-stage restaging accuracy (mean [95% CI]: 65% [56-72%]) was nonsignificantly higher than MRI T-stage accuracy (52% [44-59%]). Restaging MRI is accurate at excluding circumferential resection margin involvement. Restaging MRI and ERUS were equivalent for prediction of nodal status: the accuracy of both investigations was 72% with over-staging and under-staging occurring in 10-15%. The heterogeneity amongst restaging studies is high, limiting conclusive findings regarding their accuracies. The accuracy of restaging imaging is different for different pathological T stages and highest for T3 tumours. Morphological assessment of T- or N-stage by MRI or ERUS is currently not accurate or consistent enough for clinical application. Restaging MRI appears to have a role in excluding circumferential resection margin involvement. Colorectal Disease © 2015 The Association of Coloproctology of Great Britain and Ireland.
A uniform management approach to optimize outcome in fetal growth restriction.
Seravalli, Viola; Baschat, Ahmet A
2015-06-01
A uniform approach to the diagnosis and management of fetal growth restriction (FGR) consistently produces better outcome, prevention of unanticipated stillbirth, and appropriate timing of delivery. Early-onset and late-onset FGR represent two distinct clinical phenotypes of placental dysfunction. Management challenges in early-onset FGR revolve around prematurity and coexisting maternal hypertensive disease, whereas in late-onset disease failure of diagnosis or surveillance leading to unanticipated stillbirth is the primary issue. Identifying the surveillance tests that have the highest predictive accuracy for fetal acidemia and establishing the appropriate monitoring interval to detect fetal deterioration is a high priority. Copyright © 2015 Elsevier Inc. All rights reserved.
Mapping permafrost in the boreal forest with Thematic Mapper satellite data
NASA Technical Reports Server (NTRS)
Morrissey, L. A.; Strong, L. L.; Card, D. H.
1986-01-01
A geographic data base incorporating Landsat TM data was used to develop and evaluate logistic discriminant functions for predicting the distribution of permafrost in a boreal forest watershed. The data base included both satellite-derived information and ancillary map data. Five permafrost classifications were developed from a stratified random sample of the data base and evaluated by comparison with a photo-interpreted permafrost map using contingency table analysis and soil temperatures recorded at sites within the watershed. A classification using a TM thermal band and a TM-derived vegetation map as independent variables yielded the highest mapping accuracy for all permafrost categories.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Radiogenomics to characterize regional genetic heterogeneity in glioblastoma.
Hu, Leland S; Ning, Shuluo; Eschbacher, Jennifer M; Baxter, Leslie C; Gaw, Nathan; Ranjbar, Sara; Plasencia, Jonathan; Dueck, Amylou C; Peng, Sen; Smith, Kris A; Nakaji, Peter; Karis, John P; Quarles, C Chad; Wu, Teresa; Loftus, Joseph C; Jenkins, Robert B; Sicotte, Hugues; Kollmeyer, Thomas M; O'Neill, Brian P; Elmquist, William; Hoxworth, Joseph M; Frakes, David; Sarkaria, Jann; Swanson, Kristin R; Tran, Nhan L; Li, Jing; Mitchell, J Ross
2017-01-01
Glioblastoma (GBM) exhibits profound intratumoral genetic heterogeneity. Each tumor comprises multiple genetically distinct clonal populations with different therapeutic sensitivities. This has implications for targeted therapy and genetically informed paradigms. Contrast-enhanced (CE)-MRI and conventional sampling techniques have failed to resolve this heterogeneity, particularly for nonenhancing tumor populations. This study explores the feasibility of using multiparametric MRI and texture analysis to characterize regional genetic heterogeneity throughout MRI-enhancing and nonenhancing tumor segments. We collected multiple image-guided biopsies from primary GBM patients throughout regions of enhancement (ENH) and nonenhancing parenchyma (so called brain-around-tumor, [BAT]). For each biopsy, we analyzed DNA copy number variants for core GBM driver genes reported by The Cancer Genome Atlas. We co-registered biopsy locations with MRI and texture maps to correlate regional genetic status with spatially matched imaging measurements. We also built multivariate predictive decision-tree models for each GBM driver gene and validated accuracies using leave-one-out-cross-validation (LOOCV). We collected 48 biopsies (13 tumors) and identified significant imaging correlations (univariate analysis) for 6 driver genes: EGFR, PDGFRA, PTEN, CDKN2A, RB1, and TP53. Predictive model accuracies (on LOOCV) varied by driver gene of interest. Highest accuracies were observed for PDGFRA (77.1%), EGFR (75%), CDKN2A (87.5%), and RB1 (87.5%), while lowest accuracy was observed in TP53 (37.5%). Models for 4 driver genes (EGFR, RB1, CDKN2A, and PTEN) showed higher accuracy in BAT samples (n = 16) compared with those from ENH segments (n = 32). MRI and texture analysis can help characterize regional genetic heterogeneity, which offers potential diagnostic value under the paradigm of individualized oncology. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sewe, Maquins Odhiambo; Tozan, Yesim; Ahlm, Clas; Rocklöv, Joacim
2017-06-01
Malaria surveillance data provide opportunity to develop forecasting models. Seasonal variability in environmental factors correlate with malaria transmission, thus the identification of transmission patterns is useful in developing prediction models. However, with changing seasonal transmission patterns, either due to interventions or shifting weather seasons, traditional modelling approaches may not yield adequate predictive skill. Two statistical models,a general additive model (GAM) and GAMBOOST model with boosted regression were contrasted by assessing their predictive accuracy in forecasting malaria admissions at lead times of one to three months. Monthly admission data for children under five years with confirmed malaria at the Siaya district hospital in Western Kenya for the period 2003 to 2013 were used together with satellite derived data on rainfall, average temperature and evapotranspiration(ET). There was a total of 8,476 confirmed malaria admissions. The peak of malaria season changed and malaria admissions reduced overtime. The GAMBOOST model at 1-month lead time had the highest predictive skill during both the training and test periods and thus can be utilized in a malaria early warning system.
Konishi, Tsuyoshi; Shimada, Yoshifumi; Lee, Lik Hang; Cavalcanti, Marcela S; Hsu, Meier; Smith, Jesse Joshua; Nash, Garrett M; Temple, Larissa K; Guillem, José G; Paty, Philip B; Garcia-Aguilar, Julio; Vakiani, Efsevia; Gonen, Mithat; Shia, Jinru; Weiser, Martin R
2018-06-01
This study aimed to compare common histologic markers at the invasive front of colon adenocarcinoma in terms of prognostic accuracy and interobserver agreement. Consecutive patients who underwent curative resection for stages I to III colon adenocarcinoma at a single institution in 2007 to 2014 were identified. Poorly differentiated clusters (PDCs), tumor budding, perineural invasion, desmoplastic reaction, and Crohn-like lymphoid reaction at the invasive front, as well as the World Health Organization (WHO) grade of the entire tumor, were analyzed. Prognostic accuracies for recurrence-free survival (RFS) were compared, and interobserver agreement among 3 pathologists was assessed. The study cohort consisted of 851 patients. Although all the histologic markers except WHO grade were significantly associated with RFS (PDCs, tumor budding, perineural invasion, and desmoplastic reaction: P<0.001; Crohn-like lymphoid reaction: P=0.021), PDCs (grade 1 [G1]: n=581; G2: n=145; G3: n=125) showed the largest separation of 3-year RFS in the full cohort (G1: 94.1%; G3: 63.7%; hazard ratio [HR], 6.39; 95% confidence interval [CI], 4.11-9.95; P<0.001), stage II patients (G1: 94.0%; G3: 67.3%; HR, 4.15; 95% CI, 1.96-8.82; P<0.001), and stage III patients (G1: 89.0%; G3: 59.4%; HR, 4.50; 95% CI, 2.41-8.41; P<0.001). PDCs had the highest prognostic accuracy for RFS with the concordance probability estimate of 0.642, whereas WHO grade had the lowest. Interobserver agreement was the highest for PDCs, with a weighted kappa of 0.824. The risk of recurrence over time peaked earlier for worse PDCs grade. Our findings indicate that PDCs are the best invasive-front histologic marker in terms of prognostic accuracy and interobserver agreement. PDCs may replace WHO grade as a prognostic indicator.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sorensen, J; Duran, C; Stingo, F
Purpose: To characterize the effect of virtual monochromatic reconstructions on several commonly used texture analysis features in DECT of the chest. Further, to assess the effect of monochromatic energy levels on the ability of these textural features to identify tissue types. Methods: 20 consecutive patients underwent chest CTs for evaluation of lung nodules using Siemens Somatom Definition Flash DECT. Virtual monochromatic images were constructed at 10keV intervals from 40–190keV. For each patient, an ROI delineated the lesion under investigation, and cylindrical ROI’s were placed within 5 different healthy tissues (blood, fat, muscle, lung, and liver). Several histogram- and Grey Levelmore » Cooccurrence Matrix (GLCM)-based texture features were then evaluated in each ROI at each energy level. As a means of validation, these feature values were then used in a random forest classifier to attempt to identify the tissue types present within each ROI. Their predictive accuracy at each energy level was recorded. Results: All textural features changed considerably with virtual monochromatic energy, particularly below 70keV. Most features exhibited a global minimum or maximum around 80keV, and while feature values changed with energy above this, patient ranking was generally unaffected. As expected, blood demonstrated the lowest inter-patient variability, for all features, while lung lesions (encompassing many different pathologies) exhibited the highest. The accuracy of these features in identifying tissues (76% accuracy) was highest at 80keV, but no clear relationship between energy and classification accuracy was found. Two common misclassifications (blood vs liver and muscle vs fat) accounted for the majority (24 of the 28) errors observed. Conclusion: All textural features were highly dependent on virtual monochromatic energy level, especially below 80keV, and were more stable above this energy. However, in a random forest model, these commonly used features were able to reliably differentiate between most tissues types regardless of energy level. Dr Godoy has received a dual-energy CT research grant from Siemens Healthcare. That grant did not directly fund this research.« less
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Spittle, Alicia J; Boyd, Roslyn N; Inder, Terrie E; Doyle, Lex W
2009-02-01
The objective of this study was to compare the predictive value of qualitative MRI of brain structure at term and general movements assessments at 1 and 3 months' corrected age for motor outcome at 1 year's corrected age in very preterm infants. Eighty-six very preterm infants (<30 weeks' gestation) underwent MRI at term-equivalent age, were evaluated for white matter abnormality, and had general movements assessed at 1 and 3 months' corrected age. Motor outcome at 1 year's corrected age was evaluated with the Alberta Infant Motor Scale, the Neuro-Sensory Motor Development Assessment, and the diagnosis of cerebral palsy by the child's pediatrician. At 1 year of age, the Alberta Infant Motor Scale categorized 30 (35%) infants as suspicious/abnormal; the Neuro-Sensory Motor Development Assessment categorized 16 (18%) infants with mild-to-severe motor dysfunction, and 5 (6%) infants were classified with cerebral palsy. White matter abnormality at term and general movements at 1 and 3 months significantly correlated with Alberta Infant Motor Scale and Neuro-Sensory Motor Development Assessment scores at 1 year. White matter abnormality and general movements at 3 months were the only assessments that correlated with cerebral palsy. All assessments had 100% sensitivity in predicting cerebral palsy. White matter abnormality demonstrated the greatest accuracy in predicting combined motor outcomes, with excellent levels of specificity (>90%); however, the sensitivity was low. On the other hand, general movements assessments at 1 month had the highest sensitivity (>80%); however, the overall accuracy was relatively low. Neuroimaging (MRI) and functional (general movements) examinations have important complementary roles in predicting motor development of very preterm infants.
Zhu, Xiaolei; Mitchell, Julie C
2011-09-01
Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods. Copyright © 2011 Wiley-Liss, Inc.
Lee, Sunghee; Lee, Seung Ku; Kim, Jong Yeol; Cho, Namhan; Shin, Chol
2017-09-02
To examine whether the use of Sasang constitutional (SC) types, such as Tae-yang (TY), Tae-eum (TE), So-yang (SY), and So-eum (SE) types, increases the accuracy of risk prediction for metabolic syndrome. From 2001 to 2014, 3529 individuals aged 40 to 69 years participated in a longitudinal prospective cohort. The Cox proportional hazard model was utilized to predict the risk of developing metabolic syndrome. During the 14 year follow-up, 1591 incident events of metabolic syndrome were observed. Individuals with TE type had higher body mass indexes and waist circumferences than individuals with SY and SE types. The risk of developing metabolic syndrome was the highest among individuals with the TE type, followed by the SY type and the SE type. When the prediction risk models for incident metabolic syndrome were compared, the area under the curve for the model using SC types was significantly increased to 0.8173. Significant predictors for incident metabolic syndrome were different according to the SC types. For individuals with the TE type, the significant predictors were age, sex, body mass index (BMI), education, smoking, drinking, fasting glucose level, high-density lipoprotein (HDL) cholesterol level, systolic and diastolic blood pressure, and triglyceride level. For Individuals with the SE type, the predictors were sex, smoking, fasting glucose, HDL cholesterol level, systolic and diastolic blood pressure, and triglyceride level, while the predictors in individuals with the SY type were age, sex, BMI, smoking, drinking, total cholesterol level, fasting glucose level, HDL cholesterol level, systolic and diastolic blood pressure, and triglyceride level. In this prospective cohort study among 3529 individuals, we observed that utilizing the SC types significantly increased the accuracy of the risk prediction for the development of metabolic syndrome.
NASA Astrophysics Data System (ADS)
Oroza, C.; Bales, R. C.; Zheng, Z.; Glaser, S. D.
2017-12-01
Predicting the spatial distribution of soil moisture in mountain environments is confounded by multiple factors, including complex topography, spatial variably of soil texture, sub-surface flow paths, and snow-soil interactions. While remote-sensing tools such as passive-microwave monitoring can measure spatial variability of soil moisture, they only capture near-surface soil layers. Large-scale sensor networks are increasingly providing soil-moisture measurements at high temporal resolution across a broader range of depths than are accessible from remote sensing. It may be possible to combine these in-situ measurements with high-resolution LIDAR topography and canopy cover to estimate the spatial distribution of soil moisture at high spatial resolution at multiple depths. We study the feasibility of this approach using six years (2009-2014) of daily volumetric water content measurements at 10-, 30-, and 60-cm depths from the Southern Sierra Critical Zone Observatory. A non-parametric, multivariate regression algorithm, Random Forest, was used to predict the spatial distribution of depth-integrated soil-water storage, based on the in-situ measurements and a combination of node attributes (topographic wetness, northness, elevation, soil texture, and location with respect to canopy cover). We observe predictable patterns of predictor accuracy and independent variable ranking during the six-year study period. Predictor accuracy is highest during the snow-cover and early recession periods but declines during the dry period. Soil texture has consistently high feature importance. Other landscape attributes exhibit seasonal trends: northness peaks during the wet-up period, and elevation and topographic-wetness index peak during the recession and dry period, respectively.
Left atrial strain predicts hemodynamic parameters in cardiovascular patients.
Hewing, Bernd; Theres, Lena; Spethmann, Sebastian; Stangl, Karl; Dreger, Henryk; Knebel, Fabian
2017-08-01
We aimed to evaluate the predictive value of left atrial (LA) reservoir, conduit, and contractile function parameters as assessed by speckle tracking echocardiography (STE) for invasively measured hemodynamic parameters in a patient cohort with myocardial and valvular diseases. Sixty-nine patients undergoing invasive hemodynamic assessment were enrolled into the study. Invasive hemodynamic parameters were obtained by left and right heart catheterization. Transthoracic echocardiography assessment of LA reservoir, conduit, and contractile function was performed by STE. Forty-nine patients had sinus rhythm (SR) and 20 patients had permanent atrial fibrillation (AF). AF patients had significantly reduced LA reservoir function compared to SR patients. In patients with SR, LA reservoir, conduit, and contractile function inversely correlated with pulmonary capillary wedge pressure (PCWP), left ventricular end-diastolic pressure, and mean pulmonary artery pressure (PAP), and showed a moderate association with cardiac index. In AF patients, there were no significant correlations between LA reservoir function and invasively obtained hemodynamic parameters. In SR patients, LA contractile function with a cutoff value of 16.0% had the highest diagnostic accuracy (area under the curve, AUC: 0.895) to predict PCWP ≥18 mm Hg compared to the weaker diagnostic accuracy of average E/E' ratio with an AUC of 0.786 at a cutoff value of 14.3. In multivariate analysis, LA contractile function remained significantly associated with PCWP ≥18 mm Hg. In a cohort of patients with a broad spectrum of cardiovascular diseases LA strain shows a valuable prediction of hemodynamic parameters, specifically LV filling pressures, in the presence of SR. © 2017, Wiley Periodicals, Inc.
A wavelet-based technique to predict treatment outcome for Major Depressive Disorder.
Mumtaz, Wajid; Xia, Likun; Mohd Yasin, Mohd Azhar; Azhar Ali, Syed Saad; Malik, Aamir Saeed
2017-01-01
Treatment management for Major Depressive Disorder (MDD) has been challenging. However, electroencephalogram (EEG)-based predictions of antidepressant's treatment outcome may help during antidepressant's selection and ultimately improve the quality of life for MDD patients. In this study, a machine learning (ML) method involving pretreatment EEG data was proposed to perform such predictions for Selective Serotonin Reuptake Inhibitor (SSRIs). For this purpose, the acquisition of experimental data involved 34 MDD patients and 30 healthy controls. Consequently, a feature matrix was constructed involving time-frequency decomposition of EEG data based on wavelet transform (WT) analysis, termed as EEG data matrix. However, the resultant EEG data matrix had high dimensionality. Therefore, dimension reduction was performed based on a rank-based feature selection method according to a criterion, i.e., receiver operating characteristic (ROC). As a result, the most significant features were identified and further be utilized during the training and testing of a classification model, i.e., the logistic regression (LR) classifier. Finally, the LR model was validated with 100 iterations of 10-fold cross-validation (10-CV). The classification results were compared with short-time Fourier transform (STFT) analysis, and empirical mode decompositions (EMD). The wavelet features extracted from frontal and temporal EEG data were found statistically significant. In comparison with other time-frequency approaches such as the STFT and EMD, the WT analysis has shown highest classification accuracy, i.e., accuracy = 87.5%, sensitivity = 95%, and specificity = 80%. In conclusion, significant wavelet coefficients extracted from frontal and temporal pre-treatment EEG data involving delta and theta frequency bands may predict antidepressant's treatment outcome for the MDD patients.
Clinico-pathological nomogram for predicting BRAF mutational status of metastatic colorectal cancer.
Loupakis, Fotios; Moretto, Roberto; Aprile, Giuseppe; Muntoni, Marta; Cremolini, Chiara; Iacono, Donatella; Casagrande, Mariaelena; Ferrari, Laura; Salvatore, Lisa; Schirripa, Marta; Rossini, Daniele; De Maglio, Giovanna; Fasola, Gianpiero; Calvetti, Lorenzo; Pilotto, Sara; Carbognin, Luisa; Fontanini, Gabriella; Tortora, Giampaolo; Falcone, Alfredo; Sperduti, Isabella; Bria, Emilio
2016-01-12
In metastatic colorectal cancer (mCRC), BRAFV600E mutation has been variously associated to specific clinico-pathological features. Two large retrospective series of mCRC patients from two Italian Institutions were used as training-set (TS) and validation-set (VS) for developing a nomogram predictive of BRAFV600E status. The model was internally and externally validated. In the TS, data from 596 mCRC patients were gathered (RAS wild-type (wt) 281 (47.1%); BRAFV600E mutated 54 (9.1%)); RAS and BRAFV600E mutations were mutually exclusive. In the RAS-wt population, right-sided primary (odds ratio (OR): 7.80, 95% confidence interval (CI) 3.05-19.92), female gender (OR: 2.90, 95% CI 1.14-7.37) and mucinous histology (OR: 4.95, 95% CI 1.90-12.90) were independent predictors of BRAFV600E mutation, with high replication at internal validation (100%, 93% and 98%, respectively). A predictive nomogram was calculated: patients with the highest score (right-sided primary, female and mucinous) had a 81% chance to bear a BRAFV600E-mutant tumour; accuracy measures: AUC=0.812, SE:0.034, sensitivity:81.2%; specificity:72.1%. In the VS (508 pts, RAS wt: 262 (51.6%), BRAFV600E mutated: 49 (9.6%)), right-sided primary, female gender and mucinous histology were confirmed as independent predictors of BRAFV600E mutation with high accuracy. Three simple and easy-to-collect characteristics define a useful nomogram for predicting BRAF status in mCRC with high specificity and sensitivity.
Genomic Prediction of Gene Bank Wheat Landraces.
Crossa, José; Jarquín, Diego; Franco, Jorge; Pérez-Rodríguez, Paulino; Burgueño, Juan; Saint-Pierre, Carolina; Vikram, Prashant; Sansaloni, Carolina; Petroli, Cesar; Akdemir, Deniz; Sneller, Clay; Reynolds, Matthew; Tattaris, Maria; Payne, Thomas; Guzman, Carlos; Peña, Roberto J; Wenzl, Peter; Singh, Sukhwinder
2016-07-07
This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, "diversity" and "prediction", including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15-20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials. Copyright © 2016 Crossa et al.
Aslan, Kerim; Gunbey, Hediye Pinar; Tomak, Leman; Ozmen, Zafer; Incesu, Lutfi
The aim of this study was to investigate whether the use of combination quantitative metrics (mamillopontine distance [MPD], pontomesencephalic angle, and mesencephalon anterior-posterior/medial-lateral diameter ratios) with qualitative signs (dural enhancement, subdural collections/hematoma, venous engorgement, pituitary gland enlargements, and tonsillar herniations) provides a more accurate diagnosis of intracranial hypotension (IH). The quantitative metrics and qualitative signs of 34 patients and 34 control subjects were assessed by 2 independent observers. Receiver operating characteristic (ROC) curve was used to evaluate the diagnostic performance of quantitative metrics and qualitative signs, and for the diagnosis of IH, optimum cutoff values of quantitative metrics were found with ROC analysis. Combined ROC curve was measured for the quantitative metrics, and qualitative signs combinations in determining diagnostic accuracy and sensitivity, specificity, and positive and negative predictive values were found, and the best model combination was formed. Whereas MPD and pontomesencephalic angle were significantly lower in patients with IH when compared with the control group (P < 0.001), mesencephalon anterior-posterior/medial-lateral diameter ratio was significantly higher (P < 0.001). For qualitative signs, the highest individual distinctive power was dural enhancement with area under the ROC curve (AUC) of 0.838. For quantitative metrics, the highest individual distinctive power was MPD with AUC of 0.947. The best accuracy in the diagnosis of IH was obtained by combination of dural enhancement, venous engorgement, and MPD with an AUC of 1.00. This study showed that the combined use of dural enhancement, venous engorgement, and MPD had diagnostic accuracy of 100 % for the diagnosis of IH. Therefore, a more accurate IH diagnosis can be provided with combination of quantitative metrics with qualitative signs.
Genomic Prediction of Gene Bank Wheat Landraces
Crossa, José; Jarquín, Diego; Franco, Jorge; Pérez-Rodríguez, Paulino; Burgueño, Juan; Saint-Pierre, Carolina; Vikram, Prashant; Sansaloni, Carolina; Petroli, Cesar; Akdemir, Deniz; Sneller, Clay; Reynolds, Matthew; Tattaris, Maria; Payne, Thomas; Guzman, Carlos; Peña, Roberto J.; Wenzl, Peter; Singh, Sukhwinder
2016-01-01
This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials. PMID:27172218
Accuracy of Predicted Genomic Breeding Values in Purebred and Crossbred Pigs.
Hidalgo, André M; Bastiaansen, John W M; Lopes, Marcos S; Harlizius, Barbara; Groenen, Martien A M; de Koning, Dirk-Jan
2015-05-26
Genomic selection has been widely implemented in dairy cattle breeding when the aim is to improve performance of purebred animals. In pigs, however, the final product is a crossbred animal. This may affect the efficiency of methods that are currently implemented for dairy cattle. Therefore, the objective of this study was to determine the accuracy of predicted breeding values in crossbred pigs using purebred genomic and phenotypic data. A second objective was to compare the predictive ability of SNPs when training is done in either single or multiple populations for four traits: age at first insemination (AFI); total number of piglets born (TNB); litter birth weight (LBW); and litter variation (LVR). We performed marker-based and pedigree-based predictions. Within-population predictions for the four traits ranged from 0.21 to 0.72. Multi-population prediction yielded accuracies ranging from 0.18 to 0.67. Predictions across purebred populations as well as predicting genetic merit of crossbreds from their purebred parental lines for AFI performed poorly (not significantly different from zero). In contrast, accuracies of across-population predictions and accuracies of purebred to crossbred predictions for LBW and LVR ranged from 0.08 to 0.31 and 0.11 to 0.31, respectively. Accuracy for TNB was zero for across-population prediction, whereas for purebred to crossbred prediction it ranged from 0.08 to 0.22. In general, marker-based outperformed pedigree-based prediction across populations and traits. However, in some cases pedigree-based prediction performed similarly or outperformed marker-based prediction. There was predictive ability when purebred populations were used to predict crossbred genetic merit using an additive model in the populations studied. AFI was the only exception, indicating that predictive ability depends largely on the genetic correlation between PB and CB performance, which was 0.31 for AFI. Multi-population prediction was no better than within-population prediction for the purebred validation set. Accuracy of prediction was very trait-dependent. Copyright © 2015 Hidalgo et al.
A Comparison of Machine Learning Approaches for Corn Yield Estimation
NASA Astrophysics Data System (ADS)
Kim, N.; Lee, Y. W.
2017-12-01
Machine learning is an efficient empirical method for classification and prediction, and it is another approach to crop yield estimation. The objective of this study is to estimate corn yield in the Midwestern United States by employing the machine learning approaches such as the support vector machine (SVM), random forest (RF), and deep neural networks (DNN), and to perform the comprehensive comparison for their results. We constructed the database using satellite images from MODIS, the climate data of PRISM climate group, and GLDAS soil moisture data. In addition, to examine the seasonal sensitivities of corn yields, two period groups were set up: May to September (MJJAS) and July and August (JA). In overall, the DNN showed the highest accuracies in term of the correlation coefficient for the two period groups. The differences between our predictions and USDA yield statistics were about 10-11 %.
Mapping the Transmission Risk of Zika Virus using Machine Learning Models.
Jiang, Dong; Hao, Mengmeng; Ding, Fangyu; Fu, Jingying; Li, Meng
2018-06-19
Zika virus, which has been linked to severe congenital abnormalities, is exacerbating global public health problems with its rapid transnational expansion fueled by increased global travel and trade. Suitability mapping of the transmission risk of Zika virus is essential for drafting public health plans and disease control strategies, which are especially important in areas where medical resources are relatively scarce. Predicting the risk of Zika virus outbreak has been studied in recent years, but the published literature rarely includes multiple model comparisons or predictive uncertainty analysis. Here, three relatively popular machine learning models including backward propagation neural network (BPNN), gradient boosting machine (GBM) and random forest (RF) were adopted to map the probability of Zika epidemic outbreak at the global level, pairing high-dimensional multidisciplinary covariate layers with comprehensive location data on recorded Zika virus infection in humans. The results show that the predicted high-risk areas for Zika transmission are concentrated in four regions: Southeastern North America, Eastern South America, Central Africa and Eastern Asia. To evaluate the performance of machine learning models, the 50 modeling processes were conducted based on a training dataset. The BPNN model obtained the highest predictive accuracy with a 10-fold cross-validation area under the curve (AUC) of 0.966 [95% confidence interval (CI) 0.965-0.967], followed by the GBM model (10-fold cross-validation AUC = 0.964[0.963-0.965]) and the RF model (10-fold cross-validation AUC = 0.963[0.962-0.964]). Based on training samples, compared with the BPNN-based model, we find that significant differences (p = 0.0258* and p = 0.0001***, respectively) are observed for prediction accuracies achieved by the GBM and RF models. Importantly, the prediction uncertainty introduced by the selection of absence data was quantified and could provide more accurate fundamental and scientific information for further study on disease transmission prediction and risk assessment. Copyright © 2018. Published by Elsevier B.V.
Pearson, Amy C. S.; Subramanian, Arun; Schroeder, Darrell R.; Findlay, James Y.
2017-01-01
Background The surgical Apgar score (SAS) is a 10-point scale using the lowest heart rate, lowest mean arterial pressure, and estimated blood loss (EBL) during surgery to predict postoperative outcomes. The SAS has not yet been validated in liver transplantation patients, because typical blood loss usually exceeds the highest EBL category. Our primary aim was to develop a modified SAS for liver transplant (SAS-LT) by replacing the EBL parameter with volume of red cells transfused. We hypothesized that the SAS-LT would predict death or severe complication within 30 days of transplant with similar accuracy to current scoring systems. Methods A retrospective cohort of consecutive liver transplantations from July 2007 to November 2013 was used to develop the SAS-LT. The predictive ability of SAS-LT for early postoperative outcomes was compared with Model for End-stage Liver Disease, Sequential Organ Failure Assessment, and Acute Physiology and Chronic Health Evaluation III scores using multivariable logistic regression and receiver operating characteristic analysis. Results Of 628 transplants, death or serious perioperative morbidity occurred in 105 (16.7%). The SAS-LT (receiver operating characteristic area under the curve [AUC], 0.57) had similar predictive ability to Acute Physiology and Chronic Health Evaluation III, model for end-stage liver disease, and Sequential Organ Failure Assessment scores (0.57, 0.56, and 0.61, respectively). Seventy-nine (12.6%) patients were discharged from the ICU in 24 hours or less. These patients’ SAS-LT scores were significantly higher than those with a longer stay (7.0 vs 6.2, P < 0.01). The AUC on multivariable modeling remained predictive of early ICU discharge (AUC, 0.67). Conclusions The SAS-LT utilized simple intraoperative metrics to predict early morbidity and mortality after liver transplant with similar accuracy to other scoring systems at an earlier postoperative time point. PMID:29184910
NASA Astrophysics Data System (ADS)
Nakatsugawa, M.; Kobayashi, Y.; Okazaki, R.; Taniguchi, Y.
2017-12-01
This research aims to improve accuracy of water level prediction calculations for more effective river management. In August 2016, Hokkaido was visited by four typhoons, whose heavy rainfall caused severe flooding. In the Tokoro river basin of Eastern Hokkaido, the water level (WL) at the Kamikawazoe gauging station, which is at the lower reaches exceeded the design high-water level and the water rose to the highest level on record. To predict such flood conditions and mitigate disaster damage, it is necessary to improve the accuracy of prediction as well as to prolong the lead time (LT) required for disaster mitigation measures such as flood-fighting activities and evacuation actions by residents. There is the need to predict the river water level around the peak stage earlier and more accurately. Previous research dealing with WL prediction had proposed a method in which the WL at the lower reaches is estimated by the correlation with the WL at the upper reaches (hereinafter: "the water level correlation method"). Additionally, a runoff model-based method has been generally used in which the discharge is estimated by giving rainfall prediction data to a runoff model such as a storage function model and then the WL is estimated from that discharge by using a WL discharge rating curve (H-Q curve). In this research, an attempt was made to predict WL by applying the Random Forest (RF) method, which is a machine learning method that can estimate the contribution of explanatory variables. Furthermore, from the practical point of view, we investigated the prediction of WL based on a multiple correlation (MC) method involving factors using explanatory variables with high contribution in the RF method, and we examined the proper selection of explanatory variables and the extension of LT. The following results were found: 1) Based on the RF method tuned up by learning from previous floods, the WL for the abnormal flood case of August 2016 was properly predicted with a lead time of 6 h. 2) Based on the contribution of explanatory variables, factors were selected for the MC method. In this way, plausible prediction results were obtained.
Rabin, Laura A.; Paré, Nadia; Saykin, Andrew J.; Brown, Michael J.; Wishart, Heather A.; Flashman, Laura A.; Santulli, Robert B.
2011-01-01
Episodic memory is the first and most severely affected cognitive domain in Alzheimer's disease (AD), and it is also the key early marker in prodromal stages including amnestic mild cognitive impairment (MCI). The relative ability of memory tests to discriminate between MCI and normal aging has not been well characterized. We compared the classification value of widely used verbal memory tests in distinguishing healthy older adults (n = 51) from those with MCI (n = 38). Univariate logistic regression indicated that the total learning score from the California Verbal Learning Test-II (CVLT-II) ranked highest in terms of distinguishing MCI from normal aging (sensitivity = 90.2; specificity = 84.2). Inclusion of the delayed recall condition of a story memory task (i.e., WMS-III Logical Memory, Story A) enhanced the overall accuracy of classification (sensitivity = 92.2; specificity = 94.7). Combining Logical Memory recognition and CVLT-II long delay best predicted progression from MCI to AD over a 4-year period (accurate classification = 87.5%). Learning across multiple trials may provide the most sensitive index for initial diagnosis of MCI, but inclusion of additional variables may enhance overall accuracy and may represent the optimal strategy for identifying individuals most likely to progress to dementia. PMID:19353345
Clinical evaluation of near-infrared light transillumination in approximal dentin caries detection.
Ozkan, Gokhan; Guzel, Kadriye Gorkem Ulu
2017-08-01
The objective of this clinical study was to compare conventional caries detection techniques, pen-type laser fluorescence device, and near-infrared light transillumination method in approximal dentin caries lesions. The study included 157 patients, aged 12-18, without any cavity in the posterior teeth. Two calibrated examiners carried out the assessments of selected approximal caries sites independently. After the assessments, the unopened sites were excluded and a total of 161 approximal sites were included in the study. When both the examiners arrived at a consensus regarding the presence of dentin caries, the detected lesions were opened with a conical diamond burr, the cavity extent was examined and validated (gold standard). Sensitivity, specificity, negative predictive value, positive predictive value, accuracy, and area under the ROC curve (Az) values among the caries detection methods were calculated. Bitewing radiography and near-infrared (NIR) light transillumination methods showed the highest sensitivity (0.83-0.82) and accuracy (0.82-0.80) among the methods. Visual inspection showed the lowest sensitivity (0.54). Laser fluorescence device and visual inspection showed nearly equal performance. Near-infrared light transillumination can be used as an alternative method to approximal dentin caries detection. Visual inspection and laser fluorescence device alone should not be used for approximal dentin caries.
EVALUATING RISK-PREDICTION MODELS USING DATA FROM ELECTRONIC HEALTH RECORDS.
Wang, L E; Shaw, Pamela A; Mathelier, Hansie M; Kimmel, Stephen E; French, Benjamin
2016-03-01
The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., "false negatives"). We derive estimators for sensitivity and specificity if misclassification is independent of marker values. In simulation studies, we quantify the potential for bias in prediction accuracy summaries if misclassification depends on marker values. We compare the accuracy of alternative prognostic models for 30-day all-cause hospital readmission among 4548 patients discharged from the University of Pennsylvania Health System with a primary diagnosis of heart failure. Simulation studies indicate that if misclassification depends on marker values, then the estimated accuracy improvement is also biased, but the direction of the bias depends on the direction of the association between markers and the probability of misclassification. In our application, 29% of the 1143 readmitted patients were readmitted to a hospital elsewhere in Pennsylvania, which reduced prediction accuracy. Outcome misclassification can result in erroneous conclusions regarding the accuracy of risk-prediction models.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Igglessi-Markopoulou, Olga; Kollias, George
2010-05-01
A novel QSAR workflow is constructed that combines MLR with LS-SVM classification techniques for the identification of quinazolinone analogs as "active" or "non-active" CXCR3 antagonists. The accuracy of the LS-SVM classification technique for the training set and test was 100% and 90%, respectively. For the "active" analogs a validated MLR QSAR model estimates accurately their I-IP10 IC(50) inhibition values. The accuracy of the QSAR model (R (2) = 0.80) is illustrated using various evaluation techniques, such as leave-one-out procedure (R(LOO2)) = 0.67) and validation through an external test set (R(pred2) = 0.78). The key conclusion of this study is that the selected molecular descriptors, Highest Occupied Molecular Orbital energy (HOMO), Principal Moment of Inertia along X and Y axes PMIX and PMIZ, Polar Surface Area (PSA), Presence of triple bond (PTrplBnd), and Kier shape descriptor ((1) kappa), demonstrate discriminatory and pharmacophore abilities.
Phonon-particle coupling effects in odd-even mass differences of semi-magic nuclei
NASA Astrophysics Data System (ADS)
Saperstein, E. E.; Baldo, M.; Pankratov, S. S.; Tolokonnikov, S. V.
2017-11-01
A method to evaluate the particle-phonon coupling (PC) corrections to the single-particle energies in semi-magic nuclei, based on a direct solving the Dyson equation with PC corrected mass operator, is used for finding the odd-even mass difference between 18 even Pb isotopes and their odd-proton neighbors. The Fayans energy density functional (EDF) DF3-a is used which gives rather high accuracy of the predictions for these mass differences already on the mean-field level, with the average deviation from the existing experimental data equal to 0.389 MeV. It is only a bit worse than the corresponding value of 0.333 MeV for the Skyrme EDF HFB-17, which belongs to a family of Skyrme EDFs with the highest overall accuracy in describing the nuclear masses. Account for the PC corrections induced by the low-laying phonons 2 1 + and 3 1 - significantly diminishes the deviation of the theory from the data till 0.218 MeV.
Identification of misspelled words without a comprehensive dictionary using prevalence analysis.
Turchin, Alexander; Chu, Julia T; Shubina, Maria; Einbinder, Jonathan S
2007-10-11
Misspellings are common in medical documents and can be an obstacle to information retrieval. We evaluated an algorithm to identify misspelled words through analysis of their prevalence in a representative body of text. We evaluated the algorithm's accuracy of identifying misspellings of 200 anti-hypertensive medication names on 2,000 potentially misspelled words randomly selected from narrative medical documents. Prevalence ratios (the frequency of the potentially misspelled word divided by the frequency of the non-misspelled word) in physician notes were computed by the software for each of the words. The software results were compared to the manual assessment by an independent reviewer. Area under the ROC curve for identification of misspelled words was 0.96. Sensitivity, specificity, and positive predictive value were 99.25%, 89.72% and 82.9% for the prevalence ratio threshold (0.32768) with the highest F-measure (0.903). Prevalence analysis can be used to identify and correct misspellings with high accuracy.
Improved method for predicting protein fold patterns with ensemble classifiers.
Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C
2012-01-27
Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.
Improved Short-Term Clock Prediction Method for Real-Time Positioning.
Lv, Yifei; Dai, Zhiqiang; Zhao, Qile; Yang, Sheng; Zhou, Jinning; Liu, Jingnan
2017-06-06
The application of real-time precise point positioning (PPP) requires real-time precise orbit and clock products that should be predicted within a short time to compensate for the communication delay or data gap. Unlike orbit correction, clock correction is difficult to model and predict. The widely used linear model hardly fits long periodic trends with a small data set and exhibits significant accuracy degradation in real-time prediction when a large data set is used. This study proposes a new prediction model for maintaining short-term satellite clocks to meet the high-precision requirements of real-time clocks and provide clock extrapolation without interrupting the real-time data stream. Fast Fourier transform (FFT) is used to analyze the linear prediction residuals of real-time clocks. The periodic terms obtained through FFT are adopted in the sliding window prediction to achieve a significant improvement in short-term prediction accuracy. This study also analyzes and compares the accuracy of short-term forecasts (less than 3 h) by using different length observations. Experimental results obtained from International GNSS Service (IGS) final products and our own real-time clocks show that the 3-h prediction accuracy is better than 0.85 ns. The new model can replace IGS ultra-rapid products in the application of real-time PPP. It is also found that there is a positive correlation between the prediction accuracy and the short-term stability of on-board clocks. Compared with the accuracy of the traditional linear model, the accuracy of the static PPP using the new model of the 2-h prediction clock in N, E, and U directions is improved by about 50%. Furthermore, the static PPP accuracy of 2-h clock products is better than 0.1 m. When an interruption occurs in the real-time model, the accuracy of the kinematic PPP solution using 1-h clock prediction product is better than 0.2 m, without significant accuracy degradation. This model is of practical significance because it solves the problems of interruption and delay in data broadcast in real-time clock estimation and can meet the requirements of real-time PPP.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Songhua; Tourassi, Georgia
2012-01-01
The majority of clinical content-based image retrieval (CBIR) studies disregard human perception subjectivity, aiming to duplicate the consensus expert assessment of the visual similarity on example cases. The purpose of our study is twofold: (i) discern better the extent of human perception subjectivity when assessing the visual similarity of two images with similar semantic content, and (ii) explore the feasibility of personalized predictive modeling of visual similarity. We conducted a human observer study in which five observers of various expertise were shown ninety-nine triplets of mammographic masses with similar BI-RADS descriptors and were asked to select the two masses withmore » the highest visual relevance. Pairwise agreement ranged between poor and fair among the five observers, as assessed by the kappa statistic. The observers' self-consistency rate was remarkably low, based on repeated questions where either the orientation or the presentation order of a mass was changed. Various machine learning algorithms were explored to determine whether they can predict each observer's personalized selection using textural features. Many algorithms performed with accuracy that exceeded each observer's self-consistency rate, as determined using a cross-validation scheme. This accuracy was statistically significantly higher than would be expected by chance alone (two-tailed p-value ranged between 0.001 and 0.01 for all five personalized models). The study confirmed that human perception subjectivity should be taken into account when developing CBIR-based medical applications.« less
Number of Biopsies in Diagnosing Pulmonary Nodules
Wehrschuetz, M.; Wehrschuetz, E.; Portugaller, H. R.
2010-01-01
Purpose: To determine the number of specimens to be obtained from pulmonary lesions to get the highest possible accuracy in histological work-up. Materials and methods: A retrospective evaluation (January 1999 to April 2004) covered 260 patients with thoracic lesions who underwent computer tomography (CT)-guided core-cut biopsy in coaxial technique. All biopsies were performed utilizing a 19 gauge introducer needle and a 20 gauge core-cut biopsy needle. In all, 669 usable biopsies were taken (from 1–5 biopsies in each setting). The specimens were marked sequentially and each biopsy was worked up histologicaly. The biopsy results were correlated to histology after surgery, clinical follow-up or autopsy. The number of biopsies was determined that is necessary to achieve the highest possible accuracy in diagnosing pulmonary lesions. Results: In 591 of 669 biopsies (88.3%), there were correct positive results. The overall accuracy was 87.4%. In 193 of 260 (74.2%) patients, a suspected malignancy was confirmed. In 50 of 260 (19.2%) patients, a benign lesion was correctly diagnosed. Seventeen (6.5%) patients were lost to follow-up. The first, second and third biopsies had cumulative accuracies of 63.6%, 89.2% and 91.5%, respectively (P < 0.02). More biopsies did not show any higher impact on accuracy. Conclusion: For the highest possible accuracy in diagnosing pulmonary lesions by CT-guided core-cut biopsy, at least three usable specimens are recommended to be taken. PMID:21157523
Wang, Jun; Kliks, Michael M; Jun, Soojin; Jackson, Mel; Li, Qing X
2010-03-01
Quantitative analysis of glucose, fructose, sucrose, and maltose in different geographic origin honey samples in the world using the Fourier transform infrared (FTIR) spectroscopy and chemometrics such as partial least squares (PLS) and principal component regression was studied. The calibration series consisted of 45 standard mixtures, which were made up of glucose, fructose, sucrose, and maltose. There were distinct peak variations of all sugar mixtures in the spectral "fingerprint" region between 1500 and 800 cm(-1). The calibration model was successfully validated using 7 synthetic blend sets of sugars. The PLS 2nd-derivative model showed the highest degree of prediction accuracy with a highest R(2) value of 0.999. Along with the canonical variate analysis, the calibration model further validated by high-performance liquid chromatography measurements for commercial honey samples demonstrates that FTIR can qualitatively and quantitatively determine the presence of glucose, fructose, sucrose, and maltose in multiple regional honey samples.
Outcome Prediction in Mathematical Models of Immune Response to Infection.
Mai, Manuel; Wang, Kun; Huber, Greg; Kirby, Michael; Shattuck, Mark D; O'Hern, Corey S
2015-01-01
Clinicians need to predict patient outcomes with high accuracy as early as possible after disease inception. In this manuscript, we show that patient-to-patient variability sets a fundamental limit on outcome prediction accuracy for a general class of mathematical models for the immune response to infection. However, accuracy can be increased at the expense of delayed prognosis. We investigate several systems of ordinary differential equations (ODEs) that model the host immune response to a pathogen load. Advantages of systems of ODEs for investigating the immune response to infection include the ability to collect data on large numbers of 'virtual patients', each with a given set of model parameters, and obtain many time points during the course of the infection. We implement patient-to-patient variability v in the ODE models by randomly selecting the model parameters from distributions with coefficients of variation v that are centered on physiological values. We use logistic regression with one-versus-all classification to predict the discrete steady-state outcomes of the system. We find that the prediction algorithm achieves near 100% accuracy for v = 0, and the accuracy decreases with increasing v for all ODE models studied. The fact that multiple steady-state outcomes can be obtained for a given initial condition, i.e. the basins of attraction overlap in the space of initial conditions, limits the prediction accuracy for v > 0. Increasing the elapsed time of the variables used to train and test the classifier, increases the prediction accuracy, while adding explicit external noise to the ODE models decreases the prediction accuracy. Our results quantify the competition between early prognosis and high prediction accuracy that is frequently encountered by clinicians.
Adjusted Clinical Groups: Predictive Accuracy for Medicaid Enrollees in Three States
Adams, E. Kathleen; Bronstein, Janet M.; Raskind-Hood, Cheryl
2002-01-01
Actuarial split-sample methods were used to assess predictive accuracy of adjusted clinical groups (ACGs) for Medicaid enrollees in Georgia, Mississippi (lagging in managed care penetration), and California. Accuracy for two non-random groups—high-cost and located in urban poor areas—was assessed. Measures for random groups were derived with and without short-term enrollees to assess the effect of turnover on predictive accuracy. ACGs improved predictive accuracy for high-cost conditions in all States, but did so only for those in Georgia's poorest urban areas. Higher and more unpredictable expenses of short-term enrollees moderated the predictive power of ACGs. This limitation was significant in Mississippi due in part, to that State's very high proportion of short-term enrollees. PMID:12545598
Gladstone, Emilie; Smolina, Kate; Morgan, Steven G.; Fernandes, Kimberly A.; Martins, Diana; Gomes, Tara
2016-01-01
Background: Comprehensive systems for surveilling prescription opioid–related harms provide clear evidence that deaths from prescription opioids have increased dramatically in the United States. However, these harms are not systematically monitored in Canada. In light of a growing public health crisis, accessible, nationwide data sources to examine prescription opioid–related harms in Canada are needed. We sought to examine the performance of 5 algorithms to identify prescription opioid–related deaths from vital statistics data against data abstracted from the Office of the Chief Coroner of Ontario as a gold standard. Methods: We identified all prescription opioid–related deaths from Ontario coroners’ data that occurred between Jan. 31, 2003, and Dec. 31, 2010. We then used 5 different algorithms to identify prescription opioid–related deaths from vital statistics death data in 2010. We selected the algorithm with the highest sensitivity and a positive predictive value of more than 80% as the optimal algorithm for identifying prescription opioid–related deaths. Results: Four of the 5 algorithms had positive predictive values of more than 80%. The algorithm with the highest sensitivity (75%) in 2010 improved slightly in its predictive performance from 2003 to 2010. Interpretation: In the absence of specific systems for monitoring prescription opioid–related deaths in Canada, readily available national vital statistics data can be used to study prescription opioid–related mortality with considerable accuracy. Despite some limitations, these data may facilitate the implementation of national surveillance and monitoring strategies. PMID:26622006
Gladstone, Emilie; Smolina, Kate; Morgan, Steven G; Fernandes, Kimberly A; Martins, Diana; Gomes, Tara
2016-03-01
Comprehensive systems for surveilling prescription opioid-related harms provide clear evidence that deaths from prescription opioids have increased dramatically in the United States. However, these harms are not systematically monitored in Canada. In light of a growing public health crisis, accessible, nationwide data sources to examine prescription opioid-related harms in Canada are needed. We sought to examine the performance of 5 algorithms to identify prescription opioid-related deaths from vital statistics data against data abstracted from the Office of the Chief Coroner of Ontario as a gold standard. We identified all prescription opioid-related deaths from Ontario coroners' data that occurred between Jan. 31, 2003, and Dec. 31, 2010. We then used 5 different algorithms to identify prescription opioid-related deaths from vital statistics death data in 2010. We selected the algorithm with the highest sensitivity and a positive predictive value of more than 80% as the optimal algorithm for identifying prescription opioid-related deaths. Four of the 5 algorithms had positive predictive values of more than 80%. The algorithm with the highest sensitivity (75%) in 2010 improved slightly in its predictive performance from 2003 to 2010. In the absence of specific systems for monitoring prescription opioid-related deaths in Canada, readily available national vital statistics data can be used to study prescription opioid-related mortality with considerable accuracy. Despite some limitations, these data may facilitate the implementation of national surveillance and monitoring strategies. © 2016 Canadian Medical Association or its licensors.
Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.
Jiménez-Montero, J A; González-Recio, O; Alenda, R
2013-01-01
The aim of this study was to evaluate methods for genomic evaluation of the Spanish Holstein population as an initial step toward the implementation of routine genomic evaluations. This study provides a description of the population structure of progeny tested bulls in Spain at the genomic level and compares different genomic evaluation methods with regard to accuracy and bias. Two bayesian linear regression models, Bayes-A and Bayesian-LASSO (B-LASSO), as well as a machine learning algorithm, Random-Boosting (R-Boost), and BLUP using a realized genomic relationship matrix (G-BLUP), were compared. Five traits that are currently under selection in the Spanish Holstein population were used: milk yield, fat yield, protein yield, fat percentage, and udder depth. In total, genotypes from 1859 progeny tested bulls were used. The training sets were composed of bulls born before 2005; including 1601 bulls for production and 1574 bulls for type, whereas the testing sets contained 258 and 235 bulls born in 2005 or later for production and type, respectively. Deregressed proofs (DRP) from January 2009 Interbull (Uppsala, Sweden) evaluation were used as the dependent variables for bulls in the training sets, whereas DRP from the December 2011 DRPs Interbull evaluation were used to compare genomic predictions with progeny test results for bulls in the testing set. Genomic predictions were more accurate than traditional pedigree indices for predicting future progeny test results of young bulls. The gain in accuracy, due to inclusion of genomic data varied by trait and ranged from 0.04 to 0.42 Pearson correlation units. Results averaged across traits showed that B-LASSO had the highest accuracy with an advantage of 0.01, 0.03 and 0.03 points in Pearson correlation compared with R-Boost, Bayes-A, and G-BLUP, respectively. The B-LASSO predictions also showed the least bias (0.02, 0.03 and 0.10 SD units less than Bayes-A, R-Boost and G-BLUP, respectively) as measured by mean difference between genomic predictions and progeny test results. The R-Boosting algorithm provided genomic predictions with regression coefficients closer to unity, which is an alternative measure of bias, for 4 out of 5 traits and also resulted in mean squared errors estimates that were 2%, 10%, and 12% smaller than B-LASSO, Bayes-A, and G-BLUP, respectively. The observed prediction accuracy obtained with these methods was within the range of values expected for a population of similar size, suggesting that the prediction method and reference population described herein are appropriate for implementation of routine genome-assisted evaluations in Spanish dairy cattle. R-Boost is a competitive marker regression methodology in terms of predictive ability that can accommodate large data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Research on Improved Depth Belief Network-Based Prediction of Cardiovascular Diseases
Zhang, Hongpo
2018-01-01
Quantitative analysis and prediction can help to reduce the risk of cardiovascular disease. Quantitative prediction based on traditional model has low accuracy. The variance of model prediction based on shallow neural network is larger. In this paper, cardiovascular disease prediction model based on improved deep belief network (DBN) is proposed. Using the reconstruction error, the network depth is determined independently, and unsupervised training and supervised optimization are combined. It ensures the accuracy of model prediction while guaranteeing stability. Thirty experiments were performed independently on the Statlog (Heart) and Heart Disease Database data sets in the UCI database. Experimental results showed that the mean of prediction accuracy was 91.26% and 89.78%, respectively. The variance of prediction accuracy was 5.78 and 4.46, respectively. PMID:29854369
Aithal, Venkatesh; Kei, Joseph; Driscoll, Carlie; Murakoshi, Michio; Wada, Hiroshi
2018-02-01
Diagnosing conductive conditions in newborns is challenging for both audiologists and otolaryngologists. Although high-frequency tympanometry (HFT), acoustic stapedial reflex tests, and wideband absorbance measures are useful diagnostic tools, there is performance measure variability in their detection of middle ear conditions. Additional diagnostic sensitivity and specificity measures gained through new technology such as sweep frequency impedance (SFI) measures may assist in the diagnosis of middle ear dysfunction in newborns. The purpose of this study was to determine the test performance of SFI to predict the status of the outer and middle ear in newborns against commonly used reference standards. Automated auditory brainstem response (AABR), HFT (1000 Hz), transient evoked otoacoustic emission (TEOAE), distortion product otoacoustic emission (DPOAE), and SFI tests were administered to the study sample. A total of 188 neonates (98 males and 90 females) with a mean gestational age of 39.4 weeks were included in the sample. Mean age at the time of testing was 44.4 hr. Diagnostic accuracy of SFI was assessed in terms of its ability to identify conductive conditions in neonates when compared with nine different reference standards (including four single tests [AABR, HFT, TEOAE, and DPOAE] and five test batteries [HFT + DPOAE, HFT + TEOAE, DPOAE + TEOAE, DPOAE + AABR, and TEOAE + AABR]), using receiver operating characteristic (ROC) analysis and traditional test performance measures such as sensitivity and specificity. The test performance of SFI against the test battery reference standard of HFT + DPOAE and single reference standard of HFT was high with an area under the ROC curve (AROC) of 0.87 and 0.82, respectively. Although the HFT + DPOAE test battery reference standard performed better than the HFT reference standard in predicting middle ear conductive conditions in neonates, the difference in AROC was not significant. Further analysis revealed that the highest sensitivity and specificity for SFI (86% and 88%, respectively) was obtained when compared with the reference standard of HFT + DPOAE. Among the four single reference standards, SFI had the highest sensitivity and specificity (76% and 88%, respectively) when compared against the HFT reference standard. The high test performance of SFI against the HFT and HFT + DPOAE reference standards indicates that the SFI measure has appropriate diagnostic accuracy in detection of conductive conditions in newborns. Hence, the SFI test could be used as adjunct tool to identify conductive conditions in universal newborn hearing screening programs, and can also be used in diagnostic follow-up assessments. American Academy of Audiology
Clinical utility of balloon expulsion test for functional defecation disorders
2016-01-01
Purpose I investigated the diagnostic accuracy of balloon expulsion test (BET) with various techniques to find out the most appropriate method, and tried to confirm its clinical utility in diagnosing functional defecation disorders (FDD) in constipated patients. Methods Eighty-seven patients constituted the study population. FDD was defined when patients had at least two positive findings in defecography, manometry, and electromyography. BET was done 4 times in each patient with 2 different positions and 2 different volumes. The positions were seated position (SP) and left lateral decubitus position (LDP). The volumes were fixed volume (FV) of 60 mL and individualized volume with which patient felt a constant desire to defecate (CDV). The results of BETs with 4 different settings (LDP-FV, LDP-CDV, SP-FV, and SP-CDV) were statistically compared and analyzed. Results Of 87 patients, 23 patients (26.4%) had at least two positive findings in 3 tests and thus were diagnosed to have FDD. On receiver operating characteristic curve analysis, area under curve was highest in BET with SP-FV. With a cutoff value of 30 seconds, the specificity of BET with SP-FV was 86.0%, sensitivity was 73.9%, negative predictive value was 89.8%, positive predictive value was 65.4%, and accuracy rate was 82.8% for diagnosing FDD. Conclusion SP-FV is the most appropriate method for BET. In this setting, BET has a diagnostic accuracy sufficient to identify constipated patients who do not have FDD. Patients with negative results in BET with SP-FV may not need other onerous tests to exclude FDD. PMID:26878016
Clinical utility of balloon expulsion test for functional defecation disorders.
Seong, Moo-Kyung
2016-02-01
I investigated the diagnostic accuracy of balloon expulsion test (BET) with various techniques to find out the most appropriate method, and tried to confirm its clinical utility in diagnosing functional defecation disorders (FDD) in constipated patients. Eighty-seven patients constituted the study population. FDD was defined when patients had at least two positive findings in defecography, manometry, and electromyography. BET was done 4 times in each patient with 2 different positions and 2 different volumes. The positions were seated position (SP) and left lateral decubitus position (LDP). The volumes were fixed volume (FV) of 60 mL and individualized volume with which patient felt a constant desire to defecate (CDV). The results of BETs with 4 different settings (LDP-FV, LDP-CDV, SP-FV, and SP-CDV) were statistically compared and analyzed. Of 87 patients, 23 patients (26.4%) had at least two positive findings in 3 tests and thus were diagnosed to have FDD. On receiver operating characteristic curve analysis, area under curve was highest in BET with SP-FV. With a cutoff value of 30 seconds, the specificity of BET with SP-FV was 86.0%, sensitivity was 73.9%, negative predictive value was 89.8%, positive predictive value was 65.4%, and accuracy rate was 82.8% for diagnosing FDD. SP-FV is the most appropriate method for BET. In this setting, BET has a diagnostic accuracy sufficient to identify constipated patients who do not have FDD. Patients with negative results in BET with SP-FV may not need other onerous tests to exclude FDD.
RDT accuracy based on age group in hypoendemic malaria
NASA Astrophysics Data System (ADS)
Siahaan, L.; Panggabean, M.; Panggabean, Y. C.
2018-03-01
Malaria is still one of the problem of community health in Sumatera. This study was carried out to compare RDT accuracy in some groups of age in hypoendemic malaria. The microscopy test was investigated by 3% Giemsa Staining and examined by a trained laboratory technician. RDT was carried out by using Monotes Test Drive. The accuracy of RDT diagnostic was commonly significant in all groups of age, exceptin thegroup of age > 65 years old (p=0.393). The highest sensitivity of RDT was commonly inagroup of age ≤ 5 years old and decreased in the older group of age. Otherwise, the lowest specificity was found in agroup of age ≤ 5 years old and the highest in agroup of age 6-15 years old.The highest PPV and NPV was found inagroup of age 16-65 years old and ≤ 5 years old, respectively. The highest of parasite density was found in a group of age ≤ 5 years old (644.4±494.5parasite/μl) and the lowest in agroup of age > 65 years (400±490.71parasite/μl). The accurate diagnosis of RDT reduces by increasing of age.
An ensemble model of QSAR tools for regulatory risk assessment.
Pradeep, Prachi; Povinelli, Richard J; White, Shannon; Merrill, Stephen J
2016-01-01
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa ( κ ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.
An ensemble model of QSAR tools for regulatory risk assessment
Pradeep, Prachi; Povinelli, Richard J.; White, Shannon; ...
2016-09-22
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflictingmore » predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. In conclusion, this feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.« less
Proposed hybrid-classifier ensemble algorithm to map snow cover area
NASA Astrophysics Data System (ADS)
Nijhawan, Rahul; Raman, Balasubramanian; Das, Josodhir
2018-01-01
Metaclassification ensemble approach is known to improve the prediction performance of snow-covered area. The methodology adopted in this case is based on neural network along with four state-of-art machine learning algorithms: support vector machine, artificial neural networks, spectral angle mapper, K-mean clustering, and a snow index: normalized difference snow index. An AdaBoost ensemble algorithm related to decision tree for snow-cover mapping is also proposed. According to available literature, these methods have been rarely used for snow-cover mapping. Employing the above techniques, a study was conducted for Raktavarn and Chaturangi Bamak glaciers, Uttarakhand, Himalaya using multispectral Landsat 7 ETM+ (enhanced thematic mapper) image. The study also compares the results with those obtained from statistical combination methods (majority rule and belief functions) and accuracies of individual classifiers. Accuracy assessment is performed by computing the quantity and allocation disagreement, analyzing statistic measures (accuracy, precision, specificity, AUC, and sensitivity) and receiver operating characteristic curves. A total of 225 combinations of parameters for individual classifiers were trained and tested on the dataset and results were compared with the proposed approach. It was observed that the proposed methodology produced the highest classification accuracy (95.21%), close to (94.01%) that was produced by the proposed AdaBoost ensemble algorithm. From the sets of observations, it was concluded that the ensemble of classifiers produced better results compared to individual classifiers.
3D hand motion trajectory prediction from EEG mu and beta bandpower.
Korik, A; Sosnik, R; Siddique, N; Coyle, D
2016-01-01
A motion trajectory prediction (MTP) - based brain-computer interface (BCI) aims to reconstruct the three-dimensional (3D) trajectory of upper limb movement using electroencephalography (EEG). The most common MTP BCI employs a time series of bandpass-filtered EEG potentials (referred to here as the potential time-series, PTS, model) for reconstructing the trajectory of a 3D limb movement using multiple linear regression. These studies report the best accuracy when a 0.5-2Hz bandpass filter is applied to the EEG. In the present study, we show that spatiotemporal power distribution of theta (4-8Hz), mu (8-12Hz), and beta (12-28Hz) bands are more robust for movement trajectory decoding when the standard PTS approach is replaced with time-varying bandpower values of a specified EEG band, ie, with a bandpower time-series (BTS) model. A comprehensive analysis comprising of three subjects performing pointing movements with the dominant right arm toward six targets is presented. Our results show that the BTS model produces significantly higher MTP accuracy (R~0.45) compared to the standard PTS model (R~0.2). In the case of the BTS model, the highest accuracy was achieved across the three subjects typically in the mu (8-12Hz) and low-beta (12-18Hz) bands. Additionally, we highlight a limitation of the commonly used PTS model and illustrate how this model may be suboptimal for decoding motion trajectory relevant information. Although our results, showing that the mu and beta bands are prominent for MTP, are not in line with other MTP studies, they are consistent with the extensive literature on classical multiclass sensorimotor rhythm-based BCI studies (classification of limbs as opposed to motion trajectory prediction), which report the best accuracy of imagined limb movement classification using power values of mu and beta frequency bands. The methods proposed here provide a positive step toward noninvasive decoding of imagined 3D hand movements for movement-free BCIs. © 2016 Elsevier B.V. All rights reserved.
The accuracy of new wheelchair users' predictions about their future wheelchair use.
Hoenig, Helen; Griffiths, Patricia; Ganesh, Shanti; Caves, Kevin; Harris, Frances
2012-06-01
This study examined the accuracy of new wheelchair user predictions about their future wheelchair use. This was a prospective cohort study of 84 community-dwelling veterans provided a new manual wheelchair. The association between predicted and actual wheelchair use was strong at 3 mos (ϕ coefficient = 0.56), with 90% of those who anticipated using the wheelchair at 3 mos still using it (i.e., positive predictive value = 0.96) and 60% of those who anticipated not using it indeed no longer using the wheelchair (i.e., negative predictive value = 0.60, overall accuracy = 0.92). Predictive accuracy diminished over time, with overall accuracy declining from 0.92 at 3 mos to 0.66 at 6 mos. At all time points, and for all types of use, patients better predicted use as opposed to disuse, with correspondingly higher positive than negative predictive values. Accuracy of prediction of use in specific indoor and outdoor locations varied according to location. This study demonstrates the importance of better understanding the potential mismatch between the anticipated and actual patterns of wheelchair use. The findings suggest that users can be relied upon to accurately predict their basic wheelchair-related needs in the short-term. Further exploration is needed to identify characteristics that will aid users and their providers in more accurately predicting mobility needs for the long-term.
How much articular displacement can be detected using fluoroscopy for tibial plateau fractures?
Haller, Justin M; O'Toole, Robert; Graves, Matthew; Barei, David; Gardner, Michael; Kubiak, Erik; Nascone, Jason; Nork, Sean; Presson, Angela P; Higgins, Thomas F
2015-11-01
While there is conflicting evidence regarding the importance of anatomic reduction for tibial plateau fractures, there are currently no studies that analyse our ability to grade reduction based on fluoroscopic imaging. The purpose of this study was to determine the accuracy of fluoroscopy in judging tibial plateau articular reduction. Ten embalmed human cadavers were selected. The lateral plateau was sagitally sectioned, and the joint was reduced under direct visualization. Lateral, anterior-posterior (AP), and joint line fluoroscopic views were obtained. The same fluoroscopic views were obtained with 2mm displacement and 5mm displacement. The images were randomised, and eight orthopaedic traumatologists were asked whether the plateau was reduced. Within each pair of conditions (view and displacement from 0mm to 5mm) sensitivity, specificity, and intraclass correlations (ICC) were evaluated. The AP-lateral view with 5mm displacement yielded the highest accuracy for detecting reduction at 90% (95% CI: 83-94%). For the other conditions, accuracy ranged from (37-83%). Sensitivity was highest for the reduced lateral view (79%, 95% CI: 57-91%). Specificity was highest in the AP-lateral view 98% (95% CI: 93-99%) for 5mm step-off. ICC was perfect for the AP-lateral view with 5mm displacement, but otherwise agreement ranged from poor to moderate at ICC=0.09-0.46. Finally, there was no additional benefit to including the joint-line view with the AP and lateral views. Using both AP and lateral views for 5mm displacement had the highest accuracy, specificity, and ICC. Outside of this scenario, agreement was poor to moderate and accuracy was low. Applying this clinically, direct visualization of the articular surface may be necessary to ensure malreduction less than 5mm. Copyright © 2015 Elsevier Ltd. All rights reserved.
Contrast-enhanced ultrasound in diagnosis of gallbladder adenoma.
Yuan, Hai-Xia; Cao, Jia-Ying; Kong, Wen-Tao; Xia, Han-Sheng; Wang, Xi; Wang, Wen-Ping
2015-04-01
Gallbladder adenoma is a pre-cancerous neoplasm and needs surgical resection. It is difficult to differentiate adenoma from other gallbladder polyps using imaging examinations. The study aimed to illustrate characteristics of contrast-enhanced ultrasound (CEUS) and its diagnostic value in gallbladder adenoma. Thirty-seven patients with 39 gallbladder adenomatoid lesions (maximal diameter ≥10 mm and without metastasis) were enrolled in this study. Lesion appearances in conventional ultrasound and CEUS were documented. The imaging features were compared individually among gallbladder cholesterol polyp, gallbladder adenoma and malignant lesion. Adenoma lesions showed iso-echogenicity in ultrasound, and an eccentric enhancement pattern, "fast-in and synchronous-out" contrast enhancement pattern and homogeneous at peak-time enhancement in CEUS. The homogenicity at peak-time enhancement showed the highest diagnostic ability in differentiating gallbladder adenoma from cholesterol polyps. The sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Youden index were 100%, 90.9%, 92.9%, 100%, 95.8% and 0.91, respectively. The characteristic of continuous gallbladder wall shown by CEUS had the highest diagnostic ability in differentiating adenoma from malignant lesion (100%, 86.7%, 86.7%, 100%, 92.9% and 0.87, respectively). The characteristic of the eccentric enhancement pattern had the highest diagnostic ability in differentiating adenoma from cholesterol polyp and malignant lesion, with corresponding indices of 69.2%, 88.5%, 75.0%, 85.2%, 82.1% and 0.58, respectively. CEUS is valuable in differentiating gallbladder adenoma from other gallbladder polyps (≥10 mm in diameter). Homogeneous echogenicity on peak-time enhancement, a continuous gallbladder wall, and the eccentric enhancement pattern are important indicators of gallbladder adenoma on CEUS.
Vallejo, Roger L; Leeds, Timothy D; Fragomeni, Breno O; Gao, Guangtu; Hernandez, Alvaro G; Misztal, Ignacy; Welch, Timothy J; Wiens, Gregory D; Palti, Yniv
2016-01-01
Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic breeding values (GEBVs) for BCWD resistance in 10 families from the first generation of the NCCCWA BCWD resistance breeding line, compared the predictive ability (PA) of GEBVs to pedigree-based estimated breeding values (EBVs), and compared the impact of two SNP genotyping methods on the accuracy of GEBV predictions. The BCWD phenotypes survival days (DAYS) and survival status (STATUS) had been recorded in training fish (n = 583) subjected to experimental BCWD challenge. Training fish, and their full sibs without phenotypic data that were used as parents of the subsequent generation, were genotyped using two methods: restriction-site associated DNA (RAD) sequencing and the Rainbow Trout Axiom® 57 K SNP array (Chip). Animal-specific GEBVs were estimated using four GS models: BayesB, BayesC, single-step GBLUP (ssGBLUP), and weighted ssGBLUP (wssGBLUP). Family-specific EBVs were estimated using pedigree and phenotype data in the training fish only. The PA of EBVs and GEBVs was assessed by correlating mean progeny phenotype (MPP) with mid-parent EBV (family-specific) or GEBV (animal-specific). The best GEBV predictions were similar to EBV with PA values of 0.49 and 0.46 vs. 0.50 and 0.41 for DAYS and STATUS, respectively. Among the GEBV prediction methods, ssGBLUP consistently had the highest PA. The RAD genotyping platform had GEBVs with similar PA to those of GEBVs from the Chip platform. The PA of ssGBLUP and wssGBLUP methods was higher with the Chip, but for BayesB and BayesC methods it was higher with the RAD platform. The overall GEBV accuracy in this study was low to moderate, likely due to the small training sample used. This study explored the potential of GS for improving resistance to BCWD in rainbow trout using, for the first time, progeny testing data to assess the accuracy of GEBVs, and it provides the basis for further investigation on the implementation of GS in commercial rainbow trout populations.
Vallejo, Roger L.; Leeds, Timothy D.; Fragomeni, Breno O.; Gao, Guangtu; Hernandez, Alvaro G.; Misztal, Ignacy; Welch, Timothy J.; Wiens, Gregory D.; Palti, Yniv
2016-01-01
Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic breeding values (GEBVs) for BCWD resistance in 10 families from the first generation of the NCCCWA BCWD resistance breeding line, compared the predictive ability (PA) of GEBVs to pedigree-based estimated breeding values (EBVs), and compared the impact of two SNP genotyping methods on the accuracy of GEBV predictions. The BCWD phenotypes survival days (DAYS) and survival status (STATUS) had been recorded in training fish (n = 583) subjected to experimental BCWD challenge. Training fish, and their full sibs without phenotypic data that were used as parents of the subsequent generation, were genotyped using two methods: restriction-site associated DNA (RAD) sequencing and the Rainbow Trout Axiom® 57 K SNP array (Chip). Animal-specific GEBVs were estimated using four GS models: BayesB, BayesC, single-step GBLUP (ssGBLUP), and weighted ssGBLUP (wssGBLUP). Family-specific EBVs were estimated using pedigree and phenotype data in the training fish only. The PA of EBVs and GEBVs was assessed by correlating mean progeny phenotype (MPP) with mid-parent EBV (family-specific) or GEBV (animal-specific). The best GEBV predictions were similar to EBV with PA values of 0.49 and 0.46 vs. 0.50 and 0.41 for DAYS and STATUS, respectively. Among the GEBV prediction methods, ssGBLUP consistently had the highest PA. The RAD genotyping platform had GEBVs with similar PA to those of GEBVs from the Chip platform. The PA of ssGBLUP and wssGBLUP methods was higher with the Chip, but for BayesB and BayesC methods it was higher with the RAD platform. The overall GEBV accuracy in this study was low to moderate, likely due to the small training sample used. This study explored the potential of GS for improving resistance to BCWD in rainbow trout using, for the first time, progeny testing data to assess the accuracy of GEBVs, and it provides the basis for further investigation on the implementation of GS in commercial rainbow trout populations. PMID:27303436
Performance of genomic prediction within and across generations in maritime pine.
Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent
2016-08-11
Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
Morgante, Fabio; Huang, Wen; Maltecca, Christian; Mackay, Trudy F C
2018-06-01
Predicting complex phenotypes from genomic data is a fundamental aim of animal and plant breeding, where we wish to predict genetic merits of selection candidates; and of human genetics, where we wish to predict disease risk. While genomic prediction models work well with populations of related individuals and high linkage disequilibrium (LD) (e.g., livestock), comparable models perform poorly for populations of unrelated individuals and low LD (e.g., humans). We hypothesized that low prediction accuracies in the latter situation may occur when the genetics architecture of the trait departs from the infinitesimal and additive architecture assumed by most prediction models. We used simulated data for 10,000 lines based on sequence data from a population of unrelated, inbred Drosophila melanogaster lines to evaluate this hypothesis. We show that, even in very simplified scenarios meant as a stress test of the commonly used Genomic Best Linear Unbiased Predictor (G-BLUP) method, using all common variants yields low prediction accuracy regardless of the trait genetic architecture. However, prediction accuracy increases when predictions are informed by the genetic architecture inferred from mapping the top variants affecting main effects and interactions in the training data, provided there is sufficient power for mapping. When the true genetic architecture is largely or partially due to epistatic interactions, the additive model may not perform well, while models that account explicitly for interactions generally increase prediction accuracy. Our results indicate that accounting for genetic architecture can improve prediction accuracy for quantitative traits.
Electrophysiological evidence for preserved primacy of lexical prediction in aging.
Dave, Shruti; Brothers, Trevor A; Traxler, Matthew J; Ferreira, Fernanda; Henderson, John M; Swaab, Tamara Y
2018-05-28
Young adults show consistent neural benefits of predictable contexts when processing upcoming words, but these benefits are less clear-cut in older adults. Here we disentangle the neural correlates of prediction accuracy and contextual support during word processing, in order to test current theories that suggest that neural mechanisms underlying predictive processing are specifically impaired in older adults. During a sentence comprehension task, older and younger readers were asked to predict passage-final words and report the accuracy of these predictions. Age-related reductions were observed for N250 and N400 effects of prediction accuracy, as well as for N400 effects of contextual support independent of prediction accuracy. Furthermore, temporal primacy of predictive processing (i.e., earlier facilitation for successful predictions) was preserved across the lifespan, suggesting that predictive mechanisms are unlikely to be uniquely impaired in older adults. In addition, older adults showed prediction effects on frontal post-N400 positivities (PNPs) that were similar in amplitude to PNPs in young adults. Previous research has shown correlations between verbal fluency and lexical prediction in older adult readers, suggesting that the production system may be linked to capacity for lexical prediction, especially in aging. The current study suggests that verbal fluency modulates PNP effects of contextual support, but not prediction accuracy. Taken together, our findings suggest that aging does not result in specific declines in lexical prediction. Copyright © 2018 Elsevier Ltd. All rights reserved.
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
Edgar, Robert C
2004-01-01
We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
Prediction of Fitness to Drive in Patients with Alzheimer's Dementia
Piersma, Dafne; Fuermaier, Anselm B. M.; de Waard, Dick; Davidse, Ragnhild J.; de Groot, Jolieke; Doumen, Michelle J. A.; Bredewoud, Ruud A.; Claesen, René; Lemstra, Afina W.; Vermeeren, Annemiek; Ponds, Rudolf; Verhey, Frans; Brouwer, Wiebo H.; Tucha, Oliver
2016-01-01
The number of patients with Alzheimer’s disease (AD) is increasing and so is the number of patients driving a car. To enable patients to retain their mobility while at the same time not endangering public safety, each patient should be assessed for fitness to drive. The aim of this study is to develop a method to assess fitness to drive in a clinical setting, using three types of assessments, i.e. clinical interviews, neuropsychological assessment and driving simulator rides. The goals are (1) to determine for each type of assessment which combination of measures is most predictive for on-road driving performance, (2) to compare the predictive value of clinical interviews, neuropsychological assessment and driving simulator evaluation and (3) to determine which combination of these assessments provides the best prediction of fitness to drive. Eighty-one patients with AD and 45 healthy individuals participated. All participated in a clinical interview, and were administered a neuropsychological test battery and a driving simulator ride (predictors). The criterion fitness to drive was determined in an on-road driving assessment by experts of the CBR Dutch driving test organisation according to their official protocol. The validity of the predictors to determine fitness to drive was explored by means of logistic regression analyses, discriminant function analyses, as well as receiver operating curve analyses. We found that all three types of assessments are predictive of on-road driving performance. Neuropsychological assessment had the highest classification accuracy followed by driving simulator rides and clinical interviews. However, combining all three types of assessments yielded the best prediction for fitness to drive in patients with AD with an overall accuracy of 92.7%, which makes this method highly valid for assessing fitness to drive in AD. This method may be used to advise patients with AD and their family members about fitness to drive. PMID:26910535
Artificial neural networks to predict activity type and energy expenditure in youth.
Trost, Stewart G; Wong, Weng-Keen; Pfeiffer, Karen A; Zheng, Yonglei
2012-09-01
Previous studies have demonstrated that pattern recognition approaches to accelerometer data reduction are feasible and moderately accurate in classifying activity type in children. Whether pattern recognition techniques can be used to provide valid estimates of physical activity (PA) energy expenditure in youth remains unexplored in the research literature. The objective of this study is to develop and test artificial neural networks (ANNs) to predict PA type and energy expenditure (PAEE) from processed accelerometer data collected in children and adolescents. One hundred participants between the ages of 5 and 15 yr completed 12 activity trials that were categorized into five PA types: sedentary, walking, running, light-intensity household activities or games, and moderate-to-vigorous-intensity games or sports. During each trial, participants wore an ActiGraph GT1M on the right hip, and VO2 was measured using the Oxycon Mobile (Viasys Healthcare, Yorba Linda, CA) portable metabolic system. ANNs to predict PA type and PAEE (METs) were developed using the following features: 10th, 25th, 50th, 75th, and 90th percentiles and the lag one autocorrelation. To determine the highest time resolution achievable, we extracted features from 10-, 15-, 20-, 30-, and 60-s windows. Accuracy was assessed by calculating the percentage of windows correctly classified and root mean square error (RMSE). As window size increased from 10 to 60 s, accuracy for the PA-type ANN increased from 81.3% to 88.4%. RMSE for the MET prediction ANN decreased from 1.1 METs to 0.9 METs. At any given window size, RMSE values for the MET prediction ANN were 30-40% lower than the conventional regression-based approaches. ANNs can be used to predict both PA type and PAEE in children and adolescents using count data from a single waist mounted accelerometer.
Bell, J J; Bauer, J D; Capra, S; Pulle, R C
2014-03-01
Differences in malnutrition diagnostic measures impact malnutrition prevalence and outcomes data in hip fracture. This study investigated the concurrent and predictive validity of commonly reported malnutrition diagnostic measures in patients admitted to a metropolitan hospital acute hip fracture unit. A prospective, consecutive level II diagnostic accuracy study (n=142; 8 exclusions) including the International Classification of Disease, 10th Revision, Australian Modification (ICD10-AM) protein-energy malnutrition criteria, a body mass index (BMI) <18.5 kg/m(2), the Mini-Nutrition Assessment Short-Form (MNA-SF), pre-operative albumin and geriatrician individualised assessment. Patients were predominantly elderly (median age 83.5, range 50-100 years), female (68%), multimorbid (median five comorbidities), with 15% 4-month mortality. Malnutrition prevalence was lowest when assessed by BMI (13%), followed by MNA-SF (27%), ICD10-AM (48%), albumin (53%) and geriatrician assessment (55%). Agreement between measures was highest between ICD10-AM and geriatrician assessment (κ=0.61) followed by ICD10-AM and MNA-SF measures (κ=0.34). ICD10-AM diagnosed malnutrition was the only measure associated with 48-h mobilisation (35.0 vs 55.3%; P=0.018). Reduced likelihood of home discharge was predicted by ICD-10-AM (20.6 vs 57.1%; P=0.001) and MNA-SF (18.8 vs 47.8%; P=0.035). Bivariate analysis demonstrated ICD10-AM (relative risk (RR)1.2; 1.05-1.42) and MNA-SF (RR1.2; 1.0-1.5) predicted 4-month mortality. When adjusted for age, usual place of residency, comorbidities and time to surgery only ICD-10AM criteria predicted mortality (odds ratio 3.59; 1.10-11.77). Albumin, BMI and geriatrician assessment demonstrated limited concurrent and predictive validity. Malnutrition prevalence in hip fracture varies substantially depending on the diagnostic measure applied. ICD-10AM criteria or the MNA-SF should be considered for the diagnosis of protein-energy malnutrition in frail, multi-morbid hip fracture inpatients.
A wavelet-based technique to predict treatment outcome for Major Depressive Disorder
Xia, Likun; Mohd Yasin, Mohd Azhar; Azhar Ali, Syed Saad
2017-01-01
Treatment management for Major Depressive Disorder (MDD) has been challenging. However, electroencephalogram (EEG)-based predictions of antidepressant’s treatment outcome may help during antidepressant’s selection and ultimately improve the quality of life for MDD patients. In this study, a machine learning (ML) method involving pretreatment EEG data was proposed to perform such predictions for Selective Serotonin Reuptake Inhibitor (SSRIs). For this purpose, the acquisition of experimental data involved 34 MDD patients and 30 healthy controls. Consequently, a feature matrix was constructed involving time-frequency decomposition of EEG data based on wavelet transform (WT) analysis, termed as EEG data matrix. However, the resultant EEG data matrix had high dimensionality. Therefore, dimension reduction was performed based on a rank-based feature selection method according to a criterion, i.e., receiver operating characteristic (ROC). As a result, the most significant features were identified and further be utilized during the training and testing of a classification model, i.e., the logistic regression (LR) classifier. Finally, the LR model was validated with 100 iterations of 10-fold cross-validation (10-CV). The classification results were compared with short-time Fourier transform (STFT) analysis, and empirical mode decompositions (EMD). The wavelet features extracted from frontal and temporal EEG data were found statistically significant. In comparison with other time-frequency approaches such as the STFT and EMD, the WT analysis has shown highest classification accuracy, i.e., accuracy = 87.5%, sensitivity = 95%, and specificity = 80%. In conclusion, significant wavelet coefficients extracted from frontal and temporal pre-treatment EEG data involving delta and theta frequency bands may predict antidepressant’s treatment outcome for the MDD patients. PMID:28152063
Smart Extraction and Analysis System for Clinical Research.
Afzal, Muhammad; Hussain, Maqbool; Khan, Wajahat Ali; Ali, Taqdir; Jamshed, Arif; Lee, Sungyoung
2017-05-01
With the increasing use of electronic health records (EHRs), there is a growing need to expand the utilization of EHR data to support clinical research. The key challenge in achieving this goal is the unavailability of smart systems and methods to overcome the issue of data preparation, structuring, and sharing for smooth clinical research. We developed a robust analysis system called the smart extraction and analysis system (SEAS) that consists of two subsystems: (1) the information extraction system (IES), for extracting information from clinical documents, and (2) the survival analysis system (SAS), for a descriptive and predictive analysis to compile the survival statistics and predict the future chance of survivability. The IES subsystem is based on a novel permutation-based pattern recognition method that extracts information from unstructured clinical documents. Similarly, the SAS subsystem is based on a classification and regression tree (CART)-based prediction model for survival analysis. SEAS is evaluated and validated on a real-world case study of head and neck cancer. The overall information extraction accuracy of the system for semistructured text is recorded at 99%, while that for unstructured text is 97%. Furthermore, the automated, unstructured information extraction has reduced the average time spent on manual data entry by 75%, without compromising the accuracy of the system. Moreover, around 88% of patients are found in a terminal or dead state for the highest clinical stage of disease (level IV). Similarly, there is an ∼36% probability of a patient being alive if at least one of the lifestyle risk factors was positive. We presented our work on the development of SEAS to replace costly and time-consuming manual methods with smart automatic extraction of information and survival prediction methods. SEAS has reduced the time and energy of human resources spent unnecessarily on manual tasks.
[Potential distribution of Panax ginseng and its predicted responses to climate change.
Zhao, Ze Fang; Wei, Hai Yan; Guo, Yan Long; Gu, Wei
2016-11-18
This study utilized Panax ginseng as the research object. Based on BioMod2 platform, with species presence data and 22 climatic variables, the potential geographic distribution of P. ginseng under the current conditions in northeast China was simulated with ten species distribution model. And then with the receiver-operating characteristic curve (ROC) as weights, we build an ensemble model, which integrated the results of 10 models, using the ensemble model, the future distributions of P. ginseng were also projected for the periods 2050s and 2070s under the climate change scenarios of RCP 8.5, RCP 6, RCP 4.5 and RCP 2.6 emission scenarios described in the Special Report on Emissions Scenarios (SRES) of IPCC (Intergovernmental Panel on Climate Change). The results showed that for the entire region of study area, under the present climatic conditions, 10.4% of the areas were identified as suitable habitats, which were mainly located in northeast Changbai Mountains area and the southeastern region of the Xiaoxing'an Mountains. The model simulations indicated that the suitable habitats would have a relatively significant change under the different climate change scenarios, and generally the range of suitable habitats would be a certain degree of decrease. Meanwhile, the goodness-of-fit, predicted ranges, and weights of explanatory variables was various for each model. And according to the goodness-of-fit, Maxent had the highest model performance, and GAM, RF and ANN were followed, while SRE had the lowest prediction accuracy. In this study we established an ensemble model, which could improve the accuracy of the existing species distribution models, and optimization of species distribution prediction results.
Zapatier, Jorge A; Gómez, Néstor A; Vargas, Paola E; Maya, Susana V
2007-06-01
The infection with Helicobacter pylori (H. pylori), and the diagnostic efficacy of the serologic tests has certain variability among the different geographic regions. The objective of the present work was to find the local validation of serological methods for diagnosis of H. pylori infection and to determine the best cutoff value for the local population. Forty-eight patients were evaluated, 27 males and 21 females, with a mean age of 29.2 years. On each patient, 3 tests for H. pylori diagnosis were performed: IgG serology, IgA serology and histology. We performed IgG and IgA serologic test for H. pylori infection and a histological examination for each patient. Efficacy parameters as well as the ROC curve were obtained for the IgG and IgA serology using histology as the gold standard. The cutoff point with the highest efficacy for IgG serology was 16 U/ml (sensitivity 81%, specificity 65%, positive predictive value 81%, negative predictive value 65%, and accuracy 75%), and for IgA serology was 17 U/ml (sensitivity 61%, specificity 53%, positive predictive value 70%, negative predictive value 43%, and accuracy 58%). The area under the curve was 67.1% (CI 95%: 50 to 84.1) and 54.4% (CI 95%: 38.3 to 72.5) for IgG and IgA respectively. The serology is a valuable tool in our population with high prevalence of H. pylori, especially due to its low cost and easy performance, but a reduction ofthe cutoff value was necessary to obtain more sensibility and a more adequate identification of true positives cases.
NASA Astrophysics Data System (ADS)
Pullanagari, R. R.; Kereszturi, Gábor; Yule, I. J.
2016-07-01
On-farm assessment of mixed pasture nutrient concentrations is important for animal production and pasture management. Hyperspectral imaging is recognized as a potential tool to quantify the nutrient content of vegetation. However, it is a great challenge to estimate macro and micro nutrients in heterogeneous mixed pastures. In this study, canopy reflectance data was measured by using a high resolution airborne visible-to-shortwave infrared (Vis-SWIR) imaging spectrometer measuring in the wavelength region 380-2500 nm to predict nutrient concentrations, nitrogen (N) phosphorus (P), potassium (K), sulfur (S), zinc (Zn), sodium (Na), manganese (Mn) copper (Cu) and magnesium (Mg) in heterogeneous mixed pastures across a sheep and beef farm in hill country, within New Zealand. Prediction models were developed using four different methods which are included partial least squares regression (PLSR), kernel PLSR, support vector regression (SVR), random forest regression (RFR) algorithms and their performance compared using the test data. The results from the study revealed that RFR produced highest accuracy (0.55 ⩽ R2CV ⩽ 0.78; 6.68% ⩽ nRMSECV ⩽ 26.47%) compared to all other algorithms for the majority of nutrients (N, P, K, Zn, Na, Cu and Mg) described, and the remaining nutrients (S and Mn) were predicted with high accuracy (0.68 ⩽ R2CV ⩽ 0.86; 13.00% ⩽ nRMSECV ⩽ 14.64%) using SVR. The best training models were used to extrapolate over the whole farm with the purpose of predicting those pasture nutrients and expressed through pixel based spatial maps. These spatially registered nutrient maps demonstrate the range and geographical location of often large differences in pasture nutrient values which are normally not measured and therefore not included in decision making when considering more effective ways to utilized pasture.
Frea, Simone; Bovolo, Virginia; Pidello, Stefano; Canavosio, Federico G; Botta, Michela; Bergerone, Serena; Gaita, Fiorenzo
2015-09-15
Advanced heart failure is associated with end-organ damage. Recent literature suggested an intriguing crosstalk between failing heart, abdomen and kidneys. Venous ammonia, as a by-product of the gut, could be a marker of abdominal injury in heart failure patients. The aim of the study was to investigate the clinical and prognostic role of ammonia in patients with advanced decompensated heart failure (ADHF). 90 patients admitted with ADHF were prospectively studied. The prognostic role of ammonia at admission was evaluated. Primary end-points were: a composite of cardiac death, urgent heart transplantation and mechanical circulatory support at 3 months and need for renal replacement therapies (RRT). In the study cohort (age 59.0 ± 12.0 years, FE 21.6 ± 9.0%, INTERMACS profile 3.7 ± 0.9, creatinine 1.71 ± 0.95 mg/dl) 27 patients (30%) underwent the cardiac composite endpoint, while 9 patients (10%) needed RRT. At ROC curve analysis ammonia ≥ 130 μg/dl (abdominal damage) showed the best diagnostic accuracy. At multivariate analysis abdominal damage predicted the cardiac composite endpoint. Abdominal damage further increased risk among patient with cold profile at admission (HR 2.7, 95% CI 1.1-7.0, p = 0.046). At multivariate analysis abdominal damage also predicted need for RRT (OR 10.8, 95% CI 1.5-75.8, p = 0.017). The combined use of estimated right atrial pressure and ammonia showed the highest diagnostic accuracy and a very high specificity in prediction of need for RRT. In a selected population admitted for ADHF ammonia, as a marker of abdominal derangement, predicted adverse cardiac events and need for RRT. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Nandi, Sutanu; Subramanian, Abhishek; Sarkar, Ram Rup
2017-07-25
Prediction of essential genes helps to identify a minimal set of genes that are absolutely required for the appropriate functioning and survival of a cell. The available machine learning techniques for essential gene prediction have inherent problems, like imbalanced provision of training datasets, biased choice of the best model for a given balanced dataset, choice of a complex machine learning algorithm, and data-based automated selection of biologically relevant features for classification. Here, we propose a simple support vector machine-based learning strategy for the prediction of essential genes in Escherichia coli K-12 MG1655 metabolism that integrates a non-conventional combination of an appropriate sample balanced training set, a unique organism-specific genotype, phenotype attributes that characterize essential genes, and optimal parameters of the learning algorithm to generate the best machine learning model (the model with the highest accuracy among all the models trained for different sample training sets). For the first time, we also introduce flux-coupled metabolic subnetwork-based features for enhancing the classification performance. Our strategy proves to be superior as compared to previous SVM-based strategies in obtaining a biologically relevant classification of genes with high sensitivity and specificity. This methodology was also trained with datasets of other recent supervised classification techniques for essential gene classification and tested using reported test datasets. The testing accuracy was always high as compared to the known techniques, proving that our method outperforms known methods. Observations from our study indicate that essential genes are conserved among homologous bacterial species, demonstrate high codon usage bias, GC content and gene expression, and predominantly possess a tendency to form physiological flux modules in metabolism.
Bettinger, Nicolas; Khalique, Omar K; Krepp, Joseph M; Hamid, Nadira B; Bae, David J; Pulerwitz, Todd C; Liao, Ming; Hahn, Rebecca T; Vahl, Torsten P; Nazif, Tamim M; George, Isaac; Leon, Martin B; Einstein, Andrew J; Kodali, Susheel K
The threshold for the optimal computed tomography (CT) number in Hounsfield Units (HU) to quantify aortic valvular calcium on contrast-enhanced scans has not been standardized. Our aim was to find the most accurate threshold to predict paravalvular regurgitation (PVR) after transcatheter aortic valve replacement (TAVR). 104 patients who underwent TAVR with the CoreValve prosthesis were studied retrospectively. Luminal attenuation (LA) in HU was measured at the level of the aortic annulus. Calcium volume score for the aortic valvular complex was measured using 6 threshold cutoffs (650 HU, 850 HU, LA × 1.25, LA × 1.5, LA+50, LA+100). Receiver-operating characteristic (ROC) analysis was performed to assess the predictive value for > mild PVR (n = 16). Multivariable analysis was performed to determine the accuracy to predict > mild PVR after adjustment for depth and perimeter oversizing. ROC analysis showed lower area under the curve (AUC) values for fixed threshold cutoffs (650 or 850 HU) compared to thresholds relative to LA. The LA+100 threshold had the highest AUC (0.81), and AUC was higher than all studied protocols, other than the LA x 1.25 and LA + 50 protocols, where the difference approached statistical significance (p = 0.05, and 0.068, respectively). Multivariable analysis showed calcium volume determined by the LAx1.25, LAx1.5, LA+50, and LA+ 100 HU protocols to independently predict PVR. Calcium volume scoring thresholds which are relative to LA are more predictive of PVR post-TAVR than those which use fixed cutoffs. A threshold of LA+100 HU had the highest predictive value. Copyright © 2017 Society of Cardiovascular Computed Tomography. Published by Elsevier Inc. All rights reserved.
Waide, Emily H; Tuggle, Christopher K; Serão, Nick V L; Schroyen, Martine; Hess, Andrew; Rowland, Raymond R R; Lunney, Joan K; Plastow, Graham; Dekkers, Jack C M
2018-02-01
Genomic prediction of the pig's response to the porcine reproductive and respiratory syndrome (PRRS) virus (PRRSV) would be a useful tool in the swine industry. This study investigated the accuracy of genomic prediction based on porcine SNP60 Beadchip data using training and validation datasets from populations with different genetic backgrounds that were challenged with different PRRSV isolates. Genomic prediction accuracy averaged 0.34 for viral load (VL) and 0.23 for weight gain (WG) following experimental PRRSV challenge, which demonstrates that genomic selection could be used to improve response to PRRSV infection. Training on WG data during infection with a less virulent PRRSV, KS06, resulted in poor accuracy of prediction for WG during infection with a more virulent PRRSV, NVSL. Inclusion of single nucleotide polymorphisms (SNPs) that are in linkage disequilibrium with a major quantitative trait locus (QTL) on chromosome 4 was vital for accurate prediction of VL. Overall, SNPs that were significantly associated with either trait in single SNP genome-wide association analysis were unable to predict the phenotypes with an accuracy as high as that obtained by using all genotyped SNPs across the genome. Inclusion of data from close relatives into the training population increased whole genome prediction accuracy by 33% for VL and by 37% for WG but did not affect the accuracy of prediction when using only SNPs in the major QTL region. Results show that genomic prediction of response to PRRSV infection is moderately accurate and, when using all SNPs on the porcine SNP60 Beadchip, is not very sensitive to differences in virulence of the PRRSV in training and validation populations. Including close relatives in the training population increased prediction accuracy when using the whole genome or SNPs other than those near a major QTL.
Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines.
Nielsen, Nanna Hellum; Jahoor, Ahmed; Jensen, Jens Due; Orabi, Jihad; Cericola, Fabio; Edriss, Vahid; Jensen, Just
2016-01-01
Genomic selection was recently introduced in plant breeding. The objective of this study was to develop genomic prediction for important seed quality parameters in spring barley. The aim was to predict breeding values without expensive phenotyping of large sets of lines. A total number of 309 advanced spring barley lines tested at two locations each with three replicates were phenotyped and each line was genotyped by Illumina iSelect 9Kbarley chip. The population originated from two different breeding sets, which were phenotyped in two different years. Phenotypic measurements considered were: seed size, protein content, protein yield, test weight and ergosterol content. A leave-one-out cross-validation strategy revealed high prediction accuracies ranging between 0.40 and 0.83. Prediction across breeding sets resulted in reduced accuracies compared to the leave-one-out strategy. Furthermore, predicting across full and half-sib-families resulted in reduced prediction accuracies. Additionally, predictions were performed using reduced marker sets and reduced training population sets. In conclusion, using less than 200 lines in the training set can result in low prediction accuracy, and the accuracy will then be highly dependent on the family structure of the selected training set. However, the results also indicate that relatively small training sets (200 lines) are sufficient for genomic prediction in commercial barley breeding. In addition, our results indicate a minimum marker set of 1,000 to decrease the risk of low prediction accuracy for some traits or some families.
Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines
Nielsen, Nanna Hellum; Jahoor, Ahmed; Jensen, Jens Due; Orabi, Jihad; Cericola, Fabio; Edriss, Vahid; Jensen, Just
2016-01-01
Genomic selection was recently introduced in plant breeding. The objective of this study was to develop genomic prediction for important seed quality parameters in spring barley. The aim was to predict breeding values without expensive phenotyping of large sets of lines. A total number of 309 advanced spring barley lines tested at two locations each with three replicates were phenotyped and each line was genotyped by Illumina iSelect 9Kbarley chip. The population originated from two different breeding sets, which were phenotyped in two different years. Phenotypic measurements considered were: seed size, protein content, protein yield, test weight and ergosterol content. A leave-one-out cross-validation strategy revealed high prediction accuracies ranging between 0.40 and 0.83. Prediction across breeding sets resulted in reduced accuracies compared to the leave-one-out strategy. Furthermore, predicting across full and half-sib-families resulted in reduced prediction accuracies. Additionally, predictions were performed using reduced marker sets and reduced training population sets. In conclusion, using less than 200 lines in the training set can result in low prediction accuracy, and the accuracy will then be highly dependent on the family structure of the selected training set. However, the results also indicate that relatively small training sets (200 lines) are sufficient for genomic prediction in commercial barley breeding. In addition, our results indicate a minimum marker set of 1,000 to decrease the risk of low prediction accuracy for some traits or some families. PMID:27783639
Influence of outliers on accuracy estimation in genomic prediction in plant breeding.
Estaghvirou, Sidi Boubacar Ould; Ogutu, Joseph O; Piepho, Hans-Peter
2014-10-01
Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased. Copyright © 2014 Ould Estaghvirou et al.
Salience network-based classification and prediction of symptom severity in children with autism.
Uddin, Lucina Q; Supekar, Kaustubh; Lynch, Charles J; Khouzam, Amirah; Phillips, Jennifer; Feinstein, Carl; Ryali, Srikanth; Menon, Vinod
2013-08-01
Autism spectrum disorder (ASD) affects 1 in 88 children and is characterized by a complex phenotype, including social, communicative, and sensorimotor deficits. Autism spectrum disorder has been linked with atypical connectivity across multiple brain systems, yet the nature of these differences in young children with the disorder is not well understood. To examine connectivity of large-scale brain networks and determine whether specific networks can distinguish children with ASD from typically developing (TD) children and predict symptom severity in children with ASD. Case-control study performed at Stanford University School of Medicine of 20 children 7 to 12 years old with ASD and 20 age-, sex-, and IQ-matched TD children. Between-group differences in intrinsic functional connectivity of large-scale brain networks, performance of a classifier built to discriminate children with ASD from TD children based on specific brain networks, and correlations between brain networks and core symptoms of ASD. We observed stronger functional connectivity within several large-scale brain networks in children with ASD compared with TD children. This hyperconnectivity in ASD encompassed salience, default mode, frontotemporal, motor, and visual networks. This hyperconnectivity result was replicated in an independent cohort obtained from publicly available databases. Using maps of each individual's salience network, children with ASD could be discriminated from TD children with a classification accuracy of 78%, with 75% sensitivity and 80% specificity. The salience network showed the highest classification accuracy among all networks examined, and the blood oxygen-level dependent signal in this network predicted restricted and repetitive behavior scores. The classifier discriminated ASD from TD in the independent sample with 83% accuracy, 67% sensitivity, and 100% specificity. Salience network hyperconnectivity may be a distinguishing feature in children with ASD. Quantification of brain network connectivity is a step toward developing biomarkers for objectively identifying children with ASD.
Salience Network–Based Classification and Prediction of Symptom Severity in Children With Autism
Uddin, Lucina Q.; Supekar, Kaustubh; Lynch, Charles J.; Khouzam, Amirah; Phillips, Jennifer; Feinstein, Carl; Ryali, Srikanth; Menon, Vinod
2014-01-01
IMPORTANCE Autism spectrum disorder (ASD) affects 1 in 88 children and is characterized by a complex phenotype, including social, communicative, and sensorimotor deficits. Autism spectrum disorder has been linked with atypical connectivity across multiple brain systems, yet the nature of these differences in young children with the disorder is not well understood. OBJECTIVES To examine connectivity of large-scale brain networks and determine whether specific networks can distinguish children with ASD from typically developing (TD) children and predict symptom severity in children with ASD. DESIGN, SETTING, AND PARTICIPANTS Case-control study performed at Stanford University School of Medicine of 20 children 7 to 12 years old with ASD and 20 age-, sex-, and IQ-matched TD children. MAIN OUTCOMES AND MEASURES Between-group differences in intrinsic functional connectivity of large-scale brain networks, performance of a classifier built to discriminate children with ASD from TD children based on specific brain networks, and correlations between brain networks and core symptoms of ASD. RESULTS We observed stronger functional connectivity within several large-scale brain networks in children with ASD compared with TD children. This hyperconnectivity in ASD encompassed salience, default mode, frontotemporal, motor, and visual networks. This hyperconnectivity result was replicated in an independent cohort obtained from publicly available databases. Using maps of each individual’s salience network, children with ASD could be discriminated from TD children with a classification accuracy of 78%, with 75% sensitivity and 80% specificity. The salience network showed the highest classification accuracy among all networks examined, and the blood oxygen–level dependent signal in this network predicted restricted and repetitive behavior scores. The classifier discriminated ASD from TD in the independent sample with 83% accuracy, 67% sensitivity, and 100% specificity. CONCLUSIONS AND RELEVANCE Salience network hyperconnectivity may be a distinguishing feature in children with ASD. Quantification of brain network connectivity is a step toward developing biomarkers for objectively identifying children with ASD. PMID:23803651
Dervin, Geoffrey F.; Stiell, Ian G.; Wells, George A.; Rody, Kelly; Grabowski, Jenny
2001-01-01
Objective To determine clinicians’ accuracy and reliability for the clinical diagnosis of unstable meniscus tears in patients with symptomatic osteoarthritis of the knee. Design A prospective cohort study. Setting A single tertiary care centre. Patients One hundred and fifty-two patients with symptomatic osteoarthritis of the knee refractory to conservative medical treatment were selected for prospective evaluation of arthroscopic débridement. Intervention Arthroscopic débridement of the knee, including meniscal tear and chondral flap resection, without abrasion arthroplasty. Outcome measures A standardized assessment protocol was administered to each patient by 2 independent observers. Arthroscopic determination of unstable meniscal tears was recorded by 1 observer who reviewed a video recording and was blinded to preoperative data. Those variables that had the highest interobserver agreement and the strongest association with meniscal tear by univariate methods were entered into logistic regression to model the best prediction of resectable tears. Results There were 92 meniscal tears (77 medial, 15 lateral). Interobserver agreement between clinical fellows and treating surgeons was poor to fair (κ < 0.4) for all clinical variables except radiographic measures, which were good. Fellows and surgeons predicted unstable meniscal tear preoperatively with equivalent accuracy of 60%. Logistic regression modelling revealed that a history of swelling and a ballottable effusion were negative predictors. A positive McMurray test was the only positive predictor of unstable meniscal tear. “Mechanical” symptoms were not reliable predictors in this prospective study. The model was 69% accurate for all patients and 76% for those with advanced medial compartment osteoarthritis defined by a joint space height of 2 mm or less. Conclusions This study underscored the difficulty in using clinical variables to predict unstable medial meniscal tears in patients with pre-existing osteoarthritis of the knee. The lack of interobserver agreement must be overcome to ensure that the findings can be generalized to other physician observers. PMID:11504260
Benchmarking dairy herd health status using routinely recorded herd summary data.
Parker Gaddis, K L; Cole, J B; Clay, J S; Maltecca, C
2016-02-01
Genetic improvement of dairy cattle health through the use of producer-recorded data has been determined to be feasible. Low estimated heritabilities indicate that genetic progress will be slow. Variation observed in lowly heritable traits can largely be attributed to nongenetic factors, such as the environment. More rapid improvement of dairy cattle health may be attainable if herd health programs incorporate environmental and managerial aspects. More than 1,100 herd characteristics are regularly recorded on farm test-days. We combined these data with producer-recorded health event data, and parametric and nonparametric models were used to benchmark herd and cow health status. Health events were grouped into 3 categories for analyses: mastitis, reproductive, and metabolic. Both herd incidence and individual incidence were used as dependent variables. Models implemented included stepwise logistic regression, support vector machines, and random forests. At both the herd and individual levels, random forest models attained the highest accuracy for predicting health status in all health event categories when evaluated with 10-fold cross-validation. Accuracy (SD) ranged from 0.61 (0.04) to 0.63 (0.04) when using random forest models at the herd level. Accuracy of prediction (SD) at the individual cow level ranged from 0.87 (0.06) to 0.93 (0.001) with random forest models. Highly significant variables and key words from logistic regression and random forest models were also investigated. All models identified several of the same key factors for each health event category, including movement out of the herd, size of the herd, and weather-related variables. We concluded that benchmarking health status using routinely collected herd data is feasible. Nonparametric models were better suited to handle this complex data with numerous variables. These data mining techniques were able to perform prediction of health status and could add evidence to personal experience in herd management. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The effect of using genealogy-based haplotypes for genomic prediction
2013-01-01
Background Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Conclusions Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy. PMID:23496971
The effect of using genealogy-based haplotypes for genomic prediction.
Edriss, Vahid; Fernando, Rohan L; Su, Guosheng; Lund, Mogens S; Guldbrandtsen, Bernt
2013-03-06
Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.
Achamrah, Najate; Jésus, Pierre; Grigioni, Sébastien; Rimbert, Agnès; Petit, André; Déchelotte, Pierre; Folope, Vanessa; Coëffier, Moïse
2018-01-01
Predictive equations have been specifically developed for obese patients to estimate resting energy expenditure (REE). Body composition (BC) assessment is needed for some of these equations. We assessed the impact of BC methods on the accuracy of specific predictive equations developed in obese patients. REE was measured (mREE) by indirect calorimetry and BC assessed by bioelectrical impedance analysis (BIA) and dual-energy X-ray absorptiometry (DXA). mREE, percentages of prediction accuracy (±10% of mREE) were compared. Predictive equations were studied in 2588 obese patients. Mean mREE was 1788 ± 6.3 kcal/24 h. Only the Müller (BIA) and Harris & Benedict (HB) equations provided REE with no difference from mREE. The Huang, Müller, Horie-Waitzberg, and HB formulas provided a higher accurate prediction (>60% of cases). The use of BIA provided better predictions of REE than DXA for the Huang and Müller equations. Inversely, the Horie-Waitzberg and Lazzer formulas provided a higher accuracy using DXA. Accuracy decreased when applied to patients with BMI ≥ 40, except for the Horie-Waitzberg and Lazzer (DXA) formulas. Müller equations based on BIA provided a marked improvement of REE prediction accuracy than equations not based on BC. The interest of BC to improve REE predictive equations accuracy in obese patients should be confirmed. PMID:29320432
NASA Astrophysics Data System (ADS)
Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo
2017-11-01
Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.
Beaulieu, Jean; Doerksen, Trevor K; MacKay, John; Rainville, André; Bousquet, Jean
2014-12-02
Genomic selection (GS) may improve selection response over conventional pedigree-based selection if markers capture more detailed information than pedigrees in recently domesticated tree species and/or make it more cost effective. Genomic prediction accuracies using 1748 trees and 6932 SNPs representative of as many distinct gene loci were determined for growth and wood traits in white spruce, within and between environments and breeding groups (BG), each with an effective size of Ne ≈ 20. Marker subsets were also tested. Model fits and/or cross-validation (CV) prediction accuracies for ridge regression (RR) and the least absolute shrinkage and selection operator models approached those of pedigree-based models. With strong relatedness between CV sets, prediction accuracies for RR within environment and BG were high for wood (r = 0.71-0.79) and moderately high for growth (r = 0.52-0.69) traits, in line with trends in heritabilities. For both classes of traits, these accuracies achieved between 83% and 92% of those obtained with phenotypes and pedigree information. Prediction into untested environments remained moderately high for wood (r ≥ 0.61) but dropped significantly for growth (r ≥ 0.24) traits, emphasizing the need to phenotype in all test environments and model genotype-by-environment interactions for growth traits. Removing relatedness between CV sets sharply decreased prediction accuracies for all traits and subpopulations, falling near zero between BGs with no known shared ancestry. For marker subsets, similar patterns were observed but with lower prediction accuracies. Given the need for high relatedness between CV sets to obtain good prediction accuracies, we recommend to build GS models for prediction within the same breeding population only. Breeding groups could be merged to build genomic prediction models as long as the total effective population size does not exceed 50 individuals in order to obtain high prediction accuracy such as that obtained in the present study. A number of markers limited to a few hundred would not negatively impact prediction accuracies, but these could decrease more rapidly over generations. The most promising short-term approach for genomic selection would likely be the selection of superior individuals within large full-sib families vegetatively propagated to implement multiclonal forestry.
Johansen, Kirsten L; Dalrymple, Lorien S; Delgado, Cynthia; Kaysen, George A; Kornak, John; Grimes, Barbara; Chertow, Glenn M
2014-10-01
A well-accepted definition of frailty includes measurements of physical performance, which may limit its clinical utility. In a cross-sectional study, we compared prevalence and patient characteristics based on a frailty definition that uses self-reported function to the classic performance-based definition and developed a modified self-report-based definition. Prevalent adult patients receiving hemodialysis in 14 centers around San Francisco and Atlanta in 2009-2011. Self-report-based frailty definition in which a score lower than 75 on the Physical Function scale of the 36-Item Short Form Health Survey (SF-36) was substituted for gait speed and grip strength in the classic definition; modified self-report definition with optimized Physical Function score cutoff points derived in a development (one-half) cohort and validated in the other half. Performance-based frailty defined as 3 of the following: weight loss, weakness, exhaustion, low physical activity, and slow gait speed. 387 (53%) patients were frail based on self-reported function, of whom 209 (29% of the cohort) met the performance-based definition. Only 23 (3%) met the performance-based definition of frailty only. The self-report definition had 90% sensitivity, 64% specificity, 54% positive predictive value, 93% negative predictive value, and 72.5% overall accuracy. Intracellular water per kilogram of body weight and serum albumin, prealbumin, and creatinine levels were highest among nonfrail individuals, intermediate among those who were frail by self-report, and lowest among those who also were frail by performance. Age, percentage of body fat, and C-reactive protein level followed an opposite pattern. The modified self-report definition had better accuracy (84%; 95% CI, 79%-89%) and superior specificity (88%) and positive predictive value (67%). Our study did not address prediction of outcomes. Patients who meet the self-report-based but not the performance-based definition of frailty may represent an intermediate phenotype. A modified self-report definition can improve the accuracy of a questionnaire-based method of defining frailty. Published by Elsevier Inc.
Uribe-Rivera, David E; Soto-Azat, Claudio; Valenzuela-Sánchez, Andrés; Bizama, Gustavo; Simonetti, Javier A; Pliscoff, Patricio
2017-07-01
Climate change is a major threat to biodiversity; the development of models that reliably predict its effects on species distributions is a priority for conservation biogeography. Two of the main issues for accurate temporal predictions from Species Distribution Models (SDM) are model extrapolation and unrealistic dispersal scenarios. We assessed the consequences of these issues on the accuracy of climate-driven SDM predictions for the dispersal-limited Darwin's frog Rhinoderma darwinii in South America. We calibrated models using historical data (1950-1975) and projected them across 40 yr to predict distribution under current climatic conditions, assessing predictive accuracy through the area under the ROC curve (AUC) and True Skill Statistics (TSS), contrasting binary model predictions against temporal-independent validation data set (i.e., current presences/absences). To assess the effects of incorporating dispersal processes we compared the predictive accuracy of dispersal constrained models with no dispersal limited SDMs; and to assess the effects of model extrapolation on the predictive accuracy of SDMs, we compared this between extrapolated and no extrapolated areas. The incorporation of dispersal processes enhanced predictive accuracy, mainly due to a decrease in the false presence rate of model predictions, which is consistent with discrimination of suitable but inaccessible habitat. This also had consequences on range size changes over time, which is the most used proxy for extinction risk from climate change. The area of current climatic conditions that was absent in the baseline conditions (i.e., extrapolated areas) represents 39% of the study area, leading to a significant decrease in predictive accuracy of model predictions for those areas. Our results highlight (1) incorporating dispersal processes can improve predictive accuracy of temporal transference of SDMs and reduce uncertainties of extinction risk assessments from global change; (2) as geographical areas subjected to novel climates are expected to arise, they must be reported as they show less accurate predictions under future climate scenarios. Consequently, environmental extrapolation and dispersal processes should be explicitly incorporated to report and reduce uncertainties in temporal predictions of SDMs, respectively. Doing so, we expect to improve the reliability of the information we provide for conservation decision makers under future climate change scenarios. © 2017 by the Ecological Society of America.
Analysis of Artificial Neural Network in Erosion Modeling: A Case Study of Serang Watershed
NASA Astrophysics Data System (ADS)
Arif, N.; Danoedoro, P.; Hartono
2017-12-01
Erosion modeling is an important measuring tool for both land users and decision makers to evaluate land cultivation and thus it is necessary to have a model to represent the actual reality. Erosion models are a complex model because of uncertainty data with different sources and processing procedures. Artificial neural networks can be relied on for complex and non-linear data processing such as erosion data. The main difficulty in artificial neural network training is the determination of the value of each network input parameters, i.e. hidden layer, momentum, learning rate, momentum, and RMS. This study tested the capability of artificial neural network application in the prediction of erosion risk with some input parameters through multiple simulations to get good classification results. The model was implemented in Serang Watershed, Kulonprogo, Yogyakarta which is one of the critical potential watersheds in Indonesia. The simulation results showed the number of iterations that gave a significant effect on the accuracy compared to other parameters. A small number of iterations can produce good accuracy if the combination of other parameters was right. In this case, one hidden layer was sufficient to produce good accuracy. The highest training accuracy achieved in this study was 99.32%, occurred in ANN 14 simulation with combination of network input parameters of 1 HL; LR 0.01; M 0.5; RMS 0.0001, and the number of iterations of 15000. The ANN training accuracy was not influenced by the number of channels, namely input dataset (erosion factors) as well as data dimensions, rather it was determined by changes in network parameters.
The effect of MLC speed and acceleration on the plan delivery accuracy of VMAT
Park, J M; Wu, H-G; Kim, J H; Carlson, J N K
2015-01-01
Objective: To determine a new metric utilizing multileaf collimator (MLC) speeds and accelerations to predict plan delivery accuracy of volumetric modulated arc therapy (VMAT). Methods: To verify VMAT delivery accuracy, gamma evaluations, analysis of mechanical parameter difference between plans and log files, and analysis of changes in dose–volumetric parameters between plans and plans reconstructed with log files were performed with 40 VMAT plans. The average proportion of leaf speeds ranging from l to h cm s−1 (Sl–h and l–h = 0–0.4, 0.4–0.8, 0.8–1.2, 1.2–1.6 and 1.6–2.0), mean and standard deviation of MLC speeds were calculated for each VMAT plan. The same was carried out for accelerations in centimetre per second squared (Al–h and l–h = 0–4, 4–8, 8–12, 12–16 and 16–20). The correlations of those indicators to plan delivery accuracy were analysed with Spearman's correlation coefficient (rs). Results: The S1.2–1.6 and mean acceleration of MLCs showed generally higher correlations to plan delivery accuracy than did others. The highest rs values were observed between S1.2–1.6 and global 1%/2 mm (rs = −0.698 with p < 0.001) as well as mean acceleration and global 1%/2 mm (rs = −0.650 with p < 0.001). As the proportion of MLC speeds and accelerations >0.4 and 4 cm s−2 increased, the plan delivery accuracy of VMAT decreased. Conclusion: The variations in MLC speeds and accelerations showed considerable correlations to VMAT delivery accuracy. Advances in knowledge: As the MLC speeds and accelerations increased, VMAT delivery accuracy reduced. PMID:25734490
Genomic selection across multiple breeding cycles in applied bread wheat breeding.
Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann
2016-06-01
We evaluated genomic selection across five breeding cycles of bread wheat breeding. Bias of within-cycle cross-validation and methods for improving the prediction accuracy were assessed. The prospect of genomic selection has been frequently shown by cross-validation studies using the same genetic material across multiple environments, but studies investigating genomic selection across multiple breeding cycles in applied bread wheat breeding are lacking. We estimated the prediction accuracy of grain yield, protein content and protein yield of 659 inbred lines across five independent breeding cycles and assessed the bias of within-cycle cross-validation. We investigated the influence of outliers on the prediction accuracy and predicted protein yield by its components traits. A high average heritability was estimated for protein content, followed by grain yield and protein yield. The bias of the prediction accuracy using populations from individual cycles using fivefold cross-validation was accordingly substantial for protein yield (17-712 %) and less pronounced for protein content (8-86 %). Cross-validation using the cycles as folds aimed to avoid this bias and reached a maximum prediction accuracy of [Formula: see text] = 0.51 for protein content, [Formula: see text] = 0.38 for grain yield and [Formula: see text] = 0.16 for protein yield. Dropping outlier cycles increased the prediction accuracy of grain yield to [Formula: see text] = 0.41 as estimated by cross-validation, while dropping outlier environments did not have a significant effect on the prediction accuracy. Independent validation suggests, on the other hand, that careful consideration is necessary before an outlier correction is undertaken, which removes lines from the training population. Predicting protein yield by multiplying genomic estimated breeding values of grain yield and protein content raised the prediction accuracy to [Formula: see text] = 0.19 for this derived trait.
SU-F-R-14: PET Based Radiomics to Predict Outcomes in Patients with Hodgkin Lymphoma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, J; Aristophanous, M; Akhtari, M
Purpose: To identify PET-based radiomics features associated with high refractory/relapsed disease risk for Hodgkin lymphoma patients. Methods: A total of 251 Hodgkin lymphoma patients including 19 primary refractory and 9 relapsed patients were investigated. All patients underwent an initial pre-treatment diagnostic FDG PET/CT scan. All cancerous lymph node regions (ROIs) were delineated by an experienced physician based on thresholding each volume of disease in the anatomical regions to SUV>2.5. We extracted 122 image features and evaluated the effect of ROI selection (the largest ROI, the ROI with highest mean SUV, merged ROI, and a single anatomic region [e.g. mediastinum]) onmore » classification accuracy. Random forest was used as a classifier and ROC analysis was used to assess the relationship between selected features and patient’s outcome status. Results: Each patient had between 1 and 9 separate ROIs, with much intra-patient variability in PET features. The best model, which used features from a single anatomic region (the mediastinal ROI, only volumes>5cc: 169 patients with 12 primary refractory) had a classification accuracy of 80.5% for primary refractory disease. The top five features, based on Gini index, consist of shape features (max 3D-diameter and volume) and texture features (correlation and information measure of correlation1&2). In the ROC analysis, sensitivity and specificity of the best model were 0.92 and 0.80, respectively. The area under the ROC (AUC) and the accuracy were 0.86 and 0.86, respectively. The classification accuracy was less than 60% for other ROI models or when ROIs less than 5cc were included. Conclusion: This study showed that PET-based radiomics features from the mediastinal lymph region are associated with primary refractory disease and therefore may play an important role in predicting outcomes in Hodgkin lymphoma patients. These features could be additive beyond baseline tumor and clinical characteristics, and may warrant more aggressive treatment.« less
Accuracy of CNV Detection from GWAS Data.
Zhang, Dandan; Qian, Yudong; Akula, Nirmala; Alliey-Rodriguez, Ney; Tang, Jinsong; Gershon, Elliot S; Liu, Chunyu
2011-01-13
Several computer programs are available for detecting copy number variants (CNVs) using genome-wide SNP arrays. We evaluated the performance of four CNV detection software suites--Birdsuite, Partek, HelixTree, and PennCNV-Affy--in the identification of both rare and common CNVs. Each program's performance was assessed in two ways. The first was its recovery rate, i.e., its ability to call 893 CNVs previously identified in eight HapMap samples by paired-end sequencing of whole-genome fosmid clones, and 51,440 CNVs identified by array Comparative Genome Hybridization (aCGH) followed by validation procedures, in 90 HapMap CEU samples. The second evaluation was program performance calling rare and common CNVs in the Bipolar Genome Study (BiGS) data set (1001 bipolar cases and 1033 controls, all of European ancestry) as measured by the Affymetrix SNP 6.0 array. Accuracy in calling rare CNVs was assessed by positive predictive value, based on the proportion of rare CNVs validated by quantitative real-time PCR (qPCR), while accuracy in calling common CNVs was assessed by false positive/false negative rates based on qPCR validation results from a subset of common CNVs. Birdsuite recovered the highest percentages of known HapMap CNVs containing >20 markers in two reference CNV datasets. The recovery rate increased with decreased CNV frequency. In the tested rare CNV data, Birdsuite and Partek had higher positive predictive values than the other software suites. In a test of three common CNVs in the BiGS dataset, Birdsuite's call was 98.8% consistent with qPCR quantification in one CNV region, but the other two regions showed an unacceptable degree of accuracy. We found relatively poor consistency between the two "gold standards," the sequence data of Kidd et al., and aCGH data of Conrad et al. Algorithms for calling CNVs especially common ones need substantial improvement, and a "gold standard" for detection of CNVs remains to be established.
Quality and accuracy assessment of nutrition information on the Web for cancer prevention.
Shahar, Suzana; Shirley, Ng; Noah, Shahrul A
2013-01-01
This study aimed to assess the quality and accuracy of nutrition information about cancer prevention available on the Web. The keywords 'nutrition + diet + cancer + prevention' were submitted to the Google search engine. Out of 400 websites evaluated, 100 met the inclusion and exclusion criteria and were selected as the sample for the assessment of quality and accuracy. Overall, 54% of the studied websites had low quality, 48 and 57% had no author's name or information, respectively, 100% were not updated within 1 month during the study period and 86% did not have the Health on the Net seal. When the websites were assessed for readability using the Flesch Reading Ease test, nearly 44% of the websites were categorised as 'quite difficult'. With regard to accuracy, 91% of the websites did not precisely follow the latest WCRF/AICR 2007 recommendation. The quality scores correlated significantly with the accuracy scores (r = 0.250, p < 0.05). Professional websites (n = 22) had the highest mean quality scores, whereas government websites (n = 2) had the highest mean accuracy scores. The quality of the websites selected in this study was not satisfactory, and there is great concern about the accuracy of the information being disseminated.
Szalma, James L; Teo, Grace W L
2012-03-01
The goal for this study was to test assertions of the dynamic adaptability theory of stress, which proposes two fundamental task dimensions, information rate (temporal properties of a task) and information structure (spatial properties of a task). The theory predicts adaptive stability across stress magnitudes, with progressive and precipitous changes in adaptive response manifesting first as increases in perceived workload and stress and then as performance failure. Information structure was manipulated by varying the number of displays to be monitored (1, 2, 4 or 8 displays). Information rate was manipulated by varying stimulus presentation rate (8, 12, 16, or 20 events/min). A signal detection task was used in which critical signals were pairs of digits that differed by 0 or 1. Performance accuracy declined and workload and stress increased as a function of increased task demand, with a precipitous decline in accuracy at the highest demand levels. However, the form of performance change as well as the pattern of relationships between speed and accuracy and between performance and workload/stress indicates that some aspects of the theory need revision. Implications of the results for the theory and for future research are discussed. Copyright © 2011 Elsevier B.V. All rights reserved.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
Feng, Gangyi; Ingvalson, Erin M; Grieco-Calub, Tina M; Roberts, Megan Y; Ryan, Maura E; Birmingham, Patrick; Burrowes, Delilah; Young, Nancy M; Wong, Patrick C M
2018-01-30
Although cochlear implantation enables some children to attain age-appropriate speech and language development, communicative delays persist in others, and outcomes are quite variable and difficult to predict, even for children implanted early in life. To understand the neurobiological basis of this variability, we used presurgical neural morphological data obtained from MRI of individual pediatric cochlear implant (CI) candidates implanted younger than 3.5 years to predict variability of their speech-perception improvement after surgery. We first compared neuroanatomical density and spatial pattern similarity of CI candidates to that of age-matched children with normal hearing, which allowed us to detail neuroanatomical networks that were either affected or unaffected by auditory deprivation. This information enables us to build machine-learning models to predict the individual children's speech development following CI. We found that regions of the brain that were unaffected by auditory deprivation, in particular the auditory association and cognitive brain regions, produced the highest accuracy, specificity, and sensitivity in patient classification and the most precise prediction results. These findings suggest that brain areas unaffected by auditory deprivation are critical to developing closer to typical speech outcomes. Moreover, the findings suggest that determination of the type of neural reorganization caused by auditory deprivation before implantation is valuable for predicting post-CI language outcomes for young children.
Plasma neutrophil gelatinase-associated lipocalin: a marker of acute pyelonephritis in children.
Kim, Byung Kwan; Yim, Hyung Eun; Yoo, Kee Hwan
2017-03-01
This study was designed to compare the diagnostic accuracy of plasma neutrophil gelatinase-associated lipocalin (NGAL) with procalcitonin (PCT), C-reactive protein (CRP), and white blood cells (WBCs) for predicting acute pyelonephritis (APN) in children with febrile urinary tract infections (UTIs). In total, 138 children with febrile UTIs (APN 59, lower UTI 79) were reviewed retrospectively. Levels of NGAL, PCT, CRP, and WBCs in blood were measured on admission. The diagnostic accuracy of the biomarkers was investigated. Independent predictors of APN were identified by multivariate logistic regression analysis. Receiver operating curve (ROC) analyses showed good diagnostic profiles of NGAL, PCT, CRP, and WBCs for identifying APN [area under the curve (AUC) 0.893, 0.855, 0.879, and 0.654, respectively]. However, multivariate analysis revealed only plasma NGAL level was an independent predictor of APN (P = 0.006). At the best cutoff values of all examined biomarkers for identifying APN, sensitivity (86 %), specificity (85 %), positive predictive value (81 %), and negative predictive value (89 %) of plasma NGAL levels were the highest. The optimal NGAL cutoff value was 117 ng/ml. The positive likelihood ratio [odds ratio (OR) 5.69, 95 % confidence interval (CI) 3.56-8.78], and negative likelihood ratio (OR 0.16, 95 % CI 0.08-0.29) of plasma NGAL for APN diagnosis also showed it seemed to be more accurate than serum PCT, CRP, and WBCs. Plasma NGAL can be more useful than serum PCT, CRP, and WBC levels for identifying APN in children with febrile UTIs.
Mapping Migratory Bird Prevalence Using Remote Sensing Data Fusion
Swatantran, Anu; Dubayah, Ralph; Goetz, Scott; Hofton, Michelle; Betts, Matthew G.; Sun, Mindy; Simard, Marc; Holmes, Richard
2012-01-01
Background Improved maps of species distributions are important for effective management of wildlife under increasing anthropogenic pressures. Recent advances in lidar and radar remote sensing have shown considerable potential for mapping forest structure and habitat characteristics across landscapes. However, their relative efficacies and integrated use in habitat mapping remain largely unexplored. We evaluated the use of lidar, radar and multispectral remote sensing data in predicting multi-year bird detections or prevalence for 8 migratory songbird species in the unfragmented temperate deciduous forests of New Hampshire, USA. Methodology and Principal Findings A set of 104 predictor variables describing vegetation vertical structure and variability from lidar, phenology from multispectral data and backscatter properties from radar data were derived. We tested the accuracies of these variables in predicting prevalence using Random Forests regression models. All data sets showed more than 30% predictive power with radar models having the lowest and multi-sensor synergy (“fusion”) models having highest accuracies. Fusion explained between 54% and 75% variance in prevalence for all the birds considered. Stem density from discrete return lidar and phenology from multispectral data were among the best predictors. Further analysis revealed different relationships between the remote sensing metrics and bird prevalence. Spatial maps of prevalence were consistent with known habitat preferences for the bird species. Conclusion and Significance Our results highlight the potential of integrating multiple remote sensing data sets using machine-learning methods to improve habitat mapping. Multi-dimensional habitat structure maps such as those generated from this study can significantly advance forest management and ecological research by facilitating fine-scale studies at both stand and landscape level. PMID:22235254
Mapping soil particle-size fractions: A comparison of compositional kriging and log-ratio kriging
NASA Astrophysics Data System (ADS)
Wang, Zong; Shi, Wenjiao
2017-03-01
Soil particle-size fractions (psf) as basic physical variables need to be accurately predicted for regional hydrological, ecological, geological, agricultural and environmental studies frequently. Some methods had been proposed to interpolate the spatial distributions of soil psf, but the performance of compositional kriging and different log-ratio kriging methods is still unclear. Four log-ratio transformations, including additive log-ratio (alr), centered log-ratio (clr), isometric log-ratio (ilr), and symmetry log-ratio (slr), combined with ordinary kriging (log-ratio kriging: alr_OK, clr_OK, ilr_OK and slr_OK) were selected to be compared with compositional kriging (CK) for the spatial prediction of soil psf in Tianlaochi of Heihe River Basin, China. Root mean squared error (RMSE), Aitchison's distance (AD), standardized residual sum of squares (STRESS) and right ratio of the predicted soil texture types (RR) were chosen to evaluate the accuracy for different interpolators. The results showed that CK had a better accuracy than the four log-ratio kriging methods. The RMSE (sand, 9.27%; silt, 7.67%; clay, 4.17%), AD (0.45), STRESS (0.60) of CK were the lowest and the RR (58.65%) was the highest in the five interpolators. The clr_OK achieved relatively better performance than the other log-ratio kriging methods. In addition, CK presented reasonable and smooth transition on mapping soil psf according to the environmental factors. The study gives insights for mapping soil psf accurately by comparing different methods for compositional data interpolation. Further researches of methods combined with ancillary variables are needed to be implemented to improve the interpolation performance.
Salgar, Avinash Ramchandra; Singh, Shishir H; Podar, Rajesh S; Kulkarni, Gaurav P; Babel, Shashank N
2017-01-01
Pulp sensitivity testing, even with its limitations and shortcomings, has been and still remains a very helpful aid in endodontic diagnosis. Pulp sensitivity tests extrapolate pulpal health from the sensory response. The aim of the present study was to identify the sensitivity, specificity, positive and negative predictive values (NPVs) of thermal and electrical tests of pulp sensitivity. Pulp tests studied were two cold and heat tests respectively and electrical test. A total of 330 teeth were tested: 198 teeth with vital pulp and 132 teeth with necrotic pulps (disease prevalence of 40%). The ideal standard was established by observing bleeding within the pulp chamber. Sensitivity values of the diagnostic tests were 0.89 and 0.94 for cold test, 0.84 and 0.87 for the heat tests, and 0.75 for electrical pulp test and the specificity values of the diagnostic tests were 0.91 and 0.93 for the cold tests, 0.86 and 0.84 for the heat tests, and 0.90 for electrical pulp test. The NPVs were 0.91 and 0.96 for the cold tests, 0.89 and 0.91 for the heat tests, and 0.84 for electrical pulp test. The positive predictive values were 0.89 and 0.90 for the cold tests, 0.80 and 0.79 for the heat tests and 0.88 for electrical pulp test. The highest accuracy (0.9393) was observed with cold test (icy spray). The cold test done with icy spray was the most accurate method for sensitivity testing.
Pesek, Todd; Abramiuk, Marc; Garagic, Denis; Fini, Nick; Meerman, Jan; Cal, Victor
2009-03-01
Ethnobotanical surveys were conducted to locate culturally important, regionally scarce, and disappearing medicinal plants via a novel participatory methodology which involves healer-expert knowledge in interactive spatial modeling to prioritize conservation efforts and thus facilitate health promotion via medicinal plant resource sustained availability. These surveys, conducted in the Maya Mountains, Belize, generate ethnobotanical, ecological, and geospatial data on species which are used by Q'eqchi' Maya healers in practice. Several of these mountainous species are regionally scarce and the healers are expressing difficulties in finding them for use in promotion of community health and wellness. Based on healers' input, zones of highest probability for locating regionally scarce, disappearing, and culturally important plants in their ecosystem niches can be facilitated by interactive modeling. In the present study, this is begun by choosing three representative species to train an interactive predictive model. Model accuracy was then assessed statistically by testing for independence between predicted occurrence and actual occurrence of medicinal plants. A high level of accuracy was achieved using a small set of exemplar data. This work demonstrates the potential of combining ethnobotany and botanical spatial information with indigenous ecosystems concepts and Q'eqchi' Maya healing knowledge via predictive modeling. Through this approach, we may identify regions where species are located and accordingly promote for prioritization and application of in situ and ex situ conservation strategies to protect them. This represents a significant step toward facilitating sustained culturally relative health promotion as well as overall enhanced ecological integrity to the region and the earth.
Mapping migratory bird prevalence using remote sensing data fusion.
Swatantran, Anu; Dubayah, Ralph; Goetz, Scott; Hofton, Michelle; Betts, Matthew G; Sun, Mindy; Simard, Marc; Holmes, Richard
2012-01-01
Improved maps of species distributions are important for effective management of wildlife under increasing anthropogenic pressures. Recent advances in lidar and radar remote sensing have shown considerable potential for mapping forest structure and habitat characteristics across landscapes. However, their relative efficacies and integrated use in habitat mapping remain largely unexplored. We evaluated the use of lidar, radar and multispectral remote sensing data in predicting multi-year bird detections or prevalence for 8 migratory songbird species in the unfragmented temperate deciduous forests of New Hampshire, USA. A set of 104 predictor variables describing vegetation vertical structure and variability from lidar, phenology from multispectral data and backscatter properties from radar data were derived. We tested the accuracies of these variables in predicting prevalence using Random Forests regression models. All data sets showed more than 30% predictive power with radar models having the lowest and multi-sensor synergy ("fusion") models having highest accuracies. Fusion explained between 54% and 75% variance in prevalence for all the birds considered. Stem density from discrete return lidar and phenology from multispectral data were among the best predictors. Further analysis revealed different relationships between the remote sensing metrics and bird prevalence. Spatial maps of prevalence were consistent with known habitat preferences for the bird species. Our results highlight the potential of integrating multiple remote sensing data sets using machine-learning methods to improve habitat mapping. Multi-dimensional habitat structure maps such as those generated from this study can significantly advance forest management and ecological research by facilitating fine-scale studies at both stand and landscape level.
Final Technical Report: Increasing Prediction Accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
King, Bruce Hardison; Hansen, Clifford; Stein, Joshua
2015-12-01
PV performance models are used to quantify the value of PV plants in a given location. They combine the performance characteristics of the system, the measured or predicted irradiance and weather at a site, and the system configuration and design into a prediction of the amount of energy that will be produced by a PV system. These predictions must be as accurate as possible in order for finance charges to be minimized. Higher accuracy equals lower project risk. The Increasing Prediction Accuracy project at Sandia focuses on quantifying and reducing uncertainties in PV system performance models.
Correa, Katharina; Bangera, Rama; Figueroa, René; Lhorente, Jean P; Yáñez, José M
2017-01-31
Sea lice infestations caused by Caligus rogercresseyi are a main concern to the salmon farming industry due to associated economic losses. Resistance to this parasite was shown to have low to moderate genetic variation and its genetic architecture was suggested to be polygenic. The aim of this study was to compare accuracies of breeding value predictions obtained with pedigree-based best linear unbiased prediction (P-BLUP) methodology against different genomic prediction approaches: genomic BLUP (G-BLUP), Bayesian Lasso, and Bayes C. To achieve this, 2404 individuals from 118 families were measured for C. rogercresseyi count after a challenge and genotyped using 37 K single nucleotide polymorphisms. Accuracies were assessed using fivefold cross-validation and SNP densities of 0.5, 1, 5, 10, 25 and 37 K. Accuracy of genomic predictions increased with increasing SNP density and was higher than pedigree-based BLUP predictions by up to 22%. Both Bayesian and G-BLUP methods can predict breeding values with higher accuracies than pedigree-based BLUP, however, G-BLUP may be the preferred method because of reduced computation time and ease of implementation. A relatively low marker density (i.e. 10 K) is sufficient for maximal increase in accuracy when using G-BLUP or Bayesian methods for genomic prediction of C. rogercresseyi resistance in Atlantic salmon.
Wren, Christopher; Vogel, Melanie; Lord, Stephen; Abrams, Dominic; Bourke, John; Rees, Philip; Rosenthal, Eric
2012-02-01
The aim of this study was to examine the accuracy in predicting pathway location in children with Wolff-Parkinson-White syndrome for each of seven published algorithms. ECGs from 100 consecutive children with Wolff-Parkinson-White syndrome undergoing electrophysiological study were analysed by six investigators using seven published algorithms, six of which had been developed in adult patients. Accuracy and concordance of predictions were adjusted for the number of pathway locations. Accessory pathways were left-sided in 49, septal in 20 and right-sided in 31 children. Overall accuracy of prediction was 30-49% for the exact location and 61-68% including adjacent locations. Concordance between investigators varied between 41% and 86%. No algorithm was better at predicting septal pathways (accuracy 5-35%, improving to 40-78% including adjacent locations), but one was significantly worse. Predictive accuracy was 24-53% for the exact location of right-sided pathways (50-71% including adjacent locations) and 32-55% for the exact location of left-sided pathways (58-73% including adjacent locations). All algorithms were less accurate in our hands than in other authors' own assessment. None performed well in identifying midseptal or right anteroseptal accessory pathway locations.
Comparison of the biometric formulas used for applanation A-scan ultrasound biometry.
Özcura, Fatih; Aktaş, Serdar; Sağdık, Hacı Murat; Tetikoğlu, Mehmet
2016-10-01
The purpose of the study was to compare the accuracy of various biometric formulas for predicting postoperative refraction determined using applanation A-scan ultrasound. This retrospective comparative study included 485 eyes that underwent uneventful phacoemulsification with intraocular lens (IOL) implantation. Applanation A-scan ultrasound biometry and postoperative manifest refraction were obtained in all eyes. Biometric data were entered into each of the five IOL power calculation formulas: SRK-II, SRK/T, Holladay I, Hoffer Q, and Binkhorst II. All eyes were divided into three groups according to axial length: short (≤22.0 mm), average (22.0-25.0 mm), and long (≥25.0 mm) eyes. The postoperative spherical equivalent was calculated and compared with the predicted refractive error using each biometric formula. The results showed that all formulas had significantly lower mean absolute error (MAE) in comparison with Binkhorst II formula (P < 0.01). The lowest MAE was obtained with the SRK-II for average (0.49 ± 0.40 D) and short (0.67 ± 0.54 D) eyes and the SRK/T for long (0.61 ± 0.50 D) eyes. The highest postoperative hyperopic shift was seen with the SRK-II for average (46.8 %), short (28.1 %), and long (48.4 %) eyes. The highest postoperative myopic shift was seen with the Holladay I for average (66.4 %) and long (71.0 %) eyes and the SRK/T for short eyes (80.6 %). In conclusion, the SRK-II formula produced the lowest MAE in average and short eyes and the SRK/T formula produced the lowest MAE in long eyes. The SRK-II has the highest postoperative hyperopic shift in all eyes. The highest postoperative myopic shift is with the Holladay I for average and long eyes and SRK/T for short eyes.
Hellmuth, Christian; Weber, Martina; Koletzko, Berthold; Peissner, Wolfgang
2012-02-07
Despite their central importance for lipid metabolism, straightforward quantitative methods for determination of nonesterified fatty acid (NEFA) species are still missing. The protocol presented here provides unbiased quantitation of plasma NEFA species by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Simple deproteination of plasma in organic solvent solution yields high accuracy, including both the unbound and initially protein-bound fractions, while avoiding interferences from hydrolysis of esterified fatty acids from other lipid classes. Sample preparation is fast and nonexpensive, hence well suited for automation and high-throughput applications. Separation of isotopologic NEFA is achieved using ultrahigh-performance liquid chromatography (UPLC) coupled to triple quadrupole LC-MS/MS detection. In combination with automated liquid handling, total assay time per sample is less than 15 min. The analytical spectrum extends beyond readily available NEFA standard compounds by a regression model predicting all the relevant analytical parameters (retention time, ion path settings, and response factor) of NEFA species based on chain length and number of double bonds. Detection of 50 NEFA species and accurate quantification of 36 NEFA species in human plasma is described, the highest numbers ever reported for a LC-MS application. Accuracy and precision are within widely accepted limits. The use of qualifier ions supports unequivocal analyte verification. © 2012 American Chemical Society
Forecasting malaria in a highly endemic country using environmental and clinical predictors.
Zinszer, Kate; Kigozi, Ruth; Charland, Katia; Dorsey, Grant; Brewer, Timothy F; Brownstein, John S; Kamya, Moses R; Buckeridge, David L
2015-06-18
Malaria thrives in poor tropical and subtropical countries where local resources are limited. Accurate disease forecasts can provide public and clinical health services with the information needed to implement targeted approaches for malaria control that make effective use of limited resources. The objective of this study was to determine the relevance of environmental and clinical predictors of malaria across different settings in Uganda. Forecasting models were based on health facility data collected by the Uganda Malaria Surveillance Project and satellite-derived rainfall, temperature, and vegetation estimates from 2006 to 2013. Facility-specific forecasting models of confirmed malaria were developed using multivariate autoregressive integrated moving average models and produced weekly forecast horizons over a 52-week forecasting period. The model with the most accurate forecasts varied by site and by forecast horizon. Clinical predictors were retained in the models with the highest predictive power for all facility sites. The average error over the 52 forecasting horizons ranged from 26 to 128% whereas the cumulative burden forecast error ranged from 2 to 22%. Clinical data, such as drug treatment, could be used to improve the accuracy of malaria predictions in endemic settings when coupled with environmental predictors. Further exploration of malaria forecasting is necessary to improve its accuracy and value in practice, including examining other environmental and intervention predictors, including insecticide-treated nets.
Wang, Jing-Jing; Wu, Hai-Feng; Sun, Tao; Li, Xia; Wang, Wei; Tao, Li-Xin; Huo, Da; Lv, Ping-Xin; He, Wen; Guo, Xiu-Hua
2013-01-01
Lung cancer, one of the leading causes of cancer-related deaths, usually appears as solitary pulmonary nodules (SPNs) which are hard to diagnose using the naked eye. In this paper, curvelet-based textural features and clinical parameters are used with three prediction models [a multilevel model, a least absolute shrinkage and selection operator (LASSO) regression method, and a support vector machine (SVM)] to improve the diagnosis of benign and malignant SPNs. Dimensionality reduction of the original curvelet-based textural features was achieved using principal component analysis. In addition, non-conditional logistical regression was used to find clinical predictors among demographic parameters and morphological features. The results showed that, combined with 11 clinical predictors, the accuracy rates using 12 principal components were higher than those using the original curvelet-based textural features. To evaluate the models, 10-fold cross validation and back substitution were applied. The results obtained, respectively, were 0.8549 and 0.9221 for the LASSO method, 0.9443 and 0.9831 for SVM, and 0.8722 and 0.9722 for the multilevel model. All in all, it was found that using curvelet-based textural features after dimensionality reduction and using clinical predictors, the highest accuracy rate was achieved with SVM. The method may be used as an auxiliary tool to differentiate between benign and malignant SPNs in CT images.
Vorberg, Susann; Tetko, Igor V
2014-01-01
Biodegradability describes the capacity of substances to be mineralized by free-living bacteria. It is a crucial property in estimating a compound's long-term impact on the environment. The ability to reliably predict biodegradability would reduce the need for laborious experimental testing. However, this endpoint is difficult to model due to unavailability or inconsistency of experimental data. Our approach makes use of the Online Chemical Modeling Environment (OCHEM) and its rich supply of machine learning methods and descriptor sets to build classification models for ready biodegradability. These models were analyzed to determine the relationship between characteristic structural properties and biodegradation activity. The distinguishing feature of the developed models is their ability to estimate the accuracy of prediction for each individual compound. The models developed using seven individual descriptor sets were combined in a consensus model, which provided the highest accuracy. The identified overrepresented structural fragments can be used by chemists to improve the biodegradability of new chemical compounds. The consensus model, the datasets used, and the calculated structural fragments are publicly available at http://ochem.eu/article/31660. © 2014 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Colgan, Matthew S; Martin, Roberta E; Baldeck, Claire A; Asner, Gregory P
2015-01-01
Understanding the relative importance of environment and life history strategies in determining leaf chemical traits remains a key objective of plant ecology. We assessed 20 foliar chemical properties among 12 African savanna woody plant species and their relation to environmental variables (hillslope position, precipitation, geology) and two functional traits (thorn type and seed dispersal mechanism). We found that combinations of six leaf chemical traits (lignin, hemi-cellulose, zinc, boron, magnesium, and manganese) predicted the species with 91% accuracy. Hillslope position, precipitation, and geology accounted for only 12% of the total variance in these six chemical traits. However, thorn type and seed dispersal mechanism accounted for 46% of variance in these chemical traits. The physically defended species had the highest concentrations of hemi-cellulose and boron. Species without physical defense had the highest lignin content if dispersed by vertebrates, but threefold lower lignin content if dispersed by wind. One of the most abundant woody species in southern Africa, Colophospermum mopane, was found to have the highest foliar concentrations of zinc, phosphorus, and δ(13)C, suggesting that zinc chelation may be used by this species to bind metallic toxins and increase uptake of soil phosphorus. Across all studied species, taxonomy and physical traits accounted for the majority of variability in leaf chemistry.
Nishikawa, Hiroki; Nishijima, Norihiro; Enomoto, Hirayuki; Sakamoto, Azusa; Nasu, Akihiro; Komekado, Hideyuki; Nishimura, Takashi; Kita, Ryuichi; Kimura, Toru; Iijima, Hiroko; Nishiguchi, Shuhei; Osaki, Yukio
2017-01-01
To investigate variables before sorafenib therapy on the clinical outcomes in hepatocellular carcinoma (HCC) patients receiving sorafenib and to further assess and compare the predictive performance of continuous parameters using time-dependent receiver operating characteristics (ROC) analysis. A total of 225 HCC patients were analyzed. We retrospectively examined factors related to overall survival (OS) and progression free survival (PFS) using univariate and multivariate analyses. Subsequently, we performed time-dependent ROC analysis of continuous parameters which were significant in the multivariate analysis in terms of OS and PFS. Total sum of area under the ROC in all time points (defined as TAAT score) in each case was calculated. Our cohort included 175 male and 50 female patients (median age, 72 years) and included 158 Child-Pugh A and 67 Child-Pugh B patients. The median OS time was 0.68 years, while the median PFS time was 0.24 years. On multivariate analysis, gender, body mass index (BMI), Child-Pugh classification, extrahepatic metastases, tumor burden, aspartate aminotransferase (AST) and alpha-fetoprotein (AFP) were identified as significant predictors of OS and ECOG-performance status, Child-Pugh classification and extrahepatic metastases were identified as significant predictors of PFS. Among three continuous variables (i.e., BMI, AST and AFP), AFP had the highest TAAT score for the entire cohort. In subgroup analyses, AFP had the highest TAAT score except for Child-Pugh B and female among three continuous variables. In continuous variables, AFP could have higher predictive accuracy for survival in HCC patients undergoing sorafenib therapy.
Heidaritabar, M; Wolc, A; Arango, J; Zeng, J; Settar, P; Fulton, J E; O'Sullivan, N P; Bastiaansen, J W M; Fernando, R L; Garrick, D J; Dekkers, J C M
2016-10-01
Most genomic prediction studies fit only additive effects in models to estimate genomic breeding values (GEBV). However, if dominance genetic effects are an important source of variation for complex traits, accounting for them may improve the accuracy of GEBV. We investigated the effect of fitting dominance and additive effects on the accuracy of GEBV for eight egg production and quality traits in a purebred line of brown layers using pedigree or genomic information (42K single-nucleotide polymorphism (SNP) panel). Phenotypes were corrected for the effect of hatch date. Additive and dominance genetic variances were estimated using genomic-based [genomic best linear unbiased prediction (GBLUP)-REML and BayesC] and pedigree-based (PBLUP-REML) methods. Breeding values were predicted using a model that included both additive and dominance effects and a model that included only additive effects. The reference population consisted of approximately 1800 animals hatched between 2004 and 2009, while approximately 300 young animals hatched in 2010 were used for validation. Accuracy of prediction was computed as the correlation between phenotypes and estimated breeding values of the validation animals divided by the square root of the estimate of heritability in the whole population. The proportion of dominance variance to total phenotypic variance ranged from 0.03 to 0.22 with PBLUP-REML across traits, from 0 to 0.03 with GBLUP-REML and from 0.01 to 0.05 with BayesC. Accuracies of GEBV ranged from 0.28 to 0.60 across traits. Inclusion of dominance effects did not improve the accuracy of GEBV, and differences in their accuracies between genomic-based methods were small (0.01-0.05), with GBLUP-REML yielding higher prediction accuracies than BayesC for egg production, egg colour and yolk weight, while BayesC yielded higher accuracies than GBLUP-REML for the other traits. In conclusion, fitting dominance effects did not impact accuracy of genomic prediction of breeding values in this population. © 2016 Blackwell Verlag GmbH.
Hidalgo, A M; Bastiaansen, J W M; Lopes, M S; Veroneze, R; Groenen, M A M; de Koning, D-J
2015-07-01
Genomic selection is applied to dairy cattle breeding to improve the genetic progress of purebred (PB) animals, whereas in pigs and poultry the target is a crossbred (CB) animal for which a different strategy appears to be needed. The source of information used to estimate the breeding values, i.e., using phenotypes of CB or PB animals, may affect the accuracy of prediction. The objective of our study was to assess the direct genomic value (DGV) accuracy of CB and PB pigs using different sources of phenotypic information. Data used were from 3 populations: 2,078 Dutch Landrace-based, 2,301 Large White-based, and 497 crossbreds from an F1 cross between the 2 lines. Two female reproduction traits were analyzed: gestation length (GLE) and total number of piglets born (TNB). Phenotypes used in the analyses originated from offspring of genotyped individuals. Phenotypes collected on CB and PB animals were analyzed as separate traits using a single-trait model. Breeding values were estimated separately for each trait in a pedigree BLUP analysis and subsequently deregressed. Deregressed EBV for each trait originating from different sources (CB or PB offspring) were used to study the accuracy of genomic prediction. Accuracy of prediction was computed as the correlation between DGV and the DEBV of the validation population. Accuracy of prediction within PB populations ranged from 0.43 to 0.62 across GLE and TNB. Accuracies to predict genetic merit of CB animals with one PB population in the training set ranged from 0.12 to 0.28, with the exception of using the CB offspring phenotype of the Dutch Landrace that resulted in an accuracy estimate around 0 for both traits. Accuracies to predict genetic merit of CB animals with both parental PB populations in the training set ranged from 0.17 to 0.30. We conclude that prediction within population and trait had good predictive ability regardless of the trait being the PB or CB performance, whereas using PB population(s) to predict genetic merit of CB animals had zero to moderate predictive ability. We observed that the DGV accuracy of CB animals when training on PB data was greater than or equal to training on CB data. However, when results are corrected for the different levels of reliabilities in the PB and CB training data, we showed that training on CB data does outperform PB data for the prediction of CB genetic merit, indicating that more CB animals should be phenotyped to increase the reliability and, consequently, accuracy of DGV for CB genetic merit.
2012-01-01
Background Previous studies on tumor classification based on gene expression profiles suggest that gene selection plays a key role in improving the classification performance. Moreover, finding important tumor-related genes with the highest accuracy is a very important task because these genes might serve as tumor biomarkers, which is of great benefit to not only tumor molecular diagnosis but also drug development. Results This paper proposes a novel gene selection method with rich biomedical meaning based on Heuristic Breadth-first Search Algorithm (HBSA) to find as many optimal gene subsets as possible. Due to the curse of dimensionality, this type of method could suffer from over-fitting and selection bias problems. To address these potential problems, a HBSA-based ensemble classifier is constructed using majority voting strategy from individual classifiers constructed by the selected gene subsets, and a novel HBSA-based gene ranking method is designed to find important tumor-related genes by measuring the significance of genes using their occurrence frequencies in the selected gene subsets. The experimental results on nine tumor datasets including three pairs of cross-platform datasets indicate that the proposed method can not only obtain better generalization performance but also find many important tumor-related genes. Conclusions It is found that the frequencies of the selected genes follow a power-law distribution, indicating that only a few top-ranked genes can be used as potential diagnosis biomarkers. Moreover, the top-ranked genes leading to very high prediction accuracy are closely related to specific tumor subtype and even hub genes. Compared with other related methods, the proposed method can achieve higher prediction accuracy with fewer genes. Moreover, they are further justified by analyzing the top-ranked genes in the context of individual gene function, biological pathway, and protein-protein interaction network. PMID:22830977
Bayesian spatiotemporal crash frequency models with mixture components for space-time interactions.
Cheng, Wen; Gill, Gurdiljot Singh; Zhang, Yongping; Cao, Zhong
2018-03-01
The traffic safety research has developed spatiotemporal models to explore the variations in the spatial pattern of crash risk over time. Many studies observed notable benefits associated with the inclusion of spatial and temporal correlation and their interactions. However, the safety literature lacks sufficient research for the comparison of different temporal treatments and their interaction with spatial component. This study developed four spatiotemporal models with varying complexity due to the different temporal treatments such as (I) linear time trend; (II) quadratic time trend; (III) Autoregressive-1 (AR-1); and (IV) time adjacency. Moreover, the study introduced a flexible two-component mixture for the space-time interaction which allows greater flexibility compared to the traditional linear space-time interaction. The mixture component allows the accommodation of global space-time interaction as well as the departures from the overall spatial and temporal risk patterns. This study performed a comprehensive assessment of mixture models based on the diverse criteria pertaining to goodness-of-fit, cross-validation and evaluation based on in-sample data for predictive accuracy of crash estimates. The assessment of model performance in terms of goodness-of-fit clearly established the superiority of the time-adjacency specification which was evidently more complex due to the addition of information borrowed from neighboring years, but this addition of parameters allowed significant advantage at posterior deviance which subsequently benefited overall fit to crash data. The Base models were also developed to study the comparison between the proposed mixture and traditional space-time components for each temporal model. The mixture models consistently outperformed the corresponding Base models due to the advantages of much lower deviance. For cross-validation comparison of predictive accuracy, linear time trend model was adjudged the best as it recorded the highest value of log pseudo marginal likelihood (LPML). Four other evaluation criteria were considered for typical validation using the same data for model development. Under each criterion, observed crash counts were compared with three types of data containing Bayesian estimated, normal predicted, and model replicated ones. The linear model again performed the best in most scenarios except one case of using model replicated data and two cases involving prediction without including random effects. These phenomena indicated the mediocre performance of linear trend when random effects were excluded for evaluation. This might be due to the flexible mixture space-time interaction which can efficiently absorb the residual variability escaping from the predictable part of the model. The comparison of Base and mixture models in terms of prediction accuracy further bolstered the superiority of the mixture models as the mixture ones generated more precise estimated crash counts across all four models, suggesting that the advantages associated with mixture component at model fit were transferable to prediction accuracy. Finally, the residual analysis demonstrated the consistently superior performance of random effect models which validates the importance of incorporating the correlation structures to account for unobserved heterogeneity. Copyright © 2017 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Kwon, Heekyung
2011-01-01
The objective of this study is to provide a systematic account of three typical phenomena surrounding absolute accuracy of metacomprehension assessments: (1) the absolute accuracy of predictions is typically quite low; (2) there exist individual differences in absolute accuracy of predictions as a function of reading skill; and (3) postdictions…
Sex estimation from sternal measurements using multidetector computed tomography.
Ekizoglu, Oguzhan; Hocaoglu, Elif; Inci, Ercan; Bilgili, Mustafa Gokhan; Solmaz, Dilek; Erdil, Irem; Can, Ismail Ozgur
2014-12-01
We aimed to show the utility and reliability of sternal morphometric analysis for sex estimation.Sex estimation is a very important step in forensic identification. Skeletal surveys are main methods for sex estimation studies. Morphometric analysis of sternum may provide high accuracy rated data in sex discrimination. In this study, morphometric analysis of sternum was evaluated in 1 mm chest computed tomography scans for sex estimation. Four hundred forty 3 subjects (202 female, 241 male, mean age: 44 ± 8.1 [distribution: 30-60 year old]) were included the study. Manubrium length (ML), mesosternum length (2L), Sternebra 1 (S1W), and Sternebra 3 (S3W) width were measured and also sternal index (SI) was calculated. Differences between genders were evaluated by student t-test. Predictive factors of sex were determined by discrimination analysis and receiver operating characteristic (ROC) analysis. Male sternal measurement values are significantly higher than females (P < 0.001) while SI is significantly low in males (P < 0.001). In discrimination analysis, MSL has high accuracy rate with 80.2% in females and 80.9% in males. MSL also has the best sensitivity (75.9%) and specificity (87.6%) values. Accuracy rates were above 80% in 3 stepwise discrimination analysis for both sexes. Stepwise 1 (ML, MSL, S1W, S3W) has the highest accuracy rate in stepwise discrimination analysis with 86.1% in females and 83.8% in males. Our study showed that morphometric computed tomography analysis of sternum might provide important information for sex estimation.
Iannetta, Danilo; Fontana, Federico Y; Maturana, Felipe Mattioni; Inglis, Erin Calaine; Pogliaghi, Silvia; Keir, Daniel A; Murias, Juan M
2018-05-23
The maximal lactate steady state (MLSS) represents the highest exercise intensity at which an elevated blood lactate concentration ([Lac] b ) is stabilized above resting values. MLSS quantifies the boundary between the heavy-to-very-heavy intensity domains but its determination is not widely performed due to the number of trials required. This study aimed to: (i) develop a mathematical equation capable of predicting MLSS using variables measured during a single ramp-incremental cycling test and (ii) test the accuracy of the optimized mathematical equation. The predictive MLSS equation was determined by stepwise backward regression analysis of twelve independent variables measured in sixty individuals who had previously performed ramp-incremental exercise and in whom MLSS was known (MLSS obs ). Next, twenty-nine different individuals were prospectively recruited to test the accuracy of the equation. These participants performed ramp-incremental exercise to exhaustion and two-to-three 30-min constant-power output cycling bouts with [Lac] b sampled at regular intervals for determination of MLSS obs . Predicted MLSS (MLSS pred ) and MLSS obs in both phases of the study were compared by paired t-test, major-axis regression and Bland-Altman analysis. The predictor variables of MLSS were: respiratory compensation point (Wkg -1 ), peak oxygen uptake (V˙O 2peak ) (mlkg -1 min -1 ) and body mass (kg). MLSS pred was highly correlated with MLSS obs (r=0.93; p<0.01). When this equation was tested on the independent group, MLSS pred was not different from MLSS obs (234±43 vs. 234±44W; SEE 4.8W; r=0.99; p<0.01). These data support the validity of the predictive MLSS equation. We advocate its use as a time-efficient alternative to traditional MLSS testing in cycling. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Inage, Terunaga; Nakajima, Takahiro; Itoga, Sakae; Ishige, Takayuki; Fujiwara, Taiki; Sakairi, Yuichi; Wada, Hironobu; Suzuki, Hidemi; Iwata, Takekazu; Chiyo, Masako; Yoshida, Shigetoshi; Matsushita, Kazuyuki; Yasufuku, Kazuhiro; Yoshino, Ichiro
2018-06-13
The limited negative predictive value of endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) has often been discussed. The aim of this study was to identify a highly sensitive molecular biomarker for lymph node staging by EBUS-TBNA. Five microRNAs (miRNAs) (miR-200a, miR-200b, miR-200c, miR-141, and let-7e) were selected as biomarker candidates for the detection of nodal metastasis in a miRNA expression analysis. After having established a cutoff level of expression for each marker to differentiate malignant from benign lymph nodes among surgically dissected lymph nodes, the cutoff level was applied to snap-frozen EBUS-TBNA samples. Archived formalin-fixed paraffin- embedded (FFPE) samples rebiopsied by EBUS-TBNA after induction chemoradiotherapy were also analyzed. The expression of all candidate miRNAs was significantly higher in metastatic lymph nodes than in benign ones (p < 0.05) among the surgical samples. miR-200c showed the highest diagnostic yield, with a sensitivity of 95.4% and a specificity of 100%. When the cutoff value for miR-200c was applied to the snap-frozen EBUS-TBNA samples, the sensitivity, specificity, positive predictive value, negative predictive value, and diagnostic accuracy were 97.4, 81.8, 95.0, 90.0, and 94.0%, respectively. For restaging FFPE EBUS- TBNA samples, the sensitivity, specificity, positive predictive value, negative predictive value, and diagnostic accuracy were 100, 60.0, 80.0, 100, and 84.6%, respectively. Among the restaged samples, 4 malignant lymph nodes were false negative by EBUS-TBNA, but they were accurately identified by miR-200c. miR-200c can be used as a highly sensitive molecular staging biomarker that will enhance nodal staging of lung cancer. © 2018 S. Karger AG, Basel.
Accuracies of univariate and multivariate genomic prediction models in African cassava.
Okeke, Uche Godfrey; Akdemir, Deniz; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2017-12-04
Genomic selection (GS) promises to accelerate genetic gain in plant breeding programs especially for crop species such as cassava that have long breeding cycles. Practically, to implement GS in cassava breeding, it is necessary to evaluate different GS models and to develop suitable models for an optimized breeding pipeline. In this paper, we compared (1) prediction accuracies from a single-trait (uT) and a multi-trait (MT) mixed model for a single-environment genetic evaluation (Scenario 1), and (2) accuracies from a compound symmetric multi-environment model (uE) parameterized as a univariate multi-kernel model to a multivariate (ME) multi-environment mixed model that accounts for genotype-by-environment interaction for multi-environment genetic evaluation (Scenario 2). For these analyses, we used 16 years of public cassava breeding data for six target cassava traits and a fivefold cross-validation scheme with 10-repeat cycles to assess model prediction accuracies. In Scenario 1, the MT models had higher prediction accuracies than the uT models for all traits and locations analyzed, which amounted to on average a 40% improved prediction accuracy. For Scenario 2, we observed that the ME model had on average (across all locations and traits) a 12% improved prediction accuracy compared to the uE model. We recommend the use of multivariate mixed models (MT and ME) for cassava genetic evaluation. These models may be useful for other plant species.
The Use of Linear Programming for Prediction.
ERIC Educational Resources Information Center
Schnittjer, Carl J.
The purpose of the study was to develop a linear programming model to be used for prediction, test the accuracy of the predictions, and compare the accuracy with that produced by curvilinear multiple regression analysis. (Author)
Nenov, Artur; Mukamel, Shaul; Garavelli, Marco; Rivalta, Ivan
2015-08-11
First-principles simulations of two-dimensional electronic spectroscopy in the ultraviolet region (2DUV) require computationally demanding multiconfigurational approaches that can resolve doubly excited and charge transfer states, the spectroscopic fingerprints of coupled UV-active chromophores. Here, we propose an efficient approach to reduce the computational cost of accurate simulations of 2DUV spectra of benzene, phenol, and their dimer (i.e., the minimal models for studying electronic coupling of UV-chromophores in proteins). We first establish the multiconfigurational recipe with the highest accuracy by comparison with experimental data, providing reference gas-phase transition energies and dipole moments that can be used to construct exciton Hamiltonians involving high-lying excited states. We show that by reducing the active spaces and the number of configuration state functions within restricted active space schemes, the computational cost can be significantly decreased without loss of accuracy in predicting 2DUV spectra. The proposed recipe has been successfully tested on a realistic model proteic system in water. Accounting for line broadening due to thermal and solvent-induced fluctuations allows for direct comparison with experiments.
PPCM: Combing multiple classifiers to improve protein-protein interaction prediction
Yao, Jianzhuang; Guo, Hong; Yang, Xiaohan
2015-08-01
Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using anmore » assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.« less
Weiner, Kevin S; Barnett, Michael A; Witthoft, Nathan; Golarai, Golijeh; Stigliani, Anthony; Kay, Kendrick N; Gomez, Jesse; Natu, Vaidehi S; Amunts, Katrin; Zilles, Karl; Grill-Spector, Kalanit
2018-04-15
The parahippocampal place area (PPA) is a widely studied high-level visual region in the human brain involved in place and scene processing. The goal of the present study was to identify the most probable location of place-selective voxels in medial ventral temporal cortex. To achieve this goal, we first used cortex-based alignment (CBA) to create a probabilistic place-selective region of interest (ROI) from one group of 12 participants. We then tested how well this ROI could predict place selectivity in each hemisphere within a new group of 12 participants. Our results reveal that a probabilistic ROI (pROI) generated from one group of 12 participants accurately predicts the location and functional selectivity in individual brains from a new group of 12 participants, despite between subject variability in the exact location of place-selective voxels relative to the folding of parahippocampal cortex. Additionally, the prediction accuracy of our pROI is significantly higher than that achieved by volume-based Talairach alignment. Comparing the location of the pROI of the PPA relative to published data from over 500 participants, including data from the Human Connectome Project, shows a striking convergence of the predicted location of the PPA and the cortical location of voxels exhibiting the highest place selectivity across studies using various methods and stimuli. Specifically, the most predictive anatomical location of voxels exhibiting the highest place selectivity in medial ventral temporal cortex is the junction of the collateral and anterior lingual sulci. Methodologically, we make this pROI freely available (vpnl.stanford.edu/PlaceSelectivity), which provides a means to accurately identify a functional region from anatomical MRI data when fMRI data are not available (for example, in patient populations). Theoretically, we consider different anatomical and functional factors that may contribute to the consistent anatomical location of place selectivity relative to the folding of high-level visual cortex. Copyright © 2017 Elsevier Inc. All rights reserved.
Petta, S; Wong, V W-S; Cammà, C; Hiriart, J-B; Wong, G L-H; Vergniol, J; Chan, A W-H; Di Marco, V; Merrouche, W; Chan, H L-Y; Marra, F; Le-Bail, B; Arena, U; Craxì, A; de Ledinghen, V
2017-09-01
The accuracy of available non-invasive tools for staging severe fibrosis in patients with nonalcoholic fatty liver disease (NAFLD) is still limited. To assess the diagnostic performance of paired or serial combination of non-invasive tools in NAFLD patients. We analysed data from 741 patients with a histological diagnosis of NAFLD. The GGT/PLT, APRI, AST/ALT, BARD, FIB-4, and NAFLD Fibrosis Score (NFS) scores were calculated according to published algorithms. Liver stiffness measurement (LSM) was performed by FibroScan. LSM, NFS and FIB-4 were the best non-invasive tools for staging F3-F4 fibrosis (AUC 0.863, 0.774, and 0.792, respectively), with LSM having the highest sensitivity (90%), and the highest NPV (94%), and NFS and FIB-4 the highest specificity (97% and 93%, respectively), and the highest PPV (73% and 79%, respectively). The paired combination of LSM or NFS with FIB-4 strongly reduced the likelihood of wrongly classified patients (ranging from 2.7% to 2.6%), at the price of a high uncertainty area (ranging from 54.1% to 58.2%), and of a low overall accuracy (ranging from 43% to 39.1%). The serial combination with the second test used in patients in the grey area of the first test and in those with high LSM values (>9.6 KPa) or low NFS or FIB-4 values (<-1.455 and <1.30, respectively) overall increased the diagnostic performance generating an accuracy ranging from 69.8% to 70.1%, an uncertainty area ranging from 18.9% to 20.4% and a rate of wrong classification ranging from 9.2% to 11.3%. The serial combination of LSM with FIB-4/NFS has a good diagnostic accuracy for the non-invasive diagnosis of severe fibrosis in NAFLD. © 2017 John Wiley & Sons Ltd.
An Optimal Current Observer for Predictive Current Controlled Buck DC-DC Converters
Min, Run; Chen, Chen; Zhang, Xiaodong; Zou, Xuecheng; Tong, Qiaoling; Zhang, Qiao
2014-01-01
In digital current mode controlled DC-DC converters, conventional current sensors might not provide isolation at a minimized price, power loss and size. Therefore, a current observer which can be realized based on the digital circuit itself, is a possible substitute. However, the observed current may diverge due to the parasitic resistors and the forward conduction voltage of the diode. Moreover, the divergence of the observed current will cause steady state errors in the output voltage. In this paper, an optimal current observer is proposed. It achieves the highest observation accuracy by compensating for all the known parasitic parameters. By employing the optimal current observer-based predictive current controller, a buck converter is implemented. The converter has a convergently and accurately observed inductor current, and shows preferable transient response than the conventional voltage mode controlled converter. Besides, costs, power loss and size are minimized since the strategy requires no additional hardware for current sensing. The effectiveness of the proposed optimal current observer is demonstrated experimentally. PMID:24854061
Wiegner, T N; Edens, C J; Abaya, L M; Carlson, K M; Lyon-Colbert, A; Molloy, S L
2017-01-30
Spatial and temporal patterns of coastal microbial pollution are not well documented. Our study examined these patterns through measurements of fecal indicator bacteria (FIB), nutrients, and physiochemical parameters in Hilo Bay, Hawai'i, during high and low river flow. >40% of samples tested positive for the human-associated Bacteroides marker, with highest percentages near rivers. Other FIB were also higher near rivers, but only Clostridium perfringens concentrations were related to discharge. During storms, FIB concentrations were three times to an order of magnitude higher, and increased with decreasing salinity and water temperature, and increasing turbidity. These relationships and high spatial resolution data for these parameters were used to create Enterococcus spp. and C. perfringens maps that predicted exceedances with 64% and 95% accuracy, respectively. Mapping microbial pollution patterns and predicting exceedances is a valuable tool that can improve water quality monitoring and aid in visualizing FIB hotspots for management actions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Blanche, Paul; Proust-Lima, Cécile; Loubère, Lucie; Berr, Claudine; Dartigues, Jean-François; Jacqmin-Gadda, Hélène
2015-03-01
Thanks to the growing interest in personalized medicine, joint modeling of longitudinal marker and time-to-event data has recently started to be used to derive dynamic individual risk predictions. Individual predictions are called dynamic because they are updated when information on the subject's health profile grows with time. We focus in this work on statistical methods for quantifying and comparing dynamic predictive accuracy of this kind of prognostic models, accounting for right censoring and possibly competing events. Dynamic area under the ROC curve (AUC) and Brier Score (BS) are used to quantify predictive accuracy. Nonparametric inverse probability of censoring weighting is used to estimate dynamic curves of AUC and BS as functions of the time at which predictions are made. Asymptotic results are established and both pointwise confidence intervals and simultaneous confidence bands are derived. Tests are also proposed to compare the dynamic prediction accuracy curves of two prognostic models. The finite sample behavior of the inference procedures is assessed via simulations. We apply the proposed methodology to compare various prediction models using repeated measures of two psychometric tests to predict dementia in the elderly, accounting for the competing risk of death. Models are estimated on the French Paquid cohort and predictive accuracies are evaluated and compared on the French Three-City cohort. © 2014, The International Biometric Society.
Kirschner, Andreas; Frishman, Dmitrij
2008-10-01
Prediction of beta-turns from amino acid sequences has long been recognized as an important problem in structural bioinformatics due to their frequent occurrence as well as their structural and functional significance. Because various structural features of proteins are intercorrelated, secondary structure information has been often employed as an additional input for machine learning algorithms while predicting beta-turns. Here we present a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) capable of predicting multiple mutually dependent structural motifs and demonstrate its efficiency in recognizing three aspects of protein structure: beta-turns, beta-turn types, and secondary structure. The advantage of our method compared to other predictors is that it does not require any external input except for sequence profiles because interdependencies between different structural features are taken into account implicitly during the learning process. In a sevenfold cross-validation experiment on a standard test dataset our method exhibits the total prediction accuracy of 77.9% and the Mathew's Correlation Coefficient of 0.45, the highest performance reported so far. It also outperforms other known methods in delineating individual turn types. We demonstrate how simultaneous prediction of multiple targets influences prediction performance on single targets. The MOLEBRNN presented here is a generic method applicable in a variety of research fields where multiple mutually depending target classes need to be predicted. http://webclu.bio.wzw.tum.de/predator-web/.
Sakoda, Lori C; Henderson, Louise M; Caverly, Tanner J; Wernli, Karen J; Katki, Hormuzd A
2017-12-01
Risk prediction models may be useful for facilitating effective and high-quality decision-making at critical steps in the lung cancer screening process. This review provides a current overview of published lung cancer risk prediction models and their applications to lung cancer screening and highlights both challenges and strategies for improving their predictive performance and use in clinical practice. Since the 2011 publication of the National Lung Screening Trial results, numerous prediction models have been proposed to estimate the probability of developing or dying from lung cancer or the probability that a pulmonary nodule is malignant. Respective models appear to exhibit high discriminatory accuracy in identifying individuals at highest risk of lung cancer or differentiating malignant from benign pulmonary nodules. However, validation and critical comparison of the performance of these models in independent populations are limited. Little is also known about the extent to which risk prediction models are being applied in clinical practice and influencing decision-making processes and outcomes related to lung cancer screening. Current evidence is insufficient to determine which lung cancer risk prediction models are most clinically useful and how to best implement their use to optimize screening effectiveness and quality. To address these knowledge gaps, future research should be directed toward validating and enhancing existing risk prediction models for lung cancer and evaluating the application of model-based risk calculators and its corresponding impact on screening processes and outcomes.
Bio-knowledge based filters improve residue-residue contact prediction accuracy.
Wozniak, P P; Pelc, J; Skrzypecki, M; Vriend, G; Kotulska, M
2018-05-29
Residue-residue contact prediction through direct coupling analysis has reached impressive accuracy, but yet higher accuracy will be needed to allow for routine modelling of protein structures. One way to improve the prediction accuracy is to filter predicted contacts using knowledge about the particular protein of interest or knowledge about protein structures in general. We focus on the latter and discuss a set of filters that can be used to remove false positive contact predictions. Each filter depends on one or a few cut-off parameters for which the filter performance was investigated. Combining all filters while using default parameters resulted for a test-set of 851 protein domains in the removal of 29% of the predictions of which 92% were indeed false positives. All data and scripts are available from http://comprec-lin.iiar.pwr.edu.pl/FPfilter/. malgorzata.kotulska@pwr.edu.pl. Supplementary data are available at Bioinformatics online.
Duan, Jun; Han, Xiaoli; Bai, Linfu; Zhou, Lintong; Huang, Shicong
2017-02-01
To develop and validate a scale using variables easily obtained at the bedside for prediction of failure of noninvasive ventilation (NIV) in hypoxemic patients. The test cohort comprised 449 patients with hypoxemia who were receiving NIV. This cohort was used to develop a scale that considers heart rate, acidosis, consciousness, oxygenation, and respiratory rate (referred to as the HACOR scale) to predict NIV failure, defined as need for intubation after NIV intervention. The highest possible score was 25 points. To validate the scale, a separate group of 358 hypoxemic patients were enrolled in the validation cohort. The failure rate of NIV was 47.8 and 39.4% in the test and validation cohorts, respectively. In the test cohort, patients with NIV failure had higher HACOR scores at initiation and after 1, 12, 24, and 48 h of NIV than those with successful NIV. At 1 h of NIV the area under the receiver operating characteristic curve was 0.88, showing good predictive power for NIV failure. Using 5 points as the cutoff value, the sensitivity, specificity, positive predictive value, negative predictive value, and diagnostic accuracy for NIV failure were 72.6, 90.2, 87.2, 78.1, and 81.8%, respectively. These results were confirmed in the validation cohort. Moreover, the diagnostic accuracy for NIV failure exceeded 80% in subgroups classified by diagnosis, age, or disease severity and also at 1, 12, 24, and 48 h of NIV. Among patients with NIV failure with a HACOR score of >5 at 1 h of NIV, hospital mortality was lower in those who received intubation at ≤12 h of NIV than in those intubated later [58/88 (66%) vs. 138/175 (79%); p = 0.03). The HACOR scale variables are easily obtained at the bedside. The scale appears to be an effective way of predicting NIV failure in hypoxemic patients. Early intubation in high-risk patients may reduce hospital mortality.
Scoring systems for outcome prediction in patients with perforated peptic ulcer.
Thorsen, Kenneth; Søreide, Jon Arne; Søreide, Kjetil
2013-04-10
Patients with perforated peptic ulcer (PPU) often present with acute, severe illness that carries a high risk for morbidity and mortality. Mortality ranges from 3-40% and several prognostic scoring systems have been suggested. The aim of this study was to review the available scoring systems for PPU patients, and to assert if there is evidence to prefer one to the other. We searched PubMed for the mesh terms "perforated peptic ulcer", "scoring systems", "risk factors", "outcome prediction", "mortality", "morbidity" and the combinations of these terms. In addition to relevant scores introduced in the past (e.g. Boey score), we included recent studies published between January 2000 and December 2012) that reported on scoring systems for prediction of morbidity and mortality in PPU patients. A total of ten different scoring systems used to predict outcome in PPU patients were identified; the Boey score, the Hacettepe score, the Jabalpur score the peptic ulcer perforation (PULP) score, the ASA score, the Charlson comorbidity index, the sepsis score, the Mannheim Peritonitis Index (MPI), the Acute physiology and chronic health evaluation II (APACHE II), the simplified acute physiology score II (SAPS II), the Mortality probability models II (MPM II), the Physiological and Operative Severity Score for the enumeration of Mortality and Morbidity physical sub-score (POSSUM-phys score). Only four of the scores were specifically constructed for PPU patients. In five studies the accuracy of outcome prediction of different scoring systems was evaluated by receiver operating characteristics curve (ROC) analysis, and the corresponding area under the curve (AUC) among studies compared. Considerable variation in performance both between different scores and between different studies was found, with the lowest and highest AUC reported between 0.63 and 0.98, respectively. While the Boey score and the ASA score are most commonly used to predict outcome for PPU patients, considerable variations in accuracy for outcome prediction were shown. Other scoring systems are hampered by a lack of validation or by their complexity that precludes routine clinical use. While the PULP score seems promising it needs external validation before widespread use.
Protein contact prediction using patterns of correlation.
Hamilton, Nicholas; Burrage, Kevin; Ragan, Mark A; Huber, Thomas
2004-09-01
We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two "windows" of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Copyright 2004 Wiley-Liss, Inc.
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
Correcting Memory Improves Accuracy of Predicted Task Duration
ERIC Educational Resources Information Center
Roy, Michael M.; Mitten, Scott T.; Christenfeld, Nicholas J. S.
2008-01-01
People are often inaccurate in predicting task duration. The memory bias explanation holds that this error is due to people having incorrect memories of how long previous tasks have taken, and these biased memories cause biased predictions. Therefore, the authors examined the effect on increasing predictive accuracy of correcting memory through…
Appuhamy, Jayasooriya A D R N; France, James; Kebreab, Ermias
2016-09-01
There are several models in the literature for predicting enteric methane (CH4 ) emissions. These models were often developed on region or country-specific data and may not be able to predict the emissions successfully in every region. The majority of extant models require dry matter intake (DMI) of individual animals, which is not routinely measured. The objectives of this study were to (i) evaluate performance of extant models in predicting enteric CH4 emissions from dairy cows in North America (NA), Europe (EU), and Australia and New Zealand (AUNZ) and (ii) explore the performance using estimated DMI. Forty extant models were challenged on 55, 105, and 52 enteric CH4 measurements (g per lactating cow per day) from NA, EU, and AUNZ, respectively. The models were ranked using root mean square prediction error as a percentage of the average observed value (RMSPE) and concordance correlation coefficient (CCC). A modified model of Nielsen et al. (Acta Agriculturae Scand Section A, 63, 2013 and 126) using DMI, and dietary digestible neutral detergent fiber and fatty acid contents as predictor variables, were ranked highest in NA (RMSPE = 13.1% and CCC = 0.78). The gross energy intake-based model of Yan et al. (Livestock Production Science, 64, 2000 and 253) and the updated IPCC Tier 2 model were ranked highest in EU (RMSPE = 11.0% and CCC = 0.66) and AUNZ (RMSPE = 15.6% and CCC = 0.75), respectively. DMI of cows in NA and EU was estimated satisfactorily with body weight and fat-corrected milk yield data (RMSPE < 12.0% and CCC > 0.60). Using estimated DMI, the Nielsen et al. (2013) (RMSPE = 12.7 and CCC = 0.79) and Yan et al. (2000) (RMSPE = 13.7 and CCC = 0.50) models still predicted emissions in respective regions well. Enteric CH4 emissions from dairy cows can be predicted successfully (i.e., RMSPE < 15%), if DMI can be estimated with reasonable accuracy (i.e., RMSPE < 10%). © 2016 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Dyar, M. Darby; Giguere, Stephen; Carey, CJ; Boucher, Thomas
2016-12-01
This project examines the causes, effects, and optimization of continuum removal in laser-induced breakdown spectroscopy (LIBS) to produce the best possible prediction accuracy of elemental composition in geological samples. We compare prediction accuracy resulting from several different techniques for baseline removal, including asymmetric least squares (ALS), adaptive iteratively reweighted penalized least squares (Air-PLS), fully automatic baseline correction (FABC), continuous wavelet transformation, median filtering, polynomial fitting, the iterative thresholding Dietrich method, convex hull/rubber band techniques, and a newly-developed technique for Custom baseline removal (BLR). We assess the predictive performance of these methods using partial least-squares analysis for 13 elements of geological interest, expressed as the weight percentages of SiO2, Al2O3, TiO2, FeO, MgO, CaO, Na2O, K2O, and the parts per million concentrations of Ni, Cr, Zn, Mn, and Co. We find that previously published methods for baseline subtraction generally produce equivalent prediction accuracies for major elements. When those pre-existing methods are used, automated optimization of their adjustable parameters is always necessary to wring the best predictive accuracy out of a data set; ideally, it should be done for each individual variable. The new technique of Custom BLR produces significant improvements in prediction accuracy over existing methods across varying geological data sets, instruments, and varying analytical conditions. These results also demonstrate the dual objectives of the continuum removal problem: removing a smooth underlying signal to fit individual peaks (univariate analysis) versus using feature selection to select only those channels that contribute to best prediction accuracy for multivariate analyses. Overall, the current practice of using generalized, one-method-fits-all-spectra baseline removal results in poorer predictive performance for all methods. The extra steps needed to optimize baseline removal for each predicted variable and empower multivariate techniques with the best possible input data for optimal prediction accuracy are shown to be well worth the slight increase in necessary computations and complexity.
Jiang, Y; Zhao, Y; Rodemann, B; Plieske, J; Kollers, S; Korzun, V; Ebmeyer, E; Argillier, O; Hinze, M; Ling, J; Röder, M S; Ganal, M W; Mette, M F; Reif, J C
2015-03-01
Genome-wide mapping approaches in diverse populations are powerful tools to unravel the genetic architecture of complex traits. The main goals of our study were to investigate the potential and limits to unravel the genetic architecture and to identify the factors determining the accuracy of prediction of the genotypic variation of Fusarium head blight (FHB) resistance in wheat (Triticum aestivum L.) based on data collected with a diverse panel of 372 European varieties. The wheat lines were phenotyped in multi-location field trials for FHB resistance and genotyped with 782 simple sequence repeat (SSR) markers, and 9k and 90k single-nucleotide polymorphism (SNP) arrays. We applied genome-wide association mapping in combination with fivefold cross-validations and observed surprisingly high accuracies of prediction for marker-assisted selection based on the detected quantitative trait loci (QTLs). Using a random sample of markers not selected for marker-trait associations revealed only a slight decrease in prediction accuracy compared with marker-based selection exploiting the QTL information. The same picture was confirmed in a simulation study, suggesting that relatedness is a main driver of the accuracy of prediction in marker-assisted selection of FHB resistance. When the accuracy of prediction of three genomic selection models was contrasted for the three marker data sets, no significant differences in accuracies among marker platforms and genomic selection models were observed. Marker density impacted the accuracy of prediction only marginally. Consequently, genomic selection of FHB resistance can be implemented most cost-efficiently based on low- to medium-density SNP arrays.
Prediction algorithms for urban traffic control
DOT National Transportation Integrated Search
1979-02-01
The objectives of this study are to 1) review and assess the state-of-the-art of prediction algorithms for urban traffic control in terms of their accuracy and application, and 2) determine the prediction accuracy obtainable by examining the performa...
Medium- and Long-term Prediction of LOD Change by the Leap-step Autoregressive Model
NASA Astrophysics Data System (ADS)
Wang, Qijie
2015-08-01
The accuracy of medium- and long-term prediction of length of day (LOD) change base on combined least-square and autoregressive (LS+AR) deteriorates gradually. Leap-step autoregressive (LSAR) model can significantly reduce the edge effect of the observation sequence. Especially, LSAR model greatly improves the resolution of signals’ low-frequency components. Therefore, it can improve the efficiency of prediction. In this work, LSAR is used to forecast the LOD change. The LOD series from EOP 08 C04 provided by IERS is modeled by both the LSAR and AR models. The results of the two models are analyzed and compared. When the prediction length is between 10-30 days, the accuracy improvement is less than 10%. When the prediction length amounts to above 30 day, the accuracy improved obviously, with the maximum being around 19%. The results show that the LSAR model has higher prediction accuracy and stability in medium- and long-term prediction.
Lee, S Hong; Clark, Sam; van der Werf, Julius H J
2017-01-01
Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as 'unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Ne). Both the effective number of chromosome segments (Me) and Ne are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.
On the accuracy of ERS-1 orbit predictions
NASA Technical Reports Server (NTRS)
Koenig, Rolf; Li, H.; Massmann, Franz-Heinrich; Raimondo, J. C.; Rajasenan, C.; Reigber, C.
1993-01-01
Since the launch of ERS-1, the D-PAF (German Processing and Archiving Facility) provides regularly orbit predictions for the worldwide SLR (Satellite Laser Ranging) tracking network. The weekly distributed orbital elements are so called tuned IRV's and tuned SAO-elements. The tuning procedure, designed to improve the accuracy of the recovery of the orbit at the stations, is discussed based on numerical results. This shows that tuning of elements is essential for ERS-1 with the currently applied tracking procedures. The orbital elements are updated by daily distributed time bias functions. The generation of the time bias function is explained. Problems and numerical results are presented. The time bias function increases the prediction accuracy considerably. Finally, the quality assessment of ERS-1 orbit predictions is described. The accuracy is compiled for about 250 days since launch. The average accuracy lies in the range of 50-100 ms and has considerably improved.
Krendl, Anne C; Rule, Nicholas O; Ambady, Nalini
2014-09-01
Young adults can be surprisingly accurate at making inferences about people from their faces. Although these first impressions have important consequences for both the perceiver and the target, it remains an open question whether first impression accuracy is preserved with age. Specifically, could age differences in impressions toward others stem from age-related deficits in accurately detecting complex social cues? Research on aging and impression formation suggests that young and older adults show relative consensus in their first impressions, but it is unknown whether they differ in accuracy. It has been widely shown that aging disrupts emotion recognition accuracy, and that these impairments may predict deficits in other social judgments, such as detecting deceit. However, it is unclear whether general impression formation accuracy (e.g., emotion recognition accuracy, detecting complex social cues) relies on similar or distinct mechanisms. It is important to examine this question to evaluate how, if at all, aging might affect overall accuracy. Here, we examined whether aging impaired first impression accuracy in predicting real-world outcomes and categorizing social group membership. Specifically, we studied whether emotion recognition accuracy and age-related cognitive decline (which has been implicated in exacerbating deficits in emotion recognition) predict first impression accuracy. Our results revealed that emotion recognition accuracy did not predict first impression accuracy, nor did age-related cognitive decline impair it. These findings suggest that domains of social perception outside of emotion recognition may rely on mechanisms that are relatively unimpaired by aging. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Van den Bosch, T; Valentin, L; Van Schoubroeck, D; Luts, J; Bignardi, T; Condous, G; Epstein, E; Leone, F P; Testa, A C; Van Huffel, S; Bourne, T; Timmerman, D
2012-10-01
To estimate the diagnostic accuracy and interobserver agreement in predicting intracavitary uterine pathology at offline analysis of three-dimensional (3D) ultrasound volumes of the uterus. 3D volumes (unenhanced ultrasound and gel infusion sonography with and without power Doppler, i.e. four volumes per patient) of 75 women presenting with abnormal uterine bleeding at a 'bleeding clinic' were assessed offline by six examiners. The sonologists were asked to provide a tentative diagnosis. A histological diagnosis was obtained by hysteroscopy with biopsy or operative hysteroscopy. Proliferative, secretory or atrophic endometrium was classified as 'normal' histology; endometrial polyps, intracavitary myomas, endometrial hyperplasia and endometrial cancer were classified as 'abnormal' histology. The diagnostic accuracy of the six sonologists with regard to normal/abnormal histology and interobserver agreement were estimated. Intracavitary pathology was diagnosed at histology in 39% of patients. Agreement between the ultrasound diagnosis and the histological diagnosis (normal vs abnormal) ranged from 67 to 83% for the six sonologists. In 45% of cases all six examiners agreed with regard to the presence/absence of intracavitary pathology. The percentage agreement between any two examiners ranged from 65 to 91% (Cohen's κ, 0.31-0.81). The Schouten κ for all six examiners was 0.51 (95% CI, 0.40-0.62), while the highest Schouten κ for any three examiners was 0.69. When analyzing stored 3D ultrasound volumes, agreement between sonologists with regard to classifying the endometrium/uterine cavity as normal or abnormal as well as the diagnostic accuracy varied substantially. Possible actions to improve interobserver agreement and diagnostic accuracy include optimization of image quality and the use of a consistent technique for analyzing the 3D volumes. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
A scalable method for computing quadruplet wave-wave interactions
NASA Astrophysics Data System (ADS)
Van Vledder, Gerbrant
2017-04-01
Non-linear four-wave interactions are a key physical process in the evolution of wind generated ocean waves. The present generation operational wave models use the Discrete Interaction Approximation (DIA), but it accuracy is poor. It is now generally acknowledged that the DIA should be replaced with a more accurate method to improve predicted spectral shapes and derived parameters. The search for such a method is challenging as one should find a balance between accuracy and computational requirements. Such a method is presented here in the form of a scalable and adaptive method that can mimic both the time consuming exact Snl4 approach and the fast but inaccurate DIA, and everything in between. The method provides an elegant approach to improve the DIA, not by including more arbitrarily shaped wave number configurations, but by a mathematically consistent reduction of an exact method, viz. the WRT method. The adaptiveness is to adapt the abscissa of the locus integrand in relation to the magnitude of the known terms. The adaptiveness is extended to the highest level of the WRT method to select interacting wavenumber configurations in a hierarchical way in relation to their importance. This adaptiveness results in a speed-up of one to three orders of magnitude depending on the measure of accuracy. This definition of accuracy should not be expressed in terms of the quality of the transfer integral for academic spectra but rather in terms of wave model performance in a dynamic run. This has consequences for the balance between the required accuracy and the computational workload for evaluating these interactions. The performance of the scalable method on different scales is illustrated with results from academic spectra, simple growth curves to more complicated field cases using a 3G-wave model.
Temperature-Dependent Refractive Index of Cleartran® ZnS to Cryogenic Temperatures
NASA Technical Reports Server (NTRS)
Leviton, Doug; Frey, Brad
2013-01-01
First, let's talk about the CHARMS facility at NASA's Goddard Space Flight Center: Cryogenic, High-Accuracy Refraction Measuring System (CHARMS); design features for highest accuracy and precision; technologies we rely on; data products and examples; optical materials for which we've measured cryogenic refractive index.
Posterior Predictive Checks for Conditional Independence between Response Time and Accuracy
ERIC Educational Resources Information Center
Bolsinova, Maria; Tijmstra, Jesper
2016-01-01
Conditional independence (CI) between response time and response accuracy is a fundamental assumption of many joint models for time and accuracy used in educational measurement. In this study, posterior predictive checks (PPCs) are proposed for testing this assumption. These PPCs are based on three discrepancy measures reflecting different…
The microcomputer scientific software series 4: testing prediction accuracy.
H. Michael Rauscher
1986-01-01
A computer program, ATEST, is described in this combination user's guide / programmer's manual. ATEST provides users with an efficient and convenient tool to test the accuracy of predictors. As input ATEST requires observed-predicted data pairs. The output reports the two components of accuracy, bias and precision.
Belay, T K; Dagnachew, B S; Boison, S A; Ådnøy, T
2018-03-28
Milk infrared spectra are routinely used for phenotyping traits of interest through links developed between the traits and spectra. Predicted individual traits are then used in genetic analyses for estimated breeding value (EBV) or for phenotypic predictions using a single-trait mixed model; this approach is referred to as indirect prediction (IP). An alternative approach [direct prediction (DP)] is a direct genetic analysis of (a reduced dimension of) the spectra using a multitrait model to predict multivariate EBV of the spectral components and, ultimately, also to predict the univariate EBV or phenotype for the traits of interest. We simulated 3 traits under different genetic (low: 0.10 to high: 0.90) and residual (zero to high: ±0.90) correlation scenarios between the 3 traits and assumed the first trait is a linear combination of the other 2 traits. The aim was to compare the IP and DP approaches for predictions of EBV and phenotypes under the different correlation scenarios. We also evaluated relationships between performances of the 2 approaches and the accuracy of calibration equations. Moreover, the effect of using different regression coefficients estimated from simulated phenotypes (β p ), true breeding values (β g ), and residuals (β r ) on performance of the 2 approaches were evaluated. The simulated data contained 2,100 parents (100 sires and 2,000 cows) and 8,000 offspring (4 offspring per cow). Of the 8,000 observations, 2,000 were randomly selected and used to develop links between the first and the other 2 traits using partial least square (PLS) regression analysis. The different PLS regression coefficients, such as β p , β g , and β r , were used in subsequent predictions following the IP and DP approaches. We used BLUP analyses for the remaining 6,000 observations using the true (co)variance components that had been used for the simulation. Accuracy of prediction (of EBV and phenotype) was calculated as a correlation between predicted and true values from the simulations. The results showed that accuracies of EBV prediction were higher in the DP than in the IP approach. The reverse was true for accuracy of phenotypic prediction when using β p but not when using β g and β r , where accuracy of phenotypic prediction in the DP was slightly higher than in the IP approach. Within the DP approach, accuracies of EBV when using β g were higher than when using β p only at the low genetic correlation scenario. However, we found no differences in EBV prediction accuracy between the β p and β g in the IP approach. Accuracy of the calibration models increased with an increase in genetic and residual correlations between the traits. Performance of both approaches increased with an increase in accuracy of the calibration models. In conclusion, the DP approach is a good strategy for EBV prediction but not for phenotypic prediction, where the classical PLS regression-based equations or the IP approach provided better results. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Neurocognitive and Behavioral Predictors of Math Performance in Children with and without ADHD
Antonini, Tanya N.; O’Brien, Kathleen M.; Narad, Megan E.; Langberg, Joshua M.; Tamm, Leanne; Epstein, Jeff N.
2014-01-01
Objective: This study examined neurocognitive and behavioral predictors of math performance in children with and without attention-deficit/hyperactivity disorder (ADHD). Method: Neurocognitive and behavioral variables were examined as predictors of 1) standardized mathematics achievement scores,2) productivity on an analog math task, and 3) accuracy on an analog math task. Results: Children with ADHD had lower achievement scores but did not significantly differ from controls on math productivity or accuracy. N-back accuracy and parent-rated attention predicted math achievement. N-back accuracy and observed attention predicted math productivity. Alerting scores on the Attentional Network Task predicted math accuracy. Mediation analyses indicated that n-back accuracy significantly mediated the relationship between diagnostic group and math achievement. Conclusion: Neurocognition, rather than behavior, may account for the deficits in math achievement exhibited by many children with ADHD. PMID:24071774
Neurocognitive and Behavioral Predictors of Math Performance in Children With and Without ADHD.
Antonini, Tanya N; Kingery, Kathleen M; Narad, Megan E; Langberg, Joshua M; Tamm, Leanne; Epstein, Jeffery N
2016-02-01
This study examined neurocognitive and behavioral predictors of math performance in children with and without ADHD. Neurocognitive and behavioral variables were examined as predictors of (a) standardized mathematics achievement scores, (b) productivity on an analog math task, and (c) accuracy on an analog math task. Children with ADHD had lower achievement scores but did not significantly differ from controls on math productivity or accuracy. N-back accuracy and parent-rated attention predicted math achievement. N-back accuracy and observed attention predicted math productivity. Alerting scores on the attentional network task predicted math accuracy. Mediation analyses indicated that n-back accuracy significantly mediated the relationship between diagnostic group and math achievement. Neurocognition, rather than behavior, may account for the deficits in math achievement exhibited by many children with ADHD. © The Author(s) 2013.
Can Preoperative Magnetic Resonance Imaging Predict the Reparability of Massive Rotator Cuff Tears?
Kim, Jung Youn; Park, Ji Seon; Rhee, Yong Girl
2017-06-01
Numerous studies have shown preoperative fatty infiltration of rotator cuff muscles to be strongly negatively correlated with the successful repair of massive rotator cuff tears (RCTs). To assess the association between factors identified on preoperative magnetic resonance imaging (MRI), especially infraspinatus fatty infiltration, and the reparability of massive RCTs. Case-control study; Level of evidence, 3. We analyzed a total of 105 patients with massive RCTs for whom MRI was performed ≤6 months before arthroscopic procedures. The mean age of the patients was 62.7 years (range, 46-83 years), and 46 were men. Among them, complete repair was possible in 50 patients (48%) and not possible in 55 patients (52%). The tangent sign, fatty infiltration of the rotator cuff, and Patte classification were evaluated as predictors of reparability. Using the receiver operating characteristic curve and the area under the curve (AUC), the prediction accuracy of each variable and combinations of variables were measured. Reparability was associated with fatty infiltration of the supraspinatus ( P = .0045) and infraspinatus ( P < .001) muscles, the tangent sign ( P = .0033), and the Patte classification ( P < .001) but not with fatty infiltration of the subscapularis and teres minor ( P = .425 and .132, respectively). The cut-off values for supraspinatus and infraspinatus fatty infiltration were grade >3 and grade >2, respectively. The examination of single variables revealed that infraspinatus fatty infiltration showed the highest AUC value (0.812; sensitivity: 0.86; specificity: 0.76), while the tangent sign showed the lowest AUC value (0.626; sensitivity: 0.38; specificity: 0.87). Among 2-variable combinations, the combination of infraspinatus fatty infiltration and the Patte classification showed the highest AUC value (0.874; sensitivity: 0.54; specificity: 0.96). The combination of 4 variables, that is, infraspinatus and supraspinatus fatty infiltration, the tangent sign, and the Patte classification, had an AUC of 0.866 (sensitivity: 0.28; specificity: 0.98), which was lower than the highest AUC value (0.874; sensitivity: 0.54; specificity: 0.96) among the 2-variable combinations. The tangent sign or Patte classification alone was not a predictive indicator of the reparability of massive RCTs. Among single variables, infraspinatus fatty infiltration was the most effective in predicting reparability, while the combination of Goutallier classification <3 of the infraspinatus and Patte classification ≤2 of the rotator cuff muscles was the most predictive among the combinations of variables. This information may help predict the reparability of massive RCTs.
Artificial neural network prediction of ischemic tissue fate in acute stroke imaging
Huang, Shiliang; Shen, Qiang; Duong, Timothy Q
2010-01-01
Multimodal magnetic resonance imaging of acute stroke provides predictive value that can be used to guide stroke therapy. A flexible artificial neural network (ANN) algorithm was developed and applied to predict ischemic tissue fate on three stroke groups: 30-, 60-minute, and permanent middle cerebral artery occlusion in rats. Cerebral blood flow (CBF), apparent diffusion coefficient (ADC), and spin–spin relaxation time constant (T2) were acquired during the acute phase up to 3 hours and again at 24 hours followed by histology. Infarct was predicted on a pixel-by-pixel basis using only acute (30-minute) stroke data. In addition, neighboring pixel information and infarction incidence were also incorporated into the ANN model to improve prediction accuracy. Receiver-operating characteristic analysis was used to quantify prediction accuracy. The major findings were the following: (1) CBF alone poorly predicted the final infarct across three experimental groups; (2) ADC alone adequately predicted the infarct; (3) CBF+ADC improved the prediction accuracy; (4) inclusion of neighboring pixel information and infarction incidence further improved the prediction accuracy; and (5) prediction was more accurate for permanent occlusion, followed by 60- and 30-minute occlusion. The ANN predictive model could thus provide a flexible and objective framework for clinicians to evaluate stroke treatment options on an individual patient basis. PMID:20424631
Cross-validation of recent and longstanding resting metabolic rate prediction equations
USDA-ARS?s Scientific Manuscript database
Resting metabolic rate (RMR) measurement is time consuming and requires specialized equipment. Prediction equations provide an easy method to estimate RMR; however, their accuracy likely varies across individuals. Understanding the factors that influence predicted RMR accuracy at the individual lev...
Prospects for Genomic Selection in Cassava Breeding.
Wolfe, Marnin D; Del Carpio, Dunia Pino; Alabi, Olumide; Ezenwaka, Lydia C; Ikeogu, Ugochukwu N; Kayondo, Ismail S; Lozano, Roberto; Okeke, Uche G; Ozimati, Alfred A; Williams, Esuma; Egesi, Chiedozie; Kawuki, Robert S; Kulakow, Peter; Rabbi, Ismail Y; Jannink, Jean-Luc
2017-11-01
Cassava ( Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on previous analyses by assessing the accuracy of seven prediction models for seven traits in three prediction scenarios: cross-validation within populations, cross-population prediction and cross-generation prediction. We also evaluated the impact of increasing the training population (TP) size by phenotyping progenies selected either at random or with a genetic algorithm. Cross-validation results were mostly consistent across programs, with nonadditive models predicting of 10% better on average. Cross-population accuracy was generally low (mean = 0.18) but prediction of cassava mosaic disease increased up to 57% in one Nigerian population when data from another related population were combined. Accuracy across generations was poorer than within-generation accuracy, as expected, but accuracy for dry matter content and mosaic disease severity should be sufficient for rapid-cycling GS. Selection of a prediction model made some difference across generations, but increasing TP size was more important. With a genetic algorithm, selection of one-third of progeny could achieve an accuracy equivalent to phenotyping all progeny. We are in the early stages of GS for this crop but the results are promising for some traits. General guidelines that are emerging are that TPs need to continue to grow but phenotyping can be done on a cleverly selected subset of individuals, reducing the overall phenotyping burden. Copyright © 2017 Crop Science Society of America.
NASA Astrophysics Data System (ADS)
Fleischer, Christian; Waag, Wladislaw; Bai, Ziou; Sauer, Dirk Uwe
2013-12-01
The battery management system (BMS) of a battery-electric road vehicle must ensure an optimal operation of the electrochemical storage system to guarantee for durability and reliability. In particular, the BMS must provide precise information about the battery's state-of-functionality, i.e. how much dis-/charging power can the battery accept at current state and condition while at the same time preventing it from operating outside its safe operating area. These critical limits have to be calculated in a predictive manner, which serve as a significant input factor for the supervising vehicle energy management (VEM). The VEM must provide enough power to the vehicle's drivetrain for certain tasks and especially in critical driving situations. Therefore, this paper describes a new approach which can be used for state-of-available-power estimation with respect to lowest/highest cell voltage prediction using an adaptive neuro-fuzzy inference system (ANFIS). The estimated voltage for a given time frame in the future is directly compared with the actual voltage, verifying the effectiveness and accuracy of a relative voltage prediction error of less than 1%. Moreover, the real-time operating capability of the proposed algorithm was verified on a battery test bench while running on a real-time system performing voltage prediction.
Justice, AC; Modur, S; Tate, JP; Althoff, KN; Jacobson, LP; Gebo, K; Kitahata, M; Horberg, M; Brooks, J; Buchacz, K; Rourke, SB; Rachlis, A; Napravnik, S; Eron, J; Willig, H; Moore, R; Kirk, GD; Bosch, R; Rodriguez, B; Hogg, RS; Thorne, J; Goedert, JJ; Klein, M; Gill, MJ; Deeks, S; Sterling, TR; Anastos, K; Gange, SJ
2013-01-01
Background By supplementing an index composed of HIV biomarkers and age (Restricted Index) with measures of organ injury, the Veterans Aging Cohort Study (VACS) Index more completely reflects risk of mortality. We compare the accuracy of the VACS and Restricted Indices 1) among subjects outside the Veterans Healthcare System (VA), 2) over 1–5 years of prior exposure to antiretroviral therapy (ART), and 3) within important patient subgroups. Methods We used data from 13 cohorts in the North American AIDS Cohort Collaboration (NA-ACCORD, n=10, 835) limiting analyses to HIV-infected subjects with at least 12 months exposure to ART. Variables included demographic, laboratory (CD4 count, HIV-1 RNA, hemoglobin, platelets, aspartate and alanine transaminase, creatinine and hepatitis C status), and survival. We used C statistic and net reclassification improvement (NRI) to test discrimination varying prior ART exposure from 1–5 years. We then combined VA (n=5,066) and NA-ACCORD data, fit a parametric survival model, and compared predicted to observed mortality by cohort, gender, age, race, and HIV-1 RNA level. Results Mean follow-up was 3.3 years (655 deaths). Compared with the Restricted Index, the VACS Index showed greater discrimination (C statistic: 0.77 vs. 0.74; NRI 12%; p<0.0001). NRI was highest among those with HIV-1 RNA<500 copies/ml (25%) and age ≥50 years (20%). Predictions were similar to observed mortality among all subgroups. Conclusion VACS Index scores discriminate risk and translate into accurate mortality estimates over 1–5 years of exposures to ART and for diverse patient subgroups from North American PMID:23187941
Cross-national comparison of screening mammography accuracy measures in U.S., Norway, and Spain.
Domingo, Laia; Hofvind, Solveig; Hubbard, Rebecca A; Román, Marta; Benkeser, David; Sala, Maria; Castells, Xavier
2016-08-01
To compare accuracy measures for mammographic screening in Norway, Spain, and the US. Information from women aged 50-69 years who underwent mammographic screening 1996-2009 in the US (898,418 women), Norway (527,464), and Spain (517,317) was included. Screen-detected cancer, interval cancer, and the false-positive rates, sensitivity, specificity, positive predictive value (PPV) for recalls (PPV-1), PPV for biopsies (PPV-2), 1/PPV-1 and 1/PPV-2 were computed for each country. Analyses were stratified by age, screening history, time since last screening, calendar year, and mammography modality. The rate of screen-detected cancers was 4.5, 5.5, and 4.0 per 1000 screening exams in the US, Norway, and Spain respectively. The highest sensitivity and lowest specificity were reported in the US (83.1 % and 91.3 %, respectively), followed by Spain (79.0 % and 96.2 %) and Norway (75.5 % and 97.1 %). In Norway, Spain and the US, PPV-1 was 16.4 %, 9.8 %, and 4.9 %, and PPV-2 was 39.4 %, 38.9 %, and 25.9 %, respectively. The number of women needed to recall to detect one cancer was 20.3, 6.1, and 10.2 in the US, Norway, and Spain, respectively. Differences were found across countries, suggesting that opportunistic screening may translate into higher sensitivity at the cost of lower specificity and PPV. • Positive predictive value is higher in population-based screening programmes in Spain and Norway. • Opportunistic mammography screening in the US has lower positive predictive value. • Screening settings in the US translate into higher sensitivity and lower specificity. • The clinical burden may be higher for women screened opportunistically.
SU-F-R-10: Selecting the Optimal Solution for Multi-Objective Radiomics Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Z; Folkert, M; Wang, J
2016-06-15
Purpose: To develop an evidential reasoning approach for selecting the optimal solution from a Pareto solution set obtained by a multi-objective radiomics model for predicting distant failure in lung SBRT. Methods: In the multi-objective radiomics model, both sensitivity and specificity are considered as the objective functions simultaneously. A Pareto solution set with many feasible solutions will be resulted from the multi-objective optimization. In this work, an optimal solution Selection methodology for Multi-Objective radiomics Learning model using the Evidential Reasoning approach (SMOLER) was proposed to select the optimal solution from the Pareto solution set. The proposed SMOLER method used the evidentialmore » reasoning approach to calculate the utility of each solution based on pre-set optimal solution selection rules. The solution with the highest utility was chosen as the optimal solution. In SMOLER, an optimal learning model coupled with clonal selection algorithm was used to optimize model parameters. In this study, PET, CT image features and clinical parameters were utilized for predicting distant failure in lung SBRT. Results: Total 126 solution sets were generated by adjusting predictive model parameters. Each Pareto set contains 100 feasible solutions. The solution selected by SMOLER within each Pareto set was compared to the manually selected optimal solution. Five-cross-validation was used to evaluate the optimal solution selection accuracy of SMOLER. The selection accuracies for five folds were 80.00%, 69.23%, 84.00%, 84.00%, 80.00%, respectively. Conclusion: An optimal solution selection methodology for multi-objective radiomics learning model using the evidential reasoning approach (SMOLER) was proposed. Experimental results show that the optimal solution can be found in approximately 80% cases.« less
Ben Bouallègue, Fayçal; Vauchot, Fabien; Mariano-Goulart, Denis; Payoux, Pierre
2018-02-09
We evaluated the performance of amyloid PET textural and shape features in discriminating normal and Alzheimer's disease (AD) subjects, and in predicting conversion to AD in subjects with mild cognitive impairment (MCI) or significant memory concern (SMC). Subjects from the Alzheimer's Disease Neuroimaging Initiative with available baseline 18 F-florbetapir and T1-MRI scans were included. The cross-sectional cohort consisted of 181 controls and 148 AD subjects. The longitudinal cohort consisted of 431 SMC/MCI subjects, 85 of whom converted to AD during follow-up. PET images were normalized to MNI space and post-processed using in-house software. Relative retention indices (SUVr) were computed with respect to pontine, cerebellar, and composite reference regions. Several textural and shape features were extracted then combined using a support vector machine (SVM) to build a predictive model of AD conversion. Diagnostic and prognostic performance was evaluated using ROC analysis and survival analysis with the Cox proportional hazard model. The three SUVr and all the tested features effectively discriminated AD subjects in cross-sectional analysis (all p < 0.001). In longitudinal analysis, the variables with the highest prognostic value were composite SUVr (AUC 0.86; accuracy 81%), skewness (0.87; 83%), local minima (0.85; 79%), Geary's index (0.86; 81%), gradient norm maximal argument (0.83; 82%), and the SVM model (0.91; 86%). The adjusted hazard ratio for AD conversion was 5.5 for the SVM model, compared with 4.0, 2.6, and 3.8 for cerebellar, pontine and composite SUVr (all p < 0.001), indicating that appropriate amyloid textural and shape features predict conversion to AD with at least as good accuracy as classical SUVr.
Accuracy of diagnosis codes to identify febrile young infants using administrative data.
Aronson, Paul L; Williams, Derek J; Thurm, Cary; Tieder, Joel S; Alpern, Elizabeth R; Nigrovic, Lise E; Schondelmeyer, Amanda C; Balamuth, Fran; Myers, Angela L; McCulloh, Russell J; Alessandrini, Evaline A; Shah, Samir S; Browning, Whitney L; Hayes, Katie L; Feldman, Elana A; Neuman, Mark I
2015-12-01
Administrative data can be used to determine optimal management of febrile infants and aid clinical practice guideline development. Determine the most accurate International Classification of Diseases, Ninth Revision (ICD-9) diagnosis coding strategies for identification of febrile infants. Retrospective cross-sectional study. Eight emergency departments in the Pediatric Health Information System. Infants aged <90 days evaluated between July 1, 2012 and June 30, 2013 were randomly selected for medical record review from 1 of 4 ICD-9 diagnosis code groups: (1) discharge diagnosis of fever, (2) admission diagnosis of fever without discharge diagnosis of fever, (3) discharge diagnosis of serious infection without diagnosis of fever, and (4) no diagnosis of fever or serious infection. The ICD-9 diagnosis code groups were compared in 4 case-identification algorithms to a reference standard of fever ≥100.4°F documented in the medical record. Algorithm predictive accuracy was measured using sensitivity, specificity, and negative and positive predictive values. Among 1790 medical records reviewed, 766 (42.8%) infants had fever. Discharge diagnosis of fever demonstrated high specificity (98.2%, 95% confidence interval [CI]: 97.8-98.6) but low sensitivity (53.2%, 95% CI: 50.0-56.4). A case-identification algorithm of admission or discharge diagnosis of fever exhibited higher sensitivity (71.1%, 95% CI: 68.2-74.0), similar specificity (97.7%, 95% CI: 97.3-98.1), and the highest positive predictive value (86.9%, 95% CI: 84.5-89.3). A case-identification strategy that includes admission or discharge diagnosis of fever should be considered for febrile infant studies using administrative data, though underclassification of patients is a potential limitation. © 2015 Society of Hospital Medicine.
Accuracy of Diagnosis Codes to Identify Febrile Young Infants Using Administrative Data
Aronson, Paul L.; Williams, Derek J.; Thurm, Cary; Tieder, Joel S.; Alpern, Elizabeth R.; Nigrovic, Lise E.; Schondelmeyer, Amanda C.; Balamuth, Fran; Myers, Angela L.; McCulloh, Russell J.; Alessandrini, Evaline A.; Shah, Samir S.; Browning, Whitney L.; Hayes, Katie L.; Feldman, Elana A.; Neuman, Mark I.
2015-01-01
Background Administrative data can be used to determine optimal management of febrile infants and aid clinical practice guideline development. Objective Determine the most accurate International Classification of Diseases, 9th revision (ICD-9) diagnosis coding strategies for identification of febrile infants. Design Retrospective cross-sectional study. Setting Eight emergency departments in the Pediatric Health Information System. Patients Infants age < 90 days evaluated between July 1, 2012 and June 30, 2013 were randomly selected for medical record review from one of four ICD-9 diagnosis code groups: 1) discharge diagnosis of fever, 2) admission diagnosis of fever without discharge diagnosis of fever, 3) discharge diagnosis of serious infection without diagnosis of fever, and 4) no diagnosis of fever or serious infection. Exposure The ICD-9 diagnosis code groups were compared in four case-identification algorithms to a reference standard of fever ≥ 100.4°F documented in the medical record. Measurements Algorithm predictive accuracy was measured using sensitivity, specificity, negative and positive predictive values. Results Among 1790 medical records reviewed, 766 (42.8%) infants had fever. Discharge diagnosis of fever demonstrated high specificity (98.2%, 95% confidence interval [CI]: 97.8-98.6) but low sensitivity (53.2%, 95% CI: 50.0-56.4). A case-identification algorithm of admission or discharge diagnosis of fever exhibited higher sensitivity (71.1%, 95% CI: 68.2-74.0), similar specificity (97.7%, 95% CI: 97.3-98.1), and the highest positive predictive value (86.9%, 95% CI: 84.5-89.3). Conclusions A case-identification strategy that includes admission or discharge diagnosis of fever should be considered for febrile infant studies using administrative data, though under-classification of patients is a potential limitation. PMID:26248691
Maden, Orhan; Balci, Kevser Gülcihan; Selcuk, Mehmet Timur; Balci, Mustafa Mücahit; Açar, Burak; Unal, Sefa; Kara, Meryem; Selcuk, Hatice
2015-12-01
The aim of this study was to investigate the accuracy of three algorithms in predicting accessory pathway locations in adult patients with Wolff-Parkinson-White syndrome in Turkish population. A total of 207 adult patients with Wolff-Parkinson-White syndrome were retrospectively analyzed. The most preexcited 12-lead electrocardiogram in sinus rhythm was used for analysis. Two investigators blinded to the patient data used three algorithms for prediction of accessory pathway location. Among all locations, 48.5% were left-sided, 44% were right-sided, and 7.5% were located in the midseptum or anteroseptum. When only exact locations were accepted as match, predictive accuracy for Chiang was 71.5%, 72.4% for d'Avila, and 71.5% for Arruda. The percentage of predictive accuracy of all algorithms did not differ between the algorithms (p = 1.000; p = 0.875; p = 0.885, respectively). The best algorithm for prediction of right-sided, left-sided, and anteroseptal and midseptal accessory pathways was Arruda (p < 0.001). Arruda was significantly better than d'Avila in predicting adjacent sites (p = 0.035) and the percent of the contralateral site prediction was higher with d'Avila than Arruda (p = 0.013). All algorithms were similar in predicting accessory pathway location and the predicted accuracy was lower than previously reported by their authors. However, according to the accessory pathway site, the algorithm designed by Arruda et al. showed better predictions than the other algorithms and using this algorithm may provide advantages before a planned ablation.
Accuracy test for link prediction in terms of similarity index: The case of WS and BA models
NASA Astrophysics Data System (ADS)
Ahn, Min-Woo; Jung, Woo-Sung
2015-07-01
Link prediction is a technique that uses the topological information in a given network to infer the missing links in it. Since past research on link prediction has primarily focused on enhancing performance for given empirical systems, negligible attention has been devoted to link prediction with regard to network models. In this paper, we thus apply link prediction to two network models: The Watts-Strogatz (WS) model and Barabási-Albert (BA) model. We attempt to gain a better understanding of the relation between accuracy and each network parameter (mean degree, the number of nodes and the rewiring probability in the WS model) through network models. Six similarity indices are used, with precision and area under the ROC curve (AUC) value as the accuracy metrics. We observe a positive correlation between mean degree and accuracy, and size independence of the AUC value.
Effectiveness of link prediction for face-to-face behavioral networks.
Tsugawa, Sho; Ohsaki, Hiroyuki
2013-01-01
Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30-0.45 and a recall of 0.10-0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks.
The effect of concurrent hand movement on estimated time to contact in a prediction motion task.
Zheng, Ran; Maraj, Brian K V
2018-04-27
In many activities, we need to predict the arrival of an occluded object. This action is called prediction motion or motion extrapolation. Previous researchers have found that both eye tracking and the internal clocking model are involved in the prediction motion task. Additionally, it is reported that concurrent hand movement facilitates the eye tracking of an externally generated target in a tracking task, even if the target is occluded. The present study examined the effect of concurrent hand movement on the estimated time to contact in a prediction motion task. We found different (accurate/inaccurate) concurrent hand movements had the opposite effect on the eye tracking accuracy and estimated TTC in the prediction motion task. That is, the accurate concurrent hand tracking enhanced eye tracking accuracy and had the trend to increase the precision of estimated TTC, but the inaccurate concurrent hand tracking decreased eye tracking accuracy and disrupted estimated TTC. However, eye tracking accuracy does not determine the precision of estimated TTC.
ERIC Educational Resources Information Center
Hilton, N. Zoe; Harris, Grant T.
2009-01-01
Prediction effect sizes such as ROC area are important for demonstrating a risk assessment's generalizability and utility. How a study defines recidivism might affect predictive accuracy. Nonrecidivism is problematic when predicting specialized violence (e.g., domestic violence). The present study cross-validates the ability of the Ontario…
Predicting adherence of patients with HF through machine learning techniques.
Karanasiou, Georgia Spiridon; Tripoliti, Evanthia Eleftherios; Papadopoulos, Theofilos Grigorios; Kalatzis, Fanis Georgios; Goletsis, Yorgos; Naka, Katerina Kyriakos; Bechlioulis, Aris; Errachid, Abdelhamid; Fotiadis, Dimitrios Ioannis
2016-09-01
Heart failure (HF) is a chronic disease characterised by poor quality of life, recurrent hospitalisation and high mortality. Adherence of patient to treatment suggested by the experts has been proven a significant deterrent of the above-mentioned serious consequences. However, the non-adherence rates are significantly high; a fact that highlights the importance of predicting the adherence of the patient and enabling experts to adjust accordingly patient monitoring and management. The aim of this work is to predict the adherence of patients with HF, through the application of machine learning techniques. Specifically, it aims to classify a patient not only as medication adherent or not, but also as adherent or not in terms of medication, nutrition and physical activity (global adherent). Two classification problems are addressed: (i) if the patient is global adherent or not and (ii) if the patient is medication adherent or not. About 11 classification algorithms are employed and combined with feature selection and resampling techniques. The classifiers are evaluated on a dataset of 90 patients. The patients are characterised as medication and global adherent, based on clinician estimation. The highest detection accuracy is 82 and 91% for the first and the second classification problem, respectively.
A hybrid clustering and classification approach for predicting crash injury severity on rural roads.
Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein
2018-03-01
As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.
Predicting Space Weather: Challenges for Research and Operations
NASA Astrophysics Data System (ADS)
Singer, H. J.; Onsager, T. G.; Rutledge, R.; Viereck, R. A.; Kunches, J.
2013-12-01
Society's growing dependence on technologies and infrastructure susceptible to the consequences of space weather has given rise to increased attention at the highest levels of government as well as inspired the need for both research and improved space weather services. In part, for these reasons, the number one goal of the recent National Research Council report on a Decadal Strategy for Solar and Space Physics is to 'Determine the origins of the Sun's activity and predict the variations in the space environment.' Prediction of conditions in our space environment is clearly a challenge for both research and operations, and we require the near-term development and validation of models that have sufficient accuracy and lead time to be useful to those impacted by space weather. In this presentation, we will provide new scientific results of space weather conditions that have challenged space weather forecasters, and identify specific areas of research that can lead to improved capabilities. In addition, we will examine examples of customer impacts and requirements as well as the challenges to the operations community to establish metrics that enable the selection and transition of models and observations that can provide the greatest economic and societal benefit.
Improving Fermi Orbit Determination and Prediction in an Uncertain Atmospheric Drag Environment
NASA Technical Reports Server (NTRS)
Vavrina, Matthew A.; Newman, Clark P.; Slojkowski, Steven E.; Carpenter, J. Russell
2014-01-01
Orbit determination and prediction of the Fermi Gamma-ray Space Telescope trajectory is strongly impacted by the unpredictability and variability of atmospheric density and the spacecraft's ballistic coefficient. Operationally, Global Positioning System point solutions are processed with an extended Kalman filter for orbit determination, and predictions are generated for conjunction assessment with secondary objects. When these predictions are compared to Joint Space Operations Center radar-based solutions, the close approach distance between the two predictions can greatly differ ahead of the conjunction. This work explores strategies for improving prediction accuracy and helps to explain the prediction disparities. Namely, a tuning analysis is performed to determine atmospheric drag modeling and filter parameters that can improve orbit determination as well as prediction accuracy. A 45% improvement in three-day prediction accuracy is realized by tuning the ballistic coefficient and atmospheric density stochastic models, measurement frequency, and other modeling and filter parameters.
Chan, Kuang-Lim; Rosli, Rozana; Tatarinova, Tatiana V; Hogan, Michael; Firdaus-Raih, Mohd; Low, Eng-Ti Leslie
2017-01-27
Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion. We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure). Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.
Uher, T; Vaneckova, M; Sormani, M P; Krasensky, J; Sobisek, L; Dusankova, J Blahova; Seidl, Z; Havrdova, E; Kalincik, T; Benedict, R H B; Horakova, D
2017-02-01
While impaired cognitive performance is common in multiple sclerosis (MS), it has been largely underdiagnosed. Here a magnetic resonance imaging (MRI) screening algorithm is proposed to identify patients at highest risk of cognitive impairment. The objective was to examine whether assessment of lesion burden together with whole brain atrophy on MRI improves our ability to identify cognitively impaired MS patients. Of the 1253 patients enrolled in the study, 1052 patients with all cognitive, volumetric MRI and clinical data available were included in the analysis. Brain MRI and neuropsychological assessment with the Brief International Cognitive Assessment for Multiple Sclerosis were performed. Multivariable logistic regression and individual prediction analysis were used to investigate the associations between MRI markers and cognitive impairment. The results of the primary analysis were validated at two subsequent time points (months 12 and 24). The prevalence of cognitive impairment was greater in patients with low brain parenchymal fraction (BPF) (<0.85) and high T2 lesion volume (T2-LV) (>3.5 ml) than in patients with high BPF (>0.85) and low T2-LV (<3.5 ml), with an odds ratio (OR) of 6.5 (95% CI 4.4-9.5). Low BPF together with high T2-LV identified in 270 (25.7%) patients predicted cognitive impairment with 83% specificity, 82% negative predictive value, 51% sensitivity and 75% overall accuracy. The risk of confirmed cognitive decline over the follow-up was greater in patients with high T2-LV (OR 2.1; 95% CI 1.1-3.8) and low BPF (OR 2.6; 95% CI 1.4-4.7). The integrated MRI assessment of lesion burden and brain atrophy may improve the stratification of MS patients who may benefit from cognitive assessment. © 2016 EAN.
Scoring systems for outcome prediction in patients with perforated peptic ulcer
2013-01-01
Background Patients with perforated peptic ulcer (PPU) often present with acute, severe illness that carries a high risk for morbidity and mortality. Mortality ranges from 3-40% and several prognostic scoring systems have been suggested. The aim of this study was to review the available scoring systems for PPU patients, and to assert if there is evidence to prefer one to the other. Material and methods We searched PubMed for the mesh terms “perforated peptic ulcer”, “scoring systems”, “risk factors”, ”outcome prediction”, “mortality”, ”morbidity” and the combinations of these terms. In addition to relevant scores introduced in the past (e.g. Boey score), we included recent studies published between January 2000 and December 2012) that reported on scoring systems for prediction of morbidity and mortality in PPU patients. Results A total of ten different scoring systems used to predict outcome in PPU patients were identified; the Boey score, the Hacettepe score, the Jabalpur score the peptic ulcer perforation (PULP) score, the ASA score, the Charlson comorbidity index, the sepsis score, the Mannheim Peritonitis Index (MPI), the Acute physiology and chronic health evaluation II (APACHE II), the simplified acute physiology score II (SAPS II), the Mortality probability models II (MPM II), the Physiological and Operative Severity Score for the enumeration of Mortality and Morbidity physical sub-score (POSSUM-phys score). Only four of the scores were specifically constructed for PPU patients. In five studies the accuracy of outcome prediction of different scoring systems was evaluated by receiver operating characteristics curve (ROC) analysis, and the corresponding area under the curve (AUC) among studies compared. Considerable variation in performance both between different scores and between different studies was found, with the lowest and highest AUC reported between 0.63 and 0.98, respectively. Conclusion While the Boey score and the ASA score are most commonly used to predict outcome for PPU patients, considerable variations in accuracy for outcome prediction were shown. Other scoring systems are hampered by a lack of validation or by their complexity that precludes routine clinical use. While the PULP score seems promising it needs external validation before widespread use. PMID:23574922
Modelling and Predicting Backstroke Start Performance Using Non-Linear and Linear Models.
de Jesus, Karla; Ayala, Helon V H; de Jesus, Kelly; Coelho, Leandro Dos S; Medeiros, Alexandre I A; Abraldes, José A; Vaz, Mário A P; Fernandes, Ricardo J; Vilas-Boas, João Paulo
2018-03-01
Our aim was to compare non-linear and linear mathematical model responses for backstroke start performance prediction. Ten swimmers randomly completed eight 15 m backstroke starts with feet over the wedge, four with hands on the highest horizontal and four on the vertical handgrip. Swimmers were videotaped using a dual media camera set-up, with the starts being performed over an instrumented block with four force plates. Artificial neural networks were applied to predict 5 m start time using kinematic and kinetic variables and to determine the accuracy of the mean absolute percentage error. Artificial neural networks predicted start time more robustly than the linear model with respect to changing training to the validation dataset for the vertical handgrip (3.95 ± 1.67 vs. 5.92 ± 3.27%). Artificial neural networks obtained a smaller mean absolute percentage error than the linear model in the horizontal (0.43 ± 0.19 vs. 0.98 ± 0.19%) and vertical handgrip (0.45 ± 0.19 vs. 1.38 ± 0.30%) using all input data. The best artificial neural network validation revealed a smaller mean absolute error than the linear model for the horizontal (0.007 vs. 0.04 s) and vertical handgrip (0.01 vs. 0.03 s). Artificial neural networks should be used for backstroke 5 m start time prediction due to the quite small differences among the elite level performances.
Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.
Wang, Guanjin; Lam, Kin-Man; Deng, Zhaohong; Choi, Kup-Sze
2015-08-01
Bladder cancer is a common cancer in genitourinary malignancy. For muscle invasive bladder cancer, surgical removal of the bladder, i.e. radical cystectomy, is in general the definitive treatment which, unfortunately, carries significant morbidities and mortalities. Accurate prediction of the mortality of radical cystectomy is therefore needed. Statistical methods have conventionally been used for this purpose, despite the complex interactions of high-dimensional medical data. Machine learning has emerged as a promising technique for handling high-dimensional data, with increasing application in clinical decision support, e.g. cancer prediction and prognosis. Its ability to reveal the hidden nonlinear interactions and interpretable rules between dependent and independent variables is favorable for constructing models of effective generalization performance. In this paper, seven machine learning methods are utilized to predict the 5-year mortality of radical cystectomy, including back-propagation neural network (BPN), radial basis function (RBFN), extreme learning machine (ELM), regularized ELM (RELM), support vector machine (SVM), naive Bayes (NB) classifier and k-nearest neighbour (KNN), on a clinicopathological dataset of 117 patients of the urology unit of a hospital in Hong Kong. The experimental results indicate that RELM achieved the highest average prediction accuracy of 0.8 at a fast learning speed. The research findings demonstrate the potential of applying machine learning techniques to support clinical decision making. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multivariate Models for Prediction of Human Skin Sensitization ...
One of the lnteragency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays - the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT) and KeratinoSens TM assay - six physicochemical properties and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches , logistic regression and support vector machine, to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three logistic regression and three support vector machine) with the highest accuracy (92%) used: (1) DPRA, h-CLAT and read-across; (2) DPRA, h-CLAT, read-across and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens and log P. The models performed better at predicting human skin sensitization hazard than the murine
NASA Astrophysics Data System (ADS)
Wang, Qianxin; Hu, Chao; Xu, Tianhe; Chang, Guobin; Hernández Moraleda, Alberto
2017-12-01
Analysis centers (ACs) for global navigation satellite systems (GNSSs) cannot accurately obtain real-time Earth rotation parameters (ERPs). Thus, the prediction of ultra-rapid orbits in the international terrestrial reference system (ITRS) has to utilize the predicted ERPs issued by the International Earth Rotation and Reference Systems Service (IERS) or the International GNSS Service (IGS). In this study, the accuracy of ERPs predicted by IERS and IGS is analyzed. The error of the ERPs predicted for one day can reach 0.15 mas and 0.053 ms in polar motion and UT1-UTC direction, respectively. Then, the impact of ERP errors on ultra-rapid orbit prediction by GNSS is studied. The methods for orbit integration and frame transformation in orbit prediction with introduced ERP errors dominate the accuracy of the predicted orbit. Experimental results show that the transformation from the geocentric celestial references system (GCRS) to ITRS exerts the strongest effect on the accuracy of the predicted ultra-rapid orbit. To obtain the most accurate predicted ultra-rapid orbit, a corresponding real-time orbit correction method is developed. First, orbits without ERP-related errors are predicted on the basis of ITRS observed part of ultra-rapid orbit for use as reference. Then, the corresponding predicted orbit is transformed from GCRS to ITRS to adjust for the predicted ERPs. Finally, the corrected ERPs with error slopes are re-introduced to correct the predicted orbit in ITRS. To validate the proposed method, three experimental schemes are designed: function extrapolation, simulation experiments, and experiments with predicted ultra-rapid orbits and international GNSS Monitoring and Assessment System (iGMAS) products. Experimental results show that using the proposed correction method with IERS products considerably improved the accuracy of ultra-rapid orbit prediction (except the geosynchronous BeiDou orbits). The accuracy of orbit prediction is enhanced by at least 50% (error related to ERP) when a highly accurate observed orbit is used with the correction method. For iGMAS-predicted orbits, the accuracy improvement ranges from 8.5% for the inclined BeiDou orbits to 17.99% for the GPS orbits. This demonstrates that the correction method proposed by this study can optimize the ultra-rapid orbit prediction.
Labanca, Ludimila; Guimarães, Fernando Sales; Costa-Guarisco, Letícia Pimenta; Couto, Erica de Araújo Brandão; Gonçalves, Denise Utsch
2017-11-01
Given the high prevalence of presbycusis and its detrimental effect on quality of life, screening tests can be useful tools for detecting hearing loss in primary care settings. This study therefore aimed to determine the accuracy and reproducibility of the whispered voice test as a screening method for detecting hearing impairment in older people. This cross-sectional study was carried out with 210 older adults aged between 60 and 97 years who underwent the whispered voice test employing ten different phrases and using audiometry as a reference test. Sensitivity, specificity and positive and negative predictive values were calculated and accuracy was measured by calculating the area under the ROC curve. The test was repeated on 20% of the ears by a second examiner to assess inter-examiner reproducibility (IER). The words and phrases that showed the highest area under the curve (AUC) and IER values were: "shoe" (AUC = 0.918; IER = 0.877), "window" (AUC = 0.917; IER = 0.869), "it looks like it's going to rain" (AUC = 0.911; IER = 0.810), and "the bus is late" (AUC = 0.900; IER = 0.810), demonstrating that the whispered voice test is a useful screening tool for detecting hearing loss among older people. It is proposed that these words and phrases should be incorporated into the whispered voice test protocol.
Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier
NASA Astrophysics Data System (ADS)
Wang, Leilei; Cheng, Jinyong
2018-03-01
Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Open data mining for Taiwan's dengue epidemic.
Wu, ChienHsing; Kao, Shu-Chen; Shih, Chia-Hung; Kan, Meng-Hsuan
2018-07-01
By using a quantitative approach, this study examines the applicability of data mining technique to discover knowledge from open data related to Taiwan's dengue epidemic. We compare results when Google trend data are included or excluded. Data sources are government open data, climate data, and Google trend data. Research findings from analysis of 70,914 cases are obtained. Location and time (month) in open data show the highest classification power followed by climate variables (temperature and humidity), whereas gender and age show the lowest values. Both prediction accuracy and simplicity decrease when Google trends are considered (respectively 0.94 and 0.37, compared to 0.96 and 0.46). The article demonstrates the value of open data mining in the context of public health care. Copyright © 2018 Elsevier B.V. All rights reserved.
An expert fitness diagnosis system based on elastic cloud computing.
Tseng, Kevin C; Wu, Chia-Chuan
2014-01-01
This paper presents an expert diagnosis system based on cloud computing. It classifies a user's fitness level based on supervised machine learning techniques. This system is able to learn and make customized diagnoses according to the user's physiological data, such as age, gender, and body mass index (BMI). In addition, an elastic algorithm based on Poisson distribution is presented to allocate computation resources dynamically. It predicts the required resources in the future according to the exponential moving average of past observations. The experimental results show that Naïve Bayes is the best classifier with the highest accuracy (90.8%) and that the elastic algorithm is able to capture tightly the trend of requests generated from the Internet and thus assign corresponding computation resources to ensure the quality of service.
Kvavilashvili, Lia; Ford, Ruth M
2014-11-01
It is well documented that young children greatly overestimate their performance on tests of retrospective memory (RM), but the current investigation is the first to examine children's prediction accuracy for prospective memory (PM). Three studies were conducted, each testing a different group of 5-year-olds. In Study 1 (N=46), participants were asked to predict their success in a simple event-based PM task (remembering to convey a message to a toy mole if they encountered a particular picture during a picture-naming activity). Before naming the pictures, children listened to either a reminder story or a neutral story. Results showed that children were highly accurate in their PM predictions (78% accuracy) and that the reminder story appeared to benefit PM only in children who predicted they would remember the PM response. In Study 2 (N=80), children showed high PM prediction accuracy (69%) regardless of whether the cue was specific or general and despite typical overoptimism regarding their performance on a 10-item RM task using item-by-item prediction. Study 3 (N=35) showed that children were prone to overestimate RM even when asked about their ability to recall a single item-the mole's unusual name. In light of these findings, we consider possible reasons for children's impressive PM prediction accuracy, including the potential involvement of future thinking in performance predictions and PM. Copyright © 2014 Elsevier Inc. All rights reserved.
Erbe, M; Hayes, B J; Matukumalli, L K; Goswami, S; Bowman, P J; Reich, C M; Mason, B A; Goddard, M E
2012-07-01
Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Auinger, Hans-Jürgen; Schönleben, Manfred; Lehermeier, Christina; Schmidt, Malthe; Korzun, Viktor; Geiger, Hartwig H; Piepho, Hans-Peter; Gordillo, Andres; Wilde, Peer; Bauer, Eva; Schön, Chris-Carolin
2016-11-01
Genomic prediction accuracy can be significantly increased by model calibration across multiple breeding cycles as long as selection cycles are connected by common ancestors. In hybrid rye breeding, application of genome-based prediction is expected to increase selection gain because of long selection cycles in population improvement and development of hybrid components. Essentially two prediction scenarios arise: (1) prediction of the genetic value of lines from the same breeding cycle in which model training is performed and (2) prediction of lines from subsequent cycles. It is the latter from which a reduction in cycle length and consequently the strongest impact on selection gain is expected. We empirically investigated genome-based prediction of grain yield, plant height and thousand kernel weight within and across four selection cycles of a hybrid rye breeding program. Prediction performance was assessed using genomic and pedigree-based best linear unbiased prediction (GBLUP and PBLUP). A total of 1040 S 2 lines were genotyped with 16 k SNPs and each year testcrosses of 260 S 2 lines were phenotyped in seven or eight locations. The performance gap between GBLUP and PBLUP increased significantly for all traits when model calibration was performed on aggregated data from several cycles. Prediction accuracies obtained from cross-validation were in the order of 0.70 for all traits when data from all cycles (N CS = 832) were used for model training and exceeded within-cycle accuracies in all cases. As long as selection cycles are connected by a sufficient number of common ancestors and prediction accuracy has not reached a plateau when increasing sample size, aggregating data from several preceding cycles is recommended for predicting genetic values in subsequent cycles despite decreasing relatedness over time.
Time series modelling to forecast prehospital EMS demand for diabetic emergencies.
Villani, Melanie; Earnest, Arul; Nanayakkara, Natalie; Smith, Karen; de Courten, Barbora; Zoungas, Sophia
2017-05-05
Acute diabetic emergencies are often managed by prehospital Emergency Medical Services (EMS). The projected growth in prevalence of diabetes is likely to result in rising demand for prehospital EMS that are already under pressure. The aims of this study were to model the temporal trends and provide forecasts of prehospital attendances for diabetic emergencies. A time series analysis on monthly cases of hypoglycemia and hyperglycemia was conducted using data from the Ambulance Victoria (AV) electronic database between 2009 and 2015. Using the seasonal autoregressive integrated moving average (SARIMA) modelling process, different models were evaluated. The most parsimonious model with the highest accuracy was selected. Forty-one thousand four hundred fifty-four prehospital diabetic emergencies were attended over a seven-year period with an increase in the annual median monthly caseload between 2009 (484.5) and 2015 (549.5). Hypoglycemia (70%) and people with type 1 diabetes (48%) accounted for most attendances. The SARIMA (0,1,0,12) model provided the best fit, with a MAPE of 4.2% and predicts a monthly caseload of approximately 740 by the end of 2017. Prehospital EMS demand for diabetic emergencies is increasing. SARIMA time series models are a valuable tool to allow forecasting of future caseload with high accuracy and predict increasing cases of prehospital diabetic emergencies into the future. The model generated by this study may be used by service providers to allow appropriate planning and resource allocation of EMS for diabetic emergencies.
Zhang, Shangwei; Adrian, Lorenz; Schüürmann, Gerrit
2018-02-20
The bacterium Dehalococcoides, strain CBDB1, transforms aromatic halides through reductive dehalogenation. So far, however, the structures of its vitamin B 12 -containing dehalogenases are unknown, hampering clarification of the catalytic mechanism and substrate specificity as basis for targeted remediation strategies. This study employs a quantum chemical donor-acceptor approach for the Co(I)-substrate electron transfer. Computational characterization of the substrate electron affinity at carbon-halogen bonds enables discriminating aromatic halides ready for dehalogenation by strain CBDB1 (active substrates) from nondehalogenated (inactive) counterparts with 92% accuracy, covering 86 of 93 bromobenzenes, chlorobenzenes, chlorophenols, chloroanilines, polychlorinated biphenyls, and dibenzo-p-dioxins. Moreover, experimental regioselectivity is predicted with 78% accuracy by a site-specific parameter encoding the overlap potential between the Co(I) HOMO (highest occupied molecular orbital) and the lowest-energy unoccupied sigma-symmetry substrate MO (σ*), and the observed dehalogenation pathways are rationalized with a success rate of 81%. Molecular orbital analysis reveals that the most reactive unoccupied sigma-symmetry orbital of carbon-attached halogen X (σ C-X * ) mediates its reductive cleavage. The discussion includes predictions for untested substrates, thus providing opportunities for targeted experimental investigations. Overall, the presently introduced orbital interaction model supports the view that with bacterial strain CBDB1, an inner-sphere electron transfer from the supernucleophile B 12 Co(I) to the halogen substituent of the aromatic halide is likely to represent the rate-determining step of the reductive dehalogenation.
Demystifying the Clinical Diagnosis of Greater Trochanteric Pain Syndrome in Women.
Ganderton, Charlotte; Semciw, Adam; Cook, Jill; Pizzari, Tania
2017-06-01
To evaluate the diagnostic accuracy of 10 clinical tests that can be used in the diagnosis of greater trochanteric pain syndrome (GTPS) in women, and to compare these clinical tests to magnetic resonance imaging (MRI) findings. Twenty-eight participants with GTPS (49.5 ± 22.0 years) and 18 asymptomatic participants (mean age ± standard deviation [SD], 52.5 ± 22.8 years) were included. A blinded physiotherapist performed 10 pain provocation tests potentially diagnostic for GTPS-palpation of the greater trochanter, resisted external derotation test, modified resisted external derotation test, standard and modified Ober's tests, Patrick's or FABER test, resisted hip abduction, single-leg stance test, and the resisted hip internal rotation test. A sample of 16 symptomatic and 17 asymptomatic women undertook a hip MRI scan. Gluteal tendons were evaluated and categorized as no pathology, mild tendinosis, moderate tendinosis/partial tear, or full-thickness tear. Clinical test analyses show high specificity, high positive predictive value, low to moderate sensitivity, and negative predictive value for most clinical tests. All symptomatic and 88% of asymptomatic participants had pathological gluteal tendon changes on MRI, from mild tendinosis to full-thickness tear. The study found the Patrick's or FABER test, palpation of the greater trochanter, resisted hip abduction, and the resisted external derotation test to have the highest diagnostic test accuracy for GTPS. Tendon pathology on MRI is seen in both symptomatic and asymptomatic women.
Sex Estimation From Sternal Measurements Using Multidetector Computed Tomography
Ekizoglu, Oguzhan; Hocaoglu, Elif; Inci, Ercan; Bilgili, Mustafa Gokhan; Solmaz, Dilek; Erdil, Irem; Can, Ismail Ozgur
2014-01-01
Abstract We aimed to show the utility and reliability of sternal morphometric analysis for sex estimation. Sex estimation is a very important step in forensic identification. Skeletal surveys are main methods for sex estimation studies. Morphometric analysis of sternum may provide high accuracy rated data in sex discrimination. In this study, morphometric analysis of sternum was evaluated in 1 mm chest computed tomography scans for sex estimation. Four hundred forty 3 subjects (202 female, 241 male, mean age: 44 ± 8.1 [distribution: 30–60 year old]) were included the study. Manubrium length (ML), mesosternum length (2L), Sternebra 1 (S1W), and Sternebra 3 (S3W) width were measured and also sternal index (SI) was calculated. Differences between genders were evaluated by student t-test. Predictive factors of sex were determined by discrimination analysis and receiver operating characteristic (ROC) analysis. Male sternal measurement values are significantly higher than females (P < 0.001) while SI is significantly low in males (P < 0.001). In discrimination analysis, MSL has high accuracy rate with 80.2% in females and 80.9% in males. MSL also has the best sensitivity (75.9%) and specificity (87.6%) values. Accuracy rates were above 80% in 3 stepwise discrimination analysis for both sexes. Stepwise 1 (ML, MSL, S1W, S3W) has the highest accuracy rate in stepwise discrimination analysis with 86.1% in females and 83.8% in males. Our study showed that morphometric computed tomography analysis of sternum might provide important information for sex estimation. PMID:25501090
Weng, Ziqing; Wolc, Anna; Shen, Xia; Fernando, Rohan L; Dekkers, Jack C M; Arango, Jesus; Settar, Petek; Fulton, Janet E; O'Sullivan, Neil P; Garrick, Dorian J
2016-03-19
Genomic estimated breeding values (GEBV) based on single nucleotide polymorphism (SNP) genotypes are widely used in animal improvement programs. It is typically assumed that the larger the number of animals is in the training set, the higher is the prediction accuracy of GEBV. The aim of this study was to quantify genomic prediction accuracy depending on the number of ancestral generations included in the training set, and to determine the optimal number of training generations for different traits in an elite layer breeding line. Phenotypic records for 16 traits on 17,793 birds were used. All parents and some selection candidates from nine non-overlapping generations were genotyped for 23,098 segregating SNPs. An animal model with pedigree relationships (PBLUP) and the BayesB genomic prediction model were applied to predict EBV or GEBV at each validation generation (progeny of the most recent training generation) based on varying numbers of immediately preceding ancestral generations. Prediction accuracy of EBV or GEBV was assessed as the correlation between EBV and phenotypes adjusted for fixed effects, divided by the square root of trait heritability. The optimal number of training generations that resulted in the greatest prediction accuracy of GEBV was determined for each trait. The relationship between optimal number of training generations and heritability was investigated. On average, accuracies were higher with the BayesB model than with PBLUP. Prediction accuracies of GEBV increased as the number of closely-related ancestral generations included in the training set increased, but reached an asymptote or slightly decreased when distant ancestral generations were used in the training set. The optimal number of training generations was 4 or more for high heritability traits but less than that for low heritability traits. For less heritable traits, limiting the training datasets to individuals closely related to the validation population resulted in the best predictions. The effect of adding distant ancestral generations in the training set on prediction accuracy differed between traits and the optimal number of necessary training generations is associated with the heritability of traits.
Agreement and accuracy using the FIGO, ACOG and NICE cardiotocography interpretation guidelines.
Santo, Susana; Ayres-de-Campos, Diogo; Costa-Santos, Cristina; Schnettler, William; Ugwumadu, Austin; Da Graça, Luís M
2017-02-01
One of the limitations reported with cardiotocography is the modest interobserver agreement observed in tracing interpretation. This study compared agreement, reliability and accuracy of cardiotocography interpretation using the International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines. A total of 151 tracings were evaluated by 27 clinicians from three centers where International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines were routinely used. Interobserver agreement was evaluated using the proportions of agreement and reliability with the κ statistic. The accuracy of tracings classified as "pathological/category III" was assessed for prediction of newborn acidemia. For all measures, 95% confidence interval were calculated. Cardiotocography classifications were more distributed with International Federation of Gynecology and Obstetrics (9, 52, 39%) and National Institute for Health and Care Excellence (30, 33, 37%) than with American College of Obstetrics and Gynecology (13, 81, 6%). The category with the highest agreement was American College of Obstetrics and Gynecology category II (proportions of agreement = 0.73, 95% confidence interval 0.70-76), and the ones with the lowest agreement were American College of Obstetrics and Gynecology categories I and III. Reliability was significantly higher with International Federation of Gynecology and Obstetrics (κ = 0.37, 95% confidence interval 0.31-0.43), and National Institute for Health and Care Excellence (κ = 0.33, 95% confidence interval 0.28-0.39) than with American College of Obstetrics and Gynecology (κ = 0.15, 95% confidence interval 0.10-0.21); however, all represent only slight/fair reliability. International Federation of Gynecology and Obstetrics and National Institute for Health and Care Excellence showed a trend towards higher sensitivities in prediction of newborn acidemia (89 and 97%, respectively) than American College of Obstetrics and Gynecology (32%), but the latter achieved a significantly higher specificity (95%). With American College of Obstetrics and Gynecology guidelines there is high agreement in category II, low reliability, low sensitivity and high specificity in prediction of acidemia. With International Federation of Gynecology and Obstetrics and National Institute for Health and Care Excellence guidelines there is higher reliability, a trend towards higher sensitivity, and lower specificity in prediction of acidemia. © 2016 Nordic Federation of Societies of Obstetrics and Gynecology.
Can nutrient status of four woody plant species be predicted using field spectrometry?
NASA Astrophysics Data System (ADS)
Ferwerda, Jelle G.; Skidmore, Andrew K.
This paper demonstrates the potential of hyperspectral remote sensing to predict the chemical composition (i.e., nitrogen, phosphorous, calcium, potassium, sodium, and magnesium) of three tree species (i.e., willow, mopane and olive) and one shrub species (i.e., heather). Reflectance spectra, derivative spectra and continuum-removed spectra were compared in terms of predictive power. Results showed that the best predictions for nitrogen, phosphorous, and magnesium occur when using derivative spectra, and the best predictions for sodium, potassium, and calcium occur when using continuum-removed data. To test whether a general model for multiple species is also valid for individual species, a bootstrapping routine was applied. Prediction accuracies for the individual species were lower then prediction accuracies obtained for the combined dataset for all except one element/species combination, indicating that indices with high prediction accuracies at the landscape scale are less appropriate to detect the chemical content of individual species.
Heidelberg Retina Tomograph 3 machine learning classifiers for glaucoma detection
Townsend, K A; Wollstein, G; Danks, D; Sung, K R; Ishikawa, H; Kagemann, L; Gabriele, M L; Schuman, J S
2010-01-01
Aims To assess performance of classifiers trained on Heidelberg Retina Tomograph 3 (HRT3) parameters for discriminating between healthy and glaucomatous eyes. Methods Classifiers were trained using HRT3 parameters from 60 healthy subjects and 140 glaucomatous subjects. The classifiers were trained on all 95 variables and smaller sets created with backward elimination. Seven types of classifiers, including Support Vector Machines with radial basis (SVM-radial), and Recursive Partitioning and Regression Trees (RPART), were trained on the parameters. The area under the ROC curve (AUC) was calculated for classifiers, individual parameters and HRT3 glaucoma probability scores (GPS). Classifier AUCs and leave-one-out accuracy were compared with the highest individual parameter and GPS AUCs and accuracies. Results The highest AUC and accuracy for an individual parameter were 0.848 and 0.79, for vertical cup/disc ratio (vC/D). For GPS, global GPS performed best with AUC 0.829 and accuracy 0.78. SVM-radial with all parameters showed significant improvement over global GPS and vC/ D with AUC 0.916 and accuracy 0.85. RPART with all parameters provided significant improvement over global GPS with AUC 0.899 and significant improvement over global GPS and vC/D with accuracy 0.875. Conclusions Machine learning classifiers of HRT3 data provide significant enhancement over current methods for detection of glaucoma. PMID:18523087
The Influence of Delaying Judgments of Learning on Metacognitive Accuracy: A Meta-Analytic Review
ERIC Educational Resources Information Center
Rhodes, Matthew G.; Tauber, Sarah K.
2011-01-01
Many studies have examined the accuracy of predictions of future memory performance solicited through judgments of learning (JOLs). Among the most robust findings in this literature is that delaying predictions serves to substantially increase the relative accuracy of JOLs compared with soliciting JOLs immediately after study, a finding termed the…
Diagnostic Accuracy of Fall Risk Assessment Tools in People With Diabetic Peripheral Neuropathy
Pohl, Patricia S.; Mahnken, Jonathan D.; Kluding, Patricia M.
2012-01-01
Background Diabetic peripheral neuropathy affects nearly half of individuals with diabetes and leads to increased fall risk. Evidence addressing fall risk assessment for these individuals is lacking. Objective The purpose of this study was to identify which of 4 functional mobility fall risk assessment tools best discriminates, in people with diabetic peripheral neuropathy, between recurrent “fallers” and those who are not recurrent fallers. Design A cross-sectional study was conducted. Setting The study was conducted in a medical research university setting. Participants The participants were a convenience sample of 36 individuals between 40 and 65 years of age with diabetic peripheral neuropathy. Measurements Fall history was assessed retrospectively and was the criterion standard. Fall risk was assessed using the Functional Reach Test, the Timed “Up & Go” Test, the Berg Balance Scale, and the Dynamic Gait Index. Sensitivity, specificity, positive and negative likelihood ratios, and overall diagnostic accuracy were calculated for each fall risk assessment tool. Receiver operating characteristic curves were used to estimate modified cutoff scores for each fall risk assessment tool; indexes then were recalculated. Results Ten of the 36 participants were classified as recurrent fallers. When traditional cutoff scores were used, the Dynamic Gait Index and Functional Reach Test demonstrated the highest sensitivity at only 30%; the Dynamic Gait Index also demonstrated the highest overall diagnostic accuracy. When modified cutoff scores were used, all tools demonstrated improved sensitivity (80% or 90%). Overall diagnostic accuracy improved for all tests except the Functional Reach Test; the Timed “Up & Go” Test demonstrated the highest diagnostic accuracy at 88.9%. Limitations The small sample size and retrospective fall history assessment were limitations of the study. Conclusions Modified cutoff scores improved diagnostic accuracy for 3 of 4 fall risk assessment tools when testing people with diabetic peripheral neuropathy. PMID:22836004
Magnetic resonance imaging-ultrasound fusion biopsy for prediction of final prostate pathology.
Le, Jesse D; Stephenson, Samuel; Brugger, Michelle; Lu, David Y; Lieu, Patricia; Sonn, Geoffrey A; Natarajan, Shyam; Dorey, Frederick J; Huang, Jiaoti; Margolis, Daniel J A; Reiter, Robert E; Marks, Leonard S
2014-11-01
We explored the impact of magnetic resonance imaging-ultrasound fusion prostate biopsy on the prediction of final surgical pathology. A total of 54 consecutive men undergoing radical prostatectomy at UCLA after fusion biopsy were included in this prospective, institutional review board approved pilot study. Using magnetic resonance imaging-ultrasound fusion, tissue was obtained from a 12-point systematic grid (mapping biopsy) and from regions of interest detected by multiparametric magnetic resonance imaging (targeted biopsy). A single radiologist read all magnetic resonance imaging, and a single pathologist independently rereviewed all biopsy and whole mount pathology, blinded to prior interpretation and matched specimen. Gleason score concordance between biopsy and prostatectomy was the primary end point. Mean patient age was 62 years and median prostate specific antigen was 6.2 ng/ml. Final Gleason score at prostatectomy was 6 (13%), 7 (70%) and 8-9 (17%). A tertiary pattern was detected in 17 (31%) men. Of 45 high suspicion (image grade 4-5) magnetic resonance imaging targets 32 (71%) contained prostate cancer. The per core cancer detection rate was 20% by systematic mapping biopsy and 42% by targeted biopsy. The highest Gleason pattern at prostatectomy was detected by systematic mapping biopsy in 54%, targeted biopsy in 54% and a combination in 81% of cases. Overall 17% of cases were upgraded from fusion biopsy to final pathology and 1 (2%) was downgraded. The combination of targeted biopsy and systematic mapping biopsy was needed to obtain the best predictive accuracy. In this pilot study magnetic resonance imaging-ultrasound fusion biopsy allowed for the prediction of final prostate pathology with greater accuracy than that reported previously using conventional methods (81% vs 40% to 65%). If confirmed, these results will have important clinical implications. Copyright © 2014 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Ingle, Brandall L; Veber, Brandon C; Nichols, John W; Tornero-Velez, Rogelio
2016-11-28
The free fraction of a xenobiotic in plasma (F ub ) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data are scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict F ub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18F ub . The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0.11-0.14), and acids (MAE 0.14-0.17). A consensus model had the highest accuracy across both pharmaceuticals (MAE 0.151-0.155) and environmentally relevant chemicals (MAE 0.110-0.131). The inclusion of the majority of the ToxCast test sets within the AD of the consensus model, coupled with high prediction accuracy for these chemicals, indicates the model provides a QSAR for F ub that is broadly applicable to both pharmaceuticals and environmentally relevant chemicals.
Application of a Hybrid Model for Predicting the Incidence of Tuberculosis in Hubei, China
Zhang, Guoliang; Huang, Shuqiong; Duan, Qionghong; Shu, Wen; Hou, Yongchun; Zhu, Shiyu; Miao, Xiaoping; Nie, Shaofa; Wei, Sheng; Guo, Nan; Shan, Hua; Xu, Yihua
2013-01-01
Background A prediction model for tuberculosis incidence is needed in China which may be used as a decision-supportive tool for planning health interventions and allocating health resources. Methods The autoregressive integrated moving average (ARIMA) model was first constructed with the data of tuberculosis report rate in Hubei Province from Jan 2004 to Dec 2011.The data from Jan 2012 to Jun 2012 were used to validate the model. Then the generalized regression neural network (GRNN)-ARIMA combination model was established based on the constructed ARIMA model. Finally, the fitting and prediction accuracy of the two models was evaluated. Results A total of 465,960 cases were reported between Jan 2004 and Dec 2011 in Hubei Province. The report rate of tuberculosis was highest in 2005 (119.932 per 100,000 population) and lowest in 2010 (84.724 per 100,000 population). The time series of tuberculosis report rate show a gradual secular decline and a striking seasonal variation. The ARIMA (2, 1, 0) × (0, 1, 1)12 model was selected from several plausible ARIMA models. The residual mean square error of the GRNN-ARIMA model and ARIMA model were 0.4467 and 0.6521 in training part, and 0.0958 and 0.1133 in validation part, respectively. The mean absolute error and mean absolute percentage error of the hybrid model were also less than the ARIMA model. Discussion and Conclusions The gradual decline in tuberculosis report rate may be attributed to the effect of intensive measures on tuberculosis. The striking seasonal variation may have resulted from several factors. We suppose that a delay in the surveillance system may also have contributed to the variation. According to the fitting and prediction accuracy, the hybrid model outperforms the traditional ARIMA model, which may facilitate the allocation of health resources in China. PMID:24223232
ICU scoring systems allow prediction of patient outcomes and comparison of ICU performance.
Becker, R B; Zimmerman, J E
1996-07-01
Too much time and effort are wasted in attempts to pass final judgment on whether systems for ICU prognostication are "good or bad" and whether they "do or do not" provide a simple answer to the complex and often unpredictable question of individual mortality in the ICU. A substantial amount of data supports the usefulness of general ICU prognostic systems in comparing ICU performance with respect to a wide variety of endpoints, including ICU and hospital mortality, duration of stay, and efficiency of resource use. Work in progress is analyzing both general resource use and specific therapeutic interventions. It also is time to fully acknowledge that statistics never can predict whether a patient will die with 100% accuracy. There always will be exceptions to the rule, and physicians frequently will have information that is not included in prognostic models. In addition, the values of both physicians and patients frequently lead to differences in how a probability in interpreted; for some, a 95% probability estimate means that death is near and, for others, this estimate represents a tangible 5% chance for survival. This means that physicians must learn how to integrate such estimates into their medical decisions. In doing so, it is our hope that prognostic systems are not viewed as oversimplifying or automating clinical decisions. Rather, such systems provide objective data on which physicians may ground a spectrum of decisions regarding either escalation or withdrawal of therapy in critically ill patients. These systems do not dehumanize our decision-making process but, rather, help eliminate physician reliance on emotional, heuristic, poorly calibrated, or overly pessimistic subjective estimates. No decision regarding patient care can be considered best if the facts upon which it is based on imprecise or biased. Future research will improve the accuracy of individual patient predictions but, even with the highest degree of precision, such predictions are useful only in support of, and not as a substitute for, good clinical judgment.
Acoustics Research of Propulsion Systems
NASA Technical Reports Server (NTRS)
Gao, Ximing; Houston, Janice D.
2014-01-01
The liftoff phase induces some of the highest acoustic loading over a broad frequency for a launch vehicle. These external acoustic environments are used in the prediction of the internal vibration responses of the vehicle and components. Thus, predicting these liftoff acoustic environments is critical to the design requirements of any launch vehicle but there are challenges. Present liftoff vehicle acoustic environment prediction methods utilize stationary data from previously conducted hold-down tests; i.e. static firings conducted in the 1960's, to generate 1/3 octave band Sound Pressure Level (SPL) spectra. These data sets are used to predict the liftoff acoustic environments for launch vehicles. To facilitate the accuracy and quality of acoustic loading, predictions at liftoff for future launch vehicles such as the Space Launch System (SLS), non-stationary flight data from the Ares I-X were processed in PC-Signal in two forms which included a simulated hold-down phase and the entire launch phase. In conjunction, the Prediction of Acoustic Vehicle Environments (PAVE) program was developed in MATLAB to allow for efficient predictions of sound pressure levels (SPLs) as a function of station number along the vehicle using semiempirical methods. This consisted, initially, of generating the Dimensionless Spectrum Function (DSF) and Dimensionless Source Location (DSL) curves from the Ares I-X flight data. These are then used in the MATLAB program to generate the 1/3 octave band SPL spectra. Concluding results show major differences in SPLs between the hold-down test data and the processed Ares IX flight data making the Ares I-X flight data more practical for future vehicle acoustic environment predictions.
van Strien, Maarten J; Keller, Daniela; Holderegger, Rolf; Ghazoul, Jaboury; Kienast, Felix; Bolliger, Janine
2014-03-01
For conservation managers, it is important to know whether landscape changes lead to increasing or decreasing gene flow. Although the discipline of landscape genetics assesses the influence of landscape elements on gene flow, no studies have yet used landscape-genetic models to predict gene flow resulting from landscape change. A species that has already been severely affected by landscape change is the large marsh grasshopper (Stethophyma grossum), which inhabits moist areas in fragmented agricultural landscapes in Switzerland. From transects drawn between all population pairs within maximum dispersal distance (< 3 km), we calculated several measures of landscape composition as well as some measures of habitat configuration. Additionally, a complete sampling of all populations in our study area allowed incorporating measures of population topology. These measures together with the landscape metrics formed the predictor variables in linear models with gene flow as response variable (F(ST) and mean pairwise assignment probability). With a modified leave-one-out cross-validation approach, we selected the model with the highest predictive accuracy. With this model, we predicted gene flow under several landscape-change scenarios, which simulated construction, rezoning or restoration projects, and the establishment of a new population. For some landscape-change scenarios, significant increase or decrease in gene flow was predicted, while for others little change was forecast. Furthermore, we found that the measures of population topology strongly increase model fit in landscape genetic analysis. This study demonstrates the use of predictive landscape-genetic models in conservation and landscape planning.
Prediction Model for the Carbonation of Post-Repair Materials in Carbonated RC Structures
Lee, Hyung-Min; Lee, Han-Seung; Singh, Jitendra Kumar
2017-01-01
Concrete carbonation damages the passive film that surrounds reinforcement bars, resulting in their exposure to corrosion. Studies on the prediction of concrete carbonation are thus of great significance. The repair of pre-built reinforced concrete (RC) structures by methods such as remodeling was recently introduced. While many studies have been conducted on the progress of carbonation in newly constructed buildings and RC structures fitted with new repair materials, the prediction of post-repair carbonation has not been considered. In the present study, accelerated carbonation was carried out to investigate RC structures following surface layer repair, in order to determine the carbonation depth. To validate the obtained results, a second experiment was performed under the same conditions to determine the carbonation depth by the Finite Difference Method (FDM) and Finite Element Method (FEM). For the accelerated carbonation experiment, FDM and FEM analyses, produced very similar results, thus confirming that the carbonation depth in an RC structure after surface layer repair can be predicted with accuracy. The specimen repaired using inhibiting surface coating (ISC) had the highest carbonation penetration of 19.81, while this value was the lowest for the corrosion inhibiting mortar (IM) with 13.39 mm. In addition, the carbonation depth predicted by using the carbonation prediction formula after repair indicated that that the analytical and experimental values are almost identical if the initial concentration of Ca(OH)2 is assumed to be 52%. PMID:28772852
de Man-van Ginkel, Janneke M; Hafsteinsdóttir, Thóra B; Lindeman, Eline; Ettema, Roelof G A; Grobbee, Diederick E; Schuurmans, Marieke J
2013-09-01
The timely detection of post-stroke depression is complicated by a decreasing length of hospital stay. Therefore, the Post-stroke Depression Prediction Scale was developed and validated. The Post-stroke Depression Prediction Scale is a clinical prediction model for the early identification of stroke patients at increased risk for post-stroke depression. The study included 410 consecutive stroke patients who were able to communicate adequately. Predictors were collected within the first week after stroke. Between 6 to 8 weeks after stroke, major depressive disorder was diagnosed using the Composite International Diagnostic Interview. Multivariable logistic regression models were fitted. A bootstrap-backward selection process resulted in a reduced model. Performance of the model was expressed by discrimination, calibration, and accuracy. The model included a medical history of depression or other psychiatric disorders, hypertension, angina pectoris, and the Barthel Index item dressing. The model had acceptable discrimination, based on an area under the receiver operating characteristic curve of 0.78 (0.72-0.85), and calibration (P value of the U-statistic, 0.96). Transforming the model to an easy-to-use risk-assessment table, the lowest risk category (sum score, <-10) showed a 2% risk of depression, which increased to 82% in the highest category (sum score, >21). The clinical prediction model enables clinicians to estimate the degree of the depression risk for an individual patient within the first week after stroke.
Illias, Hazlee Azil; Chai, Xin Rui; Abu Bakar, Ab Halim; Mokhlis, Hazlie
2015-01-01
It is important to predict the incipient fault in transformer oil accurately so that the maintenance of transformer oil can be performed correctly, reducing the cost of maintenance and minimise the error. Dissolved gas analysis (DGA) has been widely used to predict the incipient fault in power transformers. However, sometimes the existing DGA methods yield inaccurate prediction of the incipient fault in transformer oil because each method is only suitable for certain conditions. Many previous works have reported on the use of intelligence methods to predict the transformer faults. However, it is believed that the accuracy of the previously proposed methods can still be improved. Since artificial neural network (ANN) and particle swarm optimisation (PSO) techniques have never been used in the previously reported work, this work proposes a combination of ANN and various PSO techniques to predict the transformer incipient fault. The advantages of PSO are simplicity and easy implementation. The effectiveness of various PSO techniques in combination with ANN is validated by comparison with the results from the actual fault diagnosis, an existing diagnosis method and ANN alone. Comparison of the results from the proposed methods with the previously reported work was also performed to show the improvement of the proposed methods. It was found that the proposed ANN-Evolutionary PSO method yields the highest percentage of correct identification for transformer fault type than the existing diagnosis method and previously reported works.
2015-01-01
It is important to predict the incipient fault in transformer oil accurately so that the maintenance of transformer oil can be performed correctly, reducing the cost of maintenance and minimise the error. Dissolved gas analysis (DGA) has been widely used to predict the incipient fault in power transformers. However, sometimes the existing DGA methods yield inaccurate prediction of the incipient fault in transformer oil because each method is only suitable for certain conditions. Many previous works have reported on the use of intelligence methods to predict the transformer faults. However, it is believed that the accuracy of the previously proposed methods can still be improved. Since artificial neural network (ANN) and particle swarm optimisation (PSO) techniques have never been used in the previously reported work, this work proposes a combination of ANN and various PSO techniques to predict the transformer incipient fault. The advantages of PSO are simplicity and easy implementation. The effectiveness of various PSO techniques in combination with ANN is validated by comparison with the results from the actual fault diagnosis, an existing diagnosis method and ANN alone. Comparison of the results from the proposed methods with the previously reported work was also performed to show the improvement of the proposed methods. It was found that the proposed ANN-Evolutionary PSO method yields the highest percentage of correct identification for transformer fault type than the existing diagnosis method and previously reported works. PMID:26103634
Prediction of anaerobic power values from an abbreviated WAnT protocol.
Stickley, Christopher D; Hetzler, Ronald K; Kimura, Iris F
2008-05-01
The traditional 30-second Wingate anaerobic test (WAnT) is a widely used anaerobic power assessment protocol. An abbreviated protocol has been shown to decrease the mild to severe physical discomfort often associated with the WAnT. Therefore, the purpose of this study was to determine whether a 20-second WAnT protocol could be used to accurately predict power values of a standard 30-second WAnT. In 96 college females, anaerobic power variables were assessed using a standard 30-second WAnT protocol. Maximum power values as well as instantaneous power at 10, 15, and 20 seconds were recorded. Based on these results, stepwise regression analysis was performed to determine the accuracy with which mean power, minimum power, 30-second power, and percentage of fatigue for a standard 30-second WAnT could be predicted from values obtained during the first 20 seconds of testing. Mean power values showed the highest level of predictability (R2 = 0.99) from the 20-second values. Minimum power, 30-second power, and percentage of fatigue also showed high levels of predictability (R2 = 0.91, 0.84, and 0.84, respectively) using only values obtained during the first 20 seconds of the protocol. An abbreviated (20-second) WAnT protocol appears to effectively predict results of a standard 30-second WAnT in college-age females, allowing for comparison of data to published norms. A shortened test may allow for a decrease in unwanted side effects associated with the traditional WAnT protocol.
Kim Oanh, Nguyen Thi; Leelasakultum, Ketsiri
2011-05-01
This study investigated the main causes of haze episodes in the northwestern Thailand to provide early warning and prediction. In an absence of emission input data required for chemical transport modeling to predict the haze, the climatological approach in combination with statistical analysis was used. An automatic meteorological classification scheme was developed using regional meteorological station data of 8years (2001-2008) which classified the prevailing synoptic patterns over Northern Thailand into 4 patterns. Pattern 2, occurring with high frequency in March, was found to associate with the highest levels of 24h PM(10) in Chiangmai, the largest city in Northern Thailand. Typical features of this pattern were the dominance of thermal lows over India, Western China and Northern Thailand with hot, dry and stagnant air in Northern Thailand. March 2007, the month with the most severe haze episode in Chiangmai, was found to have a high frequency of occurrence of pattern 2 coupled with the highest emission intensities from biomass open burning. Backward trajectories showed that, on haze episode days, air masses passed over the region of dense biomass fire hotspots before arriving at Chiangmai. A stepwise regression model was developed to predict 24h PM(10) for days of meteorology pattern 2 using February-April data of 2007-2009 and tested with 2004-2010 data. The model performed satisfactorily for the model development dataset (R(2)=87%) and test dataset (R(2)=81%), which appeared to be superior over a simple persistence regression of 24h PM(10) (R(2)=76%). Our developed model had an accuracy over 90% for the categorical forecast of PM(10)>120μg/m(3). The episode warning procedure would identify synoptic pattern 2 and predict 24h PM(10) in Chiangmai 24h in advance. This approach would be applicable for air pollution episode management in other areas with complex terrain where similar conditions exist. Copyright © 2011 Elsevier B.V. All rights reserved.
Ellens, Harma; Deng, Shibing; Coleman, JoAnn; Bentz, Joe; Taub, Mitchell E.; Ragueneau-Majlessi, Isabelle; Chung, Sophie P.; Herédi-Szabó, Krisztina; Neuhoff, Sibylle; Palm, Johan; Balimane, Praveen; Zhang, Lei; Jamei, Masoud; Hanna, Imad; O’Connor, Michael; Bednarczyk, Dallas; Forsgard, Malin; Chu, Xiaoyan; Funk, Christoph; Guo, Ailan; Hillgren, Kathleen M.; Li, LiBin; Pak, Anne Y.; Perloff, Elke S.; Rajaraman, Ganesh; Salphati, Laurent; Taur, Jan-Shiang; Weitz, Dietmar; Wortelboer, Heleen M.; Xia, Cindy Q.; Xiao, Guangqing; Yamagata, Tetsuo
2013-01-01
In the 2012 Food and Drug Administration (FDA) draft guidance on drug-drug interactions (DDIs), a new molecular entity that inhibits P-glycoprotein (P-gp) may need a clinical DDI study with a P-gp substrate such as digoxin when the maximum concentration of inhibitor at steady state divided by IC50 ([I1]/IC50) is ≥0.1 or concentration of inhibitor based on highest approved dose dissolved in 250 ml divide by IC50 ([I2]/IC50) is ≥10. In this article, refined criteria are presented, determined by receiver operating characteristic analysis, using IC50 values generated by 23 laboratories. P-gp probe substrates were digoxin for polarized cell-lines and N-methyl quinidine or vinblastine for P-gp overexpressed vesicles. Inhibition of probe substrate transport was evaluated using 15 known P-gp inhibitors. Importantly, the criteria derived in this article take into account variability in IC50 values. Moreover, they are statistically derived based on the highest degree of accuracy in predicting true positive and true negative digoxin DDI results. The refined criteria of [I1]/IC50 ≥ 0.03 and [I2]/IC50 ≥ 45 and FDA criteria were applied to a test set of 101 in vitro-in vivo digoxin DDI pairs collated from the literature. The number of false negatives (none predicted but DDI observed) were similar, 10 and 12%, whereas the number of false positives (DDI predicted but not observed) substantially decreased from 51 to 40%, relative to the FDA criteria. On the basis of estimated overall variability in IC50 values, a theoretical 95% confidence interval calculation was developed for single laboratory IC50 values, translating into a range of [I1]/IC50 and [I2]/IC50 values. The extent by which this range falls above the criteria is a measure of risk associated with the decision, attributable to variability in IC50 values. PMID:23620486
NASA Astrophysics Data System (ADS)
Sembiring, J.; Jones, F.
2018-03-01
Red cell Distribution Width (RDW) and platelet ratio (RPR) can predict liver fibrosis and cirrhosis in chronic hepatitis B with relatively high accuracy. RPR was superior to other non-invasive methods to predict liver fibrosis, such as AST and ALT ratio, AST and platelet ratio Index and FIB-4. The aim of this study was to assess diagnostic accuracy liver fibrosis by using RDW and platelets ratio in chronic hepatitis B patients based on compared with Fibroscan. This cross-sectional study was conducted at Adam Malik Hospital from January-June 2015. We examine 34 patients hepatitis B chronic, screen RDW, platelet, and fibroscan. Data were statistically analyzed. The result RPR with ROC procedure has an accuracy of 72.3% (95% CI: 84.1% - 97%). In this study, the RPR had a moderate ability to predict fibrosis degree (p = 0.029 with AUC> 70%). The cutoff value RPR was 0.0591, sensitivity and spesificity were 71.4% and 60%, Positive Prediction Value (PPV) was 55.6% and Negative Predictions Value (NPV) was 75%, positive likelihood ratio was 1.79 and negative likelihood ratio was 0.48. RPR have the ability to predict the degree of liver fibrosis in chronic hepatitis B patients with moderate accuracy.
Effectiveness of Link Prediction for Face-to-Face Behavioral Networks
Tsugawa, Sho; Ohsaki, Hiroyuki
2013-01-01
Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30–0.45 and a recall of 0.10–0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks. PMID:24339956
Scully, S; Butler, S T; Kelly, A K; Evans, A C O; Lonergan, P; Crowe, M A
2014-01-01
The aim was to assess the ability of corpus luteum (CL) and uterine ultrasound characteristics on d 18 to 21 to predict pregnancy status in lactating dairy cows. Ultrasound examinations were carried out on cows (n = 164) on d 18 to 21 following artificial insemination (AI). Images of the uterus and CL were captured using a Voluson i ultrasound device (General Electric Healthcare Systems, Vienna, Austria) equipped with a 12-MHz, multi frequency, linear array probe. Serum concentrations of progesterone were determined from blood samples collected at each ultrasound examination. Images of the CL were captured and stored for calculation of CL tissue area and echotexture. Images of the CL and associated blood flow area were captured and stored for analysis of luteal blood flow ratio. Longitudinal B-mode images of the uterine horns were stored for analysis of echotexture. Diagnosis of pregnancy was made at each ultrasound examination based on CL blood flow, CL size, and uterine echotexture. Pregnancy was confirmed by ultrasonography on d 30 after AI. The relationship between ultrasound measures and pregnancy outcome, as well as the accuracy of the pregnancy diagnosis made at each ultrasound examination was assessed. Progesterone concentrations and CL tissue area were greater in pregnant compared with nonpregnant cows on all days. The CL blood flow ratio was higher in pregnant compared with nonpregnant cows on d 20 and 21 after AI. Echotexture measures of the CL and uterus were not different between pregnant and nonpregnant cows on any day of examination. The best logistic regression model to predict pregnancy included scores for CL blood flow, CL size, and uterine echotexture on d 21 following AI. Accuracy of pregnancy diagnosis was highest on d 21, with sensitivity and specificity being 97.6 and 97.5%, respectively. Uterine echotexture scores were similar for pregnant and nonpregnant cows from d 18 to 20. On d 21, pregnant cows had higher uterine echotexture scores compared with nonpregnant cows. The logistic regression equation most likely to provide a correct pregnancy diagnosis in lactating dairy cows included the visual score for CL blood flow, CL size, and uterine echotexture on d 21 after AI. In support of this finding, the diagnostic accuracy for visual scores of CL blood flow, CL size, and uterine echotexture were also highest on d 21. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Wang, Ming; Long, Qi
2016-09-01
Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.
2007-01-01
To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.
The value of cows in reference populations for genomic selection of new functional traits.
Buch, L H; Kargo, M; Berg, P; Lassen, J; Sørensen, A C
2012-06-01
Today, almost all reference populations consist of progeny tested bulls. However, older progeny tested bulls do not have reliable estimated breeding values (EBV) for new traits. Thus, to be able to select for these new traits, it is necessary to build a reference population. We used a deterministic prediction model to test the hypothesis that the value of cows in reference populations depends on the availability of phenotypic records. To test the hypothesis, we investigated different strategies of building a reference population for a new functional trait over a 10-year period. The trait was either recorded on a large scale (30 000 cows per year) or on a small scale (2000 cows per year). For large-scale recording, we compared four scenarios where the reference population consisted of 30 sires; 30 sires and 170 test bulls; 30 sires and 2000 cows; or 30 sires, 2000 cows and 170 test bulls in the first year with measurements of the new functional trait. In addition to varying the make-up of the reference population, we also varied the heritability of the trait (h2 = 0.05 v. 0.15). The results showed that a reference population of test bulls, cows and sires results in the highest accuracy of the direct genomic values (DGV) for a new functional trait, regardless of its heritability. For small-scale recording, we compared two scenarios where the reference population consisted of the 2000 cows with phenotypic records or the 30 sires of these cows in the first year with measurements of the new functional trait. The results showed that a reference population of cows results in the highest accuracy of the DGV whether the heritability is 0.05 or 0.15, because variation is lost when phenotypic data on cows are summarized in EBV of their sires. The main conclusions from this study are: (i) the fewer phenotypic records, the larger effect of including cows in the reference population; (ii) for small-scale recording, the accuracy of the DGV will continue to increase for several years, whereas the increases in the accuracy of the DGV quickly decrease with large-scale recording; (iii) it is possible to achieve accuracies of the DGV that enable selection for new functional traits recorded on a large scale within 3 years from commencement of recording; and (iv) a higher heritability benefits a reference population of cows more than a reference population of bulls.
Accuracy assessment of seven global land cover datasets over China
NASA Astrophysics Data System (ADS)
Yang, Yongke; Xiao, Pengfeng; Feng, Xuezhi; Li, Haixing
2017-03-01
Land cover (LC) is the vital foundation to Earth science. Up to now, several global LC datasets have arisen with efforts of many scientific communities. To provide guidelines for data usage over China, nine LC maps from seven global LC datasets (IGBP DISCover, UMD, GLC, MCD12Q1, GLCNMO, CCI-LC, and GlobeLand30) were evaluated in this study. First, we compared their similarities and discrepancies in both area and spatial patterns, and analysed their inherent relations to data sources and classification schemes and methods. Next, five sets of validation sample units (VSUs) were collected to calculate their accuracy quantitatively. Further, we built a spatial analysis model and depicted their spatial variation in accuracy based on the five sets of VSUs. The results show that, there are evident discrepancies among these LC maps in both area and spatial patterns. For LC maps produced by different institutes, GLC 2000 and CCI-LC 2000 have the highest overall spatial agreement (53.8%). For LC maps produced by same institutes, overall spatial agreement of CCI-LC 2000 and 2010, and MCD12Q1 2001 and 2010 reach up to 99.8% and 73.2%, respectively; while more efforts are still needed if we hope to use these LC maps as time series data for model inputting, since both CCI-LC and MCD12Q1 fail to represent the rapid changing trend of several key LC classes in the early 21st century, in particular urban and built-up, snow and ice, water bodies, and permanent wetlands. With the highest spatial resolution, the overall accuracy of GlobeLand30 2010 is 82.39%. For the other six LC datasets with coarse resolution, CCI-LC 2010/2000 has the highest overall accuracy, and following are MCD12Q1 2010/2001, GLC 2000, GLCNMO 2008, IGBP DISCover, and UMD in turn. Beside that all maps exhibit high accuracy in homogeneous regions; local accuracies in other regions are quite different, particularly in Farming-Pastoral Zone of North China, mountains in Northeast China, and Southeast Hills. Special attention should be paid for data users who are interested in these regions.
Inci, Ercan; Ekizoglu, Oguzhan; Turkay, Rustu; Aksoy, Sema; Can, Ismail Ozgur; Solmaz, Dilek; Sayin, Ibrahim
2016-10-01
Morphometric analysis of the mandibular ramus (MR) provides highly accurate data to discriminate sex. The objective of this study was to demonstrate the utility and accuracy of MR morphometric analysis for sex identification in a Turkish population.Four hundred fifteen Turkish patients (18-60 y; 201 male and 214 female) who had previously had multidetector computed tomography scans of the cranium were included in the study. Multidetector computed tomography images were obtained using three-dimensional reconstructions and a volume-rendering technique, and 8 linear and 3 angular values were measured. Univariate, bivariate, and multivariate discriminant analyses were performed, and the accuracy rates for determining sex were calculated.Mandibular ramus values produced high accuracy rates of 51% to 95.6%. Upper ramus vertical height had the highest rate at 95.6%, and bivariate analysis showed 89.7% to 98.6% accuracy rates with the highest ratios of mandibular flexure upper border and maximum ramus breadth. Stepwise discrimination analysis gave a 99% accuracy rate for all MR variables.Our study showed that the MR, in particular morphometric measures of the upper part of the ramus, can provide valuable data to determine sex in a Turkish population. The method combines both anthropological and radiologic studies.
Accuracy of parameterized proton range models; A comparison
NASA Astrophysics Data System (ADS)
Pettersen, H. E. S.; Chaar, M.; Meric, I.; Odland, O. H.; Sølie, J. R.; Röhrich, D.
2018-03-01
An accurate calculation of proton ranges in phantoms or detector geometries is crucial for decision making in proton therapy and proton imaging. To this end, several parameterizations of the range-energy relationship exist, with different levels of complexity and accuracy. In this study we compare the accuracy of four different parameterizations models for proton range in water: Two analytical models derived from the Bethe equation, and two different interpolation schemes applied to range-energy tables. In conclusion, a spline interpolation scheme yields the highest reproduction accuracy, while the shape of the energy loss-curve is best reproduced with the differentiated Bragg-Kleeman equation.
Hydrometeorological model for streamflow prediction
Tangborn, Wendell V.
1979-01-01
The hydrometeorological model described in this manual was developed to predict seasonal streamflow from water in storage in a basin using streamflow and precipitation data. The model, as described, applies specifically to the Skokomish, Nisqually, and Cowlitz Rivers, in Washington State, and more generally to streams in other regions that derive seasonal runoff from melting snow. Thus the techniques demonstrated for these three drainage basins can be used as a guide for applying this method to other streams. Input to the computer program consists of daily averages of gaged runoff of these streams, and daily values of precipitation collected at Longmire, Kid Valley, and Cushman Dam. Predictions are based on estimates of the absolute storage of water, predominately as snow: storage is approximately equal to basin precipitation less observed runoff. A pre-forecast test season is used to revise the storage estimate and improve the prediction accuracy. To obtain maximum prediction accuracy for operational applications with this model , a systematic evaluation of several hydrologic and meteorologic variables is first necessary. Six input options to the computer program that control prediction accuracy are developed and demonstrated. Predictions of streamflow can be made at any time and for any length of season, although accuracy is usually poor for early-season predictions (before December 1) or for short seasons (less than 15 days). The coefficient of prediction (CP), the chief measure of accuracy used in this manual, approaches zero during the late autumn and early winter seasons and reaches a maximum of about 0.85 during the spring snowmelt season. (Kosco-USGS)
Protein docking prediction using predicted protein-protein interface.
Li, Bin; Kihara, Daisuke
2012-01-10
Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Südmeyer, Martin; Antke, Christina; Zizek, Tanja; Beu, Markus; Nikolaus, Susanne; Wojtecki, Lars; Schnitzler, Alfons; Müller, Hans-Wilhelm
2011-05-01
In vivo molecular imaging of pre- and postsynaptic nigrostriatal neuronal degeneration and sympathetic cardiac innervation with SPECT is used to distinguish idiopathic Parkinson disease (PD) from atypical parkinsonian disorder (APD). However, the diagnostic accuracy of these imaging approaches as stand-alone procedures is often unsatisfying. The aim of this study was therefore to evaluate to which extent diagnostic accuracy can be increased by their combined use together with a multidimensional statistical algorithm. The SPECT radiotracers (123)I-(S)-2-hydroxy-3-iodo-6-methoxy-N-[1-ethyl-2-pyrrodinyl)-methyl]benzamide (IBZM), (123)I-N-ω-fluoropropyl-2β-carbomethoxy-3β-(4-iodophenyl)nortropan (FP-CIT), and meta-(123)I-iodobenzylguanidine (MIBG) were used to assess striatal postsynaptic D(2) receptor binding, striatal presynaptic dopamine transporter binding, and myocardial adrenergic innervation, respectively. Thirty-one PD and 17 APD patients were prospectively investigated. PD and APD diagnoses were established using consensus criteria and reevaluated after 37.4 ± 12.4 and 26 ± 11.6 mo in PD and APD, respectively. Test accuracy (TA) for PD-APD differentiation was computed for all logical (Boolean) combinations of imaging modalities by receiver-operating-characteristic analysis--that is, after multidimensional optimization of cutoff values. Analysis showed moderate TA for PD-APD differentiation using each molecular approach alone (IBZM, 79%; MIBG, 73%; and FP-CIT, 73%). For combined use, the highest TA resulted under the assumption that at least 2 of the 3 biologic markers had to be positive for APD using the following cutoff values: 1.46 or less for IBZM, less than 2.10 for FP-CIT, and greater than 1.43 for MIBG. This algorithm distinguished APD from PD with a sensitivity of 94%, specificity of 94% (TA, 94%), positive predictive value of 89%, and negative predictive value of 97%. Results suggest that the multidimensional combination of FP-CIT, IBZM, and MIBG scintigraphy is likely to significantly increase TA in differentiating PD from APD. The differential diagnosis of degenerative parkinsonism may thus be facilitated.
Adnan, Adnan A.; Jibrin, Jibrin M.; Kamara, Alpha Y.; Abdulrahman, Bassam L.; Shaibu, Abdulwahab S.; Garba, Ismail I.
2017-01-01
Field trials were carried out in the Sudan Savannah of Nigeria to assess the usefulness of CERES–maize crop model as a decision support tool for optimizing maize production through manipulation of plant dates. The calibration experiments comprised of 20 maize varieties planted during the dry and rainy seasons of 2014 and 2015 at Bayero University Kano and Audu Bako College of Agriculture Dambatta. The trials for model evaluation were conducted in 16 different farmer fields across the Sudan (Bunkure and Garun—Mallam) and Northern Guinea (Tudun-Wada and Lere) Savannas using two of the calibrated varieties under four different sowing dates. The model accurately predicted grain yield, harvest index, and biomass of both varieties with low RMSE-values (below 5% of mean), high d-index (above 0.8), and high r-square (above 0.9) for the calibration trials. The time series data (tops weight, stem and leaf dry weights) were also predicted with high accuracy (% RMSEn above 70%, d-index above 0.88). Similar results were also observed for the evaluation trials, where all variables were simulated with high accuracies. Estimation efficiencies (EF)-values above 0.8 were observed for all the evaluation parameters. Seasonal and sensitivity analyses on Typic Plinthiustalfs and Plinthic Kanhaplustults in the Sudan and Northern Guinea Savannas were conducted. Results showed that planting extra early maize varieties in late July and early maize in mid-June leads to production of highest grain yields in the Sudan Savanna. In the Northern Guinea Savanna planting extra-early maize in mid-July and early maize in late July produced the highest grain yields. Delaying planting in both Agro-ecologies until mid-August leads to lower yields. Delaying planting to mid-August led to grain yield reduction of 39.2% for extra early maize and 74.4% for early maize in the Sudan Savanna. In the Northern Guinea Savanna however, delaying planting to mid-August resulted in yield reduction of 66.9 and 94.3% for extra-early and early maize, respectively. PMID:28702039
Na, Youngjun; Li, Dong Hua; Choi, Yongjun; Kim, Kyoung Hoon; Lee, Sang Rak
2018-03-02
Two experiments were conducted to determine the effects of feeding level on nutrient digestibility and enteric methane (CH4) emissions in growing goats and Sika deer. Three growing male goats (initial BW of 22.4 ± 0.9 kg) and three growing male deer (initial BW of 20.2 ± 4.8 kg) were each allotted to a respiration-metabolism chamber for an adaptation period of 7 d and a data collection period of 3 d. An experimental diet was offered to each animal at one of three feeding levels (1.5, 2.0, and 2.5% of BW) in a 3 × 3 Latin square design. The chambers were used for measuring enteric CH4 emission. Nutrient digestibility decreased linearly in goats as feeding level increased, whereas Sika deer digestibility was not affected by feeding level. The enteric production of CH4 expressed as g/kg DMI, g/kg organic matter intake (OMI), and % of gross energy intake (GEI) decreased linearly with increased feeding level in goats; however, that of Sika deer was not affected by feeding level. Six equations were estimated for predicting the enteric CH4 emission from goats and Sika deer. For goat, equation 1 was found to be of the highest accuracy: CH4 (g/day) = 6.2 (± 14.1) + 10.2 (± 7.01) × DMI (kg/day) + 0.0048 (± 0.0275) × DMD (g/kg) - 0.0070 (± 0.0187) × neutral detergent fiber digestibility (NDFD; g/kg). For Sika deer, equation 4 was found to be of the highest accuracy: CH4 (g/day) = - 13.0 (± 30.8) + 29.4 (± 3.93) × DMI (kg/day) + 0.046 (± 0.094) × DMD (g/kg) - 0.0363 (± 0.0636) × NDFD (g/kg). Increasing the feeding level increased CH4 production in both goats and Sika deer, and predictive models of enteric CH4 production by goats and Sika deer were estimated.
Na, Youngjun; Li, Dong Hua; Lee, Sang Rak
2017-07-01
Two experiments were conducted to determine the effects of forage-to-concentrate (F:C) ratio on the nutrient digestibility and enteric methane (CH 4 ) emission in growing goats and Sika deer. Three male growing goats (body weight [BW] = 19.0±0.7 kg) and three male growing deer (BW = 19.3±1.2 kg) were respectively allotted to a 3×3 Latin square design with an adaptation period of 7 d and a data collection period of 3 d. Respiration-metabolism chambers were used for measuring the enteric CH 4 emission. Treatments of low (25:75), moderate (50:50), and high (73:27) F:C ratios were given to both goats and Sika deer. Dry matter (DM) and organic matter (OM) digestibility decreased linearly with increasing F:C ratio in both goats and Sika deer. In both goats and Sika deer, the CH 4 emissions expressed as g/d, g/kg BW 0.75 , % of gross energy intake, g/kg DM intake (DMI), and g/kg OM intake (OMI) decreased linearly as the F:C ratio increased, however, the CH 4 emissions expressed as g/kg digested DMI and OMI were not affected by the F:C ratio. Eight equations were derived for predicting the enteric CH 4 emission from goats and Sika deer. For goat, equation 1 was found to be of the highest accuracy: CH 4 (g/d) = 3.36+4.71×DMI (kg/d)-0.0036×neutral detergent fiber concentrate (NDFC, g/kg)+0.01563×dry matter digestibility (DMD, g/kg)-0.0108×neutral detergent fiber digestibility (NDFD, g/kg). For Sika deer, equation 5 was found to be of the highest accuracy: CH 4 (g/d) = 66.3+27.7×DMI (kg/d)-5.91×NDFC (g/kg)-7.11× DMD (g/kg)+0.0809×NDFD (g/kg). Digested nutrient intake could be considered when determining the CH 4 generation factor in goats and Sika deer. Finally, the enteric CH 4 prediction model for goats and Sika deer were estimated.
Gutierrez, Benjamin T.; Plant, Nathaniel G.; Pendleton, Elizabeth A.; Thieler, E. Robert
2014-01-01
Sea-level rise is an ongoing phenomenon that is expected to continue and is projected to have a wide range of effects on coastal environments and infrastructure during the 21st century and beyond. Consequently, there is a need to assemble relevant datasets and to develop modeling or other analytical approaches to evaluate the likelihood of particular sea-level rise impacts, such as coastal erosion, and to inform coastal management decisions with this information. This report builds on previous work that compiled oceanographic and geomorphic data as part of the U.S. Geological Survey’s Coastal Vulnerability Index (CVI) for the U.S. Atlantic coast, and developed a Bayesian Network to predict shoreline-change rates based on sea-level rise plus variables that describe the hydrodynamic and geologic setting. This report extends the previous analysis to include the Gulf and Pacific coasts of the continental United States and Alaska and Hawaii, which required using methods applied to the USGS CVI dataset to extract data for these regions. The Bayesian Network converts inputs that include observations of local rates of relative sea-level change, mean wave height, mean tide range, a geomorphic classification, coastal slope, and observed shoreline-change rates to calculate the probability of the shoreline-erosion rate exceeding a threshold level of 1 meter per year for the coasts of the United States. The calculated probabilities were compared to the historical observations of shoreline change to evaluate the hindcast success rate of the most likely probability of shoreline change. Highest accuracy was determined for the coast of Hawaii (98 percent success rate) and lowest accuracy was determined for the Gulf of Mexico (34 percent success rate). The minimum success rate rose to nearly 80 percent (Atlantic and Gulf coasts) when success included shoreline-change outcomes that were adjacent to the most likely outcome. Additionally, the probabilistic approach determines the confidence in calculated outcomes as the probability of the most likely outcome. The confidence was highest along the Pacific coast and it was lowest along the Alaskan coast.
Li, Jin; Tran, Maggie; Siwabessy, Justy
2016-01-01
Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia’s marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to ‘small p and large n’ problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and caution should be taken when applying filter FS methods in selecting predictive models. PMID:26890307
Li, Jin; Tran, Maggie; Siwabessy, Justy
2016-01-01
Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia's marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to 'small p and large n' problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and caution should be taken when applying filter FS methods in selecting predictive models.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-06-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.
Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C
2014-01-01
Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889
Utsumi, Takanobu; Oka, Ryo; Endo, Takumi; Yano, Masashi; Kamijima, Shuichi; Kamiya, Naoto; Fujimura, Masaaki; Sekita, Nobuyuki; Mikami, Kazuo; Hiruta, Nobuyuki; Suzuki, Hiroyoshi
2015-11-01
The aim of this study is to validate and compare the predictive accuracy of two nomograms predicting the probability of Gleason sum upgrading between biopsy and radical prostatectomy pathology among representative patients with prostate cancer. We previously developed a nomogram, as did Chun et al. In this validation study, patients originated from two centers: Toho University Sakura Medical Center (n = 214) and Chibaken Saiseikai Narashino Hospital (n = 216). We assessed predictive accuracy using area under the curve values and constructed calibration plots to grasp the tendency for each institution. Both nomograms showed a high predictive accuracy in each institution, although the constructed calibration plots of the two nomograms underestimated the actual probability in Toho University Sakura Medical Center. Clinicians need to use calibration plots for each institution to correctly understand the tendency of each nomogram for their patients, even if each nomogram has a good predictive accuracy. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Wu, Cai; Li, Liang
2018-05-15
This paper focuses on quantifying and estimating the predictive accuracy of prognostic models for time-to-event outcomes with competing events. We consider the time-dependent discrimination and calibration metrics, including the receiver operating characteristics curve and the Brier score, in the context of competing risks. To address censoring, we propose a unified nonparametric estimation framework for both discrimination and calibration measures, by weighting the censored subjects with the conditional probability of the event of interest given the observed data. The proposed method can be extended to time-dependent predictive accuracy metrics constructed from a general class of loss functions. We apply the methodology to a data set from the African American Study of Kidney Disease and Hypertension to evaluate the predictive accuracy of a prognostic risk score in predicting end-stage renal disease, accounting for the competing risk of pre-end-stage renal disease death, and evaluate its numerical performance in extensive simulation studies. Copyright © 2018 John Wiley & Sons, Ltd.
Bianchi, Lorenzo; Schiavina, Riccardo; Borghesi, Marco; Bianchi, Federico Mineo; Briganti, Alberto; Carini, Marco; Terrone, Carlo; Mottrie, Alex; Gacci, Mauro; Gontero, Paolo; Imbimbo, Ciro; Marchioro, Giansilvio; Milanese, Giulio; Mirone, Vincenzo; Montorsi, Francesco; Morgia, Giuseppe; Novara, Giacomo; Porreca, Angelo; Volpe, Alessandro; Brunocilla, Eugenio
2018-04-06
To assess the predictive accuracy and the clinical value of a recent nomogram predicting cancer-specific mortality-free survival after surgery in pN1 prostate cancer patients through an external validation. We evaluated 518 prostate cancer patients treated with radical prostatectomy and pelvic lymph node dissection with evidence of nodal metastases at final pathology, at 10 tertiary centers. External validation was carried out using regression coefficients of the previously published nomogram. The performance characteristics of the model were assessed by quantifying predictive accuracy, according to the area under the curve in the receiver operating characteristic curve and model calibration. Furthermore, we systematically analyzed the specificity, sensitivity, positive predictive value and negative predictive value for each nomogram-derived probability cut-off. Finally, we implemented decision curve analysis, in order to quantify the nomogram's clinical value in routine practice. External validation showed inferior predictive accuracy as referred to in the internal validation (65.8% vs 83.3%, respectively). The discrimination (area under the curve) of the multivariable model was 66.7% (95% CI 60.1-73.0%) by testing with receiver operating characteristic curve analysis. The calibration plot showed an overestimation throughout the range of predicted cancer-specific mortality-free survival rates probabilities. However, in decision curve analysis, the nomogram's use showed a net benefit when compared with the scenarios of treating all patients or none. In an external setting, the nomogram showed inferior predictive accuracy and suboptimal calibration characteristics as compared to that reported in the original population. However, decision curve analysis showed a clinical net benefit, suggesting a clinical implication to correctly manage pN1 prostate cancer patients after surgery. © 2018 The Japanese Urological Association.
Assessing and Ensuring GOES-R Magnetometer Accuracy
NASA Technical Reports Server (NTRS)
Kronenwetter, Jeffrey; Carter, Delano R.; Todirita, Monica; Chu, Donald
2016-01-01
The GOES-R magnetometer accuracy requirement is 1.7 nanoteslas (nT). During quiet times (100 nT), accuracy is defined as absolute mean plus 3 sigma. During storms (300 nT), accuracy is defined as absolute mean plus 2 sigma. To achieve this, the sensor itself has better than 1 nT accuracy. Because zero offset and scale factor drift over time, it is also necessary to perform annual calibration maneuvers. To predict performance, we used covariance analysis and attempted to corroborate it with simulations. Although not perfect, the two generally agree and show the expected behaviors. With the annual calibration regimen, these predictions suggest that the magnetometers will meet their accuracy requirements.
Elgendi, Mohamed; Norton, Ian; Brearley, Matt; Abbott, Derek; Schuurmans, Dale
2013-01-01
Photoplethysmogram (PPG) monitoring is not only essential for critically ill patients in hospitals or at home, but also for those undergoing exercise testing. However, processing PPG signals measured after exercise is challenging, especially if the environment is hot and humid. In this paper, we propose a novel algorithm that can detect systolic peaks under challenging conditions, as in the case of emergency responders in tropical conditions. Accurate systolic-peak detection is an important first step for the analysis of heart rate variability. Algorithms based on local maxima-minima, first-derivative, and slope sum are evaluated, and a new algorithm is introduced to improve the detection rate. With 40 healthy subjects, the new algorithm demonstrates the highest overall detection accuracy (99.84% sensitivity, 99.89% positive predictivity). Existing algorithms, such as Billauer's, Li's and Zong's, have comparable although lower accuracy. However, the proposed algorithm presents an advantage for real-time applications by avoiding human intervention in threshold determination. For best performance, we show that a combination of two event-related moving averages with an offset threshold has an advantage in detecting systolic peaks, even in heat-stressed PPG signals.
NASA Astrophysics Data System (ADS)
Kuo, Tang-Wei; Chang, Shengming
Results of three-dimensional steady flow calculations are compared with existing pressure and velocity measurements of two manifold-type junctions. The junctions consist of a main duct and a side branch, both with the same rectangular cross section, with the side branch joining the main duct at an angle of either 90 or 45 degrees. Both combining and dividing flow configurations are considered for different total mass flow rates and different side-branch-to-main-duct mass flow ratios. One objective of this investigation was to assess the effects of numerical differencing scheme and mesh refinement on solution accuracy, and both parameters showed strong influences on the computed results. It is shown that calculations should be made with the highest possible level of numerical accuracy and grid resolution in regions of flow recirculation. Comparisons of computed and measured velocities, static pressures, and flow loss coefficients are presented in this paper. For most cases considered, the model predictions are in good agreement with the measurements. Results can be used as input loss coefficients to an engine-simulation code, in addition to being used to evaluate a specific junction design.
Improving Remote Health Monitoring: A Low-Complexity ECG Compression Approach
Al-Ali, Abdulla; Mohamed, Amr; Ward, Rabab
2018-01-01
Recent advances in mobile technology have created a shift towards using battery-driven devices in remote monitoring settings and smart homes. Clinicians are carrying out diagnostic and screening procedures based on the electrocardiogram (ECG) signals collected remotely for outpatients who need continuous monitoring. High-speed transmission and analysis of large recorded ECG signals are essential, especially with the increased use of battery-powered devices. Exploring low-power alternative compression methodologies that have high efficiency and that enable ECG signal collection, transmission, and analysis in a smart home or remote location is required. Compression algorithms based on adaptive linear predictors and decimation by a factor B/K are evaluated based on compression ratio (CR), percentage root-mean-square difference (PRD), and heartbeat detection accuracy of the reconstructed ECG signal. With two databases (153 subjects), the new algorithm demonstrates the highest compression performance (CR=6 and PRD=1.88) and overall detection accuracy (99.90% sensitivity, 99.56% positive predictivity) over both databases. The proposed algorithm presents an advantage for the real-time transmission of ECG signals using a faster and more efficient method, which meets the growing demand for more efficient remote health monitoring. PMID:29337892
Improving Remote Health Monitoring: A Low-Complexity ECG Compression Approach.
Elgendi, Mohamed; Al-Ali, Abdulla; Mohamed, Amr; Ward, Rabab
2018-01-16
Recent advances in mobile technology have created a shift towards using battery-driven devices in remote monitoring settings and smart homes. Clinicians are carrying out diagnostic and screening procedures based on the electrocardiogram (ECG) signals collected remotely for outpatients who need continuous monitoring. High-speed transmission and analysis of large recorded ECG signals are essential, especially with the increased use of battery-powered devices. Exploring low-power alternative compression methodologies that have high efficiency and that enable ECG signal collection, transmission, and analysis in a smart home or remote location is required. Compression algorithms based on adaptive linear predictors and decimation by a factor B / K are evaluated based on compression ratio (CR), percentage root-mean-square difference (PRD), and heartbeat detection accuracy of the reconstructed ECG signal. With two databases (153 subjects), the new algorithm demonstrates the highest compression performance ( CR = 6 and PRD = 1.88 ) and overall detection accuracy (99.90% sensitivity, 99.56% positive predictivity) over both databases. The proposed algorithm presents an advantage for the real-time transmission of ECG signals using a faster and more efficient method, which meets the growing demand for more efficient remote health monitoring.
Texture analysis of common renal masses in multiple MR sequences for prediction of pathology
NASA Astrophysics Data System (ADS)
Hoang, Uyen N.; Malayeri, Ashkan A.; Lay, Nathan S.; Summers, Ronald M.; Yao, Jianhua
2017-03-01
This pilot study performs texture analysis on multiple magnetic resonance (MR) images of common renal masses for differentiation of renal cell carcinoma (RCC). Bounding boxes are drawn around each mass on one axial slice in T1 delayed sequence to use for feature extraction and classification. All sequences (T1 delayed, venous, arterial, pre-contrast phases, T2, and T2 fat saturated sequences) are co-registered and texture features are extracted from each sequence simultaneously. Random forest is used to construct models to classify lesions on 96 normal regions, 87 clear cell RCCs, 8 papillary RCCs, and 21 renal oncocytomas; ground truths are verified through pathology reports. The highest performance is seen in random forest model when data from all sequences are used in conjunction, achieving an overall classification accuracy of 83.7%. When using data from one single sequence, the overall accuracies achieved for T1 delayed, venous, arterial, and pre-contrast phase, T2, and T2 fat saturated were 79.1%, 70.5%, 56.2%, 61.0%, 60.0%, and 44.8%, respectively. This demonstrates promising results of utilizing intensity information from multiple MR sequences for accurate classification of renal masses.
Marcilio, Izabel; Hajat, Shakoor; Gouveia, Nelson
2013-08-01
This study aimed to develop different models to forecast the daily number of patients seeking emergency department (ED) care in a general hospital according to calendar variables and ambient temperature readings and to compare the models in terms of forecasting accuracy. The authors developed and tested six different models of ED patient visits using total daily counts of patient visits to an ED in Sao Paulo, Brazil, from January 1, 2008, to December 31, 2010. The first 33 months of the data set were used to develop the ED patient visits forecasting models (the training set), leaving the last 3 months to measure each model's forecasting accuracy by the mean absolute percentage error (MAPE). Forecasting models were developed using three different time-series analysis methods: generalized linear models (GLM), generalized estimating equations (GEE), and seasonal autoregressive integrated moving average (SARIMA). For each method, models were explored with and without the effect of mean daily temperature as a predictive variable. The daily mean number of ED visits was 389, ranging from 166 to 613. Data showed a weekly seasonal distribution, with highest patient volumes on Mondays and lowest patient volumes on weekends. There was little variation in daily visits by month. GLM and GEE models showed better forecasting accuracy than SARIMA models. For instance, the MAPEs from GLM models and GEE models at the first month of forecasting (October 2012) were 11.5 and 10.8% (models with and without control for the temperature effect, respectively), while the MAPEs from SARIMA models were 12.8 and 11.7%. For all models, controlling for the effect of temperature resulted in worse or similar forecasting ability than models with calendar variables alone, and forecasting accuracy was better for the short-term horizon (7 days in advance) than for the longer term (30 days in advance). This study indicates that time-series models can be developed to provide forecasts of daily ED patient visits, and forecasting ability was dependent on the type of model employed and the length of the time horizon being predicted. In this setting, GLM and GEE models showed better accuracy than SARIMA models. Including information about ambient temperature in the models did not improve forecasting accuracy. Forecasting models based on calendar variables alone did in general detect patterns of daily variability in ED volume and thus could be used for developing an automated system for better planning of personnel resources. © 2013 by the Society for Academic Emergency Medicine.
Dissolved oxygen content prediction in crab culture using a hybrid intelligent method
Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang
2016-01-01
A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds. PMID:27270206
Dissolved oxygen content prediction in crab culture using a hybrid intelligent method.
Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang
2016-06-08
A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds.
Goo, Yeung-Ja James; Chi, Der-Jang; Shen, Zong-De
2016-01-01
The purpose of this study is to establish rigorous and reliable going concern doubt (GCD) prediction models. This study first uses the least absolute shrinkage and selection operator (LASSO) to select variables and then applies data mining techniques to establish prediction models, such as neural network (NN), classification and regression tree (CART), and support vector machine (SVM). The samples of this study include 48 GCD listed companies and 124 NGCD (non-GCD) listed companies from 2002 to 2013 in the TEJ database. We conduct fivefold cross validation in order to identify the prediction accuracy. According to the empirical results, the prediction accuracy of the LASSO-NN model is 88.96 % (Type I error rate is 12.22 %; Type II error rate is 7.50 %), the prediction accuracy of the LASSO-CART model is 88.75 % (Type I error rate is 13.61 %; Type II error rate is 14.17 %), and the prediction accuracy of the LASSO-SVM model is 89.79 % (Type I error rate is 10.00 %; Type II error rate is 15.83 %).
Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences
2018-01-01
Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal. PMID:29682424
Exploring Mouse Protein Function via Multiple Approaches.
Huang, Guohua; Chu, Chen; Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning; Cai, Yu-Dong
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.
Exploring Mouse Protein Function via Multiple Approaches
Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. PMID:27846315
Revealing how network structure affects accuracy of link prediction
NASA Astrophysics Data System (ADS)
Yang, Jin-Xuan; Zhang, Xiao-Dong
2017-08-01
Link prediction plays an important role in network reconstruction and network evolution. The network structure affects the accuracy of link prediction, which is an interesting problem. In this paper we use common neighbors and the Gini coefficient to reveal the relation between them, which can provide a good reference for the choice of a suitable link prediction algorithm according to the network structure. Moreover, the statistical analysis reveals correlation between the common neighbors index, Gini coefficient index and other indices to describe the network structure, such as Laplacian eigenvalues, clustering coefficient, degree heterogeneity, and assortativity of network. Furthermore, a new method to predict missing links is proposed. The experimental results show that the proposed algorithm yields better prediction accuracy and robustness to the network structure than existing currently used methods for a variety of real-world networks.
Variation and Likeness in Ambient Artistic Portraiture.
Hayes, Susan; Rheinberger, Nick; Powley, Meagan; Rawnsley, Tricia; Brown, Linda; Brown, Malcolm; Butler, Karen; Clarke, Ann; Crichton, Stephen; Henderson, Maggie; McCosker, Helen; Musgrave, Ann; Wilcock, Joyce; Williams, Darren; Yeaman, Karin; Zaracostas, T S; Taylor, Adam C; Wallace, Gordon
2018-06-01
An artist-led exploration of portrait accuracy and likeness involved 12 Artists producing 12 portraits referencing a life-size 3D print of the same Sitter. The works were assessed during a public exhibition, and the resulting likeness assessments were compared to portrait accuracy as measured using geometric morphometrics (statistical shape analysis). Our results are that, independently of the assessors' prior familiarity with the Sitter's face, the likeness judgements tended to be higher for less morphologically accurate portraits. The two highest rated were the portrait that most exaggerated the Sitter's distinctive features, and a portrait that was a more accurate (but not the most accurate) depiction. In keeping with research showing photograph likeness assessments involve recognition, we found familiar assessors rated the two highest ranked portraits even higher than those with some or no familiarity. In contrast, those lacking prior familiarity with the Sitter's face showed greater favour for the portrait with the highest morphological accuracy, and therefore most likely engaged in face-matching with the exhibited 3D print. Furthermore, our research indicates that abstraction in portraiture may not enhance likeness, and we found that when our 12 highly diverse portraits were statistically averaged, this resulted in a portrait that is more morphologically accurate than any of the individual artworks comprising the average.
Analysis of energy-based algorithms for RNA secondary structure prediction
2012-01-01
Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Conclusions Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets. PMID:22296803
Analysis of energy-based algorithms for RNA secondary structure prediction.
Hajiaghayi, Monir; Condon, Anne; Hoos, Holger H
2012-02-01
RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the algorithm with best overall accuracy is a pseudo MEA-based algorithm of Hamada et al. that uses a generalized centroid estimator of base pairs. However, between MFE and other MEA-based methods, there is no clear winner in the sense that the relative accuracy of the MFE versus MEA-based algorithms changes depending on the underlying energy parameters. Third, of the four parameter sets we considered, the best accuracy for the MFE-, MEA-based, and pseudo-MEA-based methods is 0.686, 0.680, and 0.711, respectively (on a scale from 0 to 1 with 1 meaning perfect structure predictions) and is obtained with a thermodynamic parameter set obtained by Andronescu et al. called BL* (named after the Boltzmann likelihood method by which the parameters were derived). Large datasets should be used to obtain reliable measures of the accuracy of RNA structure prediction algorithms, and average accuracies on specific classes (such as Group I introns and Transfer RNAs) should be interpreted with caution, considering the relatively small size of currently available datasets for such classes. The accuracy of the MEA-based methods is significantly higher when using the BL* parameter set of Andronescu et al. than when using the parameters of Mathews and Turner, and there is no significant difference between the accuracy of MEA-based methods and MFE when using the BL* parameters. The pseudo-MEA-based method of Hamada et al. with the BL* parameter set significantly outperforms all other MFE and MEA-based algorithms on our large data sets.
DOT National Transportation Integrated Search
2015-07-01
Implementing the recommendations of this study is expected to significantly : improve the accuracy of camber measurements and predictions and to : ultimately help reduce construction delays, improve bridge serviceability, : and decrease costs.
Risk terrain modeling predicts child maltreatment.
Daley, Dyann; Bachmann, Michael; Bachmann, Brittany A; Pedigo, Christian; Bui, Minh-Thuy; Coffman, Jamye
2016-12-01
As indicated by research on the long-term effects of adverse childhood experiences (ACEs), maltreatment has far-reaching consequences for affected children. Effective prevention measures have been elusive, partly due to difficulty in identifying vulnerable children before they are harmed. This study employs Risk Terrain Modeling (RTM), an analysis of the cumulative effect of environmental factors thought to be conducive for child maltreatment, to create a highly accurate prediction model for future substantiated child maltreatment cases in the City of Fort Worth, Texas. The model is superior to commonly used hotspot predictions and more beneficial in aiding prevention efforts in a number of ways: 1) it identifies the highest risk areas for future instances of child maltreatment with improved precision and accuracy; 2) it aids the prioritization of risk-mitigating efforts by informing about the relative importance of the most significant contributing risk factors; 3) since predictions are modeled as a function of easily obtainable data, practitioners do not have to undergo the difficult process of obtaining official child maltreatment data to apply it; 4) the inclusion of a multitude of environmental risk factors creates a more robust model with higher predictive validity; and, 5) the model does not rely on a retrospective examination of past instances of child maltreatment, but adapts predictions to changing environmental conditions. The present study introduces and examines the predictive power of this new tool to aid prevention efforts seeking to improve the safety, health, and wellbeing of vulnerable children. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
A new method of power load prediction in electrification railway
NASA Astrophysics Data System (ADS)
Dun, Xiaohong
2018-04-01
Aiming at the character of electrification railway, the paper mainly studies the problem of load prediction in electrification railway. After the preprocessing of data, and the similar days are separated on the basis of its statistical characteristics. Meanwhile the accuracy of different methods is analyzed. The paper provides a new thought of prediction and a new method of accuracy of judgment for the load prediction of power system.
Hengartner, M P; Heekeren, K; Dvorsky, D; Walitza, S; Rössler, W; Theodoridou, A
2017-09-01
The aim of this study was to critically examine the prognostic validity of various clinical high-risk (CHR) criteria alone and in combination with additional clinical characteristics. A total of 188 CHR positive persons from the region of Zurich, Switzerland (mean age 20.5 years; 60.2% male), meeting ultra high-risk (UHR) and/or basic symptoms (BS) criteria, were followed over three years. The test battery included the Structured Interview for Prodromal Syndromes (SIPS), verbal IQ and many other screening tools. Conversion to psychosis was defined according to ICD-10 criteria for schizophrenia (F20) or brief psychotic disorder (F23). Altogether n=24 persons developed manifest psychosis within three years and according to Kaplan-Meier survival analysis, the projected conversion rate was 17.5%. The predictive accuracy of UHR was statistically significant but poor (area under the curve [AUC]=0.65, P<.05), whereas BS did not predict psychosis beyond mere chance (AUC=0.52, P=.730). Sensitivity and specificity were 0.83 and 0.47 for UHR, and 0.96 and 0.09 for BS. UHR plus BS achieved an AUC=0.66, with sensitivity and specificity of 0.75 and 0.56. In comparison, baseline antipsychotic medication yielded a predictive accuracy of AUC=0.62 (sensitivity=0.42; specificity=0.82). A multivariable prediction model comprising continuous measures of positive symptoms and verbal IQ achieved a substantially improved prognostic accuracy (AUC=0.85; sensitivity=0.86; specificity=0.85; positive predictive value=0.54; negative predictive value=0.97). We showed that BS have no predictive accuracy beyond chance, while UHR criteria poorly predict conversion to psychosis. Combining BS with UHR criteria did not improve the predictive accuracy of UHR alone. In contrast, dimensional measures of both positive symptoms and verbal IQ showed excellent prognostic validity. A critical re-thinking of binary at-risk criteria is necessary in order to improve the prognosis of psychotic disorders. Copyright © 2017 Elsevier Masson SAS. All rights reserved.