Training in Small Business Retailing: Testing Human Capital Theory.
ERIC Educational Resources Information Center
Barcala, Marta Fernandez; Perez, Maria Jose Sanzo; Gutierrez, Juan Antonio Trespalacios
1999-01-01
Looks at four models of training demand: (1) probability of attending training in the near future; (2) probability of having attended training in the past; (3) probability of being willing to follow multimedia and correspondence courses; and (4) probability of repeating the experience of attending another training course in the near future.…
Penaloza, Andrea; Mélot, Christian; Dochy, Emmanuelle; Blocklet, Didier; Gevenois, Pierre Alain; Wautrecht, Jean-Claude; Lheureux, Philippe; Motte, Serge
2007-01-01
Assessment of pretest probability should be the initial step in investigation of patients with suspected pulmonary embolism (PE). In teaching hospitals physicians in training are often the first physicians to evaluate patients. To evaluate the accuracy of pretest probability assessment of PE by physicians in training using the Wells clinical model and to assess the safety of a diagnostic strategy including pretest probability assessment. 291 consecutive outpatients with clinical suspicion of PE were categorized as having a low, moderate or high pretest probability of PE by physicians in training who could take supervising physicians' advice when they deemed necessary. Then, patients were managed according to a sequential diagnostic algorithm including D-dimer testing, lung scan, leg compression ultrasonography and helical computed tomography. Patients in whom PE was deemed absent were followed up for 3 months. 34 patients (18%) had PE. Prevalence of PE in the low, moderate and high pretest probability groups categorized by physicians in training alone was 3% (95% confidence interval (CI): 1% to 9%), 31% (95% CI: 22% to 42%) and 100% (95% CI: 61% to 100%) respectively. One of the 152 untreated patients (0.7%, 95% CI: 0.1% to 3.6%) developed a thromboembolic event during the 3-month follow-up period. Physicians in training can use the Wells clinical model to determine pretest probability of PE. A diagnostic strategy including the use of this model by physicians in training with access to supervising physicians' advice appears to be safe.
1998-05-01
Coverage Probability with a Random Optimization Procedure: An Artificial Neural Network Approach by Biing T. Guan, George Z. Gertner, and Alan B...Modeling Training Site Vegetation Coverage Probability with a Random Optimizing Procedure: An Artificial Neural Network Approach 6. AUTHOR(S) Biing...coverage based on past coverage. Approach A literature survey was conducted to identify artificial neural network analysis techniques applicable for
Storkel, Holly L.; Bontempo, Daniel E.; Aschenbrenner, Andrew J.; Maekawa, Junko; Lee, Su-Yeon
2013-01-01
Purpose Phonotactic probability or neighborhood density have predominately been defined using gross distinctions (i.e., low vs. high). The current studies examined the influence of finer changes in probability (Experiment 1) and density (Experiment 2) on word learning. Method The full range of probability or density was examined by sampling five nonwords from each of four quartiles. Three- and 5-year-old children received training on nonword-nonobject pairs. Learning was measured in a picture-naming task immediately following training and 1-week after training. Results were analyzed using multi-level modeling. Results A linear spline model best captured nonlinearities in phonotactic probability. Specifically word learning improved as probability increased in the lowest quartile, worsened as probability increased in the midlow quartile, and then remained stable and poor in the two highest quartiles. An ordinary linear model sufficiently described neighborhood density. Here, word learning improved as density increased across all quartiles. Conclusion Given these different patterns, phonotactic probability and neighborhood density appear to influence different word learning processes. Specifically, phonotactic probability may affect recognition that a sound sequence is an acceptable word in the language and is a novel word for the child, whereas neighborhood density may influence creation of a new representation in long-term memory. PMID:23882005
Repetitive pulses and laser-induced retinal injury thresholds
NASA Astrophysics Data System (ADS)
Lund, David J.
2007-02-01
Experimental studies with repetitively pulsed lasers show that the ED 50, expressed as energy per pulse, varies as the inverse fourth power of the number of pulses in the exposure, relatively independently of the wavelength, pulse duration, or pulse repetition frequency of the laser. Models based on a thermal damage mechanism cannot readily explain this result. Menendez et al. proposed a probability-summation model for predicting the threshold for a train of pulses based on the probit statistics for a single pulse. The model assumed that each pulse is an independent trial, unaffected by any other pulse in the train of pulses and assumes that the probability of damage for a single pulse is adequately described by the logistic curve. The requirement that the effect of each pulse in the pulse train be unaffected by the effects of other pulses in the train is a showstopper when the end effect is viewed as a thermal effect with each pulse in the train contributing to the end temperature of the target tissue. There is evidence that the induction of cell death by microcavitation bubbles around melanin granules heated by incident laser irradiation can satisfy the condition of pulse independence as required by the probability summation model. This paper will summarize the experimental data and discuss the relevance of the probability summation model given microcavitation as a damage mechanism.
Learn-as-you-go acceleration of cosmological parameter estimates
NASA Astrophysics Data System (ADS)
Aslanyan, Grigor; Easther, Richard; Price, Layne C.
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitly describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.
Learn-as-you-go acceleration of cosmological parameter estimates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aslanyan, Grigor; Easther, Richard; Price, Layne C., E-mail: g.aslanyan@auckland.ac.nz, E-mail: r.easther@auckland.ac.nz, E-mail: lpri691@aucklanduni.ac.nz
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitlymore » describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.« less
Guan, Li; Hao, Bibo; Cheng, Qijin; Yip, Paul SF
2015-01-01
Background Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. Objective The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. Methods There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric “Screening Efficiency” that were adopted to evaluate model effectiveness. Results Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Conclusions Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media. PMID:26543921
Guan, Li; Hao, Bibo; Cheng, Qijin; Yip, Paul Sf; Zhu, Tingshao
2015-01-01
Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric "Screening Efficiency" that were adopted to evaluate model effectiveness. Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media.
Training models of anatomic shape variability
Merck, Derek; Tracton, Gregg; Saboo, Rohit; Levy, Joshua; Chaney, Edward; Pizer, Stephen; Joshi, Sarang
2008-01-01
Learning probability distributions of the shape of anatomic structures requires fitting shape representations to human expert segmentations from training sets of medical images. The quality of statistical segmentation and registration methods is directly related to the quality of this initial shape fitting, yet the subject is largely overlooked or described in an ad hoc way. This article presents a set of general principles to guide such training. Our novel method is to jointly estimate both the best geometric model for any given image and the shape distribution for the entire population of training images by iteratively relaxing purely geometric constraints in favor of the converging shape probabilities as the fitted objects converge to their target segmentations. The geometric constraints are carefully crafted both to obtain legal, nonself-interpenetrating shapes and to impose the model-to-model correspondences required for useful statistical analysis. The paper closes with example applications of the method to synthetic and real patient CT image sets, including same patient male pelvis and head and neck images, and cross patient kidney and brain images. Finally, we outline how this shape training serves as the basis for our approach to IGRT∕ART. PMID:18777919
DOT National Transportation Integrated Search
2009-10-13
This paper describes a probabilistic approach to estimate the conditional probability of release of hazardous materials from railroad tank cars during train accidents. Monte Carlo methods are used in developing a probabilistic model to simulate head ...
Average BER and outage probability of the ground-to-train OWC link in turbulence with rain
NASA Astrophysics Data System (ADS)
Zhang, Yixin; Yang, Yanqiu; Hu, Beibei; Yu, Lin; Hu, Zheng-Da
2017-09-01
The bit-error rate (BER) and outage probability of optical wireless communication (OWC) link for the ground-to-train of the curved track in turbulence with rain is evaluated. Considering the re-modulation effects of raining fluctuation on optical signal modulated by turbulence, we set up the models of average BER and outage probability in the present of pointing errors, based on the double inverse Gaussian (IG) statistical distribution model. The numerical results indicate that, for the same covered track length, the larger curvature radius increases the outage probability and average BER. The performance of the OWC link in turbulence with rain is limited mainly by the rain rate and pointing errors which are induced by the beam wander and train vibration. The effect of the rain rate on the performance of the link is more severe than the atmospheric turbulence, but the fluctuation owing to the atmospheric turbulence affects the laser beam propagation more greatly than the skewness of the rain distribution. Besides, the turbulence-induced beam wander has a more significant impact on the system in heavier rain. We can choose the size of transmitting and receiving apertures and improve the shockproof performance of the tracks to optimize the communication performance of the system.
Statistically Qualified Neuro-Analytic system and Method for Process Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.
1998-11-04
An apparatus and method for monitoring a process involves development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two steps: deterministic model adaption and stochastic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics,augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation emor minimization technique. Stochastic model adaptation involves qualifying any remaining uncertaintymore » in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system.« less
Liu, Xiang; Saat, Mohd Rapik; Barkan, Christopher P L
2014-07-15
Railroads play a key role in the transportation of hazardous materials in North America. Rail transport differs from highway transport in several aspects, an important one being that rail transport involves trains in which many railcars carrying hazardous materials travel together. By contrast to truck accidents, it is possible that a train accident may involve multiple hazardous materials cars derailing and releasing contents with consequently greater potential impact on human health, property and the environment. In this paper, a probabilistic model is developed to estimate the probability distribution of the number of tank cars releasing contents in a train derailment. Principal operational characteristics considered include train length, derailment speed, accident cause, position of the first car derailed, number and placement of tank cars in a train and tank car safety design. The effect of train speed, tank car safety design and tank car positions in a train were evaluated regarding the number of cars that release their contents in a derailment. This research provides insights regarding the circumstances affecting multiple-tank-car release incidents and potential strategies to reduce their occurrences. The model can be incorporated into a larger risk management framework to enable better local, regional and national safety management of hazardous materials transportation by rail. Copyright © 2014 Elsevier B.V. All rights reserved.
Model-based segmentation of abdominal aortic aneurysms in CTA images
NASA Astrophysics Data System (ADS)
de Bruijne, Marleen; van Ginneken, Bram; Niessen, Wiro J.; Loog, Marco; Viergever, Max A.
2003-05-01
Segmentation of thrombus in abdominal aortic aneurysms is complicated by regions of low boundary contrast and by the presence of many neighboring structures in close proximity to the aneurysm wall. We present an automated method that is similar to the well known Active Shape Models (ASM), combining a three-dimensional shape model with a one-dimensional boundary appearance model. Our contribution is twofold: we developed a non-parametric appearance modeling scheme that effectively deals with a highly varying background, and we propose a way of generalizing models of curvilinear structures from small training sets. In contrast with the conventional ASM approach, the new appearance model trains on both true and false examples of boundary profiles. The probability that a given image profile belongs to the boundary is obtained using k nearest neighbor (kNN) probability density estimation. The performance of this scheme is compared to that of original ASMs, which minimize the Mahalanobis distance to the average true profile in the training set. The generalizability of the shape model is improved by modeling the objects axis deformation independent of its cross-sectional deformation. A leave-one-out experiment was performed on 23 datasets. Segmentation using the kNN appearance model significantly outperformed the original ASM scheme; average volume errors were 5.9% and 46% respectively.
Generative adversarial networks for brain lesion detection
NASA Astrophysics Data System (ADS)
Alex, Varghese; Safwan, K. P. Mohammed; Chennamsetty, Sai Saketh; Krishnamurthi, Ganapathy
2017-02-01
Manual segmentation of brain lesions from Magnetic Resonance Images (MRI) is cumbersome and introduces errors due to inter-rater variability. This paper introduces a semi-supervised technique for detection of brain lesion from MRI using Generative Adversarial Networks (GANs). GANs comprises of a Generator network and a Discriminator network which are trained simultaneously with the objective of one bettering the other. The networks were trained using non lesion patches (n=13,000) from 4 different MR sequences. The network was trained on BraTS dataset and patches were extracted from regions excluding tumor region. The Generator network generates data by modeling the underlying probability distribution of the training data, (PData). The Discriminator learns the posterior probability P (Label Data) by classifying training data and generated data as "Real" or "Fake" respectively. The Generator upon learning the joint distribution, produces images/patches such that the performance of the Discriminator on them are random, i.e. P (Label Data = GeneratedData) = 0.5. During testing, the Discriminator assigns posterior probability values close to 0.5 for patches from non lesion regions, while patches centered on lesion arise from a different distribution (PLesion) and hence are assigned lower posterior probability value by the Discriminator. On the test set (n=14), the proposed technique achieves whole tumor dice score of 0.69, sensitivity of 91% and specificity of 59%. Additionally the generator network was capable of generating non lesion patches from various MR sequences.
Shaikh, Nader; Hoberman, Alejandro; Hum, Stephanie W; Alberty, Anastasia; Muniz, Gysella; Kurs-Lasky, Marcia; Landsittel, Douglas; Shope, Timothy
2018-06-01
Accurately estimating the probability of urinary tract infection (UTI) in febrile preverbal children is necessary to appropriately target testing and treatment. To develop and test a calculator (UTICalc) that can first estimate the probability of UTI based on clinical variables and then update that probability based on laboratory results. Review of electronic medical records of febrile children aged 2 to 23 months who were brought to the emergency department of Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania. An independent training database comprising 1686 patients brought to the emergency department between January 1, 2007, and April 30, 2013, and a validation database of 384 patients were created. Five multivariable logistic regression models for predicting risk of UTI were trained and tested. The clinical model included only clinical variables; the remaining models incorporated laboratory results. Data analysis was performed between June 18, 2013, and January 12, 2018. Documented temperature of 38°C or higher in children aged 2 months to less than 2 years. With the use of culture-confirmed UTI as the main outcome, cutoffs for high and low UTI risk were identified for each model. The resultant models were incorporated into a calculation tool, UTICalc, which was used to evaluate medical records. A total of 2070 children were included in the study. The training database comprised 1686 children, of whom 1216 (72.1%) were female and 1167 (69.2%) white. The validation database comprised 384 children, of whom 291 (75.8%) were female and 200 (52.1%) white. Compared with the American Academy of Pediatrics algorithm, the clinical model in UTICalc reduced testing by 8.1% (95% CI, 4.2%-12.0%) and decreased the number of UTIs that were missed from 3 cases to none. Compared with empirically treating all children with a leukocyte esterase test result of 1+ or higher, the dipstick model in UTICalc would have reduced the number of treatment delays by 10.6% (95% CI, 0.9%-20.4%). UTICalc estimates the probability of UTI by evaluating the risk factors present in the individual child. As a result, testing and treatment can be tailored, thereby improving outcomes for children with UTI.
A method of real-time fault diagnosis for power transformers based on vibration analysis
NASA Astrophysics Data System (ADS)
Hong, Kaixing; Huang, Hai; Zhou, Jianping; Shen, Yimin; Li, Yujie
2015-11-01
In this paper, a novel probability-based classification model is proposed for real-time fault detection of power transformers. First, the transformer vibration principle is introduced, and two effective feature extraction techniques are presented. Next, the details of the classification model based on support vector machine (SVM) are shown. The model also includes a binary decision tree (BDT) which divides transformers into different classes according to health state. The trained model produces posterior probabilities of membership to each predefined class for a tested vibration sample. During the experiments, the vibrations of transformers under different conditions are acquired, and the corresponding feature vectors are used to train the SVM classifiers. The effectiveness of this model is illustrated experimentally on typical in-service transformers. The consistency between the results of the proposed model and the actual condition of the test transformers indicates that the model can be used as a reliable method for transformer fault detection.
Statistically qualified neuro-analytic failure detection method and system
Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.
2002-03-02
An apparatus and method for monitoring a process involve development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two stages: deterministic model adaption and stochastic model modification of the deterministic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics, augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation error minimization technique. Stochastic model modification involves qualifying any remaining uncertainty in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system. Illustrative of the method and apparatus, the method is applied to a peristaltic pump system.
Method for Automatic Selection of Parameters in Normal Tissue Complication Probability Modeling.
Christophides, Damianos; Appelt, Ane L; Gusnanto, Arief; Lilley, John; Sebag-Montefiore, David
2018-07-01
To present a fully automatic method to generate multiparameter normal tissue complication probability (NTCP) models and compare its results with those of a published model, using the same patient cohort. Data were analyzed from 345 rectal cancer patients treated with external radiation therapy to predict the risk of patients developing grade 1 or ≥2 cystitis. In total, 23 clinical factors were included in the analysis as candidate predictors of cystitis. Principal component analysis was used to decompose the bladder dose-volume histogram into 8 principal components, explaining more than 95% of the variance. The data set of clinical factors and principal components was divided into training (70%) and test (30%) data sets, with the training data set used by the algorithm to compute an NTCP model. The first step of the algorithm was to obtain a bootstrap sample, followed by multicollinearity reduction using the variance inflation factor and genetic algorithm optimization to determine an ordinal logistic regression model that minimizes the Bayesian information criterion. The process was repeated 100 times, and the model with the minimum Bayesian information criterion was recorded on each iteration. The most frequent model was selected as the final "automatically generated model" (AGM). The published model and AGM were fitted on the training data sets, and the risk of cystitis was calculated. The 2 models had no significant differences in predictive performance, both for the training and test data sets (P value > .05) and found similar clinical and dosimetric factors as predictors. Both models exhibited good explanatory performance on the training data set (P values > .44), which was reduced on the test data sets (P values < .05). The predictive value of the AGM is equivalent to that of the expert-derived published model. It demonstrates potential in saving time, tackling problems with a large number of parameters, and standardizing variable selection in NTCP modeling. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.
Analysis of multiple tank car releases in train accidents.
Liu, Xiang; Liu, Chang; Hong, Yili
2017-10-01
There are annually over two million carloads of hazardous materials transported by rail in the United States. The American railroads use large blocks of tank cars to transport petroleum crude oil and other flammable liquids from production to consumption sites. Being different from roadway transport of hazardous materials, a train accident can potentially result in the derailment and release of multiple tank cars, which may result in significant consequences. The prior literature predominantly assumes that the occurrence of multiple tank car releases in a train accident is a series of independent Bernoulli processes, and thus uses the binomial distribution to estimate the total number of tank car releases given the number of tank cars derailing or damaged. This paper shows that the traditional binomial model can incorrectly estimate multiple tank car release probability by magnitudes in certain circumstances, thereby significantly affecting railroad safety and risk analysis. To bridge this knowledge gap, this paper proposes a novel, alternative Correlated Binomial (CB) model that accounts for the possible correlations of multiple tank car releases in the same train. We test three distinct correlation structures in the CB model, and find that they all outperform the conventional binomial model based on empirical tank car accident data. The analysis shows that considering tank car release correlations would result in a significantly improved fit of the empirical data than otherwise. Consequently, it is prudent to consider alternative modeling techniques when analyzing the probability of multiple tank car releases in railroad accidents. Copyright © 2017 Elsevier Ltd. All rights reserved.
Thoreson, Wallace B.; Van Hook, Matthew J.; Parmelee, Caitlyn; Curto, Carina
2015-01-01
Post-synaptic responses are a product of quantal amplitude (Q), size of the releasable vesicle pool (N), and release probability (P). Voltage-dependent changes in presynaptic Ca2+ entry alter post-synaptic responses primarily by changing P but have also been shown to influence N. With simultaneous whole cell recordings from cone photoreceptors and horizontal cells in tiger salamander retinal slices, we measured N and P at cone ribbon synapses by using a train of depolarizing pulses to stimulate release and deplete the pool. We developed an analytical model that calculates the total pool size contributing to release under different stimulus conditions by taking into account the prior history of release and empirically-determined properties of replenishment. The model provided a formula that calculates vesicle pool size from measurements of the initial post-synaptic response and limiting rate of release evoked by a train of pulses, the fraction of release sites available for replenishment, and the time constant for replenishment. Results of the model showed that weak and strong depolarizing stimuli evoked release with differing probabilities but the same size vesicle pool. Enhancing intraterminal Ca2+ spread by lowering Ca2+ buffering or applying BayK8644 did not increase PSCs evoked with strong test steps showing there is a fixed upper limit to pool size. Together, these results suggest that light-evoked changes in cone membrane potential alter synaptic release solely by changing release probability. PMID:26541100
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, D; Aryal, M; Samuels, S
Purpose: A previous study showed that large sub-volumes of tumor with low blood volume (BV) (poorly perfused) in head-and-neck (HN) cancers are significantly associated with local-regional failure (LRF) after chemoradiation therapy, and could be targeted with intensified radiation doses. This study aimed to develop an automated and scalable model to extract voxel-wise contrast-enhanced temporal features of dynamic contrastenhanced (DCE) MRI in HN cancers for predicting LRF. Methods: Our model development consists of training and testing stages. The training stage includes preprocessing of individual-voxel DCE curves from tumors for intensity normalization and temporal alignment, temporal feature extraction from the curves, featuremore » selection, and training classifiers. For feature extraction, multiresolution Haar discrete wavelet transformation is applied to each DCE curve to capture temporal contrast-enhanced features. The wavelet coefficients as feature vectors are selected. Support vector machine classifiers are trained to classify tumor voxels having either low or high BV, for which a BV threshold of 7.6% is previously established and used as ground truth. The model is tested by a new dataset. The voxel-wise DCE curves for training and testing were from 14 and 8 patients, respectively. A posterior probability map of the low BV class was created to examine the tumor sub-volume classification. Voxel-wise classification accuracy was computed to evaluate performance of the model. Results: Average classification accuracies were 87.2% for training (10-fold crossvalidation) and 82.5% for testing. The lowest and highest accuracies (patient-wise) were 68.7% and 96.4%, respectively. Posterior probability maps of the low BV class showed the sub-volumes extracted by our model similar to ones defined by the BV maps with most misclassifications occurred near the sub-volume boundaries. Conclusion: This model could be valuable to support adaptive clinical trials with further validation. The framework could be extendable and scalable to extract temporal contrastenhanced features of DCE-MRI in other tumors. We would like to acknowledge NIH for funding support: UO1 CA183848.« less
van Stiphout, Ruud G P M; Valentini, Vincenzo; Buijsen, Jeroen; Lammering, Guido; Meldolesi, Elisa; van Soest, Johan; Leccisotti, Lucia; Giordano, Alessandro; Gambacorta, Maria A; Dekker, Andre; Lambin, Philippe
2014-11-01
To develop and externally validate a predictive model for pathologic complete response (pCR) for locally advanced rectal cancer (LARC) based on clinical features and early sequential (18)F-FDG PETCT imaging. Prospective data (i.a. THUNDER trial) were used to train (N=112, MAASTRO Clinic) and validate (N=78, Università Cattolica del S. Cuore) the model for pCR (ypT0N0). All patients received long-course chemoradiotherapy (CRT) and surgery. Clinical parameters were age, gender, clinical tumour (cT) stage and clinical nodal (cN) stage. PET parameters were SUVmax, SUVmean, metabolic tumour volume (MTV) and maximal tumour diameter, for which response indices between pre-treatment and intermediate scan were calculated. Using multivariate logistic regression, three probability groups for pCR were defined. The pCR rates were 21.4% (training) and 23.1% (validation). The selected predictive features for pCR were cT-stage, cN-stage, response index of SUVmean and maximal tumour diameter during treatment. The models' performances (AUC) were 0.78 (training) and 0.70 (validation). The high probability group for pCR resulted in 100% correct predictions for training and 67% for validation. The model is available on the website www.predictcancer.org. The developed predictive model for pCR is accurate and externally validated. This model may assist in treatment decisions during CRT to select complete responders for a wait-and-see policy, good responders for extra RT boost and bad responders for additional chemotherapy. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Prediction and visualization of redox conditions in the groundwater of Central Valley, California
NASA Astrophysics Data System (ADS)
Rosecrans, Celia Z.; Nolan, Bernard T.; Gronberg, JoAnn M.
2017-03-01
Regional-scale, three-dimensional continuous probability models, were constructed for aspects of redox conditions in the groundwater system of the Central Valley, California. These models yield grids depicting the probability that groundwater in a particular location will have dissolved oxygen (DO) concentrations less than selected threshold values representing anoxic groundwater conditions, or will have dissolved manganese (Mn) concentrations greater than selected threshold values representing secondary drinking water-quality contaminant levels (SMCL) and health-based screening levels (HBSL). The probability models were constrained by the alluvial boundary of the Central Valley to a depth of approximately 300 m. Probability distribution grids can be extracted from the 3-D models at any desired depth, and are of interest to water-resource managers, water-quality researchers, and groundwater modelers concerned with the occurrence of natural and anthropogenic contaminants related to anoxic conditions. Models were constructed using a Boosted Regression Trees (BRT) machine learning technique that produces many trees as part of an additive model and has the ability to handle many variables, automatically incorporate interactions, and is resistant to collinearity. Machine learning methods for statistical prediction are becoming increasing popular in that they do not require assumptions associated with traditional hypothesis testing. Models were constructed using measured dissolved oxygen and manganese concentrations sampled from 2767 wells within the alluvial boundary of the Central Valley, and over 60 explanatory variables representing regional-scale soil properties, soil chemistry, land use, aquifer textures, and aquifer hydrologic properties. Models were trained on a USGS dataset of 932 wells, and evaluated on an independent hold-out dataset of 1835 wells from the California Division of Drinking Water. We used cross-validation to assess the predictive performance of models of varying complexity, as a basis for selecting final models. Trained models were applied to cross-validation testing data and a separate hold-out dataset to evaluate model predictive performance by emphasizing three model metrics of fit: Kappa; accuracy; and the area under the receiver operator characteristic curve (ROC). The final trained models were used for mapping predictions at discrete depths to a depth of 304.8 m. Trained DO and Mn models had accuracies of 86-100%, Kappa values of 0.69-0.99, and ROC values of 0.92-1.0. Model accuracies for cross-validation testing datasets were 82-95% and ROC values were 0.87-0.91, indicating good predictive performance. Kappas for the cross-validation testing dataset were 0.30-0.69, indicating fair to substantial agreement between testing observations and model predictions. Hold-out data were available for the manganese model only and indicated accuracies of 89-97%, ROC values of 0.73-0.75, and Kappa values of 0.06-0.30. The predictive performance of both the DO and Mn models was reasonable, considering all three of these fit metrics and the low percentages of low-DO and high-Mn events in the data.
Calibration power of the Braden scale in predicting pressure ulcer development.
Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha
2016-11-02
Calibration is the degree of correspondence between the estimated probability produced by a model and the actual observed probability. The aim of this study was to investigate the calibration power of the Braden scale in predicting pressure ulcer development (PU). A retrospective analysis was performed among consecutive patients in 2013. The patients were separated into training a group and a validation group. The predicted incidence was calculated using a logistic regression model in the training group and the Hosmer-Lemeshow test was used for assessing the goodness of fit. In the validation cohort, the observed and the predicted incidence were compared by the Chi-square (χ 2 ) goodness of fit test for calibration power. We included 2585 patients in the study, of these 78 patients (3.0%) developed a PU. Between the training and validation groups the patient characteristics were non-significant (p>0.05). In the training group, the logistic regression model for predicting pressure ulcer was Logit(P) = -0.433*Braden score+2.616. The Hosmer-Lemeshow test showed no goodness fit (χ 2 =13.472; p=0.019). In the validation group, the predicted pressure ulcer incidence also did not fit well with the observed incidence (χ 2 =42.154, p=0.000 by Braden scores; and χ 2 =17.223, p=0.001 by Braden scale risk classification). The Braden scale has low calibration power in predicting PU formation.
Interactive Model Visualization for NET-VISA
NASA Astrophysics Data System (ADS)
Kuzma, H. A.; Arora, N. S.
2013-12-01
NET-VISA is a probabilistic system developed for seismic network processing of data measured on the International Monitoring System (IMS) of the Comprehensive nuclear Test Ban Treaty Organization (CTBTO). NET-VISA is composed of a Generative Model (GM) and an Inference Algorithm (IA). The GM is an explicit mathematical description of the relationships between various factors in seismic network analysis. Some of the relationships inside the GM are deterministic and some are statistical. Statistical relationships are described by probability distributions, the exact parameters of which (such as mean and standard deviation) are found by training NET-VISA using recent data. The IA uses the GM to evaluate the probability of various events and associations, searching for the seismic bulletin which has the highest overall probability and is consistent with a given set of measured arrivals. An Interactive Model Visualization tool (IMV) has been developed which makes 'peeking into' the GM simple and intuitive through a web-based interfaced. For example, it is now possible to access the probability distributions for attributes of events and arrivals such as the detection rate for each station for each of 14 phases. It also clarifies the assumptions and prior knowledge that are incorporated into NET-VISA's event determination. When NET-VISA is retrained, the IMV will be a visual tool for quality control both as a means of testing that the training has been accomplished correctly and that the IMS network has not changed unexpectedly. A preview of the IMV will be shown at this poster presentation. Homepage for the IMV IMV shows current model file and reference image.
Allocating provider resources to diagnose and treat restless legs syndrome: a cost-utility analysis.
Padula, William V; Phelps, Charles E; Moran, Dane; Earley, Christopher
2017-10-01
Restless legs syndrome (RLS) is a neurological disorder that is frequently misdiagnosed, resulting in delays in proper treatment. The objective of this study was to analyze the cost-utility of training primary care providers (PCP) in early and accurate diagnosis of RLS. We used a Markov model to compare two strategies: one where PCPs received training to diagnose RLS (informed care) and one where PCPs did not receive training (standard care). This analysis was conducted from the US societal and health sector perspectives over one-year, five-year, and lifetime (50-year) horizons. Costs were adjusted to 2016 USD, utilities measured as quality-adjusted life-years (QALYs), and both measures were discounted annually at 3%. Cost, utilities, and probabilities for the model were obtained through a comprehensive review of literature. An incremental cost-effectiveness ratio (ICER) was calculated to interpret our findings at a willingness-to-pay threshold of $100,000/QALY. Univariate and multivariate analyses were conducted to test model uncertainty, in addition to calculating the expected value of perfect information. Providing training to PCPs to correctly diagnose RLS was cost-effective since it cost $2021 more and gained 0.44 QALYs per patient over the course of a lifetime, resulting in an ICER of $4593/QALY. The model was sensitive to the utility for treated and untreated RLS. The probabilistic sensitivity analysis revealed that at $100,000/QALY, informed care had a 65.5% probability of being cost-effective. A program to train PCPs to better diagnose RLS appears to be a cost-effective strategy for improving outcomes for RLS patients. Copyright © 2017 Elsevier B.V. All rights reserved.
Endoscopic third ventriculostomy in the treatment of childhood hydrocephalus.
Kulkarni, Abhaya V; Drake, James M; Mallucci, Conor L; Sgouros, Spyros; Roth, Jonathan; Constantini, Shlomi
2009-08-01
To develop a model to predict the probability of endoscopic third ventriculostomy (ETV) success in the treatment for hydrocephalus on the basis of a child's individual characteristics. We analyzed 618 ETVs performed consecutively on children at 12 international institutions to identify predictors of ETV success at 6 months. A multivariable logistic regression model was developed on 70% of the dataset (training set) and validated on 30% of the dataset (validation set). In the training set, 305/455 ETVs (67.0%) were successful. The regression model (containing patient age, cause of hydrocephalus, and previous cerebrospinal fluid shunt) demonstrated good fit (Hosmer-Lemeshow, P = .78) and discrimination (C statistic = 0.70). In the validation set, 105/163 ETVs (64.4%) were successful and the model maintained good fit (Hosmer-Lemeshow, P = .45), discrimination (C statistic = 0.68), and calibration (calibration slope = 0.88). A simplified ETV Success Score was devised that closely approximates the predicted probability of ETV success. Children most likely to succeed with ETV can now be accurately identified and spared the long-term complications of CSF shunting.
NASA Astrophysics Data System (ADS)
Matsunaga, Y.; Sugita, Y.
2018-06-01
A data-driven modeling scheme is proposed for conformational dynamics of biomolecules based on molecular dynamics (MD) simulations and experimental measurements. In this scheme, an initial Markov State Model (MSM) is constructed from MD simulation trajectories, and then, the MSM parameters are refined using experimental measurements through machine learning techniques. The second step can reduce the bias of MD simulation results due to inaccurate force-field parameters. Either time-series trajectories or ensemble-averaged data are available as a training data set in the scheme. Using a coarse-grained model of a dye-labeled polyproline-20, we compare the performance of machine learning estimations from the two types of training data sets. Machine learning from time-series data could provide the equilibrium populations of conformational states as well as their transition probabilities. It estimates hidden conformational states in more robust ways compared to that from ensemble-averaged data although there are limitations in estimating the transition probabilities between minor states. We discuss how to use the machine learning scheme for various experimental measurements including single-molecule time-series trajectories.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
Pointwise probability reinforcements for robust statistical inference.
Frénay, Benoît; Verleysen, Michel
2014-02-01
Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.
Prediction and visualization of redox conditions in the groundwater of Central Valley, California
Rosecrans, Celia Z.; Nolan, Bernard T.; Gronberg, JoAnn M.
2017-01-01
Regional-scale, three-dimensional continuous probability models, were constructed for aspects of redox conditions in the groundwater system of the Central Valley, California. These models yield grids depicting the probability that groundwater in a particular location will have dissolved oxygen (DO) concentrations less than selected threshold values representing anoxic groundwater conditions, or will have dissolved manganese (Mn) concentrations greater than selected threshold values representing secondary drinking water-quality contaminant levels (SMCL) and health-based screening levels (HBSL). The probability models were constrained by the alluvial boundary of the Central Valley to a depth of approximately 300 m. Probability distribution grids can be extracted from the 3-D models at any desired depth, and are of interest to water-resource managers, water-quality researchers, and groundwater modelers concerned with the occurrence of natural and anthropogenic contaminants related to anoxic conditions.Models were constructed using a Boosted Regression Trees (BRT) machine learning technique that produces many trees as part of an additive model and has the ability to handle many variables, automatically incorporate interactions, and is resistant to collinearity. Machine learning methods for statistical prediction are becoming increasing popular in that they do not require assumptions associated with traditional hypothesis testing. Models were constructed using measured dissolved oxygen and manganese concentrations sampled from 2767 wells within the alluvial boundary of the Central Valley, and over 60 explanatory variables representing regional-scale soil properties, soil chemistry, land use, aquifer textures, and aquifer hydrologic properties. Models were trained on a USGS dataset of 932 wells, and evaluated on an independent hold-out dataset of 1835 wells from the California Division of Drinking Water. We used cross-validation to assess the predictive performance of models of varying complexity, as a basis for selecting final models. Trained models were applied to cross-validation testing data and a separate hold-out dataset to evaluate model predictive performance by emphasizing three model metrics of fit: Kappa; accuracy; and the area under the receiver operator characteristic curve (ROC). The final trained models were used for mapping predictions at discrete depths to a depth of 304.8 m. Trained DO and Mn models had accuracies of 86–100%, Kappa values of 0.69–0.99, and ROC values of 0.92–1.0. Model accuracies for cross-validation testing datasets were 82–95% and ROC values were 0.87–0.91, indicating good predictive performance. Kappas for the cross-validation testing dataset were 0.30–0.69, indicating fair to substantial agreement between testing observations and model predictions. Hold-out data were available for the manganese model only and indicated accuracies of 89–97%, ROC values of 0.73–0.75, and Kappa values of 0.06–0.30. The predictive performance of both the DO and Mn models was reasonable, considering all three of these fit metrics and the low percentages of low-DO and high-Mn events in the data.
Estimating synaptic parameters from mean, variance, and covariance in trains of synaptic responses.
Scheuss, V; Neher, E
2001-10-01
Fluctuation analysis of synaptic transmission using the variance-mean approach has been restricted in the past to steady-state responses. Here we extend this method to short repetitive trains of synaptic responses, during which the response amplitudes are not stationary. We consider intervals between trains, long enough so that the system is in the same average state at the beginning of each train. This allows analysis of ensemble means and variances for each response in a train separately. Thus, modifications in synaptic efficacy during short-term plasticity can be attributed to changes in synaptic parameters. In addition, we provide practical guidelines for the analysis of the covariance between successive responses in trains. Explicit algorithms to estimate synaptic parameters are derived and tested by Monte Carlo simulations on the basis of a binomial model of synaptic transmission, allowing for quantal variability, heterogeneity in the release probability, and postsynaptic receptor saturation and desensitization. We find that the combined analysis of variance and covariance is advantageous in yielding an estimate for the number of release sites, which is independent of heterogeneity in the release probability under certain conditions. Furthermore, it allows one to calculate the apparent quantal size for each response in a sequence of stimuli.
NASA Astrophysics Data System (ADS)
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
2017-11-01
In Hezaveh et al. we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational-lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data, as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single variational parameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that the application of approximate Bayesian neural networks to astrophysical modeling problems can be a fast alternative to Monte Carlo Markov Chains, allowing orders of magnitude improvement in speed.
Training Teachers to Teach Probability
ERIC Educational Resources Information Center
Batanero, Carmen; Godino, Juan D.; Roa, Rafael
2004-01-01
In this paper we analyze the reasons why the teaching of probability is difficult for mathematics teachers, describe the contents needed in the didactical preparation of teachers to teach probability and analyze some examples of activities to carry out this training. These activities take into account the experience at the University of Granada,…
NASA Astrophysics Data System (ADS)
Mondini, Alessandro C.; Chang, Kang-Tsung; Chiang, Shou-Hao; Schlögel, Romy; Notarnicola, Claudia; Saito, Hitoshi
2017-12-01
We propose a framework to systematically generate event landslide inventory maps from satellite images in southern Taiwan, where landslides are frequent and abundant. The spectral information is used to assess the pixel land cover class membership probability through a Maximum Likelihood classifier trained with randomly generated synthetic land cover spectral fingerprints, which are obtained from an independent training images dataset. Pixels are classified as landslides when the calculated landslide class membership probability, weighted by a susceptibility model, is higher than membership probabilities of other classes. We generated synthetic fingerprints from two FORMOSAT-2 images acquired in 2009 and tested the procedure on two other images, one in 2005 and the other in 2009. We also obtained two landslide maps through manual interpretation. The agreement between the two sets of inventories is given by the Cohen's k coefficients of 0.62 and 0.64, respectively. This procedure can now classify a new FORMOSAT-2 image automatically facilitating the production of landslide inventory maps.
Wang, Chen; Lu, Linjun; Lu, Jian; Wang, Tao
2016-01-01
In order to improve motorcycle safety, this article examines the correlation between crash avoidance maneuvers and injury severity sustained by motorcyclists, under multiple precrash conditions. Ten-year crash data for single-vehicle motorcycle crashes from the General Estimates Systems (GES) were analyzed, using partial proportional odds models (i.e., generalized ordered logit models). The modeling results show that "braking (no lock-up)" is associated with a higher probability of increased severity, whereas "braking (lock-up)" is associated with a higher probability of decreased severity, under all precrash conditions. "Steering" is associated with a higher probability of reduced injury severity when other vehicles are encroaching, whereas it is correlated with high injury severity under other conditions. "Braking and steering" is significantly associated with a higher probability of low severity under "animal encounter and object presence," whereas it is surprisingly correlated with high injury severity when motorcycles are traveling off the edge of the road. The results also show that a large number of motorcyclists did not perform any crash avoidance maneuvers or conducted crash avoidance maneuvers that are significantly associated with high injury severity. In general, this study suggests that precrash maneuvers are an important factor associated with motorcyclists' injury severity. To improve motorcycle safety, training/educational programs should be considered to improve safety awareness and adjust driving habits of motorcyclists. Antilock brakes and such systems are also promising, because they could effectively prevent brake lock-up and assist motorcyclists in maneuvering during critical conditions. This study also provides valuable information for the design of motorcycle training curriculum.
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.
Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu
2005-01-01
Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.
Real-time individual predictions of prostate cancer recurrence using joint models
Taylor, Jeremy M. G.; Park, Yongseok; Ankerst, Donna P.; Proust-Lima, Cecile; Williams, Scott; Kestin, Larry; Bae, Kyoungwha; Pickles, Tom; Sandler, Howard
2012-01-01
Summary Patients who were previously treated for prostate cancer with radiation therapy are monitored at regular intervals using a laboratory test called Prostate Specific Antigen (PSA). If the value of the PSA test starts to rise, this is an indication that the prostate cancer is more likely to recur, and the patient may wish to initiate new treatments. Such patients could be helped in making medical decisions by an accurate estimate of the probability of recurrence of the cancer in the next few years. In this paper, we describe the methodology for giving the probability of recurrence for a new patient, as implemented on a web-based calculator. The methods use a joint longitudinal survival model. The model is developed on a training dataset of 2,386 patients and tested on a dataset of 846 patients. Bayesian estimation methods are used with one Markov chain Monte Carlo (MCMC) algorithm developed for estimation of the parameters from the training dataset and a second quick MCMC developed for prediction of the risk of recurrence that uses the longitudinal PSA measures from a new patient. PMID:23379600
Quality assessment of a new surgical simulator for neuroendoscopic training.
Filho, Francisco Vaz Guimarães; Coelho, Giselle; Cavalheiro, Sergio; Lyra, Marcos; Zymberg, Samuel T
2011-04-01
Ideal surgical training models should be entirely reliable, atoxic, easy to handle, and, if possible, low cost. All available models have their advantages and disadvantages. The choice of one or another will depend on the type of surgery to be performed. The authors created an anatomical model called the S.I.M.O.N.T. (Sinus Model Oto-Rhino Neuro Trainer) Neurosurgical Endotrainer, which can provide reliable neuroendoscopic training. The aim in the present study was to assess both the quality of the model and the development of surgical skills by trainees. The S.I.M.O.N.T. is built of a synthetic thermoretractable, thermosensible rubber called Neoderma, which, combined with different polymers, produces more than 30 different formulas. Quality assessment of the model was based on qualitative and quantitative data obtained from training sessions with 9 experienced and 13 inexperienced neurosurgeons. The techniques used for evaluation were face validation, retest and interrater reliability, and construct validation. The experts considered the S.I.M.O.N.T. capable of reproducing surgical situations as if they were real and presenting great similarity with the human brain. Surgical results of serial training showed that the model could be considered precise. Finally, development and improvement in surgical skills by the trainees were observed and considered relevant to further training. It was also observed that the probability of any single error was dramatically decreased after each training session, with a mean reduction of 41.65% (range 38.7%-45.6%). Neuroendoscopic training has some specific requirements. A unique set of instruments is required, as is a model that can resemble real-life situations. The S.I.M.O.N.T. is a new alternative model specially designed for this purpose. Validation techniques followed by precision assessments attested to the model's feasibility.
Learning with imperfectly labeled patterns
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of learning in pattern recognition using imperfectly labeled patterns is considered. The performance of the Bayes and nearest neighbor classifiers with imperfect labels is discussed using a probabilistic model for the mislabeling of the training patterns. Schemes for training the classifier using both parametric and non parametric techniques are presented. Methods for the correction of imperfect labels were developed. To gain an understanding of the learning process, expressions are derived for success probability as a function of training time for a one dimensional increment error correction classifier with imperfect labels. Feature selection with imperfectly labeled patterns is described.
a Cloud Boundary Detection Scheme Combined with Aslic and Cnn Using ZY-3, GF-1/2 Satellite Imagery
NASA Astrophysics Data System (ADS)
Guo, Z.; Li, C.; Wang, Z.; Kwok, E.; Wei, X.
2018-04-01
Remote sensing optical image cloud detection is one of the most important problems in remote sensing data processing. Aiming at the information loss caused by cloud cover, a cloud detection method based on convolution neural network (CNN) is presented in this paper. Firstly, a deep CNN network is used to extract the multi-level feature generation model of cloud from the training samples. Secondly, the adaptive simple linear iterative clustering (ASLIC) method is used to divide the detected images into superpixels. Finally, the probability of each superpixel belonging to the cloud region is predicted by the trained network model, thereby generating a cloud probability map. The typical region of GF-1/2 and ZY-3 were selected to carry out the cloud detection test, and compared with the traditional SLIC method. The experiment results show that the average accuracy of cloud detection is increased by more than 5 %, and it can detected thin-thick cloud and the whole cloud boundary well on different imaging platforms.
Behavior Knowledge Space-Based Fusion for Copy-Move Forgery Detection.
Ferreira, Anselmo; Felipussi, Siovani C; Alfaro, Carlos; Fonseca, Pablo; Vargas-Munoz, John E; Dos Santos, Jefersson A; Rocha, Anderson
2016-07-20
The detection of copy-move image tampering is of paramount importance nowadays, mainly due to its potential use for misleading the opinion forming process of the general public. In this paper, we go beyond traditional forgery detectors and aim at combining different properties of copy-move detection approaches by modeling the problem on a multiscale behavior knowledge space, which encodes the output combinations of different techniques as a priori probabilities considering multiple scales of the training data. Afterwards, the conditional probabilities missing entries are properly estimated through generative models applied on the existing training data. Finally, we propose different techniques that exploit the multi-directionality of the data to generate the final outcome detection map in a machine learning decision-making fashion. Experimental results on complex datasets, comparing the proposed techniques with a gamut of copy-move detection approaches and other fusion methodologies in the literature show the effectiveness of the proposed method and its suitability for real-world applications.
Automated segmentation of dental CBCT image with prior-guided sequential random forests
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Li; Gao, Yaozong; Shi, Feng
Purpose: Cone-beam computed tomography (CBCT) is an increasingly utilized imaging modality for the diagnosis and treatment planning of the patients with craniomaxillofacial (CMF) deformities. Accurate segmentation of CBCT image is an essential step to generate 3D models for the diagnosis and treatment planning of the patients with CMF deformities. However, due to the image artifacts caused by beam hardening, imaging noise, inhomogeneity, truncation, and maximal intercuspation, it is difficult to segment the CBCT. Methods: In this paper, the authors present a new automatic segmentation method to address these problems. Specifically, the authors first employ a majority voting method to estimatemore » the initial segmentation probability maps of both mandible and maxilla based on multiple aligned expert-segmented CBCT images. These probability maps provide an important prior guidance for CBCT segmentation. The authors then extract both the appearance features from CBCTs and the context features from the initial probability maps to train the first-layer of random forest classifier that can select discriminative features for segmentation. Based on the first-layer of trained classifier, the probability maps are updated, which will be employed to further train the next layer of random forest classifier. By iteratively training the subsequent random forest classifier using both the original CBCT features and the updated segmentation probability maps, a sequence of classifiers can be derived for accurate segmentation of CBCT images. Results: Segmentation results on CBCTs of 30 subjects were both quantitatively and qualitatively validated based on manually labeled ground truth. The average Dice ratios of mandible and maxilla by the authors’ method were 0.94 and 0.91, respectively, which are significantly better than the state-of-the-art method based on sparse representation (p-value < 0.001). Conclusions: The authors have developed and validated a novel fully automated method for CBCT segmentation.« less
ERIC Educational Resources Information Center
Iyioke, Ifeoma Chika
2013-01-01
This dissertation describes a design for training, in accordance with probability judgment heuristics principles, for the Angoff standard setting method. The new training with instruction, practice, and feedback tailored to the probability judgment heuristics principles was called the Heuristic training and the prevailing Angoff method training…
NASA Astrophysics Data System (ADS)
Zhu, Zhe; Gallant, Alisa L.; Woodcock, Curtis E.; Pengra, Bruce; Olofsson, Pontus; Loveland, Thomas R.; Jin, Suming; Dahal, Devendra; Yang, Limin; Auch, Roger F.
2016-12-01
The U.S. Geological Survey's Land Change Monitoring, Assessment, and Projection (LCMAP) initiative is a new end-to-end capability to continuously track and characterize changes in land cover, use, and condition to better support research and applications relevant to resource management and environmental change. Among the LCMAP product suite are annual land cover maps that will be available to the public. This paper describes an approach to optimize the selection of training and auxiliary data for deriving the thematic land cover maps based on all available clear observations from Landsats 4-8. Training data were selected from map products of the U.S. Geological Survey's Land Cover Trends project. The Random Forest classifier was applied for different classification scenarios based on the Continuous Change Detection and Classification (CCDC) algorithm. We found that extracting training data proportionally to the occurrence of land cover classes was superior to an equal distribution of training data per class, and suggest using a total of 20,000 training pixels to classify an area about the size of a Landsat scene. The problem of unbalanced training data was alleviated by extracting a minimum of 600 training pixels and a maximum of 8000 training pixels per class. We additionally explored removing outliers contained within the training data based on their spectral and spatial criteria, but observed no significant improvement in classification results. We also tested the importance of different types of auxiliary data that were available for the conterminous United States, including: (a) five variables used by the National Land Cover Database, (b) three variables from the cloud screening "Function of mask" (Fmask) statistics, and (c) two variables from the change detection results of CCDC. We found that auxiliary variables such as a Digital Elevation Model and its derivatives (aspect, position index, and slope), potential wetland index, water probability, snow probability, and cloud probability improved the accuracy of land cover classification. Compared to the original strategy of the CCDC algorithm (500 pixels per class), the use of the optimal strategy improved the classification accuracies substantially (15-percentage point increase in overall accuracy and 4-percentage point increase in minimum accuracy).
Introductory Life Science Mathematics and Quantitative Neuroscience Courses
ERIC Educational Resources Information Center
Duffus, Dwight; Olifer, Andrei
2010-01-01
We describe two sets of courses designed to enhance the mathematical, statistical, and computational training of life science undergraduates at Emory College. The first course is an introductory sequence in differential and integral calculus, modeling with differential equations, probability, and inferential statistics. The second is an…
Troubleshooting Complex Equipment in the Military Services: Research and Prospects.
1979-12-01
the maintenance concept for a given prime * ,equipeent. The concept defines levels of maintenance, the arrangements for support by mobile and depot...Naval Training Bquipment Center, 1975. Ross, S. AUmlied probability models with otimization aplications . San Francisco: Holden-Day, 1970. use, W.B. Human
Predicting space telerobotic operator training performance from human spatial ability assessment
NASA Astrophysics Data System (ADS)
Liu, Andrew M.; Oman, Charles M.; Galvan, Raquel; Natapoff, Alan
2013-11-01
Our goal was to determine whether existing tests of spatial ability can predict an astronaut's qualification test performance after robotic training. Because training astronauts to be qualified robotics operators is so long and expensive, NASA is interested in tools that can predict robotics performance before training begins. Currently, the Astronaut Office does not have a validated tool to predict robotics ability as part of its astronaut selection or training process. Commonly used tests of human spatial ability may provide such a tool to predict robotics ability. We tested the spatial ability of 50 active astronauts who had completed at least one robotics training course, then used logistic regression models to analyze the correlation between spatial ability test scores and the astronauts' performance in their evaluation test at the end of the training course. The fit of the logistic function to our data is statistically significant for several spatial tests. However, the prediction performance of the logistic model depends on the criterion threshold assumed. To clarify the critical selection issues, we show how the probability of correct classification vs. misclassification varies as a function of the mental rotation test criterion level. Since the costs of misclassification are low, the logistic models of spatial ability and robotic performance are reliable enough only to be used to customize regular and remedial training. We suggest several changes in tracking performance throughout robotics training that could improve the range and reliability of predictive models.
On splice site prediction using weight array models: a comparison of smoothing techniques
NASA Astrophysics Data System (ADS)
Taher, Leila; Meinicke, Peter; Morgenstern, Burkhard
2007-11-01
In most eukaryotic genes, protein-coding exons are separated by non-coding introns which are removed from the primary transcript by a process called "splicing". The positions where introns are cut and exons are spliced together are called "splice sites". Thus, computational prediction of splice sites is crucial for gene finding in eukaryotes. Weight array models are a powerful probabilistic approach to splice site detection. Parameters for these models are usually derived from m-tuple frequencies in trusted training data and subsequently smoothed to avoid zero probabilities. In this study we compare three different ways of parameter estimation for m-tuple frequencies, namely (a) non-smoothed probability estimation, (b) standard pseudo counts and (c) a Gaussian smoothing procedure that we recently developed.
A collision model for safety evaluation of autonomous intelligent cruise control.
Touran, A; Brackstone, M A; McDonald, M
1999-09-01
This paper describes a general framework for safety evaluation of autonomous intelligent cruise control in rear-end collisions. Using data and specifications from prototype devices, two collision models are developed. One model considers a train of four cars, one of which is equipped with autonomous intelligent cruise control. This model considers the car in front and two cars following the equipped car. In the second model, none of the cars is equipped with the device. Each model can predict the possibility of rear-end collision between cars under various conditions by calculating the remaining distance between cars after the front car brakes. Comparing the two collision models allows one to evaluate the effectiveness of autonomous intelligent cruise control in preventing collisions. The models are then subjected to Monte Carlo simulation to calculate the probability of collision. Based on crash probabilities, an expected value is calculated for the number of cars involved in any collision. It is found that given the model assumptions, while equipping a car with autonomous intelligent cruise control can significantly reduce the probability of the collision with the car ahead, it may adversely affect the situation for the following cars.
2014-03-01
orofacial injuries.10 These and other efforts have been associated with reduced BCT injuries over time as shown in Figure 111 but injury incidence...to predict first episode of low back pain in Soldiers undergoing combat medic training. Moran et al30 reported an AUG of . 765 for a pragmatic 5...Dugan JL, Robinson ME. Predictors of occurrence and severity of first time low back pain episodes: Findings from a military inception cohort. PLoS
Stochastic analysis of a pulse-type prey-predator model
NASA Astrophysics Data System (ADS)
Wu, Y.; Zhu, W. Q.
2008-04-01
A stochastic Lotka-Volterra model, a so-called pulse-type model, for the interaction between two species and their random natural environment is investigated. The effect of a random environment is modeled as random pulse trains in the birth rate of the prey and the death rate of the predator. The generalized cell mapping method is applied to calculate the probability distributions of the species populations at a state of statistical quasistationarity. The time evolution of the population densities is studied, and the probability of the near extinction time, from an initial state to a critical state, is obtained. The effects on the ecosystem behaviors of the prey self-competition term and of the pulse mean arrival rate are also discussed. Our results indicate that the proposed pulse-type model shows obviously distinguishable characteristics from a Gaussian-type model, and may confer a significant advantage for modeling the prey-predator system under discrete environmental fluctuations.
Stochastic analysis of a pulse-type prey-predator model.
Wu, Y; Zhu, W Q
2008-04-01
A stochastic Lotka-Volterra model, a so-called pulse-type model, for the interaction between two species and their random natural environment is investigated. The effect of a random environment is modeled as random pulse trains in the birth rate of the prey and the death rate of the predator. The generalized cell mapping method is applied to calculate the probability distributions of the species populations at a state of statistical quasistationarity. The time evolution of the population densities is studied, and the probability of the near extinction time, from an initial state to a critical state, is obtained. The effects on the ecosystem behaviors of the prey self-competition term and of the pulse mean arrival rate are also discussed. Our results indicate that the proposed pulse-type model shows obviously distinguishable characteristics from a Gaussian-type model, and may confer a significant advantage for modeling the prey-predator system under discrete environmental fluctuations.
Probabilistic Open Set Recognition
NASA Astrophysics Data System (ADS)
Jain, Lalit Prithviraj
Real-world tasks in computer vision, pattern recognition and machine learning often touch upon the open set recognition problem: multi-class recognition with incomplete knowledge of the world and many unknown inputs. An obvious way to approach such problems is to develop a recognition system that thresholds probabilities to reject unknown classes. Traditional rejection techniques are not about the unknown; they are about the uncertain boundary and rejection around that boundary. Thus traditional techniques only represent the "known unknowns". However, a proper open set recognition algorithm is needed to reduce the risk from the "unknown unknowns". This dissertation examines this concept and finds existing probabilistic multi-class recognition approaches are ineffective for true open set recognition. We hypothesize the cause is due to weak adhoc assumptions combined with closed-world assumptions made by existing calibration techniques. Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the large set of unknown classes even under this assumption of incomplete class knowledge. For this, we formulate the problem as one of modeling positive training data by invoking statistical extreme value theory (EVT) near the decision boundary of positive data with respect to negative data. We provide a new algorithm called the PI-SVM for estimating the unnormalized posterior probability of class inclusion. This dissertation also introduces a new open set recognition model called Compact Abating Probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical EVT for score calibration with one-class and binary support vector machines. Building from the success of statistical EVT based recognition methods such as PI-SVM and W-SVM on the open set problem, we present a new general supervised learning algorithm for multi-class classification and multi-class open set recognition called the Extreme Value Local Basis (EVLB). The design of this algorithm is motivated by the observation that extrema from known negative class distributions are the closest negative points to any positive sample during training, and thus should be used to define the parameters of a probabilistic decision model. In the EVLB, the kernel distribution for each positive training sample is estimated via an EVT distribution fit over the distances to the separating hyperplane between positive training sample and closest negative samples, with a subset of the overall positive training data retained to form a probabilistic decision boundary. Using this subset as a frame of reference, the probability of a sample at test time decreases as it moves away from the positive class. Possessing this property, the EVLB is well-suited to open set recognition problems where samples from unknown or novel classes are encountered at test. Our experimental evaluation shows that the EVLB provides a substantial improvement in scalability compared to standard radial basis function kernel machines, as well as P I-SVM and W-SVM, with improved accuracy in many cases. We evaluate our algorithm on open set variations of the standard visual learning benchmarks, as well as with an open subset of classes from Caltech 256 and ImageNet. Our experiments show that PI-SVM, WSVM and EVLB provide significant advances over the previous state-of-the-art solutions for the same tasks.
Xu, Yifang; Collins, Leslie M
2004-04-01
The incorporation of low levels of noise into an electrical stimulus has been shown to improve auditory thresholds in some human subjects (Zeng et al., 2000). In this paper, thresholds for noise-modulated pulse-train stimuli are predicted utilizing a stochastic neural-behavioral model of ensemble fiber responses to bi-phasic stimuli. The neural refractory effect is described using a Markov model for a noise-free pulse-train stimulus and a closed-form solution for the steady-state neural response is provided. For noise-modulated pulse-train stimuli, a recursive method using the conditional probability is utilized to track the neural responses to each successive pulse. A neural spike count rule has been presented for both threshold and intensity discrimination under the assumption that auditory perception occurs via integration over a relatively long time period (Bruce et al., 1999). An alternative approach originates from the hypothesis of the multilook model (Viemeister and Wakefield, 1991), which argues that auditory perception is based on several shorter time integrations and may suggest an NofM model for prediction of pulse-train threshold. This motivates analyzing the neural response to each individual pulse within a pulse train, which is considered to be the brief look. A logarithmic rule is hypothesized for pulse-train threshold. Predictions from the multilook model are shown to match trends in psychophysical data for noise-free stimuli that are not always matched by the long-time integration rule. Theoretical predictions indicate that threshold decreases as noise variance increases. Theoretical models of the neural response to pulse-train stimuli not only reduce calculational overhead but also facilitate utilization of signal detection theory and are easily extended to multichannel psychophysical tasks.
Training frontline workforce on psychosis management: a prospective study of training effects.
Sørlie, Tore; Borg, Marit; Flage, Karin B; Kolbjørnsrud, Ole-Bjørn; Haugen, Gunnar B; Benth, Jūratė Šaltytė; Ruud, Torleif
2015-01-01
The care situation for persons experiencing severe mental illness is often complex and demands good coordination, communication, and interpersonal relationships among those involved from the primary and specialized mental health care systems. For 15 years, professional care providers from different service levels within the same geographical areas in Norway have been trained together in a 2-year local onsite training program with the aim of increasing skills, joint understanding, and collaboration in their work with individuals experiencing severe mental illness. The key aspects of competence addressed by the training program were measured at baseline, after 1 year, and at the end of the training period. Professional education and experience were also rated at baseline. Data were collected between 1999 and 2005 and were analyzed by estimating a linear mixed model. Results showed a significant increase in participants' experienced competence in all training goals, especially for the understanding of psychosis and relationship building. There was no significant variance at the program level, indicating consistent implementation of local programs. This prospective study indicates that the training program was successful in increasing perceived competence in the areas addressed, and training staff from different service levels together probably contributed to more collaboration. This training model still operates in Norway.
Sequential experimental design based generalised ANOVA
NASA Astrophysics Data System (ADS)
Chakraborty, Souvik; Chowdhury, Rajib
2016-07-01
Over the last decade, surrogate modelling technique has gained wide popularity in the field of uncertainty quantification, optimization, model exploration and sensitivity analysis. This approach relies on experimental design to generate training points and regression/interpolation for generating the surrogate. In this work, it is argued that conventional experimental design may render a surrogate model inefficient. In order to address this issue, this paper presents a novel distribution adaptive sequential experimental design (DA-SED). The proposed DA-SED has been coupled with a variant of generalised analysis of variance (G-ANOVA), developed by representing the component function using the generalised polynomial chaos expansion. Moreover, generalised analytical expressions for calculating the first two statistical moments of the response, which are utilized in predicting the probability of failure, have also been developed. The proposed approach has been utilized in predicting probability of failure of three structural mechanics problems. It is observed that the proposed approach yields accurate and computationally efficient estimate of the failure probability.
Sequential experimental design based generalised ANOVA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chakraborty, Souvik, E-mail: csouvik41@gmail.com; Chowdhury, Rajib, E-mail: rajibfce@iitr.ac.in
Over the last decade, surrogate modelling technique has gained wide popularity in the field of uncertainty quantification, optimization, model exploration and sensitivity analysis. This approach relies on experimental design to generate training points and regression/interpolation for generating the surrogate. In this work, it is argued that conventional experimental design may render a surrogate model inefficient. In order to address this issue, this paper presents a novel distribution adaptive sequential experimental design (DA-SED). The proposed DA-SED has been coupled with a variant of generalised analysis of variance (G-ANOVA), developed by representing the component function using the generalised polynomial chaos expansion. Moreover,more » generalised analytical expressions for calculating the first two statistical moments of the response, which are utilized in predicting the probability of failure, have also been developed. The proposed approach has been utilized in predicting probability of failure of three structural mechanics problems. It is observed that the proposed approach yields accurate and computationally efficient estimate of the failure probability.« less
Ye, Qing; Pan, Hao; Liu, Changhua
2015-01-01
This research proposes a novel framework of final drive simultaneous failure diagnosis containing feature extraction, training paired diagnostic models, generating decision threshold, and recognizing simultaneous failure modes. In feature extraction module, adopt wavelet package transform and fuzzy entropy to reduce noise interference and extract representative features of failure mode. Use single failure sample to construct probability classifiers based on paired sparse Bayesian extreme learning machine which is trained only by single failure modes and have high generalization and sparsity of sparse Bayesian learning approach. To generate optimal decision threshold which can convert probability output obtained from classifiers into final simultaneous failure modes, this research proposes using samples containing both single and simultaneous failure modes and Grid search method which is superior to traditional techniques in global optimization. Compared with other frequently used diagnostic approaches based on support vector machine and probability neural networks, experiment results based on F 1-measure value verify that the diagnostic accuracy and efficiency of the proposed framework which are crucial for simultaneous failure diagnosis are superior to the existing approach. PMID:25722717
NASA Astrophysics Data System (ADS)
Qi, Wei; Liu, Junguo; Yang, Hong; Sweetapple, Chris
2018-03-01
Global precipitation products are very important datasets in flow simulations, especially in poorly gauged regions. Uncertainties resulting from precipitation products, hydrological models and their combinations vary with time and data magnitude, and undermine their application to flow simulations. However, previous studies have not quantified these uncertainties individually and explicitly. This study developed an ensemble-based dynamic Bayesian averaging approach (e-Bay) for deterministic discharge simulations using multiple global precipitation products and hydrological models. In this approach, the joint probability of precipitation products and hydrological models being correct is quantified based on uncertainties in maximum and mean estimation, posterior probability is quantified as functions of the magnitude and timing of discharges, and the law of total probability is implemented to calculate expected discharges. Six global fine-resolution precipitation products and two hydrological models of different complexities are included in an illustrative application. e-Bay can effectively quantify uncertainties and therefore generate better deterministic discharges than traditional approaches (weighted average methods with equal and varying weights and maximum likelihood approach). The mean Nash-Sutcliffe Efficiency values of e-Bay are up to 0.97 and 0.85 in training and validation periods respectively, which are at least 0.06 and 0.13 higher than traditional approaches. In addition, with increased training data, assessment criteria values of e-Bay show smaller fluctuations than traditional approaches and its performance becomes outstanding. The proposed e-Bay approach bridges the gap between global precipitation products and their pragmatic applications to discharge simulations, and is beneficial to water resources management in ungauged or poorly gauged regions across the world.
Modeling habitat for Marbled Murrelets on the Siuslaw National Forest, Oregon, using lidar data
Hagar, Joan C.; Aragon, Ramiro; Haggerty, Patricia; Hollenbeck, Jeff P.
2018-03-28
Habitat models using lidar-derived variables that quantify fine-scale variation in vegetation structure can improve the accuracy of occupancy estimates for canopy-dwelling species over models that use variables derived from other remote sensing techniques. However, the ability of models developed at such a fine spatial scale to maintain accuracy at regional or larger spatial scales has not been tested. We tested the transferability of a lidar-based habitat model for the threatened Marbled Murrelet (Brachyramphus marmoratus) between two management districts within a larger regional conservation zone in coastal western Oregon. We compared the performance of the transferred model against models developed with data from the application location. The transferred model had good discrimination (AUC = 0.73) at the application location, and model performance was further improved by fitting the original model with coefficients from the application location dataset (AUC = 0.79). However, the model selection procedure indicated that neither of these transferred models were considered competitive with a model trained on local data. The new model trained on data from the application location resulted in the selection of a slightly different set of lidar metrics from the original model, but both transferred and locally trained models consistently indicated positive relationships between the probability of occupancy and lidar measures of canopy structural complexity. We conclude that while the locally trained model had superior performance for local application, the transferred model could reasonably be applied to the entire conservation zone.
ERIC Educational Resources Information Center
Balfour, Danny L.; Neff, Donna M.
1993-01-01
A logistic regression model applied to data from 171 child service caseworkers identified variables determining job turnover during times of intense external criticism of the agency (length of service, professional commitment, level of education). A special training program did not significantly reduce the probability of turnover. (SK)
Multipath Very-Simplified Estimate of Adversary Sequence Interruption v. 2.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snell, Mark K.
2017-10-10
MP VEASI is a training tool that models physical protection systems for fixed sites using Adversary Sequence Diagrams (ASDs) and then uses the ASD to find most-vulnerable adversary paths through the ASD. The identified paths have the lowest Probability of Interruption among all the paths through the ASD.
A new biodegradation prediction model specific to petroleum hydrocarbons.
Howard, Philip; Meylan, William; Aronson, Dallas; Stiteler, William; Tunkel, Jay; Comber, Michael; Parkerton, Thomas F
2005-08-01
A new predictive model for determining quantitative primary biodegradation half-lives of individual petroleum hydrocarbons has been developed. This model uses a fragment-based approach similar to that of several other biodegradation models, such as those within the Biodegradation Probability Program (BIOWIN) estimation program. In the present study, a half-life in days is estimated using multiple linear regression against counts of 31 distinct molecular fragments. The model was developed using a data set consisting of 175 compounds with environmentally relevant experimental data that was divided into training and validation sets. The original fragments from the Ministry of International Trade and Industry BIOWIN model were used initially as structural descriptors and additional fragments were then added to better describe the ring systems found in petroleum hydrocarbons and to adjust for nonlinearity within the experimental data. The training and validation sets had r2 values of 0.91 and 0.81, respectively.
Economic Choices Reveal Probability Distortion in Macaque Monkeys
Lak, Armin; Bossaerts, Peter; Schultz, Wolfram
2015-01-01
Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing. PMID:25698750
Economic choices reveal probability distortion in macaque monkeys.
Stauffer, William R; Lak, Armin; Bossaerts, Peter; Schultz, Wolfram
2015-02-18
Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing. Copyright © 2015 Stauffer et al.
Spaced-retrieval effects on name-face recognition in older adults with probable Alzheimer's disease.
Hawley, Karri S; Cherry, Katie E
2004-03-01
Six older adults with probable Alzheimer's disease (AD) were trained to recall a name-face association using the spaced-retrieval method. We administered six training sessions over a 2-week period. On each trial, participants selected a target photograph and stated the target name, from eight other photographs, at increasingly longer retention intervals. Results yielded a positive effect of spaced-retrieval training for name-face recognition. All participants were able to select the target photograph and state the target's name for longer periods of time within and across training sessions. A live-person transfer task was administered to determine whether the name-face association, trained by spaced-retrieval, would transfer to a live person. Half of the participants were able to call the live person by the correct name. These data provide initial evidence that spaced-retrieval training can aid older adults with probable AD in recall of a name-face association and in transfer of that association to an actual person.
Estimation of the probability of success in petroleum exploration
Davis, J.C.
1977-01-01
A probabilistic model for oil exploration can be developed by assessing the conditional relationship between perceived geologic variables and the subsequent discovery of petroleum. Such a model includes two probabilistic components, the first reflecting the association between a geologic condition (structural closure, for example) and the occurrence of oil, and the second reflecting the uncertainty associated with the estimation of geologic variables in areas of limited control. Estimates of the conditional relationship between geologic variables and subsequent production can be found by analyzing the exploration history of a "training area" judged to be geologically similar to the exploration area. The geologic variables are assessed over the training area using an historical subset of the available data, whose density corresponds to the present control density in the exploration area. The success or failure of wells drilled in the training area subsequent to the time corresponding to the historical subset provides empirical estimates of the probability of success conditional upon geology. Uncertainty in perception of geological conditions may be estimated from the distribution of errors made in geologic assessment using the historical subset of control wells. These errors may be expressed as a linear function of distance from available control. Alternatively, the uncertainty may be found by calculating the semivariogram of the geologic variables used in the analysis: the two procedures will yield approximately equivalent results. The empirical probability functions may then be transferred to the exploration area and used to estimate the likelihood of success of specific exploration plays. These estimates will reflect both the conditional relationship between the geological variables used to guide exploration and the uncertainty resulting from lack of control. The technique is illustrated with case histories from the mid-Continent area of the U.S.A. ?? 1977 Plenum Publishing Corp.
Nowakowska, Marzena
2017-04-01
The development of the Bayesian logistic regression model classifying the road accident severity is discussed. The already exploited informative priors (method of moments, maximum likelihood estimation, and two-stage Bayesian updating), along with the original idea of a Boot prior proposal, are investigated when no expert opinion has been available. In addition, two possible approaches to updating the priors, in the form of unbalanced and balanced training data sets, are presented. The obtained logistic Bayesian models are assessed on the basis of a deviance information criterion (DIC), highest probability density (HPD) intervals, and coefficients of variation estimated for the model parameters. The verification of the model accuracy has been based on sensitivity, specificity and the harmonic mean of sensitivity and specificity, all calculated from a test data set. The models obtained from the balanced training data set have a better classification quality than the ones obtained from the unbalanced training data set. The two-stage Bayesian updating prior model and the Boot prior model, both identified with the use of the balanced training data set, outperform the non-informative, method of moments, and maximum likelihood estimation prior models. It is important to note that one should be careful when interpreting the parameters since different priors can lead to different models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Interleaved Training and Training-Based Transmission Design for Hybrid Massive Antenna Downlink
NASA Astrophysics Data System (ADS)
Zhang, Cheng; Jing, Yindi; Huang, Yongming; Yang, Luxi
2018-06-01
In this paper, we study the beam-based training design jointly with the transmission design for hybrid massive antenna single-user (SU) and multiple-user (MU) systems where outage probability is adopted as the performance measure. For SU systems, we propose an interleaved training design to concatenate the feedback and training procedures, thus making the training length adaptive to the channel realization. Exact analytical expressions are derived for the average training length and the outage probability of the proposed interleaved training. For MU systems, we propose a joint design for the beam-based interleaved training, beam assignment, and MU data transmissions. Two solutions for the beam assignment are provided with different complexity-performance tradeoff. Analytical results and simulations show that for both SU and MU systems, the proposed joint training and transmission designs achieve the same outage performance as the traditional full-training scheme but with significant saving in the training overhead.
Determination of sensation threshold from small pulse trains of 2.01μm laser light
NASA Astrophysics Data System (ADS)
Dugan, Daniel C.; Johnson, Thomas E.
2009-02-01
The determination of sensation thresholds has applications ranging from uses in the medical community such as neural pathway mapping and for the diagnosis of diabetic neuropathy, to potential uses in determining safety standards. This study sought to determine the sensation threshold, and the distribution of sensation probabilities, for pulse trains ranging from two 10 ms pulses to nine 10 ms pulses from 2.01 μm laser light incident on a human forearm and chest. Threshold was defined as the energy density that would elicit sensation 50% of the time (ED50). A method of levels approach was used in conjunction with a monovariate binary response model to determine the ED50. We determined the ED50 and also a distribution of threshold probabilities. Threshold was found to be largely dependant on total energy deposited for smaller pulse trains, and thus independent of the number of pulses. Total energy becomes less important as the number of pulses increases however, and a decrease in threshold was measured for a nine pulse train as compared to one through four pulse trains. Thus we have demonstrated that this method is a useful and easy way for determining sensation thresholds from a 2.01 μm laser for possible clinical use. We have also demonstrated that lower power lasers when pulsed can elicit sensation at comparable levels to higher power single pulse lasers.
Mapping risk of plague in Qinghai-Tibetan Plateau, China.
Qian, Quan; Zhao, Jian; Fang, Liqun; Zhou, Hang; Zhang, Wenyi; Wei, Lan; Yang, Hong; Yin, Wenwu; Cao, Wuchun; Li, Qun
2014-07-10
Qinghai-Tibetan Plateau of China is known to be the plague endemic region where marmot (Marmota himalayana) is the primary host. Human plague cases are relatively low incidence but high mortality, which presents unique surveillance and public health challenges, because early detection through surveillance may not always be feasible and infrequent clinical cases may be misdiagnosed. Based on plague surveillance data and environmental variables, Maxent was applied to model the presence probability of plague host. 75% occurrence points were randomly selected for training model, and the rest 25% points were used for model test and validation. Maxent model performance was measured as test gain and test AUC. The optimal probability cut-off value was chosen by maximizing training sensitivity and specificity simultaneously. We used field surveillance data in an ecological niche modeling (ENM) framework to depict spatial distribution of natural foci of plague in Qinghai-Tibetan Plateau. Most human-inhabited areas at risk of exposure to enzootic plague are distributed in the east and south of the Plateau. Elevation, temperature of land surface and normalized difference vegetation index play a large part in determining the distribution of the enzootic plague. This study provided a more detailed view of spatial pattern of enzootic plague and human-inhabited areas at risk of plague. The maps could help public health authorities decide where to perform plague surveillance and take preventive measures in Qinghai-Tibetan Plateau.
Bayesian anomaly detection in monitoring data applying relevance vector machine
NASA Astrophysics Data System (ADS)
Saito, Tomoo
2011-04-01
A method for automatically classifying the monitoring data into two categories, normal and anomaly, is developed in order to remove anomalous data included in the enormous amount of monitoring data, applying the relevance vector machine (RVM) to a probabilistic discriminative model with basis functions and their weight parameters whose posterior PDF (probabilistic density function) conditional on the learning data set is given by Bayes' theorem. The proposed framework is applied to actual monitoring data sets containing some anomalous data collected at two buildings in Tokyo, Japan, which shows that the trained models discriminate anomalous data from normal data very clearly, giving high probabilities of being normal to normal data and low probabilities of being normal to anomalous data.
Analytical performance evaluation of SAR ATR with inaccurate or estimated models
NASA Astrophysics Data System (ADS)
DeVore, Michael D.
2004-09-01
Hypothesis testing algorithms for automatic target recognition (ATR) are often formulated in terms of some assumed distribution family. The parameter values corresponding to a particular target class together with the distribution family constitute a model for the target's signature. In practice such models exhibit inaccuracy because of incorrect assumptions about the distribution family and/or because of errors in the assumed parameter values, which are often determined experimentally. Model inaccuracy can have a significant impact on performance predictions for target recognition systems. Such inaccuracy often causes model-based predictions that ignore the difference between assumed and actual distributions to be overly optimistic. This paper reports on research to quantify the effect of inaccurate models on performance prediction and to estimate the effect using only trained parameters. We demonstrate that for large observation vectors the class-conditional probabilities of error can be expressed as a simple function of the difference between two relative entropies. These relative entropies quantify the discrepancies between the actual and assumed distributions and can be used to express the difference between actual and predicted error rates. Focusing on the problem of ATR from synthetic aperture radar (SAR) imagery, we present estimators of the probabilities of error in both ideal and plug-in tests expressed in terms of the trained model parameters. These estimators are defined in terms of unbiased estimates for the first two moments of the sample statistic. We present an analytical treatment of these results and include demonstrations from simulated radar data.
[Prolonged mechanical ventilation probability model].
Añón, J M; Gómez-Tello, V; González-Higueras, E; Oñoro, J J; Córcoles, V; Quintana, M; López-Martínez, J; Marina, L; Choperena, G; García-Fernández, A M; Martín-Delgado, C; Gordo, F; Díaz-Alersi, R; Montejo, J C; Lorenzo, A García de; Pérez-Arriaga, M; Madero, R
2012-10-01
To design a probability model for prolonged mechanical ventilation (PMV) using variables obtained during the first 24 hours of the start of MV. An observational, prospective, multicenter cohort study. Thirteen Spanish medical-surgical intensive care units. Adult patients requiring mechanical ventilation for more than 24 hours. None. APACHE II, SOFA, demographic data, clinical data, reason for mechanical ventilation, comorbidity, and functional condition. A multivariate risk model was constructed. The model contemplated a dependent variable with three possible conditions: 1. Early mortality; 2. Early extubation; and 3. PMV. Of the 1661 included patients, 67.9% (n=1127) were men. Age: 62.1±16.2 years. APACHE II: 20.3±7.5. Total SOFA: 8.4±3.5. The APACHE II and SOFA scores were higher in patients ventilated for 7 or more days (p=0.04 and p=0.0001, respectively). Noninvasive ventilation failure was related to PMV (p=0.005). A multivariate model for the three above exposed outcomes was generated. The overall accuracy of the model in the training and validation sample was 0.763 (95%IC: 0.729-0.804) and 0.751 (95%IC: 0.672-0.816), respectively. The likelihood ratios (LRs) for early extubation, involving a cutoff point of 0.65, in the training sample were LR (+): 2.37 (95%CI: 1.77-3.19) and LR (-): 0.47 (95%CI: 0.41-0.55). The LRs for the early mortality model, for a cutoff point of 0.73, in the training sample, were LR (+): 2.64 (95%CI: 2.01-3.4) and LR (-): 0.39 (95%CI: 0.30-0.51). The proposed model could be a helpful tool in decision making. However, because of its moderate accuracy, it should be considered as a first approach, and the results should be corroborated by further studies involving larger samples and the use of standardized criteria. Copyright © 2011 Elsevier España, S.L. y SEMICYUC. All rights reserved.
A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.
Gao, Xiang; Lin, Huaiying; Dong, Qunfeng
2017-01-01
Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.
Development and validation of a mortality risk model for pediatric sepsis.
Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan
2017-05-01
Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial.We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities.According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively.The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients.
Development and validation of a mortality risk model for pediatric sepsis
Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan
2017-01-01
Abstract Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial. We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities. According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively. The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients. PMID:28514310
Lee, Jong-Seok; Park, Cheol Hoon
2010-08-01
We propose a novel stochastic optimization algorithm, hybrid simulated annealing (SA), to train hidden Markov models (HMMs) for visual speech recognition. In our algorithm, SA is combined with a local optimization operator that substitutes a better solution for the current one to improve the convergence speed and the quality of solutions. We mathematically prove that the sequence of the objective values converges in probability to the global optimum in the algorithm. The algorithm is applied to train HMMs that are used as visual speech recognizers. While the popular training method of HMMs, the expectation-maximization algorithm, achieves only local optima in the parameter space, the proposed method can perform global optimization of the parameters of HMMs and thereby obtain solutions yielding improved recognition performance. The superiority of the proposed algorithm to the conventional ones is demonstrated via isolated word recognition experiments.
Hidden Markov models for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J. (Inventor)
1995-01-01
The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data. The invention first estimates the state of the system at discrete intervals in time. A feature vector x of dimension k is estimated from sets of successive windows of sensor data. A pattern recognition component then models the instantaneous estimate of the posterior class probability given the features, p(w(sub i) (vertical bar)/x), 1 less than or equal to i isless than or equal to m. Finally, a hidden Markov model is used to take advantage of temporal context and estimate class probabilities conditioned on recent past history. In this hierarchical pattern of information flow, the time series data is transformed and mapped into a categorical representation (the fault classes) and integrated over time to enable robust decision-making.
Hidden Markov models for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J. (Inventor)
1993-01-01
The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data. The invention first estimates the state of the system at discrete intervals in time. A feature vector x of dimension k is estimated from sets of successive windows of sensor data. A pattern recognition component then models the instantaneous estimate of the posterior class probability given the features, p(w(sub i) perpendicular to x), 1 less than or equal to i is less than or equal to m. Finally, a hidden Markov model is used to take advantage of temporal context and estimate class probabilities conditioned on recent past history. In this hierarchical pattern of information flow, the time series data is transformed and mapped into a categorical representation (the fault classes) and integrated over time to enable robust decision-making.
Memory effects on a resonate-and-fire neuron model subjected to Ornstein-Uhlenbeck noise
NASA Astrophysics Data System (ADS)
Paekivi, S.; Mankin, R.; Rekker, A.
2017-10-01
We consider a generalized Langevin equation with an exponentially decaying memory kernel as a model for the firing process of a resonate-and-fire neuron. The effect of temporally correlated random neuronal input is modeled as Ornstein-Uhlenbeck noise. In the noise-induced spiking regime of the neuron, we derive exact analytical formulas for the dependence of some statistical characteristics of the output spike train, such as the probability distribution of the interspike intervals (ISIs) and the survival probability, on the parameters of the input stimulus. Particularly, on the basis of these exact expressions, we have established sufficient conditions for the occurrence of memory-time-induced transitions between unimodal and multimodal structures of the ISI density and a critical damping coefficient which marks a dynamical transition in the behavior of the system.
Decomposition of conditional probability for high-order symbolic Markov chains.
Melnik, S S; Usatenko, O V
2017-07-01
The main goal of this paper is to develop an estimate for the conditional probability function of random stationary ergodic symbolic sequences with elements belonging to a finite alphabet. We elaborate on a decomposition procedure for the conditional probability function of sequences considered to be high-order Markov chains. We represent the conditional probability function as the sum of multilinear memory function monomials of different orders (from zero up to the chain order). This allows us to introduce a family of Markov chain models and to construct artificial sequences via a method of successive iterations, taking into account at each step increasingly high correlations among random elements. At weak correlations, the memory functions are uniquely expressed in terms of the high-order symbolic correlation functions. The proposed method fills the gap between two approaches, namely the likelihood estimation and the additive Markov chains. The obtained results may have applications for sequential approximation of artificial neural network training.
Decomposition of conditional probability for high-order symbolic Markov chains
NASA Astrophysics Data System (ADS)
Melnik, S. S.; Usatenko, O. V.
2017-07-01
The main goal of this paper is to develop an estimate for the conditional probability function of random stationary ergodic symbolic sequences with elements belonging to a finite alphabet. We elaborate on a decomposition procedure for the conditional probability function of sequences considered to be high-order Markov chains. We represent the conditional probability function as the sum of multilinear memory function monomials of different orders (from zero up to the chain order). This allows us to introduce a family of Markov chain models and to construct artificial sequences via a method of successive iterations, taking into account at each step increasingly high correlations among random elements. At weak correlations, the memory functions are uniquely expressed in terms of the high-order symbolic correlation functions. The proposed method fills the gap between two approaches, namely the likelihood estimation and the additive Markov chains. The obtained results may have applications for sequential approximation of artificial neural network training.
NASA Astrophysics Data System (ADS)
Zeng, Zhi-Ping; Zhao, Yan-Gang; Xu, Wen-Tao; Yu, Zhi-Wu; Chen, Ling-Kun; Lou, Ping
2015-04-01
The frequent use of bridges in high-speed railway lines greatly increases the probability that trains are running on bridges when earthquakes occur. This paper investigates the random vibrations of a high-speed train traversing a slab track on a continuous girder bridge subjected to track irregularities and traveling seismic waves by the pseudo-excitation method (PEM). To derive the equations of motion of the train-slab track-bridge interaction system, the multibody dynamics and finite element method models are used for the train and the track and bridge, respectively. By assuming track irregularities to be fully coherent random excitations with time lags between different wheels and seismic accelerations to be uniformly modulated, non-stationary random excitations with time lags between different foundations, the random load vectors of the equations of motion are transformed into a series of deterministic pseudo-excitations based on PEM and the wheel-rail contact relationship. A computer code is developed to obtain the time-dependent random responses of the entire system. As a case study, the random vibration characteristics of an ICE-3 high-speed train traversing a seven-span continuous girder bridge simultaneously excited by track irregularities and traveling seismic waves are analyzed. The influence of train speed and seismic wave propagation velocity on the random vibration characteristics of the bridge and train are discussed.
Elmenhorst, Eva-Maria; Pennig, Sibylle; Rolny, Vinzent; Quehl, Julia; Mueller, Uwe; Maaß, Hartmut; Basner, Mathias
2012-05-01
Traffic noise is interfering during day- and nighttime causing distress and adverse physiological reactions in large parts of the population. Railway noise proved less annoying than aircraft noise in surveys which were the bases for a so called 5 dB railway bonus regarding noise protection in many European countries. The present field study investigated railway noise-induced awakenings during sleep, nighttime annoyance and the impact on performance the following day. Comparing these results with those from a field study on aircraft noise allowed for a ranking of traffic modes concerning physiological and psychological reactions. 33 participants (mean age 36.2 years ± 10.3 (SD); 22 females) living alongside railway tracks around Cologne/Bonn (Germany) were polysomnographically investigated. These data were pooled with data from a field study on aircraft noise (61 subjects) directly comparing the effects of railway and aircraft noise in one random subject effects logistic regression model. Annoyance was rated in the morning evaluating the previous night. Probability of sleep stage changes to wake/S1 from railway noise increased significantly from 6.5% at 35 dB(A) to 20.5% at 80 dB(A) LAFmax. Rise time of noise events had a significant impact on awakening probability. Nocturnal railway noise led to significantly higher awakening probabilities than aircraft noise, partly explained by the different rise times, whereas the order was inversed for annoyance. Freight train noise compared to passenger train noise proved to have the most impact on awakening probability. Nocturnal railway noise had no effect on psychomotor vigilance. Nocturnal freight train noise exposure in Germany was associated with increased awakening probabilities exceeding those for aircraft noise and contrasting the findings of many annoyance surveys and annoyance ratings of our study. During nighttime a bonus for railway noise seems not appropriate. Copyright © 2012 Elsevier B.V. All rights reserved.
Haslinger, Robert; Pipa, Gordon; Brown, Emery
2010-10-01
One approach for understanding the encoding of information by spike trains is to fit statistical models and then test their goodness of fit. The time-rescaling theorem provides a goodness-of-fit test consistent with the point process nature of spike trains. The interspike intervals (ISIs) are rescaled (as a function of the model's spike probability) to be independent and exponentially distributed if the model is accurate. A Kolmogorov-Smirnov (KS) test between the rescaled ISIs and the exponential distribution is then used to check goodness of fit. This rescaling relies on assumptions of continuously defined time and instantaneous events. However, spikes have finite width, and statistical models of spike trains almost always discretize time into bins. Here we demonstrate that finite temporal resolution of discrete time models prevents their rescaled ISIs from being exponentially distributed. Poor goodness of fit may be erroneously indicated even if the model is exactly correct. We present two adaptations of the time-rescaling theorem to discrete time models. In the first we propose that instead of assuming the rescaled times to be exponential, the reference distribution be estimated through direct simulation by the fitted model. In the second, we prove a discrete time version of the time-rescaling theorem that analytically corrects for the effects of finite resolution. This allows us to define a rescaled time that is exponentially distributed, even at arbitrary temporal discretizations. We demonstrate the efficacy of both techniques by fitting generalized linear models to both simulated spike trains and spike trains recorded experimentally in monkey V1 cortex. Both techniques give nearly identical results, reducing the false-positive rate of the KS test and greatly increasing the reliability of model evaluation based on the time-rescaling theorem.
Hatten, James R.; Parsley, Michael; Barton, Gary; Batt, Thomas; Fosness, Ryan L.
2018-01-01
A study was conducted to identify habitat characteristics associated with age 0+ White Sturgeon (Acipenser transmontanus Richardson, 1863) recruitment in three reaches of the Columbia River Basin: Skamania reach (consistent recruitment), John Day reach (intermittent/inconsistent recruitment), and Kootenai reach (no recruitment). Our modeling approach involved numerous steps. First, we collected information about substrate, embeddedness, and hydrodynamics in each reach. Second, we developed a set of spatially explicit predictor variables. Third, we built two habitat (probability) models with Skamania reach training data where White Sturgeon recruitment was consistent. Fourth, we created spawning maps of each reach by populating the habitat models with in-reach physical metrics (substrate, embeddedness, and hydrodynamics). Fifth, we examined model accuracy by overlaying spawning locations in Skamania and Kootenai reaches with habitat predictions obtained from probability models. Sixth, we simulated how predicted habitat changed in each reach after manipulating physical conditions to more closely match Skamania reach. Model verification confirmed White Sturgeon generally spawned in locations with higher model probabilities in Skamania and Kootenai reaches, indicating the utility of extrapolating the models. Model simulations revealed significant gains in White Sturgeon habitat in all reaches when spring flow increased, gravel/cobble composition increased, or embeddedness decreased. The habitat models appear well suited to assist managers when identifying reach-specific factors limiting White Sturgeon recruitment in the Columbia River Basin or throughout its range.
Kawaguchi, Minato; Mino, Hiroyuki; Durand, Dominique M
2006-01-01
This article presents an analysis of the information transmission of periodic sub-threshold spike trains in a hippocampal CA1 neuron model in the presence of a homogeneous Poisson shot noise. In the computer simulation, periodic sub-threshold spike trains were presented repeatedly to the midpoint of the main apical branch, while the homogeneous Poisson shot noise was applied to the mid-point of a basal dendrite in the CA1 neuron model consisting of the soma with one sodium, one calcium, and five potassium channels. From spike firing times recorded at the soma, the inter spike intervals were generated and then the probability, p(T), of the inter-spike interval histogram corresponding to the spike interval, r, of the periodic input spike trains was estimated to obtain an index of information transmission. In the present article, it is shown that at a specific amplitude of the homogeneous Poisson shot noise, p(T) was found to be maximized, as well as the possibility to encode the periodic sub-threshold spike trains became greater. It was implied that setting the amplitude of the homogeneous Poisson shot noise to the specific values which maximize the information transmission might contribute to efficiently encoding the periodic sub-threshold spike trains by utilizing the stochastic resonance.
Task specificity of attention training: the case of probability cuing
Jiang, Yuhong V.; Swallow, Khena M.; Won, Bo-Yeong; Cistera, Julia D.; Rosenbaum, Gail M.
2014-01-01
Statistical regularities in our environment enhance perception and modulate the allocation of spatial attention. Surprisingly little is known about how learning-induced changes in spatial attention transfer across tasks. In this study, we investigated whether a spatial attentional bias learned in one task transfers to another. Most of the experiments began with a training phase in which a search target was more likely to be located in one quadrant of the screen than in the other quadrants. An attentional bias toward the high-probability quadrant developed during training (probability cuing). In a subsequent, testing phase, the target's location distribution became random. In addition, the training and testing phases were based on different tasks. Probability cuing did not transfer between visual search and a foraging-like task. However, it did transfer between various types of visual search tasks that differed in stimuli and difficulty. These data suggest that different visual search tasks share a common and transferrable learned attentional bias. However, this bias is not shared by high-level, decision-making tasks such as foraging. PMID:25113853
Data-driven, Interpretable Photometric Redshifts Trained on Heterogeneous and Unrepresentative Data
NASA Astrophysics Data System (ADS)
Leistedt, Boris; Hogg, David W.
2017-03-01
We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux-redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training data or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the I-magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.
Prediction suppression and surprise enhancement in monkey inferotemporal cortex.
Ramachandran, Suchitra; Meyer, Travis; Olson, Carl R
2017-07-01
Exposing monkeys, over the course of days and weeks, to pairs of images presented in fixed sequence, so that each leading image becomes a predictor for the corresponding trailing image, affects neuronal visual responsiveness in area TE. At the end of the training period, neurons respond relatively weakly to a trailing image when it appears in a trained sequence and, thus, confirms prediction, whereas they respond relatively strongly to the same image when it appears in an untrained sequence and, thus, violates prediction. This effect could arise from prediction suppression (reduced firing in response to the occurrence of a probable event) or surprise enhancement (elevated firing in response to the omission of a probable event). To identify its cause, we compared firing under the prediction-confirming and prediction-violating conditions to firing under a prediction-neutral condition. The results provide strong evidence for prediction suppression and limited evidence for surprise enhancement. NEW & NOTEWORTHY In predictive coding models of the visual system, neurons carry signed prediction error signals. We show here that monkey inferotemporal neurons exhibit prediction-modulated firing, as posited by these models, but that the signal is unsigned. The response to a prediction-confirming image is suppressed, and the response to a prediction-violating image may be enhanced. These results are better explained by a model in which the visual system emphasizes unpredicted events than by a predictive coding model. Copyright © 2017 the American Physiological Society.
Adaptation of hidden Markov models for recognizing speech of reduced frame rate.
Lee, Lee-Min; Jean, Fu-Rong
2013-12-01
The frame rate of the observation sequence in distributed speech recognition applications may be reduced to suit a resource-limited front-end device. In order to use models trained using full-frame-rate data in the recognition of reduced frame-rate (RFR) data, we propose a method for adapting the transition probabilities of hidden Markov models (HMMs) to match the frame rate of the observation. Experiments on the recognition of clean and noisy connected digits are conducted to evaluate the proposed method. Experimental results show that the proposed method can effectively compensate for the frame-rate mismatch between the training and the test data. Using our adapted model to recognize the RFR speech data, one can significantly reduce the computation time and achieve the same level of accuracy as that of a method, which restores the frame rate using data interpolation.
Dexter, Franklin; De Oliveira, Gildasio S; McCarthy, Robert J
2016-01-15
We surveyed anesthesiology residents to evaluate the predictive effect of prior residence on desired location for future practice opportunities. One thousand five hundred United States anesthesiology residents were invited to participate. One question asked whether they intend to enter academic practice when they graduate from their residency/fellowship training. The analysis categorized the responses into "surely yes" and "probably" versus "even," "probably not," and "surely no." "After finishing your residency/fellowship training, are you planning to look seriously (e.g., interview) at jobs located more than a 2-hour drive from a location where you or your family (e.g., spouse or partner/significant other) have lived previously?" Responses were categorized into "very probably" and "somewhat probably" versus "somewhat improbably" and "not probable." Other questions explored predictors of the relationships quantified using the area under the receiver operating characteristic curve (area under the curve) ± its standard error. Among the 696 respondents, 36.9% (N = 256) would "probably" consider an academic practice. Fewer than half of those (P < 0.0001) would "very probably" consider a distant location (31.6%, 99% CI 24.4%-39.6%). Respondents with prior formal research training (e.g., PhD or Master's) had greater interest in academic practice at a distant location (AUC 0.63 ± 0.03, P = 0.0002). Except among respondents with formal research training, a good question to ask a job applicant is whether the applicant or the applicant's family has previously lived in the area.
NASA Astrophysics Data System (ADS)
Xu, Chong; Dai, Fuchu; Xu, Xiwei; Lee, Yuan Hsi
2012-04-01
Support vector machine (SVM) modeling is based on statistical learning theory. It involves a training phase with associated input and target output values. In recent years, the method has become increasingly popular. The main purpose of this study is to evaluate the mapping power of SVM modeling in earthquake triggered landslide-susceptibility mapping for a section of the Jianjiang River watershed using a Geographic Information System (GIS) software. The river was affected by the Wenchuan earthquake of May 12, 2008. Visual interpretation of colored aerial photographs of 1-m resolution and extensive field surveys provided a detailed landslide inventory map containing 3147 landslides related to the 2008 Wenchuan earthquake. Elevation, slope angle, slope aspect, distance from seismogenic faults, distance from drainages, and lithology were used as the controlling parameters. For modeling, three groups of positive and negative training samples were used in concert with four different kernel functions. Positive training samples include the centroids of 500 large landslides, those of all 3147 landslides, and 5000 randomly selected points in landslide polygons. Negative training samples include 500, 3147, and 5000 randomly selected points on slopes that remained stable during the Wenchuan earthquake. The four kernel functions are linear, polynomial, radial basis, and sigmoid. In total, 12 cases of landslide susceptibility were mapped. Comparative analyses of landslide-susceptibility probability and area relation curves show that both the polynomial and radial basis functions suitably classified the input data as either landslide positive or negative though the radial basis function was more successful. The 12 generated landslide-susceptibility maps were compared with known landslide centroid locations and landslide polygons to verify the success rate and predictive accuracy of each model. The 12 results were further validated using area-under-curve analysis. Group 3 with 5000 randomly selected points on the landslide polygons, and 5000 randomly selected points along stable slopes gave the best results with a success rate of 79.20% and predictive accuracy of 79.13% under the radial basis function. Of all the results, the sigmoid kernel function was the least skillful when used in concert with the centroid data of all 3147 landslides as positive training samples, and the negative training samples of 3147 randomly selected points in regions of stable slope (success rate = 54.95%; predictive accuracy = 61.85%). This paper also provides suggestions and reference data for selecting appropriate training samples and kernel function types for earthquake triggered landslide-susceptibility mapping using SVM modeling. Predictive landslide-susceptibility maps could be useful in hazard mitigation by helping planners understand the probability of landslides in different regions.
Haslinger, Robert; Pipa, Gordon; Brown, Emery
2010-01-01
One approach for understanding the encoding of information by spike trains is to fit statistical models and then test their goodness of fit. The time rescaling theorem provides a goodness of fit test consistent with the point process nature of spike trains. The interspike intervals (ISIs) are rescaled (as a function of the model’s spike probability) to be independent and exponentially distributed if the model is accurate. A Kolmogorov Smirnov (KS) test between the rescaled ISIs and the exponential distribution is then used to check goodness of fit. This rescaling relies upon assumptions of continuously defined time and instantaneous events. However spikes have finite width and statistical models of spike trains almost always discretize time into bins. Here we demonstrate that finite temporal resolution of discrete time models prevents their rescaled ISIs from being exponentially distributed. Poor goodness of fit may be erroneously indicated even if the model is exactly correct. We present two adaptations of the time rescaling theorem to discrete time models. In the first we propose that instead of assuming the rescaled times to be exponential, the reference distribution be estimated through direct simulation by the fitted model. In the second, we prove a discrete time version of the time rescaling theorem which analytically corrects for the effects of finite resolution. This allows us to define a rescaled time which is exponentially distributed, even at arbitrary temporal discretizations. We demonstrate the efficacy of both techniques by fitting Generalized Linear Models (GLMs) to both simulated spike trains and spike trains recorded experimentally in monkey V1 cortex. Both techniques give nearly identical results, reducing the false positive rate of the KS test and greatly increasing the reliability of model evaluation based upon the time rescaling theorem. PMID:20608868
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Nan; Carmona, Ruben; Sirak, Igor
Purpose: To demonstrate an efficient method for training and validation of a knowledge-based planning (KBP) system as a radiation therapy clinical trial plan quality-control system. Methods and Materials: We analyzed 86 patients with stage IB through IVA cervical cancer treated with intensity modulated radiation therapy at 2 institutions according to the standards of the INTERTECC (International Evaluation of Radiotherapy Technology Effectiveness in Cervical Cancer, National Clinical Trials Network identifier: 01554397) protocol. The protocol used a planning target volume and 2 primary organs at risk: pelvic bone marrow (PBM) and bowel. Secondary organs at risk were rectum and bladder. Initial unfiltered dose-volumemore » histogram (DVH) estimation models were trained using all 86 plans. Refined training sets were created by removing sub-optimal plans from the unfiltered sample, and DVH estimation models… and DVH estimation models were constructed by identifying 30 of 86 plans emphasizing PBM sparing (comparing protocol-specified dosimetric cutpoints V{sub 10} (percentage volume of PBM receiving at least 10 Gy dose) and V{sub 20} (percentage volume of PBM receiving at least 20 Gy dose) with unfiltered predictions) and another 30 of 86 plans emphasizing bowel sparing (comparing V{sub 40} (absolute volume of bowel receiving at least 40 Gy dose) and V{sub 45} (absolute volume of bowel receiving at least 45 Gy dose), 9 in common with the PBM set). To obtain deliverable KBP plans, refined models must inform patient-specific optimization objectives and/or priorities (an auto-planning “routine”). Four candidate routines emphasizing different tradeoffs were composed, and a script was developed to automatically re-plan multiple patients with each routine. After selection of the routine that best met protocol objectives in the 51-patient training sample (KBP{sub FINAL}), protocol-specific DVH metrics and normal tissue complication probability were compared for original versus KBP{sub FINAL} plans across the 35-patient validation set. Paired t tests were used to test differences between planning sets. Results: KBP{sub FINAL} plans outperformed manual planning across the validation set in all protocol-specific DVH cutpoints. The mean normal tissue complication probability for gastrointestinal toxicity was lower for KBP{sub FINAL} versus validation-set plans (48.7% vs 53.8%, P<.001). Similarly, the estimated mean white blood cell count nadir was higher (2.77 vs 2.49 k/mL, P<.001) with KBP{sub FINAL} plans, indicating lowered probability of hematologic toxicity. Conclusions: This work demonstrates that a KBP system can be efficiently trained and refined for use in radiation therapy clinical trials with minimal effort. This patient-specific plan quality control resulted in improvements on protocol-specific dosimetric endpoints.« less
Quantifying Uncertainty of Wind Power Production Through an Analog Ensemble
NASA Astrophysics Data System (ADS)
Shahriari, M.; Cervone, G.
2016-12-01
The Analog Ensemble (AnEn) method is used to generate probabilistic weather forecasts that quantify the uncertainty in power estimates at hypothetical wind farm locations. The data are from the NREL Eastern Wind Dataset that includes more than 1,300 modeled wind farms. The AnEn model uses a two-dimensional grid to estimate the probability distribution of wind speed (the predictand) given the values of predictor variables such as temperature, pressure, geopotential height, U-component and V-component of wind. The meteorological data is taken from the NCEP GFS which is available on a 0.25 degree grid resolution. The methodology first divides the data into two classes: training period and verification period. The AnEn selects a point in the verification period and searches for the best matching estimates (analogs) in the training period. The predictand value at those analogs are the ensemble prediction for the point in the verification period. The model provides a grid of wind speed values and the uncertainty (probability index) associated with each estimate. Each wind farm is associated with a probability index which quantifies the degree of difficulty to estimate wind power. Further, the uncertainty in estimation is related to other factors such as topography, land cover and wind resources. This is achieved by using a GIS system to compute the correlation between the probability index and geographical characteristics. This study has significant applications for investors in renewable energy sector especially wind farm developers. Lower level of uncertainty facilitates the process of submitting bids into day ahead and real time electricity markets. Thus, building wind farms in regions with lower levels of uncertainty will reduce the real-time operational risks and create a hedge against volatile real-time prices. Further, the links between wind estimate uncertainty and factors such as topography and wind resources, provide wind farm developers with valuable information regarding wind farm siting.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leistedt, Boris; Hogg, David W., E-mail: boris.leistedt@nyu.edu, E-mail: david.hogg@nyu.edu
We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux–redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training datamore » or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the i -magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST ) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.« less
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
2017-11-15
In Hezaveh et al. (2017) we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data,more » as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single hyperparameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that neural networks can be a fast alternative to Monte Carlo Markov Chains for parameter uncertainty estimation in many practical applications, allowing more than seven orders of magnitude improvement in speed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
In Hezaveh et al. (2017) we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data,more » as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single hyperparameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that neural networks can be a fast alternative to Monte Carlo Markov Chains for parameter uncertainty estimation in many practical applications, allowing more than seven orders of magnitude improvement in speed.« less
Text line extraction in free style document
NASA Astrophysics Data System (ADS)
Shen, Xiaolu; Liu, Changsong; Ding, Xiaoqing; Zou, Yanming
2009-01-01
This paper addresses to text line extraction in free style document, such as business card, envelope, poster, etc. In free style document, global property such as character size, line direction can hardly be concluded, which reveals a grave limitation in traditional layout analysis. 'Line' is the most prominent and the highest structure in our bottom-up method. First, we apply a novel intensity function found on gradient information to locate text areas where gradient within a window have large magnitude and various directions, and split such areas into text pieces. We build a probability model of lines consist of text pieces via statistics on training data. For an input image, we group text pieces to lines using a simulated annealing algorithm with cost function based on the probability model.
Hossain, Monir; Wright, Steven; Petersen, Laura A
2002-04-01
One way to monitor patient access to emergent health care services is to use patient characteristics to predict arrival time at the hospital after onset of symptoms. This predicted arrival time can then be compared with actual arrival time to allow monitoring of access to services. Predicted arrival time could also be used to estimate potential effects of changes in health care service availability, such as closure of an emergency department or an acute care hospital. Our goal was to determine the best statistical method for prediction of arrival intervals for patients with acute myocardial infarction (AMI) symptoms. We compared the performance of multinomial logistic regression (MLR) and discriminant analysis (DA) models. Models for MLR and DA were developed using a dataset of 3,566 male veterans hospitalized with AMI in 81 VA Medical Centers in 1994-1995 throughout the United States. The dataset was randomly divided into a training set (n = 1,846) and a test set (n = 1,720). Arrival times were grouped into three intervals on the basis of treatment considerations: <6 hours, 6-12 hours, and >12 hours. One model for MLR and two models for DA were developed using the training dataset. One DA model had equal prior probabilities, and one DA model had proportional prior probabilities. Predictive performance of the models was compared using the test (n = 1,720) dataset. Using the test dataset, the proportions of patients in the three arrival time groups were 60.9% for <6 hours, 10.3% for 6-12 hours, and 28.8% for >12 hours after symptom onset. Whereas the overall predictive performance by MLR and DA with proportional priors was higher, the DA models with equal priors performed much better in the smaller groups. Correct classifications were 62.6% by MLR, 62.4% by DA using proportional prior probabilities, and 48.1% using equal prior probabilities of the groups. The misclassifications by MLR for the three groups were 9.5%, 100.0%, 74.2% for each time interval, respectively. Misclassifications by DA models were 9.8%, 100.0%, and 74.4% for the model with proportional priors and 47.6%, 79.5%, and 51.0% for the model with equal priors. The choice of MLR or DA with proportional priors, or DA with equal priors for monitoring time intervals of predicted hospital arrival time for a population should depend on the consequences of misclassification errors.
The Effects of the Previous Outcome on Probabilistic Choice in Rats
Marshall, Andrew T.; Kirkpatrick, Kimberly
2014-01-01
This study examined the effects of previous outcomes on subsequent choices in a probabilistic-choice task. Twenty-four rats were trained to choose between a certain outcome (1 or 3 pellets) versus an uncertain outcome (3 or 9 pellets), delivered with a probability of .1, .33, .67, and .9 in different phases. Uncertain outcome choices increased with the probability of uncertain food. Additionally, uncertain choices increased with the probability of uncertain food following both certain-choice outcomes and unrewarded uncertain choices. However, following uncertain-choice food outcomes, there was a tendency to choose the uncertain outcome in all cases, indicating that the rats continued to “gamble” after successful uncertain choices, regardless of the overall probability or magnitude of food. A subsequent manipulation, in which the probability of uncertain food varied within each session as a function of the previous uncertain outcome, examined how the previous outcome and probability of uncertain food affected choice in a dynamic environment. Uncertain-choice behavior increased with the probability of uncertain food. The rats exhibited increased sensitivity to probability changes and a greater degree of win–stay/lose–shift behavior than in the static phase. Simulations of two sequential choice models were performed to explore the possible mechanisms of reward value computations. The simulation results supported an exponentially decaying value function that updated as a function of trial (rather than time). These results emphasize the importance of analyzing global and local factors in choice behavior and suggest avenues for the future development of sequential-choice models. PMID:23205915
Vehicle Steering control: A model of learning
NASA Technical Reports Server (NTRS)
Smiley, A.; Reid, L.; Fraser, M.
1978-01-01
A hierarchy of strategies were postulated to describe the process of learning steering control. Vehicle motion and steering control data were recorded for twelve novices who drove an instrumented car twice a week during and after a driver training course. Car-driver describing functions were calculated, the probable control structure determined, and the driver-alone transfer function modelled. The data suggested that the largest changes in steering control with learning were in the way the driver used the lateral position cue.
Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.
2016-01-01
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
Sub-daily Statistical Downscaling of Meteorological Variables Using Neural Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Jitendra; Brooks, Bjørn-Gustaf J.; Thornton, Peter E
2012-01-01
A new open source neural network temporal downscaling model is described and tested using CRU-NCEP reanal ysis and CCSM3 climate model output. We downscaled multiple meteorological variables in tandem from monthly to sub-daily time steps while also retaining consistent correlations between variables. We found that our feed forward, error backpropagation approach produced synthetic 6 hourly meteorology with biases no greater than 0.6% across all variables and variance that was accurate within 1% for all variables except atmospheric pressure, wind speed, and precipitation. Correlations between downscaled output and the expected (original) monthly means exceeded 0.99 for all variables, which indicates thatmore » this approach would work well for generating atmospheric forcing data consistent with mass and energy conserved GCM output. Our neural network approach performed well for variables that had correlations to other variables of about 0.3 and better and its skill was increased by downscaling multiple correlated variables together. Poor replication of precipitation intensity however required further post-processing in order to obtain the expected probability distribution. The concurrence of precipitation events with expected changes in sub ordinate variables (e.g., less incident shortwave radiation during precipitation events) were nearly as consistent in the downscaled data as in the training data with probabilities that differed by no more than 6%. Our downscaling approach requires training data at the target time step and relies on a weak assumption that climate variability in the extrapolated data is similar to variability in the training data.« less
Identification of phreatophytic groundwater dependent ecosystems using geospatial technologies
NASA Astrophysics Data System (ADS)
Perez Hoyos, Isabel Cristina
The protection of groundwater dependent ecosystems (GDEs) is increasingly being recognized as an essential aspect for the sustainable management and allocation of water resources. Ecosystem services are crucial for human well-being and for a variety of flora and fauna. However, the conservation of GDEs is only possible if knowledge about their location and extent is available. Several studies have focused on the identification of GDEs at specific locations using ground-based measurements. However, recent progress in technologies such as remote sensing and their integration with geographic information systems (GIS) has provided alternative ways to map GDEs at much larger spatial extents. This study is concerned with the discovery of patterns in geospatial data sets using data mining techniques for mapping phreatophytic GDEs in the United States at 1 km spatial resolution. A methodology to identify the probability of an ecosystem to be groundwater dependent is developed. Probabilities are obtained by modeling the relationship between the known locations of GDEs and main factors influencing groundwater dependency, namely water table depth (WTD) and aridity index (AI). A methodology is proposed to predict WTD at 1 km spatial resolution using relevant geospatial data sets calibrated with WTD observations. An ensemble learning algorithm called random forest (RF) is used in order to model the distribution of groundwater in three study areas: Nevada, California, and Washington, as well as in the entire United States. RF regression performance is compared with a single regression tree (RT). The comparison is based on contrasting training error, true prediction error, and variable importance estimates of both methods. Additionally, remote sensing variables are omitted from the process of fitting the RF model to the data to evaluate the deterioration in the model performance when these variables are not used as an input. Research results suggest that although the prediction accuracy of a single RT is reduced in comparison with RFs, single trees can still be used to understand the interactions that might be taking place between predictor variables and the response variable. Regarding RF, there is a great potential in using the power of an ensemble of trees for prediction of WTD. The superior capability of RF to accurately map water table position in Nevada, California, and Washington demonstrate that this technique can be applied at scales larger than regional levels. It is also shown that the removal of remote sensing variables from the RF training process degrades the performance of the model. Using the predicted WTD, the probability of an ecosystem to be groundwater dependent (GDE probability) is estimated at 1 km spatial resolution. The modeling technique is evaluated in the state of Nevada, USA to develop a systematic approach for the identification of GDEs and it is then applied in the United States. The modeling approach selected for the development of the GDE probability map results from a comparison of the performance of classification trees (CT) and classification forests (CF). Predictive performance evaluation for the selection of the most accurate model is achieved using a threshold independent technique, and the prediction accuracy of both models is assessed in greater detail using threshold-dependent measures. The resulting GDE probability map can potentially be used for the definition of conservation areas since it can be translated into a binary classification map with two classes: GDE and NON-GDE. These maps are created by selecting a probability threshold. It is demonstrated that the choice of this threshold has dramatic effects on deterministic model performance measures.
Gruber, Susan; Logan, Roger W; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A
2015-01-15
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.
Gruber, Susan; Logan, Roger W.; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A.
2014-01-01
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V -fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. PMID:25316152
Prediction and control of neural responses to pulsatile electrical stimulation
NASA Astrophysics Data System (ADS)
Campbell, Luke J.; Sly, David James; O'Leary, Stephen John
2012-04-01
This paper aims to predict and control the probability of firing of a neuron in response to pulsatile electrical stimulation of the type delivered by neural prostheses such as the cochlear implant, bionic eye or in deep brain stimulation. Using the cochlear implant as a model, we developed an efficient computational model that predicts the responses of auditory nerve fibers to electrical stimulation and evaluated the model's accuracy by comparing the model output with pooled responses from a group of guinea pig auditory nerve fibers. It was found that the model accurately predicted the changes in neural firing probability over time to constant and variable amplitude electrical pulse trains, including speech-derived signals, delivered at rates up to 889 pulses s-1. A simplified version of the model that did not incorporate adaptation was used to adaptively predict, within its limitations, the pulsatile electrical stimulus required to cause a desired response from neurons up to 250 pulses s-1. Future stimulation strategies for cochlear implants and other neural prostheses may be enhanced using similar models that account for the way that neural responses are altered by previous stimulation.
ERIC Educational Resources Information Center
Cherry, Katie E.; Walvoord, Ashley A. G.; Hawley, Karri S.
2010-01-01
The authors trained 4 older adults with probable Alzheimer's disease to recall a name-face-occupation association using the spaced retrieval technique. Six training sessions were administered over a 2-week period. On each trial, participants selected a target photograph and stated the target name and occupation at increasingly longer retention…
Fenlon, Caroline; O'Grady, Luke; Butler, Stephen; Doherty, Michael L; Dunnion, John
2017-01-01
Herd fertility in pasture-based dairy farms is a key driver of farm economics. Models for predicting nulliparous reproductive outcomes are rare, but age, genetics, weight, and BCS have been identified as factors influencing heifer conception. The aim of this study was to create a simulation model of heifer conception to service with thorough evaluation. Artificial Insemination service records from two research herds and ten commercial herds were provided to build and evaluate the models. All were managed as spring-calving pasture-based systems. The factors studied were related to age, genetics, and time of service. The data were split into training and testing sets and bootstrapping was used to train the models. Logistic regression (with and without random effects) and generalised additive modelling were selected as the model-building techniques. Two types of evaluation were used to test the predictive ability of the models: discrimination and calibration. Discrimination, which includes sensitivity, specificity, accuracy and ROC analysis, measures a model's ability to distinguish between positive and negative outcomes. Calibration measures the accuracy of the predicted probabilities with the Hosmer-Lemeshow goodness-of-fit, calibration plot and calibration error. After data cleaning and the removal of services with missing values, 1396 services remained to train the models and 597 were left for testing. Age, breed, genetic predicted transmitting ability for calving interval, month and year were significant in the multivariate models. The regression models also included an interaction between age and month. Year within herd was a random effect in the mixed regression model. Overall prediction accuracy was between 77.1% and 78.9%. All three models had very high sensitivity, but low specificity. The two regression models were very well-calibrated. The mean absolute calibration errors were all below 4%. Because the models were not adept at identifying unsuccessful services, they are not suggested for use in predicting the outcome of individual heifer services. Instead, they are useful for the comparison of services with different covariate values or as sub-models in whole-farm simulations. The mixed regression model was identified as the best model for prediction, as the random effects can be ignored and the other variables can be easily obtained or simulated.
NASA Astrophysics Data System (ADS)
Cronkite-Ratcliff, C.; Phelps, G. A.; Boucher, A.
2011-12-01
In many geologic settings, the pathways of groundwater flow are controlled by geologic heterogeneities which have complex geometries. Models of these geologic heterogeneities, and consequently, their effects on the simulated pathways of groundwater flow, are characterized by uncertainty. Multiple-point geostatistics, which uses a training image to represent complex geometric descriptions of geologic heterogeneity, provides a stochastic approach to the analysis of geologic uncertainty. Incorporating multiple-point geostatistics into numerical models provides a way to extend this analysis to the effects of geologic uncertainty on the results of flow simulations. We present two case studies to demonstrate the application of multiple-point geostatistics to numerical flow simulation in complex geologic settings with both static and dynamic conditioning data. Both cases involve the development of a training image from a complex geometric description of the geologic environment. Geologic heterogeneity is modeled stochastically by generating multiple equally-probable realizations, all consistent with the training image. Numerical flow simulation for each stochastic realization provides the basis for analyzing the effects of geologic uncertainty on simulated hydraulic response. The first case study is a hypothetical geologic scenario developed using data from the alluvial deposits in Yucca Flat, Nevada. The SNESIM algorithm is used to stochastically model geologic heterogeneity conditioned to the mapped surface geology as well as vertical drill-hole data. Numerical simulation of groundwater flow and contaminant transport through geologic models produces a distribution of hydraulic responses and contaminant concentration results. From this distribution of results, the probability of exceeding a given contaminant concentration threshold can be used as an indicator of uncertainty about the location of the contaminant plume boundary. The second case study considers a characteristic lava-flow aquifer system in Pahute Mesa, Nevada. A 3D training image is developed by using object-based simulation of parametric shapes to represent the key morphologic features of rhyolite lava flows embedded within ash-flow tuffs. In addition to vertical drill-hole data, transient pressure head data from aquifer tests can be used to constrain the stochastic model outcomes. The use of both static and dynamic conditioning data allows the identification of potential geologic structures that control hydraulic response. These case studies demonstrate the flexibility of the multiple-point geostatistics approach for considering multiple types of data and for developing sophisticated models of geologic heterogeneities that can be incorporated into numerical flow simulations.
Speech processing using conditional observable maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, John; Nix, David
A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence ofmore » speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.« less
ERIC Educational Resources Information Center
de Hosson, Cécile; Décamp, Nicolas
2014-01-01
A great amount of research has been carried out world-wide to promote history of science as a powerful science teaching tool. Because the ways of choosing and using historical elements depend on teachers' or researchers' educational purpose, any attempt to support a single model-to-use seems difficult and probably irrelevant. However,…
ERIC Educational Resources Information Center
Karmel, Tom; Mlotkowski, Peter
2010-01-01
The primary focus of this research is the impact of wages on the decision not to continue with an apprenticeship or traineeship. The approach taken is to model three wages relevant to apprentices and trainees: the wage during training; the expected wage in alternative employment; and, the expected wage on completion. The results of these models…
Growth of left ventricular mass with military basic training in army recruits.
Batterham, Alan M; George, Keith P; Birch, Karen M; Pennell, Dudley J; Myerson, Saul G
2011-07-01
Exercise-induced left ventricular hypertrophy is well documented, but whether this occurs merely in line with concomitant increases in lean body mass is unclear. Our aim was to model the extent of left ventricular hypertrophy associated with increased lean body mass attributable to an exercise training program. Cardiac and whole-body magnetic resonance imaging was performed before and after a 10-wk intensive British Army basic training program in a sample of 116 healthy Caucasian males (aged 17-28 yr). The within-subjects repeated-measures allometric relationship between lean body mass and left ventricular mass was modeled to allow the proper normalization of changes in left ventricular mass for attendant changes in lean body mass. To linearize the general allometric model (Y=aXb), data were log-transformed before analysis; the resulting effects were therefore expressed as percent changes. We quantified the probability that the true population increase in normalized left ventricular mass was greater than a predefined minimum important difference of 0.2 SD, assigning a probabilistic descriptive anchor for magnitude-based inference. The absolute increase in left ventricular mass was 4.8% (90% confidence interval=3.5%-6%), whereas lean body mass increased by 2.6% (2.1%-3.0%). The change in left ventricular mass adjusted for the change in lean body mass was 3.5% (1.9%-5.1%), equivalent to an increase of 0.25 SD (0.14-0.37). The probability that this effect size was greater than or equal to our predefined minimum important change of 0.2 SD was 0.78-likely to be important. After correction for allometric growth rates, left ventricular hypertrophy and lean body mass changes do not occur at the same magnitude in response to chronic exercise.
Pillow, Jonathan W; Ahmadian, Yashar; Paninski, Liam
2011-01-01
One of the central problems in systems neuroscience is to understand how neural spike trains convey sensory information. Decoding methods, which provide an explicit means for reading out the information contained in neural spike responses, offer a powerful set of tools for studying the neural coding problem. Here we develop several decoding methods based on point-process neural encoding models, or forward models that predict spike responses to stimuli. These models have concave log-likelihood functions, which allow efficient maximum-likelihood model fitting and stimulus decoding. We present several applications of the encoding model framework to the problem of decoding stimulus information from population spike responses: (1) a tractable algorithm for computing the maximum a posteriori (MAP) estimate of the stimulus, the most probable stimulus to have generated an observed single- or multiple-neuron spike train response, given some prior distribution over the stimulus; (2) a gaussian approximation to the posterior stimulus distribution that can be used to quantify the fidelity with which various stimulus features are encoded; (3) an efficient method for estimating the mutual information between the stimulus and the spike trains emitted by a neural population; and (4) a framework for the detection of change-point times (the time at which the stimulus undergoes a change in mean or variance) by marginalizing over the posterior stimulus distribution. We provide several examples illustrating the performance of these estimators with simulated and real neural data.
Linking quality of care and training costs: cost-effectiveness in health professions education.
Tolsgaard, Martin G; Tabor, Ann; Madsen, Mette E; Wulff, Camilla B; Dyre, Liv; Ringsted, Charlotte; Nørgaard, Lone N
2015-12-01
To provide a model for conducting cost-effectiveness analyses in medical education. The model was based on a randomised trial examining the effects of training midwives to perform cervical length measurement (CLM) as compared with obstetricians on patients' waiting times. (CLM), as compared with obstetricians. The model included four steps: (i) gathering data on training outcomes, (ii) assessing total costs and effects, (iii) calculating the incremental cost-effectiveness ratio (ICER) and (iv) estimating cost-effectiveness probability for different willingness to pay (WTP) values. To provide a model example, we conducted a randomised cost-effectiveness trial. Midwives were randomised to CLM training (midwife-performed CLMs) or no training (initial management by midwife, and CLM performed by obstetrician). Intervention-group participants underwent simulation-based and clinical training until they were proficient. During the following 6 months, waiting times from arrival to admission or discharge were recorded for women who presented with symptoms of pre-term labour. Outcomes for women managed by intervention and control-group participants were compared. These data were then used for the remaining steps of the cost-effectiveness model. Intervention-group participants needed a mean 268.2 (95% confidence interval [CI], 140.2-392.2) minutes of simulator training and a mean 7.3 (95% CI, 4.4-10.3) supervised scans to attain proficiency. Women who were scanned by intervention-group participants had significantly reduced waiting time compared with those managed by the control group (n = 65; mean difference, 36.6 [95% CI 7.3-65.8] minutes; p = 0.008), which corresponded to an ICER of 0.45 EUR minute(-1) . For WTP values less than EUR 0.26 minute(-1) , obstetrician-performed CLM was the most cost-effective strategy, whereas midwife-performed CLM was cost-effective for WTP values above EUR 0.73 minute(-1) . Cost-effectiveness models can be used to link quality of care to training costs. The example used in the present study demonstrated that different training strategies could be recommended as the most cost-effective depending on administrators' willingness to pay per unit of the outcome variable. © 2015 Medical Education Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Jarabo-Amores, María-Pilar; la Mata-Moya, David de; Gil-Pita, Roberto; Rosa-Zurera, Manuel
2013-12-01
The application of supervised learning machines trained to minimize the Cross-Entropy error to radar detection is explored in this article. The detector is implemented with a learning machine that implements a discriminant function, which output is compared to a threshold selected to fix a desired probability of false alarm. The study is based on the calculation of the function the learning machine approximates to during training, and the application of a sufficient condition for a discriminant function to be used to approximate the optimum Neyman-Pearson (NP) detector. In this article, the function a supervised learning machine approximates to after being trained to minimize the Cross-Entropy error is obtained. This discriminant function can be used to implement the NP detector, which maximizes the probability of detection, maintaining the probability of false alarm below or equal to a predefined value. Some experiments about signal detection using neural networks are also presented to test the validity of the study.
ERIC Educational Resources Information Center
Cherry, Katie E.; Hawley, Karri S.; Jackson, Erin M.; Boudreaux, Emily O.
2009-01-01
Six older adults with probable Alzheimer's disease (AD) were trained to recall a name-face association using the spaced retrieval technique. In this study, we retested these persons in a 6-month follow-up program. For half of the participants, three booster sessions were administered at 6, 12, and 18 weeks after original training to promote…
Is Weight Training Safe during Pregnancy?
ERIC Educational Resources Information Center
Work, Janis A.
1989-01-01
Examines the opinions of several experts on the safety of weight training during pregnancy, noting that no definitive research on weight training alone has been done. Experts agree that low-intensity weight training probably poses no harm for mother or fetus; exercise programs should be individualized. (SM)
49 CFR 382.603 - Training for supervisors.
Code of Federal Regulations, 2010 CFR
2010-10-01
... minutes of training on controlled substances use. The training will be used by the supervisors to determine whether reasonable suspicion exists to require a driver to undergo testing under § 382.307. The training shall include the physical, behavioral, speech, and performance indicators of probable alcohol...
NASA Astrophysics Data System (ADS)
Wang, Xiaohui; Song, Yingxiong
2018-02-01
By exploiting the non-Kolmogorov model and Rytov approximation theory, a propagation model of Bessel-Gaussian vortex beams (BGVB) propagating in a subway tunnel is derived. Based on the propagation model, a model of orbital angular momentum (OAM) mode probability distribution is established to evaluate the propagation performance when the beam propagates along both longitudinal and transverse directions in the subway tunnel. By numerical simulations and experimental verifications, the influences of the various parameters of BGVB and turbulence on the OAM mode probability distribution are evaluated, and the results of simulations are consistent with the experimental statistics. The results verify that the middle area of turbulence is more beneficial for the vortex beam propagation than the edge; when the BGVB propagates along the longitudinal direction in the subway tunnel, the effects of turbulence on the OAM mode probability distribution can be decreased by selecting a larger anisotropy parameter, smaller coherence length, larger non-Kolmogorov power spectrum coefficient, smaller topological charge number, deeper subway tunnel, lower train speed, and longer wavelength. When the BGVB propagates along the transverse direction, the influences can be also mitigated by adopting a larger topological charge number, less non-Kolmogorov power spectrum coefficient, smaller refractive structure index, shorter wavelength, and shorter propagation distance.
Training in two-tier labor markets: The role of job match quality.
Akgündüz, Yusuf Emre; van Huizen, Thomas
2015-07-01
This study examines training investments in two-tier labor markets, focusing on the role of job match quality. Temporary workers are in general more likely than permanent workers to leave their employer and therefore are less likely to receive employer-funded training. However, as firms prefer to continue productive job matches, we hypothesize that the negative effect of holding a temporary contract on the probability to be trained diminishes with the quality of the job match. Using a recent longitudinal survey from the Netherlands, we find that temporary workers indeed participate less frequently in firm-sponsored training. However, this effect is fully driven by mismatches: holding a temporary contract does not significantly decrease the probability to receive training for workers in good job matches. Depending on match quality, a temporary job can either be a stepping stone or a dead-end. Copyright © 2015 Elsevier Inc. All rights reserved.
Seismic waveform inversion using neural networks
NASA Astrophysics Data System (ADS)
De Wit, R. W.; Trampert, J.
2012-12-01
Full waveform tomography aims to extract all available information on Earth structure and seismic sources from seismograms. The strongly non-linear nature of this inverse problem is often addressed through simplifying assumptions for the physical theory or data selection, thus potentially neglecting valuable information. Furthermore, the assessment of the quality of the inferred model is often lacking. This calls for the development of methods that fully appreciate the non-linear nature of the inverse problem, whilst providing a quantification of the uncertainties in the final model. We propose to invert seismic waveforms in a fully non-linear way by using artificial neural networks. Neural networks can be viewed as powerful and flexible non-linear filters. They are very common in speech, handwriting and pattern recognition. Mixture Density Networks (MDN) allow us to obtain marginal posterior probability density functions (pdfs) of all model parameters, conditioned on the data. An MDN can approximate an arbitrary conditional pdf as a linear combination of Gaussian kernels. Seismograms serve as input, Earth structure parameters are the so-called targets and network training aims to learn the relationship between input and targets. The network is trained on a large synthetic data set, which we construct by drawing many random Earth models from a prior model pdf and solving the forward problem for each of these models, thus generating synthetic seismograms. As a first step, we aim to construct a 1D Earth model. Training sets are constructed using the Mineos package, which computes synthetic seismograms in a spherically symmetric non-rotating Earth by summing normal modes. We train a network on the body waveforms present in these seismograms. Once the network has been trained, it can be presented with new unseen input data, in our case the body waves in real seismograms. We thus obtain the posterior pdf which represents our final state of knowledge given the information in the training set and the real data.
Nevada Test and Training Range Depleted Uranium Target Disposal Environmental Assessment
2005-03-01
to establish the probability and scope of such transport. Long-Term Fate of Depleted Uranium at Aberdeen and Yuma Proving Grounds Phase II: Human...1990. Long-Term Fate of Depleted Uranium at Aberdeen and Yuma Proving Grounds Final Report, Phase 1: Geochemical Transport and Modeling. Los...of Depleted Uranium at Aberdeen and Yuma Proving Grounds , Phase II: Human Health and Ecological Risk Assessments. Los Alamos National Laboratory
Workload Transition: Implications for Individual and Team Performance
1993-01-01
a hrlf second) if they are motivated to do so, and they will choose one strategy or the other on the basis of the probability that it will best serve...Arousal, 106 Conclusion, 107 Mediating Effects, 107 Coping with Stress, 108 DesiZ.n Solutions. 108 Strategies , 109 Training, 109 Team Models: Implications...214 Cognitive Switching, 214 Strategy Switching, 215 Task Switching. 216 Implications for Workload Transition, 219 CONTENTS XV
Techniques for the computation in demographic projections of health manpower.
Horbach, L
1979-01-01
Some basic principles and algorithms are presented which can be used for projective calculations of medical staff on the basis of demographic data. The effects of modifications of the input data such as by health policy measures concerning training capacity, can be demonstrated by repeated calculations with assumptions. Such models give a variety of results and may highlight the probable future balance between health manpower supply and requirements.
Spatiotemporal Dynamics and Reliable Computations in Recurrent Spiking Neural Networks
NASA Astrophysics Data System (ADS)
Pyle, Ryan; Rosenbaum, Robert
2017-01-01
Randomly connected networks of excitatory and inhibitory spiking neurons provide a parsimonious model of neural variability, but are notoriously unreliable for performing computations. We show that this difficulty is overcome by incorporating the well-documented dependence of connection probability on distance. Spatially extended spiking networks exhibit symmetry-breaking bifurcations and generate spatiotemporal patterns that can be trained to perform dynamical computations under a reservoir computing framework.
Viterbi sparse spike detection and a compositional origin to ultralow-velocity zones
NASA Astrophysics Data System (ADS)
Brown, Samuel Paul
Accurate interpretation of seismic travel times and amplitudes in both the exploration and global scales is complicated by the band-limited nature of seismic data. We present a stochastic method, Viterbi sparse spike detection (VSSD), to reduce a seismic waveform into a most probable constituent spike train. Model waveforms are constructed from a set of candidate spike trains convolved with a source wavelet estimate. For each model waveform, a profile hidden Markov model (HMM) is constructed to represent the waveform as a stochastic generative model with a linear topology corresponding to a sequence of samples. The Viterbi algorithm is employed to simultaneously find the optimal nonlinear alignment between a model waveform and the seismic data, and to assign a score to each candidate spike train. The most probable travel times and amplitudes are inferred from the alignments of the highest scoring models. Our analyses show that the method can resolve closely spaced arrivals below traditional resolution limits and that travel time estimates are robust in the presence of random noise and source wavelet errors. We applied the VSSD method to constrain the elastic properties of a ultralow- velocity zone (ULVZ) at the core-mantle boundary beneath the Coral Sea. We analyzed vertical component short period ScP waveforms for 16 earthquakes occurring in the Tonga-Fiji trench recorded at the Alice Springs Array (ASAR) in central Australia. These waveforms show strong pre and postcursory seismic arrivals consistent with ULVZ layering. We used the VSSD method to measure differential travel-times and amplitudes of the post-cursor arrival ScSP and the precursor arrival SPcP relative to ScP. We compare our measurements to a database of approximately 340,000 synthetic seismograms finding that these data are best fit by a ULVZ model with an S-wave velocity reduction of 24%, a P-wave velocity reduction of 23%, a thickness of 8.5 km, and a density increase of 6%. We simultaneously constrain both P- and S-wave velocity reductions as a 1:1 ratio inside this ULVZ. This 1:1 ratio is not consistent with a partial melt origin to ULVZs. Rather, we demonstrate that a compositional origin is more likely.
Let your fingers do the walking: A simple spectral signature model for "remote" fossil prospecting.
Conroy, Glenn C; Emerson, Charles W; Anemone, Robert L; Townsend, K E Beth
2012-07-01
Even with the most meticulous planning, and utilizing the most experienced fossil-hunters, fossil prospecting in remote and/or extensive areas can be time-consuming, expensive, logistically challenging, and often hit or miss. While nothing can predict or guarantee with 100% assurance that fossils will be found in any particular location, any procedures or techniques that might increase the odds of success would be a major benefit to the field. Here we describe, and test, one such technique that we feel has great potential for increasing the probability of finding fossiliferous sediments - a relatively simple spectral signature model using the spatial analysis and image classification functions of ArcGIS(®)10 that creates interactive thematic land cover maps that can be used for "remote" fossil prospecting. Our test case is the extensive Eocene sediments of the Uinta Basin, Utah - a fossil prospecting area encompassing ∼1200 square kilometers. Using Landsat 7 ETM+ satellite imagery, we "trained" the spatial analysis and image classification algorithms using the spectral signatures of known fossil localities discovered in the Uinta Basin prior to 2005 and then created interactive probability models highlighting other regions in the Basin having a high probability of containing fossiliferous sediments based on their spectral signatures. A fortuitous "post-hoc" validation of our model presented itself. Our model identified several paleontological "hotspots", regions that, while not producing any fossil localities prior to 2005, had high probabilities of being fossiliferous based on the similarities of their spectral signatures to those of previously known fossil localities. Subsequent fieldwork found fossils in all the regions predicted by the model. Copyright © 2012 Elsevier Ltd. All rights reserved.
Adapting Active Shape Models for 3D segmentation of tubular structures in medical images.
de Bruijne, Marleen; van Ginneken, Bram; Viergever, Max A; Niessen, Wiro J
2003-07-01
Active Shape Models (ASM) have proven to be an effective approach for image segmentation. In some applications, however, the linear model of gray level appearance around a contour that is used in ASM is not sufficient for accurate boundary localization. Furthermore, the statistical shape model may be too restricted if the training set is limited. This paper describes modifications to both the shape and the appearance model of the original ASM formulation. Shape model flexibility is increased, for tubular objects, by modeling the axis deformation independent of the cross-sectional deformation, and by adding supplementary cylindrical deformation modes. Furthermore, a novel appearance modeling scheme that effectively deals with a highly varying background is developed. In contrast with the conventional ASM approach, the new appearance model is trained on both boundary and non-boundary points, and the probability that a given point belongs to the boundary is estimated non-parametrically. The methods are evaluated on the complex task of segmenting thrombus in abdominal aortic aneurysms (AAA). Shape approximation errors were successfully reduced using the two shape model extensions. Segmentation using the new appearance model significantly outperformed the original ASM scheme; average volume errors are 5.1% and 45% respectively.
NASA Astrophysics Data System (ADS)
Idris, N. H.; Salim, N. A.; Othman, M. M.; Yasin, Z. M.
2018-03-01
This paper presents the Evolutionary Programming (EP) which proposed to optimize the training parameters for Artificial Neural Network (ANN) in predicting cascading collapse occurrence due to the effect of protection system hidden failure. The data has been collected from the probability of hidden failure model simulation from the historical data. The training parameters of multilayer-feedforward with backpropagation has been optimized with objective function to minimize the Mean Square Error (MSE). The optimal training parameters consists of the momentum rate, learning rate and number of neurons in first hidden layer and second hidden layer is selected in EP-ANN. The IEEE 14 bus system has been tested as a case study to validate the propose technique. The results show the reliable prediction of performance validated through MSE and Correlation Coefficient (R).
Nagelkerke, Nico; Fidler, Vaclav
2015-01-01
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
A multi-agent intelligent environment for medical knowledge.
Vicari, Rosa M; Flores, Cecilia D; Silvestre, André M; Seixas, Louise J; Ladeira, Marcelo; Coelho, Helder
2003-03-01
AMPLIA is a multi-agent intelligent learning environment designed to support training of diagnostic reasoning and modelling of domains with complex and uncertain knowledge. AMPLIA focuses on the medical area. It is a system that deals with uncertainty under the Bayesian network approach, where learner-modelling tasks will consist of creating a Bayesian network for a problem the system will present. The construction of a network involves qualitative and quantitative aspects. The qualitative part concerns the network topology, that is, causal relations among the domain variables. After it is ready, the quantitative part is specified. It is composed of the distribution of conditional probability of the variables represented. A negotiation process (managed by an intelligent MediatorAgent) will treat the differences of topology and probability distribution between the model the learner built and the one built-in in the system. That negotiation process occurs between the agents that represent the expert knowledge domain (DomainAgent) and the agent that represents the learner knowledge (LearnerAgent).
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
NASA Astrophysics Data System (ADS)
Khodabakhshi, M.; Jafarpour, B.
2013-12-01
Characterization of complex geologic patterns that create preferential flow paths in certain reservoir systems requires higher-order geostatistical modeling techniques. Multipoint statistics (MPS) provides a flexible grid-based approach for simulating such complex geologic patterns from a conceptual prior model known as a training image (TI). In this approach, a stationary TI that encodes the higher-order spatial statistics of the expected geologic patterns is used to represent the shape and connectivity of the underlying lithofacies. While MPS is quite powerful for describing complex geologic facies connectivity, the nonlinear and complex relation between the flow data and facies distribution makes flow data conditioning quite challenging. We propose an adaptive technique for conditioning facies simulation from a prior TI to nonlinear flow data. Non-adaptive strategies for conditioning facies simulation to flow data can involves many forward flow model solutions that can be computationally very demanding. To improve the conditioning efficiency, we develop an adaptive sampling approach through a data feedback mechanism based on the sampling history. In this approach, after a short period of sampling burn-in time where unconditional samples are generated and passed through an acceptance/rejection test, an ensemble of accepted samples is identified and used to generate a facies probability map. This facies probability map contains the common features of the accepted samples and provides conditioning information about facies occurrence in each grid block, which is used to guide the conditional facies simulation process. As the sampling progresses, the initial probability map is updated according to the collective information about the facies distribution in the chain of accepted samples to increase the acceptance rate and efficiency of the conditioning. This conditioning process can be viewed as an optimization approach where each new sample is proposed based on the sampling history to improve the data mismatch objective function. We extend the application of this adaptive conditioning approach to the case where multiple training images are proposed to describe the geologic scenario in a given formation. We discuss the advantages and limitations of the proposed adaptive conditioning scheme and use numerical experiments from fluvial channel formations to demonstrate its applicability and performance compared to non-adaptive conditioning techniques.
van Vugt, Marieke Karlijn; Hitchcock, Peter; Shahar, Ben; Britton, Willoughby
2012-01-01
converging research suggests that mindfulness training exerts its therapeutic effects on depression by reducing rumination. Theoretically, rumination is a multifaceted construct that aggregates multiple neurocognitive aspects of depression, including poor executive control, negative and overgeneral memory bias, and persistence or stickiness of negative mind states. Current measures of rumination, most-often self-reports, do not capture these different aspects of ruminative tendencies, and therefore are limited in providing detailed information about the mechanisms of mindfulness. we developed new insight into the potential mechanisms of rumination, based on three model-based metrics of free recall dynamics. These three measures reflect the patterns of memory retrieval of valenced information: the probability of first recall (Pstart) which represents initial affective bias, the probability of staying with the same valence category rather than switching, which indicates strength of positive or negative association networks (Pstay), and probability of stopping (Pstop) or ending recall within a given valence, which indicates persistence or stickiness of a mind state. We investigated the effects of Mindfulness-Based Cognitive Therapy (MBCT; N = 29) vs. wait-list control (N = 23) on these recall dynamics in a randomized controlled trial in individuals with recurrent depression. Participants completed a standard laboratory stressor, the Trier Social Stress Test, to induce negative mood and activate ruminative tendencies. Following that, participants completed a free recall task consisting of three word lists. This assessment was conducted both before and after treatment or wait-list. while MBCT participant's Pstart remained relatively stable, controls showed multiple indications of depression-related deterioration toward more negative and less positive bias. Following the intervention, MBCT participants decreased in their tendency to sustain trains of negative words and increased their tendency to sustain trains of positive words. Conversely, controls showed the opposite tendency: controls stayed in trains of negative words for longer, and stayed in trains of positive words for less time relative to pre-intervention scores. MBCT participants tended to stop recall less often with negative words, which indicates less persistence or stickiness of negatively valenced mental context. MBCT participants showed a decrease in patterns that may perpetuate rumination on all three types of recall dynamics (Pstart, Pstay, and Pstop), compared to controls. MBCT may weaken the strength of self-perpetuating negative associations networks that are responsible for the persistent and "sticky" negative mind states observed in depression, and increase the positive associations that are lacking in depression. This study also offers a novel, objective method of measuring several indices of ruminative tendencies indicative of the underlying mechanisms of rumination.
Convergence analyses on on-line weight noise injection-based training algorithms for MLPs.
Sum, John; Leung, Chi-Sing; Ho, Kevin
2012-11-01
Injecting weight noise during training is a simple technique that has been proposed for almost two decades. However, little is known about its convergence behavior. This paper studies the convergence of two weight noise injection-based training algorithms, multiplicative weight noise injection with weight decay and additive weight noise injection with weight decay. We consider that they are applied to multilayer perceptrons either with linear or sigmoid output nodes. Let w(t) be the weight vector, let V(w) be the corresponding objective function of the training algorithm, let α >; 0 be the weight decay constant, and let μ(t) be the step size. We show that if μ(t)→ 0, then with probability one E[||w(t)||2(2)] is bound and lim(t) → ∞ ||w(t)||2 exists. Based on these two properties, we show that if μ(t)→ 0, Σtμ(t)=∞, and Σtμ(t)(2) <; ∞, then with probability one these algorithms converge. Moreover, w(t) converges with probability one to a point where ∇wV(w)=0.
Improving galaxy morphologies for SDSS with Deep Learning
NASA Astrophysics Data System (ADS)
Domínguez Sánchez, H.; Huertas-Company, M.; Bernardi, M.; Tuccillo, D.; Fischer, J. L.
2018-05-01
We present a morphological catalogue for ˜670 000 galaxies in the Sloan Digital Sky Survey in two flavours: T-type, related to the Hubble sequence, and Galaxy Zoo 2 (GZ2 hereafter) classification scheme. By combining accurate existing visual classification catalogues with machine learning, we provide the largest and most accurate morphological catalogue up to date. The classifications are obtained with Deep Learning algorithms using Convolutional Neural Networks (CNNs). We use two visual classification catalogues, GZ2 and Nair & Abraham (2010), for training CNNs with colour images in order to obtain T-types and a series of GZ2 type questions (disc/features, edge-on galaxies, bar signature, bulge prominence, roundness, and mergers). We also provide an additional probability enabling a separation between pure elliptical (E) from S0, where the T-type model is not so efficient. For the T-type, our results show smaller offset and scatter than previous models trained with support vector machines. For the GZ2 type questions, our models have large accuracy (>97 per cent), precision and recall values (>90 per cent), when applied to a test sample with the same characteristics as the one used for training. The catalogue is publicly released with the paper.
Real-time flood forecasts & risk assessment using a possibility-theory based fuzzy neural network
NASA Astrophysics Data System (ADS)
Khan, U. T.
2016-12-01
Globally floods are one of the most devastating natural disasters and improved flood forecasting methods are essential for better flood protection in urban areas. Given the availability of high resolution real-time datasets for flood variables (e.g. streamflow and precipitation) in many urban areas, data-driven models have been effectively used to predict peak flow rates in river; however, the selection of input parameters for these types of models is often subjective. Additionally, the inherit uncertainty associated with data models along with errors in extreme event observations means that uncertainty quantification is essential. Addressing these concerns will enable improved flood forecasting methods and provide more accurate flood risk assessments. In this research, a new type of data-driven model, a quasi-real-time updating fuzzy neural network is developed to predict peak flow rates in urban riverine watersheds. A possibility-to-probability transformation is first used to convert observed data into fuzzy numbers. A possibility theory based training regime is them used to construct the fuzzy parameters and the outputs. A new entropy-based optimisation criterion is used to train the network. Two existing methods to select the optimum input parameters are modified to account for fuzzy number inputs, and compared. These methods are: Entropy-Wavelet-based Artificial Neural Network (EWANN) and Combined Neural Pathway Strength Analysis (CNPSA). Finally, an automated algorithm design to select the optimum structure of the neural network is implemented. The overall impact of each component of training this network is to replace the traditional ad hoc network configuration methods, with one based on objective criteria. Ten years of data from the Bow River in Calgary, Canada (including two major floods in 2005 and 2013) are used to calibrate and test the network. The EWANN method selected lagged peak flow as a candidate input, whereas the CNPSA method selected lagged precipitation and lagged mean daily flow as candidate inputs. Model performance metric show that the CNPSA method had higher performance (with an efficiency of 0.76). Model output was used to assess the risk of extreme peak flows for a given day using an inverse possibility-to-probability transformation.
Effect of multimedia information sequencing on educational outcome in orthodontic training.
Aly, Medhat; Willems, Guy; Van Den Noortgate, Wim; Elen, Jan
2012-08-01
The aim of this research was to compare the effectiveness of hierarchical sequencing (HS) versus elaboration sequencing (ES) models in improving educational outcome of clinical knowledge when using instructional multimedia programs in postgraduate orthodontic training. Twenty-four postgraduate and 24 undergraduate dental students participated in this study. The postgraduates were following an orthodontic speciality training programme. The undergraduates were fourth- and fifth-year dental students. Twelve instructional multimedia modules were developed, six logically sequenced (LS) discussing six different orthodontic topics. Another six modules on identical topics were sequenced according to one macro-sequencing (MS) model. The implemented MS model was either HS or ES. The only difference between LS and MS modules was the adopted sequencing model. All participants were assigned into consistent pairs of students and were randomly divided into a test and a control group. In each pair, one student studied the LS module (control group) while the other studied the MS version (test group). Pre- and post-evaluation tests of each pair of participants were performed to measure knowledge, understanding and application of each participant with regard to the discussed topic. A multilevel analysis was conducted to assess the estimated effect of the different sequencing models. The level of significance was set at 0.05. At baseline, no significant differences (P > 0.05) were found in pre-test scores between groups. The HS model showed a significant effect on the scores achieved (P = 0.05). The test group showed a significantly higher estimated probability of correct answers to the questions (P = 0.003) when applying the HS model. The HS model may improve educational outcome when using instructional multimedia programs in postgraduate orthodontic training.
Predicting the probability of mortality of gastric cancer patients using decision tree.
Mohammadzadeh, F; Noorkojuri, H; Pourhoseingholi, M A; Saadat, S; Baghestani, A R
2015-06-01
Gastric cancer is the fourth most common cancer worldwide. This reason motivated us to investigate and introduce gastric cancer risk factors utilizing statistical methods. The aim of this study was to identify the most important factors influencing the mortality of patients who suffer from gastric cancer disease and to introduce a classification approach according to decision tree model for predicting the probability of mortality from this disease. Data on 216 patients with gastric cancer, who were registered in Taleghani hospital in Tehran,Iran, were analyzed. At first, patients were divided into two groups: the dead and alive. Then, to fit decision tree model to our data, we randomly selected 20% of dataset to the test sample and remaining dataset considered as the training sample. Finally, the validity of the model examined with sensitivity, specificity, diagnosis accuracy and the area under the receiver operating characteristic curve. The CART version 6.0 and SPSS version 19.0 softwares were used for the analysis of the data. Diabetes, ethnicity, tobacco, tumor size, surgery, pathologic stage, age at diagnosis, exposure to chemical weapons and alcohol consumption were determined as effective factors on mortality of gastric cancer. The sensitivity, specificity and accuracy of decision tree were 0.72, 0.75 and 0.74 respectively. The indices of sensitivity, specificity and accuracy represented that the decision tree model has acceptable accuracy to prediction the probability of mortality in gastric cancer patients. So a simple decision tree consisted of factors affecting on mortality of gastric cancer may help clinicians as a reliable and practical tool to predict the probability of mortality in these patients.
Sensitivity Analysis of the Bone Fracture Risk Model
NASA Technical Reports Server (NTRS)
Lewandowski, Beth; Myers, Jerry; Sibonga, Jean Diane
2017-01-01
Introduction: The probability of bone fracture during and after spaceflight is quantified to aid in mission planning, to determine required astronaut fitness standards and training requirements and to inform countermeasure research and design. Probability is quantified with a probabilistic modeling approach where distributions of model parameter values, instead of single deterministic values, capture the parameter variability within the astronaut population and fracture predictions are probability distributions with a mean value and an associated uncertainty. Because of this uncertainty, the model in its current state cannot discern an effect of countermeasures on fracture probability, for example between use and non-use of bisphosphonates or between spaceflight exercise performed with the Advanced Resistive Exercise Device (ARED) or on devices prior to installation of ARED on the International Space Station. This is thought to be due to the inability to measure key contributors to bone strength, for example, geometry and volumetric distributions of bone mass, with areal bone mineral density (BMD) measurement techniques. To further the applicability of model, we performed a parameter sensitivity study aimed at identifying those parameter uncertainties that most effect the model forecasts in order to determine what areas of the model needed enhancements for reducing uncertainty. Methods: The bone fracture risk model (BFxRM), originally published in (Nelson et al) is a probabilistic model that can assess the risk of astronaut bone fracture. This is accomplished by utilizing biomechanical models to assess the applied loads; utilizing models of spaceflight BMD loss in at-risk skeletal locations; quantifying bone strength through a relationship between areal BMD and bone failure load; and relating fracture risk index (FRI), the ratio of applied load to bone strength, to fracture probability. There are many factors associated with these calculations including environmental factors, factors associated with the fall event, mass and anthropometric values of the astronaut, BMD characteristics, characteristics of the relationship between BMD and bone strength and bone fracture characteristics. The uncertainty in these factors is captured through the use of parameter distributions and the fracture predictions are probability distributions with a mean value and an associated uncertainty. To determine parameter sensitivity, a correlation coefficient is found between the sample set of each model parameter and the calculated fracture probabilities. Each parameters contribution to the variance is found by squaring the correlation coefficients, dividing by the sum of the squared correlation coefficients, and multiplying by 100. Results: Sensitivity analyses of BFxRM simulations of preflight, 0 days post-flight and 365 days post-flight falls onto the hip revealed a subset of the twelve factors within the model which cause the most variation in the fracture predictions. These factors include the spring constant used in the hip biomechanical model, the midpoint FRI parameter within the equation used to convert FRI to fracture probability and preflight BMD values. Future work: Plans are underway to update the BFxRM by incorporating bone strength information from finite element models (FEM) into the bone strength portion of the BFxRM. Also, FEM bone strength information along with fracture outcome data will be incorporated into the FRI to fracture probability.
Combining Multiple Knowledge Sources for Continuous Speech Recognition
1989-08-01
derived by estimating probabilities from a training set, or a linguistically -based model that uses syntactic and semantic information explicitly. The...into a hierarchical set of rules tha’ wouA. :over a much larger percentage of new sentences than the original sentence patteiis. We applied this tool...statistical grammars typically used by the use of linguistic knowledge. In particular, we group the different words in the vocabulary into classes, under the
Fink, Günther; Maloney, Kathleen; Berg, Katrina; Jordan, Matthew; Svoronos, Theodore; Aber, Flavia; Dickens, William
2015-01-01
Abstract Objective To evaluate the impact – on diagnosis and treatment of malaria – of introducing rapid diagnostic tests to drug shops in eastern Uganda. Methods Overall, 2193 households in 79 study villages with at least one licensed drug shop were enrolled and monitored for 12 months. After 3 months of monitoring, drug shop vendors in 67 villages randomly selected for the intervention were offered training in the use of malaria rapid diagnostic tests and – if trained – offered access to such tests at a subsidized price. The remaining 12 study villages served as controls. A difference-in-differences regression model was used to estimate the impact of the intervention. Findings Vendors from 92 drug shops successfully completed training and 50 actively stocked and performed the rapid tests. Over 9 months, trained vendors did an average of 146 tests per shop. Households reported 22 697 episodes of febrile illness. The availability of rapid tests at local drug shops significantly increased the probability of any febrile illness being tested for malaria by 23.15% (P = 0.015) and being treated with an antimalarial drug by 8.84% (P = 0.056). The probability that artemisinin combination therapy was bought increased by a statistically insignificant 5.48% (P = 0.574). Conclusion In our study area, testing for malaria was increased by training drug shop vendors in the use of rapid tests and providing them access to such tests at a subsidized price. Additional interventions may be needed to achieve a higher coverage of testing and a higher rate of appropriate responses to test results.
Computer Modeling to Evaluate the Impact of Technology Changes on Resident Procedural Volume.
Grenda, Tyler R; Ballard, Tiffany N S; Obi, Andrea T; Pozehl, William; Seagull, F Jacob; Chen, Ryan; Cohn, Amy M; Daskin, Mark S; Reddy, Rishindra M
2016-12-01
As resident "index" procedures change in volume due to advances in technology or reliance on simulation, it may be difficult to ensure trainees meet case requirements. Training programs are in need of metrics to determine how many residents their institutional volume can support. As a case study of how such metrics can be applied, we evaluated a case distribution simulation model to examine program-level mediastinoscopy and endobronchial ultrasound (EBUS) volumes needed to train thoracic surgery residents. A computer model was created to simulate case distribution based on annual case volume, number of trainees, and rotation length. Single institutional case volume data (2011-2013) were applied, and 10 000 simulation years were run to predict the likelihood (95% confidence interval) of all residents (4 trainees) achieving board requirements for operative volume during a 2-year program. The mean annual mediastinoscopy volume was 43. In a simulation of pre-2012 board requirements (thoracic pathway, 25; cardiac pathway, 10), there was a 6% probability of all 4 residents meeting requirements. Under post-2012 requirements (thoracic, 15; cardiac, 10), however, the likelihood increased to 88%. When EBUS volume (mean 19 cases per year) was concurrently evaluated in the post-2012 era (thoracic, 10; cardiac, 0), the likelihood of all 4 residents meeting case requirements was only 23%. This model provides a metric to predict the probability of residents meeting case requirements in an era of changing volume by accounting for unpredictable and inequitable case distribution. It could be applied across operations, procedures, or disease diagnoses and may be particularly useful in developing resident curricula and schedules.
Kaitani, Toshiko; Nakagami, Gojiro; Iizaka, Shinji; Fukuda, Takashi; Oe, Makoto; Igarashi, Ataru; Mori, Taketoshi; Takemura, Yukie; Mizokami, Yuko; Sugama, Junko; Sanada, Hiromi
2015-01-01
The high prevalence of severe pressure ulcers (PUs) is an important issue that requires to be highlighted in Japan. In a previous study, we devised an advanced PU management protocol to enable early detection of and intervention for deep tissue injury and critical colonization. This protocol was effective for preventing more severe PUs. The present study aimed to compare the cost-effectiveness of the care provided using an advanced PU management protocol, from a medical provider's perspective, implemented by trained wound, ostomy, and continence nurses (WOCNs), with that of conventional care provided by a control group of WOCNs. A Markov model was constructed for a 1-year time horizon to determine the incremental cost-effectiveness ratio of advanced PU management compared with conventional care. The number of quality-adjusted life-years gained, and the cost in Japanese yen (¥) ($US1 = ¥120; 2015) was used as the outcome. Model inputs for clinical probabilities and related costs were based on our previous clinical trial results. Univariate sensitivity analyses were performed. Furthermore, a Bayesian multivariate probability sensitivity analysis was performed using Monte Carlo simulations with advanced PU management. Two different models were created for initial cohort distribution. For both models, the expected effectiveness for the intervention group using advanced PU management techniques was high, with a low expected cost value. The sensitivity analyses suggested that the results were robust. Intervention by WOCNs using advanced PU management techniques was more effective and cost-effective than conventional care. © 2015 by the Wound Healing Society.
van Wilgen, Nicola J; Richardson, David M
2012-04-01
We developed a method to predict the potential of non-native reptiles and amphibians (herpetofauna) to establish populations. This method may inform efforts to prevent the introduction of invasive non-native species. We used boosted regression trees to determine whether nine variables influence establishment success of introduced herpetofauna in California and Florida. We used an independent data set to assess model performance. Propagule pressure was the variable most strongly associated with establishment success. Species with short juvenile periods and species with phylogenetically more distant relatives in regional biotas were more likely to establish than species that start breeding later and those that have close relatives. Average climate match (the similarity of climate between native and non-native range) and life form were also important. Frogs and lizards were the taxonomic groups most likely to establish, whereas a much lower proportion of snakes and turtles established. We used results from our best model to compile a spreadsheet-based model for easy use and interpretation. Probability scores obtained from the spreadsheet model were strongly correlated with establishment success as were probabilities predicted for independent data by the boosted regression tree model. However, the error rate for predictions made with independent data was much higher than with cross validation using training data. This difference in predictive power does not preclude use of the model to assess the probability of establishment of herpetofauna because (1) the independent data had no information for two variables (meaning the full predictive capacity of the model could not be realized) and (2) the model structure is consistent with the recent literature on the primary determinants of establishment success for herpetofauna. It may still be difficult to predict the establishment probability of poorly studied taxa, but it is clear that non-native species (especially lizards and frogs) that mature early and come from environments similar to that of the introduction region have the highest probability of establishment. ©2012 Society for Conservation Biology.
Dabbour, Essam; Easa, Said; Haider, Murtaza
2017-10-01
This study attempts to identify significant factors that affect the severity of drivers' injuries when colliding with trains at railroad-grade crossings by analyzing the individual-specific heterogeneity related to those factors over a period of 15 years. Both fixed-parameter and random-parameter ordered regression models were used to analyze records of all vehicle-train collisions that occurred in the United States from January 1, 2001 to December 31, 2015. For fixed-parameter ordered models, both probit and negative log-log link functions were used. The latter function accounts for the fact that lower injury severity levels are more probable than higher ones. Separate models were developed for heavy and light-duty vehicles. Higher train and vehicle speeds, female, and young drivers (below the age of 21 years) were found to be consistently associated with higher severity of drivers' injuries for both heavy and light-duty vehicles. Furthermore, favorable weather, light-duty trucks (including pickup trucks, panel trucks, mini-vans, vans, and sports-utility vehicles), and senior drivers (above the age of 65 years) were found be consistently associated with higher severity of drivers' injuries for light-duty vehicles only. All other factors (e.g. air temperature, the type of warning devices, darkness conditions, and highway pavement type) were found to be temporally unstable, which may explain the conflicting findings of previous studies related to those factors. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kohonen and counterpropagation neural networks applied for mapping and interpretation of IR spectra.
Novic, Marjana
2008-01-01
The principles of learning strategy of Kohonen and counterpropagation neural networks are introduced. The advantages of unsupervised learning are discussed. The self-organizing maps produced in both methods are suitable for a wide range of applications. Here, we present an example of Kohonen and counterpropagation neural networks used for mapping, interpretation, and simulation of infrared (IR) spectra. The artificial neural network models were trained for prediction of structural fragments of an unknown compound from its infrared spectrum. The training set contained over 3,200 IR spectra of diverse compounds of known chemical structure. The structure-spectra relationship was encompassed by the counterpropagation neural network, which assigned structural fragments to individual compounds within certain probability limits, assessed from the predictions of test compounds. The counterpropagation neural network model for prediction of fragments of chemical structure is reversible, which means that, for a given structural domain, limited to the training data set in the study, it can be used to simulate the IR spectrum of a chemical defined with a set of structural fragments.
Image aesthetic quality evaluation using convolution neural network embedded learning
NASA Astrophysics Data System (ADS)
Li, Yu-xin; Pu, Yuan-yuan; Xu, Dan; Qian, Wen-hua; Wang, Li-peng
2017-11-01
A way of embedded learning convolution neural network (ELCNN) based on the image content is proposed to evaluate the image aesthetic quality in this paper. Our approach can not only solve the problem of small-scale data but also score the image aesthetic quality. First, we chose Alexnet and VGG_S to compare for confirming which is more suitable for this image aesthetic quality evaluation task. Second, to further boost the image aesthetic quality classification performance, we employ the image content to train aesthetic quality classification models. But the training samples become smaller and only using once fine-tuning cannot make full use of the small-scale data set. Third, to solve the problem in second step, a way of using twice fine-tuning continually based on the aesthetic quality label and content label respective is proposed, the classification probability of the trained CNN models is used to evaluate the image aesthetic quality. The experiments are carried on the small-scale data set of Photo Quality. The experiment results show that the classification accuracy rates of our approach are higher than the existing image aesthetic quality evaluation approaches.
NASA Astrophysics Data System (ADS)
Miwa, Shotaro; Kage, Hiroshi; Hirai, Takashi; Sumi, Kazuhiko
We propose a probabilistic face recognition algorithm for Access Control System(ACS)s. Comparing with existing ACSs using low cost IC-cards, face recognition has advantages in usability and security that it doesn't require people to hold cards over scanners and doesn't accept imposters with authorized cards. Therefore face recognition attracts more interests in security markets than IC-cards. But in security markets where low cost ACSs exist, price competition is important, and there is a limitation on the quality of available cameras and image control. Therefore ACSs using face recognition are required to handle much lower quality images, such as defocused and poor gain-controlled images than high security systems, such as immigration control. To tackle with such image quality problems we developed a face recognition algorithm based on a probabilistic model which combines a variety of image-difference features trained by Real AdaBoost with their prior probability distributions. It enables to evaluate and utilize only reliable features among trained ones during each authentication, and achieve high recognition performance rates. The field evaluation using a pseudo Access Control System installed in our office shows that the proposed system achieves a constant high recognition performance rate independent on face image qualities, that is about four times lower EER (Equal Error Rate) under a variety of image conditions than one without any prior probability distributions. On the other hand using image difference features without any prior probabilities are sensitive to image qualities. We also evaluated PCA, and it has worse, but constant performance rates because of its general optimization on overall data. Comparing with PCA, Real AdaBoost without any prior distribution performs twice better under good image conditions, but degrades to a performance as good as PCA under poor image conditions.
A method for modeling bias in a person's estimates of likelihoods of events
NASA Technical Reports Server (NTRS)
Nygren, Thomas E.; Morera, Osvaldo
1988-01-01
It is of practical importance in decision situations involving risk to train individuals to transform uncertainties into subjective probability estimates that are both accurate and unbiased. We have found that in decision situations involving risk, people often introduce subjective bias in their estimation of the likelihoods of events depending on whether the possible outcomes are perceived as being good or bad. Until now, however, the successful measurement of individual differences in the magnitude of such biases has not been attempted. In this paper we illustrate a modification of a procedure originally outlined by Davidson, Suppes, and Siegel (3) to allow for a quantitatively-based methodology for simultaneously estimating an individual's subjective utility and subjective probability functions. The procedure is now an interactive computer-based algorithm, DSS, that allows for the measurement of biases in probability estimation by obtaining independent measures of two subjective probability functions (S+ and S-) for winning (i.e., good outcomes) and for losing (i.e., bad outcomes) respectively for each individual, and for different experimental conditions within individuals. The algorithm and some recent empirical data are described.
Task Training Emphasis for Determining Training Priority.
1987-08-01
the relative time spent on tasks performed in their current jobs. Supervisors also rated the tasks on several different task factors, including a new... different task factors, including Task Difficulty, Probable Consequences of Inadequate Performance, Task Delay Tolerance, and Recommended Training Emphasis...3 11. APPROACH. .. ..... ..... ...... ..... ..... ...... ...... 4 III. METHOD
Efficient detection of wound-bed and peripheral skin with statistical colour models.
Veredas, Francisco J; Mesa, Héctor; Morente, Laura
2015-04-01
A pressure ulcer is a clinical pathology of localised damage to the skin and underlying tissue caused by pressure, shear or friction. Reliable diagnosis supported by precise wound evaluation is crucial in order to success on treatment decisions. This paper presents a computer-vision approach to wound-area detection based on statistical colour models. Starting with a training set consisting of 113 real wound images, colour histogram models are created for four different tissue types. Back-projections of colour pixels on those histogram models are used, from a Bayesian perspective, to get an estimate of the posterior probability of a pixel to belong to any of those tissue classes. Performance measures obtained from contingency tables based on a gold standard of segmented images supplied by experts have been used for model selection. The resulting fitted model has been validated on a training set consisting of 322 wound images manually segmented and labelled by expert clinicians. The final fitted segmentation model shows robustness and gives high mean performance rates [(AUC: .9426 (SD .0563); accuracy: .8777 (SD .0799); F-score: 0.7389 (SD .1550); Cohen's kappa: .6585 (SD .1787)] when segmenting significant wound areas that include healing tissues.
Predicting the accuracy of ligand overlay methods with Random Forest models.
Nandigam, Ravi K; Evans, David A; Erickson, Jon A; Kim, Sangtae; Sutherland, Jeffrey J
2008-12-01
The accuracy of binding mode prediction using standard molecular overlay methods (ROCS, FlexS, Phase, and FieldCompare) is studied. Previous work has shown that simple decision tree modeling can be used to improve accuracy by selection of the best overlay template. This concept is extended to the use of Random Forest (RF) modeling for template and algorithm selection. An extensive data set of 815 ligand-bound X-ray structures representing 5 gene families was used for generating ca. 70,000 overlays using four programs. RF models, trained using standard measures of ligand and protein similarity and Lipinski-related descriptors, are used for automatically selecting the reference ligand and overlay method maximizing the probability of reproducing the overlay deduced from X-ray structures (i.e., using rmsd < or = 2 A as the criteria for success). RF model scores are highly predictive of overlay accuracy, and their use in template and method selection produces correct overlays in 57% of cases for 349 overlay ligands not used for training RF models. The inclusion in the models of protein sequence similarity enables the use of templates bound to related protein structures, yielding useful results even for proteins having no available X-ray structures.
TORNADO-WARNING PERFORMANCE IN THE PAST AND FUTURE: A Perspective from Signal Detection Theory.
NASA Astrophysics Data System (ADS)
Brooks, Harold E.
2004-06-01
Changes over the years in tornado-warning performance in the United States can be modeled from the perspective of signal detection theory. From this view, it can be seen that there have been distinct periods of change in performance, most likely associated with deployment of radars, and changes in scientific understanding and training. The model also makes it clear that improvements in the false alarm ratio can only occur at the cost of large decreases in the probability of detection, or with large improvements in the overall quality of the warning system.
Annual Historical Report - AMEDD Activities, Calendar Year 1986
1987-01-01
experimentation . A method was devised to determine the heat transfer properties of the head by use of a copper model which is unique and allows independent ...Church, VA 22041 Commander US Army Training and Doctrine Command ATTN: ATCD-S ATCD-ATMD Fort Monroe, VA 23651 Commander US Army Test and Experimentation ...4500 m). Soldiers with less than 20 torr increase test have only a 40-50% probability of acute mountain sickness. Therefore, CPT at sea level may be used
Introductory life science mathematics and quantitative neuroscience courses.
Duffus, Dwight; Olifer, Andrei
2010-01-01
We describe two sets of courses designed to enhance the mathematical, statistical, and computational training of life science undergraduates at Emory College. The first course is an introductory sequence in differential and integral calculus, modeling with differential equations, probability, and inferential statistics. The second is an upper-division course in computational neuroscience. We provide a description of each course, detailed syllabi, examples of content, and a brief discussion of the main issues encountered in developing and offering the courses.
Serial Spike Time Correlations Affect Probability Distribution of Joint Spike Events.
Shahi, Mina; van Vreeswijk, Carl; Pipa, Gordon
2016-01-01
Detecting the existence of temporally coordinated spiking activity, and its role in information processing in the cortex, has remained a major challenge for neuroscience research. Different methods and approaches have been suggested to test whether the observed synchronized events are significantly different from those expected by chance. To analyze the simultaneous spike trains for precise spike correlation, these methods typically model the spike trains as a Poisson process implying that the generation of each spike is independent of all the other spikes. However, studies have shown that neural spike trains exhibit dependence among spike sequences, such as the absolute and relative refractory periods which govern the spike probability of the oncoming action potential based on the time of the last spike, or the bursting behavior, which is characterized by short epochs of rapid action potentials, followed by longer episodes of silence. Here we investigate non-renewal processes with the inter-spike interval distribution model that incorporates spike-history dependence of individual neurons. For that, we use the Monte Carlo method to estimate the full shape of the coincidence count distribution and to generate false positives for coincidence detection. The results show that compared to the distributions based on homogeneous Poisson processes, and also non-Poisson processes, the width of the distribution of joint spike events changes. Non-renewal processes can lead to both heavy tailed or narrow coincidence distribution. We conclude that small differences in the exact autostructure of the point process can cause large differences in the width of a coincidence distribution. Therefore, manipulations of the autostructure for the estimation of significance of joint spike events seem to be inadequate.
Serial Spike Time Correlations Affect Probability Distribution of Joint Spike Events
Shahi, Mina; van Vreeswijk, Carl; Pipa, Gordon
2016-01-01
Detecting the existence of temporally coordinated spiking activity, and its role in information processing in the cortex, has remained a major challenge for neuroscience research. Different methods and approaches have been suggested to test whether the observed synchronized events are significantly different from those expected by chance. To analyze the simultaneous spike trains for precise spike correlation, these methods typically model the spike trains as a Poisson process implying that the generation of each spike is independent of all the other spikes. However, studies have shown that neural spike trains exhibit dependence among spike sequences, such as the absolute and relative refractory periods which govern the spike probability of the oncoming action potential based on the time of the last spike, or the bursting behavior, which is characterized by short epochs of rapid action potentials, followed by longer episodes of silence. Here we investigate non-renewal processes with the inter-spike interval distribution model that incorporates spike-history dependence of individual neurons. For that, we use the Monte Carlo method to estimate the full shape of the coincidence count distribution and to generate false positives for coincidence detection. The results show that compared to the distributions based on homogeneous Poisson processes, and also non-Poisson processes, the width of the distribution of joint spike events changes. Non-renewal processes can lead to both heavy tailed or narrow coincidence distribution. We conclude that small differences in the exact autostructure of the point process can cause large differences in the width of a coincidence distribution. Therefore, manipulations of the autostructure for the estimation of significance of joint spike events seem to be inadequate. PMID:28066225
Recruit Fitness as a Predictor of Police Academy Graduation.
Shusko, M; Benedetti, L; Korre, M; Eshleman, E J; Farioli, A; Christophi, C A; Kales, S N
2017-10-01
Suboptimal recruit fitness may be a risk factor for poor performance, injury, illness, and lost time during police academy training. To assess the probability of successful completion and graduation from a police academy as a function of recruits' baseline fitness levels at the time of academy entry. Retrospective study where all available records from recruit training courses held (2006-2012) at all Massachusetts municipal police academies were reviewed and analysed. Entry fitness levels were quantified from the following measures, as recorded at the start of each training class: body composition, push-ups, sit-ups, sit-and-reach, and 1.5-mile run-time. The primary outcome of interest was the odds of not successfully graduating from an academy. We used generalized linear mixed models in order to fit logistic regression models with random intercepts for assessing the probability of not graduating, based on entry-level fitness. The primary analyses were restricted to recruits with complete entry-level fitness data. The fitness measures most strongly associated with academy failure were lesser number of push-ups completed (odds ratio [OR] = 5.2, 95% confidence interval [CI] 2.3-11.7, for 20 versus 41-60 push-ups) and slower run times (OR = 3.8, 95% CI 1.8-7.8, [1.5 mile run time of ≥15'20″] versus [12'33″ to 10'37″]). Baseline pushups and 1.5-mile run-time showed the best ability to predict successful academy graduation, especially when considered together. Future research should include prospective validation of entry-level fitness as a predictor of subsequent police academy success. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine.
Stargate GTM: Bridging Descriptor and Activity Spaces.
Gaspar, Héléna A; Baskin, Igor I; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre
2015-11-23
Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
Neumann, Steffen; Schmitt-Kopplin, Philippe
2017-01-01
Lipid identification is a major bottleneck in high-throughput lipidomics studies. However, tools for the analysis of lipid tandem MS spectra are rather limited. While the comparison against spectra in reference libraries is one of the preferred methods, these libraries are far from being complete. In order to improve identification rates, the in silico fragmentation tool MetFrag was combined with Lipid Maps and lipid-class specific classifiers which calculate probabilities for lipid class assignments. The resulting LipidFrag workflow was trained and evaluated on different commercially available lipid standard materials, measured with data dependent UPLC-Q-ToF-MS/MS acquisition. The automatic analysis was compared against manual MS/MS spectra interpretation. With the lipid class specific models, identification of the true positives was improved especially for cases where candidate lipids from different lipid classes had similar MetFrag scores by removing up to 56% of false positive results. This LipidFrag approach was then applied to MS/MS spectra of lipid extracts of the nematode Caenorhabditis elegans. Fragments explained by LipidFrag match known fragmentation pathways, e.g., neutral losses of lipid headgroups and fatty acid side chain fragments. Based on prediction models trained on standard lipid materials, high probabilities for correct annotations were achieved, which makes LipidFrag a good choice for automated lipid data analysis and reliability testing of lipid identifications. PMID:28278196
Learning and Visualizing Modulation Discriminative Radio Signal Features
2016-09-01
implemented as a mapping of a sequence of in-phase quadrature ( IQ ) measurements generated by a software-defined radio to a probability distri- bution...over modulation classes. 3.1 TRAINING SNR EVALUATION Training CNNs on RF data raises the unique question of determining an optimal training SNR, that
Transition from School-Based Training in VET
ERIC Educational Resources Information Center
Daehlen, Marianne
2017-01-01
Purpose: This paper assesses the drop-out rate among disadvantaged students within vocational education and training. The purpose of this paper is to examine the probability of dropping out after school-based training for child welfare clients--a particularly disadvantaged group of youth. Child welfare clients' drop-out rate is compared with…
Asher, Lucy; Harvey, Naomi D.; Green, Martin; England, Gary C. W.
2017-01-01
Epidemiology is the study of patterns of health-related states or events in populations. Statistical models developed for epidemiology could be usefully applied to behavioral states or events. The aim of this study is to present the application of epidemiological statistics to understand animal behavior where discrete outcomes are of interest, using data from guide dogs to illustrate. Specifically, survival analysis and multistate modeling are applied to data on guide dogs comparing dogs that completed training and qualified as a guide dog, to those that were withdrawn from the training program. Survival analysis allows the time to (or between) a binary event(s) and the probability of the event occurring at or beyond a specified time point. Survival analysis, using a Cox proportional hazards model, was used to examine the time taken to withdraw a dog from training. Sex, breed, and other factors affected time to withdrawal. Bitches were withdrawn faster than dogs, Labradors were withdrawn faster, and Labrador × Golden Retrievers slower, than Golden Retriever × Labradors; and dogs not bred by Guide Dogs were withdrawn faster than those bred by Guide Dogs. Multistate modeling (MSM) can be used as an extension of survival analysis to incorporate more than two discrete events or states. Multistate models were used to investigate transitions between states of training to qualification as a guide dog or behavioral withdrawal, and from qualification as a guide dog to behavioral withdrawal. Sex, breed (with purebred Labradors and Golden retrievers differing from F1 crosses), and bred by Guide Dogs or not, effected movements between states. We postulate that survival analysis and MSM could be applied to a wide range of behavioral data and key examples are provided. PMID:28804710
Xu, Yifang; Collins, Leslie M
2005-06-01
This work investigates dynamic range and intensity discrimination for electrical pulse-train stimuli that are modulated by noise using a stochastic auditory nerve model. Based on a hypothesized monotonic relationship between loudness and the number of spikes elicited by a stimulus, theoretical prediction of the uncomfortable level has previously been determined by comparing spike counts to a fixed threshold, Nucl. However, no specific rule for determining Nucl has been suggested. Our work determines the uncomfortable level based on the excitation pattern of the neural response in a normal ear. The number of fibers corresponding to the portion of the basilar membrane driven by a stimulus at an uncomfortable level in a normal ear is related to Nucl at an uncomfortable level of the electrical stimulus. Intensity discrimination limens are predicted using signal detection theory via the probability mass function of the neural response and via experimental simulations. The results show that the uncomfortable level for pulse-train stimuli increases slightly as noise level increases. Combining this with our previous threshold predictions, we hypothesize that the dynamic range for noise-modulated pulse-train stimuli should increase with additive noise. However, since our predictions indicate that intensity discrimination under noise degrades, overall intensity coding performance may not improve significantly.
Designing optimal stimuli to control neuronal spike timing
Packer, Adam M.; Yuste, Rafael; Paninski, Liam
2011-01-01
Recent advances in experimental stimulation methods have raised the following important computational question: how can we choose a stimulus that will drive a neuron to output a target spike train with optimal precision, given physiological constraints? Here we adopt an approach based on models that describe how a stimulating agent (such as an injected electrical current or a laser light interacting with caged neurotransmitters or photosensitive ion channels) affects the spiking activity of neurons. Based on these models, we solve the reverse problem of finding the best time-dependent modulation of the input, subject to hardware limitations as well as physiologically inspired safety measures, that causes the neuron to emit a spike train that with highest probability will be close to a target spike train. We adopt fast convex constrained optimization methods to solve this problem. Our methods can potentially be implemented in real time and may also be generalized to the case of many cells, suitable for neural prosthesis applications. With the use of biologically sensible parameters and constraints, our method finds stimulation patterns that generate very precise spike trains in simulated experiments. We also tested the intracellular current injection method on pyramidal cells in mouse cortical slices, quantifying the dependence of spiking reliability and timing precision on constraints imposed on the applied currents. PMID:21511704
Feedback Valence Affects Auditory Perceptual Learning Independently of Feedback Probability
Amitay, Sygal; Moore, David R.; Molloy, Katharine; Halliday, Lorna F.
2015-01-01
Previous studies have suggested that negative feedback is more effective in driving learning than positive feedback. We investigated the effect on learning of providing varying amounts of negative and positive feedback while listeners attempted to discriminate between three identical tones; an impossible task that nevertheless produces robust learning. Four feedback conditions were compared during training: 90% positive feedback or 10% negative feedback informed the participants that they were doing equally well, while 10% positive or 90% negative feedback informed them they were doing equally badly. In all conditions the feedback was random in relation to the listeners’ responses (because the task was to discriminate three identical tones), yet both the valence (negative vs. positive) and the probability of feedback (10% vs. 90%) affected learning. Feedback that informed listeners they were doing badly resulted in better post-training performance than feedback that informed them they were doing well, independent of valence. In addition, positive feedback during training resulted in better post-training performance than negative feedback, but only positive feedback indicating listeners were doing badly on the task resulted in learning. As we have previously speculated, feedback that better reflected the difficulty of the task was more effective in driving learning than feedback that suggested performance was better than it should have been given perceived task difficulty. But contrary to expectations, positive feedback was more effective than negative feedback in driving learning. Feedback thus had two separable effects on learning: feedback valence affected motivation on a subjectively difficult task, and learning occurred only when feedback probability reflected the subjective difficulty. To optimize learning, training programs need to take into consideration both feedback valence and probability. PMID:25946173
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carson, K.S.
The presence of overpopulation or unsustainable population growth may place pressure on the food and water supplies of countries in sensitive areas of the world. Severe air or water pollution may place additional pressure on these resources. These pressures may generate both internal and international conflict in these areas as nations struggle to provide for their citizens. Such conflicts may result in United States intervention, either unilaterally, or through the United Nations. Therefore, it is in the interests of the United States to identify potential areas of conflict in order to properly train and allocate forces. The purpose of thismore » research is to forecast the probability of conflict in a nation as a function of it s environmental conditions. Probit, logit and ordered probit models are employed to forecast the probability of a given level of conflict. Data from 95 countries are used to estimate the models. Probability forecasts are generated for these 95 nations. Out-of sample forecasts are generated for an additional 22 nations. These probabilities are then used to rank nations from highest probability of conflict to lowest. The results indicate that the dependence of a nation`s economy on agriculture, the rate of deforestation, and the population density are important variables in forecasting the probability and level of conflict. These results indicate that environmental variables do play a role in generating or exacerbating conflict. It is unclear that the United States military has any direct role in mitigating the environmental conditions that may generate conflict. A more important role for the military is to aid in data gathering to generate better forecasts so that the troops are adequntely prepared when conflicts arises.« less
A test of geographic assignment using isotope tracers in feathers of known origin
Wunder, Michael B.; Kester, C.L.; Knopf, F.L.; Rye, R.O.
2005-01-01
We used feathers of known origin collected from across the breeding range of a migratory shorebird to test the use of isotope tracers for assigning breeding origins. We analyzed δD, δ13C, and δ15N in feathers from 75 mountain plover (Charadrius montanus) chicks sampled in 2001 and from 119 chicks sampled in 2002. We estimated parameters for continuous-response inverse regression models and for discrete-response Bayesian probability models from data for each year independently. We evaluated model predictions with both the training data and by using the alternate year as an independent test dataset. Our results provide weak support for modeling latitude and isotope values as monotonic functions of one another, especially when data are pooled over known sources of variation such as sample year or location. We were unable to make even qualitative statements, such as north versus south, about the likely origin of birds using both δD and δ13C in inverse regression models; results were no better than random assignment. Probability models provided better results and a more natural framework for the problem. Correct assignment rates were highest when considering all three isotopes in the probability framework, but the use of even a single isotope was better than random assignment. The method appears relatively robust to temporal effects and is most sensitive to the isotope discrimination gradients over which samples are taken. We offer that the problem of using isotope tracers to infer geographic origin is best framed as one of assignment, rather than prediction.
[Resistance training is an underutilized therapy in obesity and advanced age].
Sundell, Jan
2011-01-01
The prevalence and costs of obesity, type 2 diabetes and frailty syndrome will increase dramatically. Resistance training not only decreases fat mass and central obesity, but also enhances insulin sensitivity. Resistance training is probably the most effective measure to prevent and treat sarcopenia. Many studies have shown that resistance training can maintain or even increase bone mineral density. Optimal nutrition enhances the anabolic effect of resistance training. Resistance training should be a central component of public health promotion programs along with aerobic exercise.
Event probabilities and impact zones for hazardous materials accidents on railroads
DOT National Transportation Integrated Search
1983-11-01
Procedures are presented for evaluating the probability and impacts of hazardous material accidents in rail transportation. The significance of track class for accident frequencies and of train speed for accident severity is quantified. Special atten...
Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study
Ursino, Mauro; Crisafulli, Andrea; di Pellegrino, Giuseppe; Magosso, Elisa; Cuppini, Cristiano
2017-01-01
The brain integrates information from different sensory modalities to generate a coherent and accurate percept of external events. Several experimental studies suggest that this integration follows the principle of Bayesian estimate. However, the neural mechanisms responsible for this behavior, and its development in a multisensory environment, are still insufficiently understood. We recently presented a neural network model of audio-visual integration (Neural Computation, 2017) to investigate how a Bayesian estimator can spontaneously develop from the statistics of external stimuli. Model assumes the presence of two unimodal areas (auditory and visual) topologically organized. Neurons in each area receive an input from the external environment, computed as the inner product of the sensory-specific stimulus and the receptive field synapses, and a cross-modal input from neurons of the other modality. Based on sensory experience, synapses were trained via Hebbian potentiation and a decay term. Aim of this work is to improve the previous model, including a more realistic distribution of visual stimuli: visual stimuli have a higher spatial accuracy at the central azimuthal coordinate and a lower accuracy at the periphery. Moreover, their prior probability is higher at the center, and decreases toward the periphery. Simulations show that, after training, the receptive fields of visual and auditory neurons shrink to reproduce the accuracy of the input (both at the center and at the periphery in the visual case), thus realizing the likelihood estimate of unimodal spatial position. Moreover, the preferred positions of visual neurons contract toward the center, thus encoding the prior probability of the visual input. Finally, a prior probability of the co-occurrence of audio-visual stimuli is encoded in the cross-modal synapses. The model is able to simulate the main properties of a Bayesian estimator and to reproduce behavioral data in all conditions examined. In particular, in unisensory conditions the visual estimates exhibit a bias toward the fovea, which increases with the level of noise. In cross modal conditions, the SD of the estimates decreases when using congruent audio-visual stimuli, and a ventriloquism effect becomes evident in case of spatially disparate stimuli. Moreover, the ventriloquism decreases with the eccentricity. PMID:29046631
ERIC Educational Resources Information Center
Arntzen, Erik; Grondahl, Terje; Eilifsen, Christoffer
2010-01-01
Previous studies comparing groups of subjects have indicated differential probabilities of stimulus equivalence outcome as a function of training structures. One-to-Many (OTM) and Many-to-One (MTO) training structures seem to produce positive outcomes on tests for stimulus equivalence more often than a Linear Series (LS) training structure does.…
NASA Astrophysics Data System (ADS)
Ksoll, Victor F.; Gouliermis, Dimitrios A.; Klessen, Ralf S.; Grebel, Eva K.; Sabbi, Elena; Anderson, Jay; Lennon, Daniel J.; Cignoni, Michele; de Marchi, Guido; Smith, Linda J.; Tosi, Monica; van der Marel, Roeland P.
2018-05-01
The Hubble Tarantula Treasury Project (HTTP) has provided an unprecedented photometric coverage of the entire star-burst region of 30 Doradus down to the half Solar mass limit. We use the deep stellar catalogue of HTTP to identify all the pre-main-sequence (PMS) stars of the region, i.e., stars that have not started their lives on the main-sequence yet. The photometric distinction of these stars from the more evolved populations is not a trivial task due to several factors that alter their colour-magnitude diagram positions. The identification of PMS stars requires, thus, sophisticated statistical methods. We employ Machine Learning Classification techniques on the HTTP survey of more than 800,000 sources to identify the PMS stellar content of the observed field. Our methodology consists of 1) carefully selecting the most probable low-mass PMS stellar population of the star-forming cluster NGC2070, 2) using this sample to train classification algorithms to build a predictive model for PMS stars, and 3) applying this model in order to identify the most probable PMS content across the entire Tarantula Nebula. We employ Decision Tree, Random Forest and Support Vector Machine classifiers to categorise the stars as PMS and Non-PMS. The Random Forest and Support Vector Machine provided the most accurate models, predicting about 20,000 sources with a candidateship probability higher than 50 percent, and almost 10,000 PMS candidates with a probability higher than 95 percent. This is the richest and most accurate photometric catalogue of extragalactic PMS candidates across the extent of a whole star-forming complex.
Rokosik, Sandra L; Napier, T Celeste
2012-05-01
The dopamine agonist pramipexole (PPX) can increase impulsiveness, and PPX therapy for neurological diseases (Parkinson's disease (PD) and restless leg syndrome) is associated with impulse control disorders (ICDs) in subpopulations of treated patients. A commonly reported ICD is pathological gambling of which risk taking is a prominent feature. Probability discounting is a measurable aspect of risk taking. We recently developed a probability discounting paradigm wherein intracranial self-stimulation (ICSS) serves as the positive reinforcer. Here we used this paradigm to determine the effects of PPX on discounting. We included assessments of a rodent model of PD, wherein 6-OHDA was injected into the dorsolateral striatum of both hemispheres, which produced persistent PD-like deficits in posture adjustment. Rats were trained to perform ICSS-mediated probability discounting, in which PD-like and control groups exhibited similar profiles. Rats were treated twice daily for 2 weeks with 2 mg/kg (±)PPX (ie, 1 mg/kg of the active form), a dose that improved lesion-induced motor deficits. In both groups, (±)PPX increased discounting; preference for the large reinforcer was enhanced 30-45% at the most uncertain probabilities. Tolerance did not develop with repeated treatments. Increased discounting subsided within 2 weeks of (±)PPX cessation, and re-exposure to (±)PPX reinstated heightened discounting. Such findings emulate the clinical scenario; therefore, ICSS for discounting assessments in rats exhibited high face validity. This model should prove useful in medication development where assessment of the propensity of a putative therapy to induce risk-taking behaviors is of interest.
Deep phenotyping to predict live birth outcomes in in vitro fertilization
Banerjee, Prajna; Choi, Bokyung; Shahine, Lora K.; Jun, Sunny H.; O’Leary, Kathleen; Lathi, Ruth B.; Westphal, Lynn M.; Wong, Wing H.; Yao, Mylene W. M.
2010-01-01
Nearly 75% of in vitro fertilization (IVF) treatments do not result in live births and patients are largely guided by a generalized age-based prognostic stratification. We sought to provide personalized and validated prognosis by using available clinical and embryo data from prior, failed treatments to predict live birth probabilities in the subsequent treatment. We generated a boosted tree model, IVFBT, by training it with IVF outcomes data from 1,676 first cycles (C1s) from 2003–2006, followed by external validation with 634 cycles from 2007–2008, respectively. We tested whether this model could predict the probability of having a live birth in the subsequent treatment (C2). By using nondeterministic methods to identify prognostic factors and their relative nonredundant contribution, we generated a prediction model, IVFBT, that was superior to the age-based control by providing over 1,000-fold improvement to fit new data (p < 0.05), and increased discrimination by receiver–operative characteristic analysis (area-under-the-curve, 0.80 vs. 0.68 for C1, 0.68 vs. 0.58 for C2). IVFBT provided predictions that were more accurate for ∼83% of C1 and ∼60% of C2 cycles that were out of the range predicted by age. Over half of those patients were reclassified to have higher live birth probabilities. We showed that data from a prior cycle could be used effectively to provide personalized and validated live birth probabilities in a subsequent cycle. Our approach may be replicated and further validated in other IVF clinics. PMID:20643955
Occupational Propensity for Training in a Late Industrial Society: Evidence from Russia
ERIC Educational Resources Information Center
Anikin, Vasiliy A.
2017-01-01
What factors best explain the low incidence of skills training in a late industrial society like Russia? This research undertakes a multilevel analysis of the role of occupational structure in the probability of training. The explanatory power of occupation-specific determinants and skills polarization are evaluated, using a representative 2012…
Does Employer-Financed General Training Pay? Evidence from the US Navy.
ERIC Educational Resources Information Center
Garcia, Federico; Arkes, Jeremy; Trost, Robert
2002-01-01
Examines whether the US Navy's Voluntary Education program leads to lower personnel turnover. Finds that program participation is associated with an 11-percentage-point increase in the probability of continuing in the Navy for 6 years. Findings seem to support the theory that general training safeguards employer investments in specific training by…
Kimura, Satoko; Akamatsu, Tomonari; Li, Songhai; Dong, Shouyue; Dong, Lijun; Wang, Kexiong; Wang, Ding; Arai, Nobuaki
2010-09-01
A method is presented to estimate the density of finless porpoises using stationed passive acoustic monitoring. The number of click trains detected by stereo acoustic data loggers (A-tag) was converted to an estimate of the density of porpoises. First, an automated off-line filter was developed to detect a click train among noise, and the detection and false-alarm rates were calculated. Second, a density estimation model was proposed. The cue-production rate was measured by biologging experiments. The probability of detecting a cue and the area size were calculated from the source level, beam patterns, and a sound-propagation model. The effect of group size on the cue-detection rate was examined. Third, the proposed model was applied to estimate the density of finless porpoises at four locations from the Yangtze River to the inside of Poyang Lake. The estimated mean density of porpoises in a day decreased from the main stream to the lake. Long-term monitoring during 466 days from June 2007 to May 2009 showed variation in the density 0-4.79. However, the density was fewer than 1 porpoise/km(2) during 94% of the period. These results suggest a potential gap and seasonal migration of the population in the bottleneck of Poyang Lake.
van Vugt, Marieke Karlijn; Hitchcock, Peter; Shahar, Ben; Britton, Willoughby
2012-01-01
Objectives: converging research suggests that mindfulness training exerts its therapeutic effects on depression by reducing rumination. Theoretically, rumination is a multifaceted construct that aggregates multiple neurocognitive aspects of depression, including poor executive control, negative and overgeneral memory bias, and persistence or stickiness of negative mind states. Current measures of rumination, most-often self-reports, do not capture these different aspects of ruminative tendencies, and therefore are limited in providing detailed information about the mechanisms of mindfulness. Methods: we developed new insight into the potential mechanisms of rumination, based on three model-based metrics of free recall dynamics. These three measures reflect the patterns of memory retrieval of valenced information: the probability of first recall (Pstart) which represents initial affective bias, the probability of staying with the same valence category rather than switching, which indicates strength of positive or negative association networks (Pstay), and probability of stopping (Pstop) or ending recall within a given valence, which indicates persistence or stickiness of a mind state. We investigated the effects of Mindfulness-Based Cognitive Therapy (MBCT; N = 29) vs. wait-list control (N = 23) on these recall dynamics in a randomized controlled trial in individuals with recurrent depression. Participants completed a standard laboratory stressor, the Trier Social Stress Test, to induce negative mood and activate ruminative tendencies. Following that, participants completed a free recall task consisting of three word lists. This assessment was conducted both before and after treatment or wait-list. Results: while MBCT participant’s Pstart remained relatively stable, controls showed multiple indications of depression-related deterioration toward more negative and less positive bias. Following the intervention, MBCT participants decreased in their tendency to sustain trains of negative words and increased their tendency to sustain trains of positive words. Conversely, controls showed the opposite tendency: controls stayed in trains of negative words for longer, and stayed in trains of positive words for less time relative to pre-intervention scores. MBCT participants tended to stop recall less often with negative words, which indicates less persistence or stickiness of negatively valenced mental context. Conclusion: MBCT participants showed a decrease in patterns that may perpetuate rumination on all three types of recall dynamics (Pstart, Pstay, and Pstop), compared to controls. MBCT may weaken the strength of self-perpetuating negative associations networks that are responsible for the persistent and “sticky” negative mind states observed in depression, and increase the positive associations that are lacking in depression. This study also offers a novel, objective method of measuring several indices of ruminative tendencies indicative of the underlying mechanisms of rumination. PMID:23049507
Soft context clustering for F0 modeling in HMM-based speech synthesis
NASA Astrophysics Data System (ADS)
Khorram, Soheil; Sameti, Hossein; King, Simon
2015-12-01
This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional `hard' decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in synthesizing natural-sounding high-quality speech. Conventionally, hard decision tree-clustered hidden Markov models (HMMs) are used, in which each model parameter is assigned to a single leaf node. However, this `divide-and-conquer' approach leads to data sparsity, with the consequence that it suffers from poor generalization, meaning that it is unable to accurately predict parameters for models of unseen contexts: the hard decision tree is a weak function approximator. To alleviate this, we propose the soft decision tree, which is a binary decision tree with soft decisions at the internal nodes. In this soft clustering method, internal nodes select both their children with certain membership degrees; therefore, each node can be viewed as a fuzzy set with a context-dependent membership function. The soft decision tree improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves. In order to use such a soft decision tree to predict the parameters of the HMM output probability distribution, we derive the smoothest (maximum entropy) distribution which captures all partial first-order moments and a global second-order moment of the training samples. Employing such a soft decision tree architecture with maximum entropy distributions, a novel speech synthesis system is trained using maximum likelihood (ML) parameter re-estimation and synthesis is achieved via maximum output probability parameter generation. In addition, a soft decision tree construction algorithm optimizing a log-likelihood measure is developed. Both subjective and objective evaluations were conducted and indicate a considerable improvement over the conventional method.
The Integrated Medical Model: Statistical Forecasting of Risks to Crew Health and Mission Success
NASA Technical Reports Server (NTRS)
Fitts, M. A.; Kerstman, E.; Butler, D. J.; Walton, M. E.; Minard, C. G.; Saile, L. G.; Toy, S.; Myers, J.
2008-01-01
The Integrated Medical Model (IMM) helps capture and use organizational knowledge across the space medicine, training, operations, engineering, and research domains. The IMM uses this domain knowledge in the context of a mission and crew profile to forecast crew health and mission success risks. The IMM is most helpful in comparing the risk of two or more mission profiles, not as a tool for predicting absolute risk. The process of building the IMM adheres to Probability Risk Assessment (PRA) techniques described in NASA Procedural Requirement (NPR) 8705.5, and uses current evidence-based information to establish a defensible position for making decisions that help ensure crew health and mission success. The IMM quantitatively describes the following input parameters: 1) medical conditions and likelihood, 2) mission duration, 3) vehicle environment, 4) crew attributes (e.g. age, sex), 5) crew activities (e.g. EVA's, Lunar excursions), 6) diagnosis and treatment protocols (e.g. medical equipment, consumables pharmaceuticals), and 7) Crew Medical Officer (CMO) training effectiveness. It is worth reiterating that the IMM uses the data sets above as inputs. Many other risk management efforts stop at determining only likelihood. The IMM is unique in that it models not only likelihood, but risk mitigations, as well as subsequent clinical outcomes based on those mitigations. Once the mathematical relationships among the above parameters are established, the IMM uses a Monte Carlo simulation technique (a random sampling of the inputs as described by their statistical distribution) to determine the probable outcomes. Because the IMM is a stochastic model (i.e. the input parameters are represented by various statistical distributions depending on the data type), when the mission is simulated 10-50,000 times with a given set of medical capabilities (risk mitigations), a prediction of the most probable outcomes can be generated. For each mission, the IMM tracks which conditions occurred and decrements the pharmaceuticals and supplies required to diagnose and treat these medical conditions. If supplies are depleted, then the medical condition goes untreated, and crew and mission risk increase. The IMM currently models approximately 30 medical conditions. By the end of FY2008, the IMM will be modeling over 100 medical conditions, approximately 60 of which have been recorded to have occurred during short and long space missions.
Reinforcement Probability Modulates Temporal Memory Selection and Integration Processes
Matell, Matthew S.; Kurti, Allison N.
2013-01-01
We have previously shown that rats trained in a mixed-interval peak procedure (tone = 4s, light = 12s) respond in a scalar manner at a time in between the trained peak times when presented with the stimulus compound (Swanton & Matell, 2011). In our previous work, the two component cues were reinforced with different probabilities (short = 20%, long = 80%) to equate response rates, and we found that the compound peak time was biased toward the cue with the higher reinforcement probability. Here, we examined the influence that different reinforcement probabilities have on the temporal location and shape of the compound response function. We found that the time of peak responding shifted as a function of the relative reinforcement probability of the component cues, becoming earlier as the relative likelihood of reinforcement associated with the short cue increased. However, as the relative probabilities of the component cues grew dissimilar, the compound peak became non-scalar, suggesting that the temporal control of behavior shifted from a process of integration to one of selection. As our previous work has utilized durations and reinforcement probabilities more discrepant than those used here, these data suggest that the processes underlying the integration/selection decision for time are based on cue value. PMID:23896560
NASA Astrophysics Data System (ADS)
Hashemi, Seyyedhossein; Javaherian, Abdolrahim; Ataee-pour, Majid; Tahmasebi, Pejman; Khoshdel, Hossein
2014-12-01
In facies modeling, the ideal objective is to integrate different sources of data to generate a model that has the highest consistency to reality with respect to geological shapes and their facies architectures. Multiple-point (geo)statistics (MPS) is a tool that gives the opportunity of reaching this goal via defining a training image (TI). A facies modeling workflow was conducted on a carbonate reservoir located southwest Iran. Through a sequence stratigraphic correlation among the wells, it was revealed that the interval under a modeling process was deposited in a tidal flat environment. Bahamas tidal flat environment which is one of the most well studied modern carbonate tidal flats was considered to be the source of required information for modeling a TI. In parallel, a neural network probability cube was generated based on a set of attributes derived from 3D seismic cube to be applied into the MPS algorithm as a soft conditioning data. Moreover, extracted channel bodies and drilled well log facies came to the modeling as hard data. Combination of these constraints resulted to a facies model which was greatly consistent to the geological scenarios. This study showed how analogy of modern occurrences can be set as the foundation for generating a training image. Channel morphology and facies types currently being deposited, which are crucial for modeling a training image, was inferred from modern occurrences. However, there were some practical considerations concerning the MPS algorithm used for facies simulation. The main limitation was the huge amount of RAM and CPU-time needed to perform simulations.
Automatic land cover classification of geo-tagged field photos using deep learning method
NASA Astrophysics Data System (ADS)
Xu, G.; Zhu, X.; Fu, D.; Dong, J.; Xiao, X.
2016-12-01
With the popularity of smartphones, more and more crowdsourcing geo-tagged field photos have been shared by the public online. They are becoming a potentially valuable information source for the environmental studies. However, the labelling and recognition of these photos are time-consuming. To utilise and exploit such information, this research aims to propose a land cover type recognition model for geo-tagged field photo based on the deep learning technique. This model combines a pre-trained convolutional neural network (CNN) as the image feature extractor and the softmax regression model as the feature classifier. The pre-trained CNN model Inception-v3 is used in this study. The previously labelled field photos from the Global Geo-Referenced Field Photo Library (http://eomf.ou.edu/photos) are chosen for model training and validation. The results indicate that our field photo recognition model achieves an acceptable accuracy (50.34% for top-1 prediction and 78.20% for top-3 prediction) of land cover classification. What is more important, this model can provide the probabilities for the predictions as the self-assessment of uncertainty. After filtering out the predictions with the certainty of less than 75%, the overall accuracy can increase to 80.14%, which implies that the model is fully aware of its prediction uncertainty and can quantitatively assess it. Hopefully, by proving the possibility of this type of research, other similar studies could be further conducted, such as geological and atmospheric information extraction from field photos. This research could be a critical exploration of how artificial intelligence and crowd-sourced data can help the earth studies.
2003-04-01
34action orientetion ". T^ks concerned pre-flight safety assessments for military combat aircraft and were performed 1^ Army Cobra aviators. Dependent...evaluations are vital during future assessments of team performance and especially for modeling purposes, as the literature lacks empirical...a similar scale, and then assign probabilities to likelihood’s for these in the future . Once completed, one can multiply expected feature values of
Introductory Life Science Mathematics and Quantitative Neuroscience Courses
Olifer, Andrei
2010-01-01
We describe two sets of courses designed to enhance the mathematical, statistical, and computational training of life science undergraduates at Emory College. The first course is an introductory sequence in differential and integral calculus, modeling with differential equations, probability, and inferential statistics. The second is an upper-division course in computational neuroscience. We provide a description of each course, detailed syllabi, examples of content, and a brief discussion of the main issues encountered in developing and offering the courses. PMID:20810971
Boudreaux, Emily O; Cherry, Katie E; Elliott, Emily M; Hicks, Jason L
2011-05-01
Eight participants with probable Alzheimer's disease (AD) were trained to recall names of countries using the spaced-retrieval memory intervention. Six training sessions were administered on alternate days over a 2-week period. Half of the participants studied a target country alone and the other half studied a target country along with eight distractor countries. Training stimuli appeared in text-only format in half of the sessions and text with a color photograph of the country in the other sessions. On each trial, participants selected the target at increasingly longer retention intervals, contingent upon successful recall. Results indicated that the mean proportion of correct trials and longest duration achieved increased across training sessions, confirming the success of the spaced-retrieval intervention. Pictorial illustrations enhanced explicit memory for target country names. Implications of these data for current views on memory remediation in cognitively impaired older adults are discussed.
A Fine-Grained API Link Prediction Approach Supporting CMDA Mashup Recommendation
NASA Astrophysics Data System (ADS)
Zhang, J.; Bao, Q.; Lee, T. J.; Ramachandran, R.; Lee, S.; Pan, L.; Gatlin, P. N.; Maskey, M.
2017-12-01
Service (API) discovery and recommendation is key to the wide spread of service oriented architecture and service oriented software engineering. Service recommendation typically relies on service linkage prediction calculated by the semantic distances (or similarities) among services based on their collection of inherent attributes. Given a specific context (mashup goal), however, different attributes may contribute differently to a service linkage. In this work, instead of training a model for all attributes as a whole, a novel approach is presented to simultaneously train separate models for individual attributes. Our contributions are summarized in three-fold. First is that we have developed a scalable attribute-level data model, featuring scalability and extensibility. We have extended Multiplicative Attribute Graph (MAG) model to represent node profiles featuring rich categorical attributes, while relaxing its constraint of requiring a priori knowledge of predefined attributes. LDA is leveraged to dynamically identify attributes based on attribute modeling, and multiple Gaussian fit is applied to find global optimal values. The second contribution is that we have seamlessly integrated the latent relationships between API attributes as well as observed network structure based on historical API usage data. Such a layered information model enables us to predict the probability of a link between two APIs based on their attribute link affinities carrying a variety of information including meta data, semantic data, historical usage data, as well as crowdsourcing user comments and annotations. The third contribution is that we have developed a finegrained context-aware mashup-API recommendation technique. On top of individual models trained for separate attributes, a dedicated layer is trained to represent the latent attribute distribution regarding mashup purpose, i.e., sensitivity of attributes to context. Thus, given the description of an intended mashup, the attributes sensitive to the goal will be identified, and corresponding attribute models will be exploited to compute the possibility of API linkages under the context. Such a layered model increases search accuracy.
2018-01-01
Background Many studies have tried to develop predictors for return-to-work (RTW). However, since complex factors have been demonstrated to predict RTW, it is difficult to use them practically. This study investigated whether factors used in previous studies could predict whether an individual had returned to his/her original work by four years after termination of the worker's recovery period. Methods An initial logistic regression analysis of 1,567 participants of the fourth Panel Study of Worker's Compensation Insurance yielded odds ratios. The participants were divided into two subsets, a training dataset and a test dataset. Using the training dataset, logistic regression, decision tree, random forest, and support vector machine models were established, and important variables of each model were identified. The predictive abilities of the different models were compared. Results The analysis showed that only earned income and company-related factors significantly affected return-to-original-work (RTOW). The random forest model showed the best accuracy among the tested machine learning models; however, the difference was not prominent. Conclusion It is possible to predict a worker's probability of RTOW using machine learning techniques with moderate accuracy. PMID:29736160
Sedai, Suman; Garnavi, Rahil; Roy, Pallab; Xi Liang
2015-08-01
Multi-atlas segmentation first registers each atlas image to the target image and transfers the label of atlas image to the coordinate system of the target image. The transferred labels are then combined, using a label fusion algorithm. In this paper, we propose a novel label fusion method which aggregates discriminative learning and generative modeling for segmentation of cardiac MR images. First, a probabilistic Random Forest classifier is trained as a discriminative model to obtain the prior probability of a label at the given voxel of the target image. Then, a probability distribution of image patches is modeled using Gaussian Mixture Model for each label, providing the likelihood of the voxel belonging to the label. The final label posterior is obtained by combining the classification score and the likelihood score under Bayesian rule. Comparative study performed on MICCAI 2013 SATA Segmentation Challenge demonstrates that our proposed hybrid label fusion algorithm is accurate than other five state-of-the-art label fusion methods. The proposed method obtains dice similarity coefficient of 0.94 and 0.92 in segmenting epicardium and endocardium respectively. Moreover, our label fusion method achieves more accurate segmentation results compared to four other label fusion methods.
How many employees receive safety training during their first year of a new job?
Smith, Peter M; Mustard, Cameron A
2007-02-01
To describe the provision of safety training to Canadian employees, specifically those in their first year of employment with a new employer. Three repeated national Canadian cross-sectional surveys. 59 159 respondents from Statistics Canada's Workplace and Employee Surveys (1999, 2001 and 2003), 5671 who were in their first year of employment. Receiving occupational health and safety training, orientation training or office or non-office equipment training in either a classroom or on-the-job in the previous 12 months. Only 12% of women and 16% of men reported receiving safety training in the previous 12 months. Employees in their first 12 months of employment were more likely to receive safety training than employees with >5 years of job tenure. However, still only one in five new employees had received any safety training while with their current employer. In a fully adjusted regression model, employees who had access to family and support programs, women in medium-sized workplaces and in manufacturing, and men in large workplaces and in part-time employment all had an increased probability of receiving safety training. No increased likelihood of safety training was found in younger workers or those in jobs with higher physical demands, both of which are associated with increased injury risk. From our results, it would appear that only one in five Canadian employees in their first year of a new job received safety training. Further, the provision of safety training does not appear to be more prevalent among workers or in occupations with increased risk of injuries.
Wang, Yunfeng; Ma, Zhimin; Xu, Chaonan; Wang, ZiKun; Yang, Xinghua
2018-05-15
This study aimed to identify the rules of transition between normotension, prehypertension and hypertension states and to establish a prediction model for the incidence of prehypertension and hypertension. Data from the China Health and Nutrition Survey from 1991 to 2009 were used as training data to develop the model. Data of the year 2011 were used for model validation. The multistate Markov model was developed using the msm package in R software. A total of 5265 participants were included at baseline, with an average follow-up of 8.05 ± 5.27 years and 17 640 observations. The ratio of men to women was 1 : 1.17, and the mean age was 37.54 ± 13.80 years. Within 10 years, in men, from normotension, the average probability to prehypertension and hypertension are 34.5 and 35.25%, respectively; from prehypertension, the average probability of recovering to normotension and developing to hypertension are 17.78 and 43.85%, respectively. In women, the average probabilities are 27.49, 28.09, 29.11 and 39.05%. Fat consumption increasing was found to be a protective factor, with 4.5% lower rate of transferring from normotension to prehypertension for a quarter percentage increasing. The model showed a very good prediction ability within 10 years and provided good prediction of blood pressure in the 2011 cohort (χ = 0.781, P = 0.676). The multistate Markov model can be a useful tool to identify the rules of transition among multiple states of blood pressure and predict well prevalence of the normotension, prehypertension and hypertension in cohort populations.
NASA Astrophysics Data System (ADS)
Bilalic, Rusmir
A novel application of support vector machines (SVMs), artificial neural networks (ANNs), and Gaussian processes (GPs) for machine learning (GPML) to model microcontroller unit (MCU) upset due to intentional electromagnetic interference (IEMI) is presented. In this approach, an MCU performs a counting operation (0-7) while electromagnetic interference in the form of a radio frequency (RF) pulse is direct-injected into the MCU clock line. Injection times with respect to the clock signal are the clock low, clock rising edge, clock high, and the clock falling edge periods in the clock window during which the MCU is performing initialization and executing the counting procedure. The intent is to cause disruption in the counting operation and model the probability of effect (PoE) using machine learning tools. Five experiments were executed as part of this research, each of which contained a set of 38,300 training points and 38,300 test points, for a total of 383,000 total points with the following experiment variables: injection times with respect to the clock signal, injected RF power, injected RF pulse width, and injected RF frequency. For the 191,500 training points, the average training error was 12.47%, while for the 191,500 test points the average test error was 14.85%, meaning that on average, the machine was able to predict MCU upset with an 85.15% accuracy. Leaving out the results for the worst-performing model (SVM with a linear kernel), the test prediction accuracy for the remaining machines is almost 89%. All three machine learning methods (ANNs, SVMs, and GPML) showed excellent and consistent results in their ability to model and predict the PoE on an MCU due to IEMI. The GP approach performed best during training with a 7.43% average training error, while the ANN technique was most accurate during the test with a 10.80% error.
Guidelines for Enhancement of Visual Conspicuity of Trains at Grade Crossings
DOT National Transportation Integrated Search
1975-05-01
This report summarizes a comprehensive study of potential means of reducing the probability of train-motor vehicle collisions at railroad-highway grade crossings through enhancement of the visual conspicuity of locomotives. Passive techniques are rev...
Probability of Equivalence Formation: Familiar Stimuli and Training Sequence
ERIC Educational Resources Information Center
Arntzen, Erik
2004-01-01
The present study was conducted to show how responding in accord with equivalence relations changes as a function of position of familiar stimuli, pictures, and with the use of nonsense syllables in an MTO-training structure. Fifty college students were tested for responding in accord with equivalence in an AB, CB, DB, and EB training structure.…
ERIC Educational Resources Information Center
Majchrzak, Ann
A study was conducted of the training programs used by plants with Computer Automated Design/Computer Automated Manufacturing (CAD/CAM) to help their employees adapt to automated manufacturing. The study sought to determine the relative priorities of manufacturing establishments for training certain workers in certain skills; the status of…
Workplace Training Programs: Instruments for Human Capital Improvements or Screening Devices?
ERIC Educational Resources Information Center
Brunetti, Irene; Corsini, Lorenzo
2017-01-01
Purpose: The purpose of this paper is to analyse the effect of an Italian training program on the re-employment probability of young unemployed workers. The program consists exclusively of workplace training and is coordinated by employment centers, even if it is fully implemented by firms. Design/Methodology/Approach: The authors develop a…
Random Forests to Predict Rectal Toxicity Following Prostate Cancer Radiation Therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ospina, Juan D.; INSERM, U1099, Rennes; Escuela de Estadística, Universidad Nacional de Colombia Sede Medellín, Medellín
2014-08-01
Purpose: To propose a random forest normal tissue complication probability (RF-NTCP) model to predict late rectal toxicity following prostate cancer radiation therapy, and to compare its performance to that of classic NTCP models. Methods and Materials: Clinical data and dose-volume histograms (DVH) were collected from 261 patients who received 3-dimensional conformal radiation therapy for prostate cancer with at least 5 years of follow-up. The series was split 1000 times into training and validation cohorts. A RF was trained to predict the risk of 5-year overall rectal toxicity and bleeding. Parameters of the Lyman-Kutcher-Burman (LKB) model were identified and a logistic regression modelmore » was fit. The performance of all the models was assessed by computing the area under the receiving operating characteristic curve (AUC). Results: The 5-year grade ≥2 overall rectal toxicity and grade ≥1 and grade ≥2 rectal bleeding rates were 16%, 25%, and 10%, respectively. Predictive capabilities were obtained using the RF-NTCP model for all 3 toxicity endpoints, including both the training and validation cohorts. The age and use of anticoagulants were found to be predictors of rectal bleeding. The AUC for RF-NTCP ranged from 0.66 to 0.76, depending on the toxicity endpoint. The AUC values for the LKB-NTCP were statistically significantly inferior, ranging from 0.62 to 0.69. Conclusions: The RF-NTCP model may be a useful new tool in predicting late rectal toxicity, including variables other than DVH, and thus appears as a strong competitor to classic NTCP models.« less
Abbott-Anderson, Kristen; Gilmore-Bykovskyi, Andrea; Lyles, Annmarie A
The ability to successfully mentor others is an essential skill necessary for building and strengthening an infrastructure of well-prepared nurse faculty to accelerate advancements in nursing science. Mentoring is a fundamental part of the nurse faculty role, but new faculty are often unprepared to take on mentoring roles early in their academic career. Applied training in research mentoring initiated during doctor of philosophy (PhD) programs may better prepare future faculty to manage teaching and mentoring responsibilities earlier and with greater confidence. The unique opportunity exists for PhD students to engage in research mentoring with undergraduate nursing students, with probable benefits for both the mentor and the mentee. This manuscript uses Kram's temporal mentoring model as a guide to examine the training experiences of 3 PhD students mentoring undergraduate nursing students and discusses the benefits and challenges associated with these mentoring relationships. Collectively, these experiences provide preliminary support and guidance for the development and adoption of formal PhD mentor training programs to better prepare future PhD nursing faculty for their mentoring responsibilities. Copyright © 2016 Elsevier Inc. All rights reserved.
Close-To-Practice Assessment Of Meat Freshness With Metal Oxide Sensor Microarray Electronic Nose
DOE Office of Scientific and Technical Information (OSTI.GOV)
Musatov, V. Yu.; Sysoev, V. V.; Sommer, M.
In this report we estimate the ability of KAMINA e-nose, based on a metal oxide sensor (MOS) microarray and Linear Discriminant Analysis (LDA) pattern recognition, to evaluate meat freshness. The received results show that, 1) one or two exposures of standard meat samples to the e-nose are enough for the instrument to recognize the fresh meat prepared by the same supplier with 100% probability; 2) the meat samples of two kinds, stored at 4 deg. C and 25 deg. C, are mutually recognized at early stages of decay with the help of the LDA model built independently under the e-nosemore » training to each kind of meat; 3) the 3-4 training cycles of exposure to meat from different suppliers are necessary for the e-nose to build a reliable LDA model accounting for the supplier factor. This study approves that the MOS e-nose is ready to be currently utilised in food industry for evaluation of product freshness. The e-nose performance is characterized by low training cost, a confident recognition power of various product decay conditions and easy adjustment to changing conditions.« less
Summation and subtraction using a modified autoshaping procedure in pigeons.
Ploog, Bertram O
2008-06-01
A modified autoshaping paradigm (significantly different from those previously reported in the summation literature) was employed to allow for the simultaneous assessment of stimulus summation and subtraction in pigeons. The response requirements and the probability of food delivery were adjusted such that towards the end of training 12 of 48 trials ended in food delivery, the same proportion as under testing. Stimuli (outlines of squares of three sizes and colors: A, B, and C) were used that could be presented separately or in any combination of two or three stimuli. Twelve of the pigeons (summation groups) were trained with either A, B, and C or with AB, BC, and CA, and tested with ABC. The remaining 12 pigeons (subtraction groups) received training with ABC but were tested with A, B, and C or with AB, BC, and CA. These groups were further subdivided according to whether stimulus elements were presented either in a concentric or dispersed manner. Summation did not occur; subtraction occurred in the two concentric groups. For interpretation of the results, configural theory, the Rescorla-Wagner model, and the composite-stimulus control model were considered. The results suggest different mechanisms responsible for summation and subtraction.
Mixture EMOS model for calibrating ensemble forecasts of wind speed.
Baran, S; Lerch, S
2016-03-01
Ensemble model output statistics (EMOS) is a statistical tool for post-processing forecast ensembles of weather variables obtained from multiple runs of numerical weather prediction models in order to produce calibrated predictive probability density functions. The EMOS predictive probability density function is given by a parametric distribution with parameters depending on the ensemble forecasts. We propose an EMOS model for calibrating wind speed forecasts based on weighted mixtures of truncated normal (TN) and log-normal (LN) distributions where model parameters and component weights are estimated by optimizing the values of proper scoring rules over a rolling training period. The new model is tested on wind speed forecasts of the 50 member European Centre for Medium-range Weather Forecasts ensemble, the 11 member Aire Limitée Adaptation dynamique Développement International-Hungary Ensemble Prediction System ensemble of the Hungarian Meteorological Service, and the eight-member University of Washington mesoscale ensemble, and its predictive performance is compared with that of various benchmark EMOS models based on single parametric families and combinations thereof. The results indicate improved calibration of probabilistic and accuracy of point forecasts in comparison with the raw ensemble and climatological forecasts. The mixture EMOS model significantly outperforms the TN and LN EMOS methods; moreover, it provides better calibrated forecasts than the TN-LN combination model and offers an increased flexibility while avoiding covariate selection problems. © 2016 The Authors Environmetrics Published by JohnWiley & Sons Ltd.
Dendritic sidebranching in the three-dimensional symmetric model in the presence of noise
NASA Technical Reports Server (NTRS)
Langer, J. S.
1987-01-01
The time-dependent behavior of sidebranching deformations in the three-dimensional symmetric model of dendritic solidification is studied within a WKB approximation. Localized wave packets generated by pulses in the neighborhood of the tip are found to grow in amplitude and to spread and stretch as they move down the sides of the dendrite. This behavior is shown to imply that noise in the solidifying medium is selectively amplified in such a way as to produce a fluctuating train of sidebranches in qualitative agreement with experimental observations. A rough estimate indicates that purely thermal noise is probably not quite strong enough to fit the data.
Surveillance system and method having an adaptive sequential probability fault detection test
NASA Technical Reports Server (NTRS)
Herzog, James P. (Inventor); Bickford, Randall L. (Inventor)
2005-01-01
System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Surveillance system and method having an adaptive sequential probability fault detection test
NASA Technical Reports Server (NTRS)
Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)
2006-01-01
System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Surveillance System and Method having an Adaptive Sequential Probability Fault Detection Test
NASA Technical Reports Server (NTRS)
Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)
2008-01-01
System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Trujillano, Javier; March, Jaume; Sorribas, Albert
2004-01-01
In clinical practice, there is an increasing interest in obtaining adequate models of prediction. Within the possible available alternatives, the artificial neural networks (ANN) are progressively more used. In this review we first introduce the ANN methodology, describing the most common type of ANN, the Multilayer Perceptron trained with backpropagation algorithm (MLP). Then we compare the MLP with the Logistic Regression (LR). Finally, we show a practical scheme to make an application based on ANN by means of an example with actual data. The main advantage of the RN is its capacity to incorporate nonlinear effects and interactions between the variables of the model without need to include them a priori. As greater disadvantages, they show a difficult interpretation of their parameters and large empiricism in their process of construction and training. ANN are useful for the computation of probabilities of a given outcome based on a set of predicting variables. Furthermore, in some cases, they obtain better results than LR. Both methodologies, ANN and LR, are complementary and they help us to obtain more valid models.
Ramachandran, Suchitra; Meyer, Travis; Olson, Carl R
2016-01-01
When monkeys view two images in fixed sequence repeatedly over days and weeks, neurons in area TE of the inferotemporal cortex come to exhibit prediction suppression. The trailing image elicits only a weak response when presented following the leading image that preceded it during training. Induction of prediction suppression might depend either on the contiguity of the images, as determined by their co-occurrence and captured in the measure of joint probability P(A,B), or on their contingency, as determined by their correlation and as captured in the measures of conditional probability P(A|B) and P(B|A). To distinguish between these possibilities, we measured prediction suppression after imposing training regimens that held P(A,B) constant but varied P(A|B) and P(B|A). We found that reducing either P(A|B) or P(B|A) during training attenuated prediction suppression as measured during subsequent testing. We conclude that prediction suppression depends on contingency, as embodied in the predictive relations between the images, and not just on contiguity, as embodied in their co-occurrence. Copyright © 2016 the American Physiological Society.
NASA Astrophysics Data System (ADS)
Morales-Esteban, A.; Martínez-Álvarez, F.; Reyes, J.
2013-05-01
A method to predict earthquakes in two of the seismogenic areas of the Iberian Peninsula, based on Artificial Neural Networks (ANNs), is presented in this paper. ANNs have been widely used in many fields but only very few and very recent studies have been conducted on earthquake prediction. Two kinds of predictions are provided in this study: a) the probability of an earthquake, of magnitude equal or larger than a preset threshold magnitude, within the next 7 days, to happen; b) the probability of an earthquake of a limited magnitude interval to happen, during the next 7 days. First, the physical fundamentals related to earthquake occurrence are explained. Second, the mathematical model underlying ANNs is explained and the configuration chosen is justified. Then, the ANNs have been trained in both areas: The Alborán Sea and the Western Azores-Gibraltar fault. Later, the ANNs have been tested in both areas for a period of time immediately subsequent to the training period. Statistical tests are provided showing meaningful results. Finally, ANNs were compared to other well known classifiers showing quantitatively and qualitatively better results. The authors expect that the results obtained will encourage researchers to conduct further research on this topic. Development of a system capable of predicting earthquakes for the next seven days Application of ANN is particularly reliable to earthquake prediction. Use of geophysical information modeling the soil behavior as ANN's input data Successful analysis of one region with large seismic activity
Zago, Myrka; Bosco, Gianfranco; Maffei, Vincenzo; Iosa, Marco; Ivanenko, Yuri P; Lacquaniti, Francesco
2005-02-01
We studied how subjects learn to deal with two conflicting sensory environments as a function of the probability of each environment and the temporal distance between repeated events. Subjects were asked to intercept a visual target moving downward on a screen with randomized laws of motion. We compared five protocols that differed in the probability of constant speed (0g) targets and accelerated (1g) targets. Probability ranged from 9 to 100%, and the time interval between consecutive repetitions of the same target ranged from about 1 to 20 min. We found that subjects systematically timed their responses consistent with the assumption of gravity effects, for both 1 and 0g trials. With training, subjects rapidly adapted to 0g targets by shifting the time of motor activation. Surprisingly, the adaptation rate was independent of both the probability of 0g targets and their temporal distance. Very few 0g trials sporadically interspersed as catch trials during immersive practice with 1g trials were sufficient for learning and consolidation in long-term memory, as verified by retesting after 24 h. We argue that the memory store for adapted states of the internal gravity model is triggered by individual events and can be sustained for prolonged periods of time separating sporadic repetitions. This form of event-related learning could depend on multiple-stage memory, with exponential rise and decay in the initial stages followed by a sample-and-hold module.
Optimizing one-shot learning with binary synapses.
Romani, Sandro; Amit, Daniel J; Amit, Yali
2008-08-01
A network of excitatory synapses trained with a conservative version of Hebbian learning is used as a model for recognizing the familiarity of thousands of once-seen stimuli from those never seen before. Such networks were initially proposed for modeling memory retrieval (selective delay activity). We show that the same framework allows the incorporation of both familiarity recognition and memory retrieval, and estimate the network's capacity. In the case of binary neurons, we extend the analysis of Amit and Fusi (1994) to obtain capacity limits based on computations of signal-to-noise ratio of the field difference between selective and non-selective neurons of learned signals. We show that with fast learning (potentiation probability approximately 1), the most recently learned patterns can be retrieved in working memory (selective delay activity). A much higher number of once-seen learned patterns elicit a realistic familiarity signal in the presence of an external field. With potentiation probability much less than 1 (slow learning), memory retrieval disappears, whereas familiarity recognition capacity is maintained at a similarly high level. This analysis is corroborated in simulations. For analog neurons, where such analysis is more difficult, we simplify the capacity analysis by studying the excess number of potentiated synapses above the steady-state distribution. In this framework, we derive the optimal constraint between potentiation and depression probabilities that maximizes the capacity.
Zhang, M M; Zheng, Y D; Liang, Y H
2018-02-18
To present a prognostic model for evaluating the outcome of root canal treatment in teeth with pulpitis or apical periodontitis 2 years after treatment. The implementation of this study was based on a retrospective study on the 2-year outcome of root canal treatment. A cohort of 360 teeth, which received treatment and review, were chosen to build up the total sample size. In the study, 143 teeth with vital pulp and 217 teeth with apical periodontitis were included. About 67% of the samples were selected randomly to derive a training date set for modeling, and the others were used as validating date set for testing. Logistic regression models were used to produce the prognostic models. The dependent variable was defined as absence of periapical lesion or reduction of periapical lesion. The predictability of the models was evaluated by the area under the receiver-operating characteristic (ROC) curve (AUC). Four predictors were included in model one (absence of apical lesion): pre-operative periapical radiolucency, canal curvature, density and apical extent of root fillings. The AUC was 0.802 (95%CI: 0.744-0.859). And the AUC of the testing date was 0.688. Only the density and apical extent of root fillings were included to present model two (reduction of apical lesion). The AUC of training dates and testing dates were 0.734 (95%CI: 0.612-0.856) and 0.681, respectively. As predicted by model one, the probability of absence of periapical lesion 2 years after endodontic treatment was 90% in pulpitis teeth with sever root-canal curvature and adequate root canal fillings, but 51% in teeth with apical periodontitis. When using prognostic model two for prediction, in teeth with apical periodontitis, the probability of detecting lesion reduction with adequate or inadequate root fillings was 95% and 39% 2 years after treatment. The pre-operative periapical status, canal curvature and quality of root canal treatment could be used to predict the 2-year outcome of root canal treatment.
Cerebral vessels segmentation for light-sheet microscopy image using convolutional neural networks
NASA Astrophysics Data System (ADS)
Hu, Chaoen; Hui, Hui; Wang, Shuo; Dong, Di; Liu, Xia; Yang, Xin; Tian, Jie
2017-03-01
Cerebral vessel segmentation is an important step in image analysis for brain function and brain disease studies. To extract all the cerebrovascular patterns, including arteries and capillaries, some filter-based methods are used to segment vessels. However, the design of accurate and robust vessel segmentation algorithms is still challenging, due to the variety and complexity of images, especially in cerebral blood vessel segmentation. In this work, we addressed a problem of automatic and robust segmentation of cerebral micro-vessels structures in cerebrovascular images acquired by light-sheet microscope for mouse. To segment micro-vessels in large-scale image data, we proposed a convolutional neural networks (CNNs) architecture trained by 1.58 million pixels with manual label. Three convolutional layers and one fully connected layer were used in the CNNs model. We extracted a patch of size 32x32 pixels in each acquired brain vessel image as training data set to feed into CNNs for classification. This network was trained to output the probability that the center pixel of input patch belongs to vessel structures. To build the CNNs architecture, a series of mouse brain vascular images acquired from a commercial light sheet fluorescence microscopy (LSFM) system were used for training the model. The experimental results demonstrated that our approach is a promising method for effectively segmenting micro-vessels structures in cerebrovascular images with vessel-dense, nonuniform gray-level and long-scale contrast regions.
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Physiological Effects of Strength Training and Various Strength Training Devices.
ERIC Educational Resources Information Center
Wilmore, Jack H.
Current knowledge in the area of muscle physiology is a basis for a discussion on strength training programs. It is now recognized that the expression of strength is related to, but not dependent upon, the size of the muscle and is probably more related to the ability to recruit more muscle fibers in the contraction, or to better synchronize their…
Activity interference and noise annoyance
NASA Astrophysics Data System (ADS)
Hall, F. L.; Taylor, S. M.; Birnie, S. E.
1985-11-01
Debate continues over differences in the dose-response functions used to predict the annoyance at different sources of transportation noise. This debate reflects the lack of an accepted model of noise annoyance in residential communities. In this paper a model is proposed which is focussed on activity interference as a central component mediating the relationship between noise exposure and annoyance. This model represents a departure from earlier models in two important respects. First, single event noise levels (e.g., maximum levels, sound exposure level) constitute the noise exposure variables in place of long-term energy equivalent measures (e.g., 24-hour Leq or Ldn). Second, the relationships within the model are expressed as probabilistic rather than deterministic equations. The model has been tested by using acoustical and social survey data collected at 57 sites in the Toronto region exposed to aircraft, road traffic or train noise. Logit analysis was used to estimate two sets of equations. The first predicts the probability of activity interference as a function of event noise level. Four types of interference are included: indoor speech, outdoor speech, difficulty getting to sleep and awakening. The second set predicts the probability of annoyance as a function of the combination of activity interferences. From the first set of equations, it was possible to estimate a function for indoor speech interference only. In this case, the maximum event level was the strongest predictor. The lack of significant results for the other types of interference is explained by the limitations of the data. The same function predicts indoor speech interference for all three sources—road, rail and aircraft noise. The results for the second set of equations show strong relationships between activity interference and the probability of annoyance. Again, the parameters of the logit equations are similar for the three sources. A trial application of the model predicts a higher probability of annoyance for aircraft than for road traffic situations with the same 24-hour Leq. This result suggests that the model may account for previously reported source differences in annoyance.
Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris
2013-10-08
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris
2013-01-01
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fan, J; Fan, J; Hu, W
Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditionalmore » probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.« less
Quantum-assisted learning of graphical models with arbitrary pairwise connectivity
NASA Astrophysics Data System (ADS)
Realpe-Gómez, John; Benedetti, Marcello; Biswas, Rupak; Perdomo-Ortiz, Alejandro
Mainstream machine learning techniques rely heavily on sampling from generally intractable probability distributions. There is increasing interest in the potential advantages of using quantum computing technologies as sampling engines to speedup these tasks. However, some pressing challenges in state-of-the-art quantum annealers have to be overcome before we can assess their actual performance. The sparse connectivity, resulting from the local interaction between quantum bits in physical hardware implementations, is considered the most severe limitation to the quality of constructing powerful machine learning models. Here we show how to surpass this `curse of limited connectivity' bottleneck and illustrate our findings by training probabilistic generative models with arbitrary pairwise connectivity on a real dataset of handwritten digits and two synthetic datasets in experiments with up to 940 quantum bits. Our model can be trained in quantum hardware without full knowledge of the effective parameters specifying the corresponding Boltzmann-like distribution. Therefore, the need to infer the effective temperature at each iteration is avoided, speeding up learning, and the effect of noise in the control parameters is mitigated, improving accuracy. This work was supported in part by NASA, AFRL, ODNI, and IARPA.
Predicting Seagrass Occurrence in a Changing Climate Using Random Forests
NASA Astrophysics Data System (ADS)
Aydin, O.; Butler, K. A.
2017-12-01
Seagrasses are marine plants that can quickly sequester vast amounts of carbon (up to 100 times more and 12 times faster than tropical forests). In this work, we present an integrated GIS and machine learning approach to build a data-driven model of seagrass presence-absence. We outline a random forest approach that avoids the prevalence bias in many ecological presence-absence models. One of our goals is to predict global seagrass occurrence from a spatially limited training sample. In addition, we conduct a sensitivity study which investigates the vulnerability of seagrass to changing climate conditions. We integrate multiple data sources including fine-scale seagrass data from MarineCadastre.gov and the recently available globally extensive publicly available Ecological Marine Units (EMU) dataset. These data are used to train a model for seagrass occurrence along the U.S. coast. In situ oceans data are interpolated using Empirical Bayesian Kriging (EBK) to produce globally extensive prediction variables. A neural network is used to estimate probable future values of prediction variables such as ocean temperature to assess the impact of a warming climate on seagrass occurrence. The proposed workflow can be generalized to many presence-absence models.
Semantic Image Segmentation with Contextual Hierarchical Models.
Seyedhosseini, Mojtaba; Tasdizen, Tolga
2016-05-01
Semantic segmentation is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. The importance of using contextual information in semantic segmentation frameworks has been widely realized in the field. We propose a contextual framework, called contextual hierarchical model (CHM), which learns contextual information in a hierarchical framework for semantic segmentation. At each level of the hierarchy, a classifier is trained based on downsampled input images and outputs of previous levels. Our model then incorporates the resulting multi-resolution contextual information into a classifier to segment the input image at original resolution. This training strategy allows for optimization of a joint posterior probability at multiple resolutions through the hierarchy. Contextual hierarchical model is purely based on the input image patches and does not make use of any fragments or shape examples. Hence, it is applicable to a variety of problems such as object segmentation and edge detection. We demonstrate that CHM performs at par with state-of-the-art on Stanford background and Weizmann horse datasets. It also outperforms state-of-the-art edge detection methods on NYU depth dataset and achieves state-of-the-art on Berkeley segmentation dataset (BSDS 500).
Counseling Psychology Doctoral Trainees' Satisfaction with Clinical Methods Training
ERIC Educational Resources Information Center
Menke, Kristen Ann
2015-01-01
Counseling psychology doctoral trainees' satisfaction with their clinical methods training is an important predictor of their self-efficacy as counselors, persistence in graduate programs, and probability of practicing psychotherapy in their careers (Fernando & Hulse-Killacky, 2005; Hadjipavlou & Ogrodniczuk, 2007; Morton & Worthley,…
Wheeler, David C.; Burstyn, Igor; Vermeulen, Roel; Yu, Kai; Shortreed, Susan M.; Pronk, Anjoeka; Stewart, Patricia A.; Colt, Joanne S.; Baris, Dalsu; Karagas, Margaret R.; Schwenn, Molly; Johnson, Alison; Silverman, Debra T.; Friesen, Melissa C.
2014-01-01
Objectives Evaluating occupational exposures in population-based case-control studies often requires exposure assessors to review each study participants' reported occupational information job-by-job to derive exposure estimates. Although such assessments likely have underlying decision rules, they usually lack transparency, are time-consuming and have uncertain reliability and validity. We aimed to identify the underlying rules to enable documentation, review, and future use of these expert-based exposure decisions. Methods Classification and regression trees (CART, predictions from a single tree) and random forests (predictions from many trees) were used to identify the underlying rules from the questionnaire responses and an expert's exposure assignments for occupational diesel exhaust exposure for several metrics: binary exposure probability and ordinal exposure probability, intensity, and frequency. Data were split into training (n=10,488 jobs), testing (n=2,247), and validation (n=2,248) data sets. Results The CART and random forest models' predictions agreed with 92–94% of the expert's binary probability assignments. For ordinal probability, intensity, and frequency metrics, the two models extracted decision rules more successfully for unexposed and highly exposed jobs (86–90% and 57–85%, respectively) than for low or medium exposed jobs (7–71%). Conclusions CART and random forest models extracted decision rules and accurately predicted an expert's exposure decisions for the majority of jobs and identified questionnaire response patterns that would require further expert review if the rules were applied to other jobs in the same or different study. This approach makes the exposure assessment process in case-control studies more transparent and creates a mechanism to efficiently replicate exposure decisions in future studies. PMID:23155187
Dopaminergic modulation of reward-guided decision making in alcohol-preferring AA rats.
Oinio, Ville; Bäckström, Pia; Uhari-Väänänen, Johanna; Raasmaja, Atso; Piepponen, Petteri; Kiianmaa, Kalervo
2017-05-30
R**esults from animal gambling models have highlighted the importance of dopaminergic neurotransmission in modulating decision making when large sucrose rewards are combined with uncertainty. The majority of these models use food restriction as a tool to motivate animals to accomplish operant behavioral tasks, in which sucrose is used as a reward. As enhanced motivation to obtain sucrose due to hunger may impact its reward-seeking effect, we wanted to examine the decision-making behavior of rats in a situation where rats were fed ad libitum. For this purpose, we chose alcohol-preferring AA (alko alcohol) rats, as these rats have been shown to have high preference for sweet agents. In the present study, AA rats were trained to self-administer sucrose pellet rewards in a two-lever choice task (one pellet vs. three pellets). Once rational choice behavior had been established, the probability of gaining three pellets was decreased over time (50%, 33%, 25% then 20%). The effect of d-amphetamine on decision making was studied at every probability level, as well as the effect of the dopamine D 1 receptor agonist SKF-81297 and D 2 agonist quinpirole at probability levels of 100% and 25%. d-Amphetamine increased unprofitable choices in a dose-dependent manner at the two lowest probability levels. Quinpirole increased the frequency of unprofitable decisions at the 25% probability level, and SKF-82197 did not affect choice behavior. These results mirror the findings of probabilistic discounting studies using food-restricted rats. Based on this, the use of AA rats provides a new approach for studies on reward-guided decision making. Copyright © 2017 Elsevier B.V. All rights reserved.
How many employees receive safety training during their first year of a new job?
Smith, Peter M; Mustard, Cameron A
2007-01-01
Objective To describe the provision of safety training to Canadian employees, specifically those in their first year of employment with a new employer. Design Three repeated national Canadian cross‐sectional surveys. Subjects 59 159 respondents from Statistics Canada's Workplace and Employee Surveys (1999, 2001 and 2003), 5671 who were in their first year of employment. Main outcome Receiving occupational health and safety training, orientation training or office or non‐office equipment training in either a classroom or on‐the‐job in the previous 12 months. Results Only 12% of women and 16% of men reported receiving safety training in the previous 12 months. Employees in their first 12 months of employment were more likely to receive safety training than employees with >5 years of job tenure. However, still only one in five new employees had received any safety training while with their current employer. In a fully adjusted regression model, employees who had access to family and support programs, women in medium‐sized workplaces and in manufacturing, and men in large workplaces and in part‐time employment all had an increased probability of receiving safety training. No increased likelihood of safety training was found in younger workers or those in jobs with higher physical demands, both of which are associated with increased injury risk. Conclusions From our results, it would appear that only one in five Canadian employees in their first year of a new job received safety training. Further, the provision of safety training does not appear to be more prevalent among workers or in occupations with increased risk of injuries. PMID:17296687
Pérez, Omar D; Aitken, Michael R F; Zhukovsky, Peter; Soto, Fabián A; Urcelay, Gonzalo P; Dickinson, Anthony
2016-12-15
Associative learning theories regard the probability of reinforcement as the critical factor determining responding. However, the role of this factor in instrumental conditioning is not completely clear. In fact, free-operant experiments show that participants respond at a higher rate on variable ratio than on variable interval schedules even though the reinforcement probability is matched between the schedules. This difference has been attributed to the differential reinforcement of long inter-response times (IRTs) by interval schedules, which acts to slow responding. In the present study, we used a novel experimental design to investigate human responding under random ratio (RR) and regulated probability interval (RPI) schedules, a type of interval schedule that sets a reinforcement probability independently of the IRT duration. Participants responded on each type of schedule before a final choice test in which they distributed responding between two schedules similar to those experienced during training. Although response rates did not differ during training, the participants responded at a lower rate on the RPI schedule than on the matched RR schedule during the choice test. This preference cannot be attributed to a higher probability of reinforcement for long IRTs and questions the idea that similar associative processes underlie classical and instrumental conditioning.
Assessing the potential for improving S2S forecast skill through multimodel ensembling
NASA Astrophysics Data System (ADS)
Vigaud, N.; Robertson, A. W.; Tippett, M. K.; Wang, L.; Bell, M. J.
2016-12-01
Non-linear logistic regression is well suited to probability forecasting and has been successfully applied in the past to ensemble weather and climate predictions, providing access to the full probabilities distribution without any Gaussian assumption. However, little work has been done at sub-monthly lead times where relatively small re-forecast ensembles and lengths represent new challenges for which post-processing avenues have yet to be investigated. A promising approach consists in extending the definition of non-linear logistic regression by including the quantile of the forecast distribution as one of the predictors. So-called Extended Logistic Regression (ELR), which enables mutually consistent individual threshold probabilities, is here applied to ECMWF, CFSv2 and CMA re-forecasts from the S2S database in order to produce rainfall probabilities at weekly resolution. The ELR model is trained on seasonally-varying tercile categories computed for lead times of 1 to 4 weeks. It is then tested in a cross-validated manner, i.e. allowing real-time predictability applications, to produce rainfall tercile probabilities from individual weekly hindcasts that are finally combined by equal pooling. Results will be discussed over a broader North American region, where individual and MME forecasts generated out to 4 weeks lead are characterized by good probabilistic reliability but low sharpness, exhibiting systematically more skill in winter than summer.
Supervised Detection of Anomalous Light Curves in Massive Astronomical Catalogs
NASA Astrophysics Data System (ADS)
Nun, Isadora; Pichara, Karim; Protopapas, Pavlos; Kim, Dae-Won
2014-09-01
The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. In order to process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new methodology to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each of the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. By leaving out one of the classes on the training set, we perform a validity test and show that when the random forest classifier attempts to classify unknown light curves (the class left out), it votes with an unusual distribution among the classes. This rare voting is detected by the Bayesian network and expressed as a low joint probability. Our method is suitable for exploring massive data sets given that the training process is performed offline. We tested our algorithm on 20 million light curves from the MACHO catalog and generated a list of anomalous candidates. After analysis, we divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration, or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post-analysis stage by performing a cross-match with all publicly available catalogs. Within these candidates we identified certain known but rare objects such as eclipsing Cepheids, blue variables, cataclysmic variables, and X-ray sources. For some outliers there was no additional information. Among them we identified three unknown variability types and a few individual outliers that will be followed up in order to perform a deeper analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nun, Isadora; Pichara, Karim; Protopapas, Pavlos
The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. In order to process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new methodology to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each ofmore » the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. By leaving out one of the classes on the training set, we perform a validity test and show that when the random forest classifier attempts to classify unknown light curves (the class left out), it votes with an unusual distribution among the classes. This rare voting is detected by the Bayesian network and expressed as a low joint probability. Our method is suitable for exploring massive data sets given that the training process is performed offline. We tested our algorithm on 20 million light curves from the MACHO catalog and generated a list of anomalous candidates. After analysis, we divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration, or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post-analysis stage by performing a cross-match with all publicly available catalogs. Within these candidates we identified certain known but rare objects such as eclipsing Cepheids, blue variables, cataclysmic variables, and X-ray sources. For some outliers there was no additional information. Among them we identified three unknown variability types and a few individual outliers that will be followed up in order to perform a deeper analysis.« less
Refinement of a Method for Identifying Probable Archaeological Sites from Remotely Sensed Data
NASA Technical Reports Server (NTRS)
Tilton, James C.; Comer, Douglas C.; Priebe, Carey E.; Sussman, Daniel; Chen, Li
2012-01-01
To facilitate locating archaeological sites before they are compromised or destroyed, we are developing approaches for generating maps of probable archaeological sites, through detecting subtle anomalies in vegetative cover, soil chemistry, and soil moisture by analyzing remotely sensed data from multiple sources. We previously reported some success in this effort with a statistical analysis of slope, radar, and Ikonos data (including tasseled cap and NDVI transforms) with Student's t-test. We report here on new developments in our work, performing an analysis of 8-band multispectral Worldview-2 data. The Worldview-2 analysis begins by computing medians and median absolute deviations for the pixels in various annuli around each site of interest on the 28 band difference ratios. We then use principle components analysis followed by linear discriminant analysis to train a classifier which assigns a posterior probability that a location is an archaeological site. We tested the procedure using leave-one-out cross validation with a second leave-one-out step to choose parameters on a 9,859x23,000 subset of the WorldView-2 data over the western portion of Ft. Irwin, CA, USA. We used 100 known non-sites and trained one classifier for lithic sites (n=33) and one classifier for habitation sites (n=16). We then analyzed convex combinations of scores from the Archaeological Predictive Model (APM) and our scores. We found that that the combined scores had a higher area under the ROC curve than either individual method, indicating that including WorldView-2 data in analysis improved the predictive power of the provided APM.
Bayesian Estimation of Small Effects in Exercise and Sports Science.
Mengersen, Kerrie L; Drovandi, Christopher C; Robert, Christian P; Pyne, David B; Gore, Christopher J
2016-01-01
The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a 'magnitude-based inference' approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.
NASA Astrophysics Data System (ADS)
Nicolae, Doina; Talianu, Camelia; Vasilescu, Jeni; Nicolae, Victor; Stachlewska, Iwona S.
2018-04-01
A Python code was developed to automatically retrieve the aerosol type (and its predominant component in the mixture) from EARLINET's 3 backscatter and 2 extinction data. The typing relies on Artificial Neural Networks which are trained to identify the most probable aerosol type from a set of mean-layer intensive optical parameters. This paper presents the use and limitations of the code with respect to the quality of the inputed lidar profiles, as well as with the assumptions made in the aerosol model.
Dose-volume histogram prediction using density estimation.
Skarpman Munter, Johanna; Sjölund, Jens
2015-09-07
Knowledge of what dose-volume histograms can be expected for a previously unseen patient could increase consistency and quality in radiotherapy treatment planning. We propose a machine learning method that uses previous treatment plans to predict such dose-volume histograms. The key to the approach is the framing of dose-volume histograms in a probabilistic setting.The training consists of estimating, from the patients in the training set, the joint probability distribution of some predictive features and the dose. The joint distribution immediately provides an estimate of the conditional probability of the dose given the values of the predictive features. The prediction consists of estimating, from the new patient, the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimate of the dose-volume histogram.To illustrate how the proposed method relates to previously proposed methods, we use the signed distance to the target boundary as a single predictive feature. As a proof-of-concept, we predicted dose-volume histograms for the brainstems of 22 acoustic schwannoma patients treated with stereotactic radiosurgery, and for the lungs of 9 lung cancer patients treated with stereotactic body radiation therapy. Comparing with two previous attempts at dose-volume histogram prediction we find that, given the same input data, the predictions are similar.In summary, we propose a method for dose-volume histogram prediction that exploits the intrinsic probabilistic properties of dose-volume histograms. We argue that the proposed method makes up for some deficiencies in previously proposed methods, thereby potentially increasing ease of use, flexibility and ability to perform well with small amounts of training data.
Brain-wave Dynamics Related to Cognitive Tasks and Neurofeedback Information Flow
NASA Astrophysics Data System (ADS)
Pop-Jordanova, Nada; Pop-Jordanov, Jordan; Dimitrovski, Darko; Markovska, Natasa
2003-08-01
Synchronization of oscillating neuronal discharges has been recently correlated to the moment of perception and the ensuing motor response, with transition between these two cognitive acts "through cellular mechanisms that remain to be established"[1]. Last year, using genetic strategies, it was found that the switching off persistent electric activity in the brain blocks memory recall [2]. On the other hand, analyzing mental-neural information flow, the nobelist Eccles has formulated a fundamental hypotheses that mental events may change the probability of quantum vesicular emissions of transmitters analogously to probability functions of quantum mechanics [3]. Applying the advanced quantum modeling to molecular rotational states exposed to electric activity in brain cells, we found that the probability of transitions does not depend on the field amplitude, suggesting the electric field frequency as the possible information-bearing physical quantity [4]. In this paper, an attempt is made to inter-correlate the above results on frequency aspects of neural transitions induced by cognitive tasks. Furthermore, considering the consecutive steps of mental-neural information flow during the biofeedback training to normalize EEG frequencies, the rationales for neurofeedback efficiency have been deduced.
NASA Astrophysics Data System (ADS)
Piretzidis, Dimitrios; Sra, Gurveer; Karantaidis, George; Sideris, Michael G.
2017-04-01
A new method for identifying correlated errors in Gravity Recovery and Climate Experiment (GRACE) monthly harmonic coefficients has been developed and tested. Correlated errors are present in the differences between monthly GRACE solutions, and can be suppressed using a de-correlation filter. In principle, the de-correlation filter should be implemented only on coefficient series with correlated errors to avoid losing useful geophysical information. In previous studies, two main methods of implementing the de-correlation filter have been utilized. In the first one, the de-correlation filter is implemented starting from a specific minimum order until the maximum order of the monthly solution examined. In the second one, the de-correlation filter is implemented only on specific coefficient series, the selection of which is based on statistical testing. The method proposed in the present study exploits the capabilities of supervised machine learning algorithms such as neural networks and support vector machines (SVMs). The pattern of correlated errors can be described by several numerical and geometric features of the harmonic coefficient series. The features of extreme cases of both correlated and uncorrelated coefficients are extracted and used for the training of the machine learning algorithms. The trained machine learning algorithms are later used to identify correlated errors and provide the probability of a coefficient series to be correlated. Regarding SVMs algorithms, an extensive study is performed with various kernel functions in order to find the optimal training model for prediction. The selection of the optimal training model is based on the classification accuracy of the trained SVM algorithm on the same samples used for training. Results show excellent performance of all algorithms with a classification accuracy of 97% - 100% on a pre-selected set of training samples, both in the validation stage of the training procedure and in the subsequent use of the trained algorithms to classify independent coefficients. This accuracy is also confirmed by the external validation of the trained algorithms using the hydrology model GLDAS NOAH. The proposed method meet the requirement of identifying and de-correlating only coefficients with correlated errors. Also, there is no need of applying statistical testing or other techniques that require prior de-correlation of the harmonic coefficients.
Generative adversarial network based telecom fraud detection at the receiving bank.
Zheng, Yu-Jun; Zhou, Xiao-Han; Sheng, Wei-Guo; Xue, Yu; Chen, Sheng-Yong
2018-06-01
Recently telecom fraud has become a serious problem especially in developing countries such as China. At present, it can be very difficult to coordinate different agencies to prevent fraud completely. In this paper we study how to detect large transfers that are sent from victims deceived by fraudsters at the receiving bank. We propose a new generative adversarial network (GAN) based model to calculate for each large transfer a probability that it is fraudulent, such that the bank can take appropriate measures to prevent potential fraudsters to take the money if the probability exceeds a threshold. The inference model uses a deep denoising autoencoder to effectively learn the complex probabilistic relationship among the input features, and employs adversarial training that establishes a minimax game between a discriminator and a generator to accurately discriminate between positive samples and negative samples in the data distribution. We show that the model outperforms a set of well-known classification methods in experiments, and its applications in two commercial banks have reduced losses of about 10 million RMB in twelve weeks and significantly improved their business reputation. Copyright © 2018 Elsevier Ltd. All rights reserved.
Deep Flare Net (DeFN) Model for Solar Flare Prediction
NASA Astrophysics Data System (ADS)
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Ishii, M.
2018-05-01
We developed a solar flare prediction model using a deep neural network (DNN) named Deep Flare Net (DeFN). This model can calculate the probability of flares occurring in the following 24 hr in each active region, which is used to determine the most likely maximum classes of flares via a binary classification (e.g., ≥M class versus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Su Jing; Chen Shaohao; Jaron-Becker, Agnieszka
We theoretically study the control of two-photon excitation to bound and dissociative states in a molecule induced by trains of laser pulses, which are equivalent to certain sets of spectral phase modulated pulses. To this end, we solve the time-dependent Schroedinger equation for the interaction of molecular model systems with an external intense laser field. Our numerical results for the temporal evolution of the population in the excited states show that, in the case of an excited dissociative state, control schemes, previously validated for the atomic case, fail due to the coupling of electronic and nuclear motion. In contrast, formore » excitation to bound states the two-photon excitation probability is controlled via the time delay and the carrier-envelope phase difference between two consecutive pulses in the train.« less
Khatun, Jainab; Hamlett, Eric; Giddings, Morgan C
2008-03-01
The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.
NASA Astrophysics Data System (ADS)
Mallepudi, Sri Abhishikth; Calix, Ricardo A.; Knapp, Gerald M.
2011-02-01
In recent years there has been a rapid increase in the size of video and image databases. Effective searching and retrieving of images from these databases is a significant current research area. In particular, there is a growing interest in query capabilities based on semantic image features such as objects, locations, and materials, known as content-based image retrieval. This study investigated mechanisms for identifying materials present in an image. These capabilities provide additional information impacting conditional probabilities about images (e.g. objects made of steel are more likely to be buildings). These capabilities are useful in Building Information Modeling (BIM) and in automatic enrichment of images. I2T methodologies are a way to enrich an image by generating text descriptions based on image analysis. In this work, a learning model is trained to detect certain materials in images. To train the model, an image dataset was constructed containing single material images of bricks, cloth, grass, sand, stones, and wood. For generalization purposes, an additional set of 50 images containing multiple materials (some not used in training) was constructed. Two different supervised learning classification models were investigated: a single multi-class SVM classifier, and multiple binary SVM classifiers (one per material). Image features included Gabor filter parameters for texture, and color histogram data for RGB components. All classification accuracy scores using the SVM-based method were above 85%. The second model helped in gathering more information from the images since it assigned multiple classes to the images. A framework for the I2T methodology is presented.
Trainable structure-activity relationship model for virtual screening of CYP3A4 inhibition.
Didziapetris, Remigijus; Dapkunas, Justas; Sazonovas, Andrius; Japertas, Pranas
2010-11-01
A new structure-activity relationship model predicting the probability for a compound to inhibit human cytochrome P450 3A4 has been developed using data for >800 compounds from various literature sources and tested on PubChem screening data. Novel GALAS (Global, Adjusted Locally According to Similarity) modeling methodology has been used, which is a combination of baseline global QSAR model and local similarity based corrections. GALAS modeling method allows forecasting the reliability of prediction thus defining the model applicability domain. For compounds within this domain the statistical results of the final model approach the data consistency between experimental data from literature and PubChem datasets with the overall accuracy of 89%. However, the original model is applicable only for less than a half of PubChem database. Since the similarity correction procedure of GALAS modeling method allows straightforward model training, the possibility to expand the applicability domain has been investigated. Experimental data from PubChem dataset served as an example of in-house high-throughput screening data. The model successfully adapted itself to both data classified using the same and different IC₅₀ threshold compared with the training set. In addition, adjustment of the CYP3A4 inhibition model to compounds with a novel chemical scaffold has been demonstrated. The reported GALAS model is proposed as a useful tool for virtual screening of compounds for possible drug-drug interactions even prior to the actual synthesis.
An integrated logit model for contamination event detection in water distribution systems.
Housh, Mashor; Ostfeld, Avi
2015-05-15
The problem of contamination event detection in water distribution systems has become one of the most challenging research topics in water distribution systems analysis. Current attempts for event detection utilize a variety of approaches including statistical, heuristics, machine learning, and optimization methods. Several existing event detection systems share a common feature in which alarms are obtained separately for each of the water quality indicators. Unifying those single alarms from different indicators is usually performed by means of simple heuristics. A salient feature of the current developed approach is using a statistically oriented model for discrete choice prediction which is estimated using the maximum likelihood method for integrating the single alarms. The discrete choice model is jointly calibrated with other components of the event detection system framework in a training data set using genetic algorithms. The fusing process of each indicator probabilities, which is left out of focus in many existing event detection system models, is confirmed to be a crucial part of the system which could be modelled by exploiting a discrete choice model for improving its performance. The developed methodology is tested on real water quality data, showing improved performances in decreasing the number of false positive alarms and in its ability to detect events with higher probabilities, compared to previous studies. Copyright © 2015 Elsevier Ltd. All rights reserved.
Predicting and understanding law-making with word vectors and an ensemble model.
Nay, John J
2017-01-01
Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill's sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment.
Predicting and understanding law-making with word vectors and an ensemble model
Nay, John J.
2017-01-01
Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill’s sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment. PMID:28489868
Sequential and simultaneous choices: testing the diet selection and sequential choice models.
Freidin, Esteban; Aw, Justine; Kacelnik, Alex
2009-03-01
We investigate simultaneous and sequential choices in starlings, using Charnov's Diet Choice Model (DCM) and Shapiro, Siller and Kacelnik's Sequential Choice Model (SCM) to integrate function and mechanism. During a training phase, starlings encountered one food-related option per trial (A, B or R) in random sequence and with equal probability. A and B delivered food rewards after programmed delays (shorter for A), while R ('rejection') moved directly to the next trial without reward. In this phase we measured latencies to respond. In a later, choice, phase, birds encountered the pairs A-B, A-R and B-R, the first implementing a simultaneous choice and the second and third sequential choices. The DCM predicts when R should be chosen to maximize intake rate, and SCM uses latencies of the training phase to predict choices between any pair of options in the choice phase. The predictions of both models coincided, and both successfully predicted the birds' preferences. The DCM does not deal with partial preferences, while the SCM does, and experimental results were strongly correlated to this model's predictions. We believe that the SCM may expose a very general mechanism of animal choice, and that its wider domain of success reflects the greater ecological significance of sequential over simultaneous choices.
Yu, J S; Xue, A Y; Redei, E E; Bagheri, N
2016-01-01
Major depressive disorder (MDD) is a critical cause of morbidity and disability with an economic cost of hundreds of billions of dollars each year, necessitating more effective treatment strategies and novel approaches to translational research. A notable barrier in addressing this public health threat involves reliable identification of the disorder, as many affected individuals remain undiagnosed or misdiagnosed. An objective blood-based diagnostic test using transcript levels of a panel of markers would provide an invaluable tool for MDD as the infrastructure—including equipment, trained personnel, billing, and governmental approval—for similar tests is well established in clinics worldwide. Here we present a supervised classification model utilizing support vector machines (SVMs) for the analysis of transcriptomic data readily obtained from a peripheral blood specimen. The model was trained on data from subjects with MDD (n=32) and age- and gender-matched controls (n=32). This SVM model provides a cross-validated sensitivity and specificity of 90.6% for the diagnosis of MDD using a panel of 10 transcripts. We applied a logistic equation on the SVM model and quantified a likelihood of depression score. This score gives the probability of a MDD diagnosis and allows the tuning of specificity and sensitivity for individual patients to bring personalized medicine closer in psychiatry. PMID:27779627
Risk, Reward, and Decision-Making in a Rodent Model of Cognitive Aging
Gilbert, Ryan J.; Mitchell, Marci R.; Simon, Nicholas W.; Bañuelos, Cristina; Setlow, Barry; Bizon, Jennifer L.
2011-01-01
Impaired decision-making in aging can directly impact factors (financial security, health care) that are critical to maintaining quality of life and independence at advanced ages. Naturalistic rodent models mimic human aging in other cognitive domains, and afford the opportunity to parse the effects of age on discrete aspects of decision-making in a manner relatively uncontaminated by experiential factors. Young adult (5–7 months) and aged (23–25 months) male F344 rats were trained on a probability discounting task in which they made discrete-trial choices between a small certain reward (one food pellet) and a large but uncertain reward (two food pellets with varying probabilities of delivery ranging from 100 to 0%). Young rats chose the large reward when it was associated with a high probability of delivery and shifted to the small but certain reward as probability of the large reward decreased. As a group, aged rats performed comparably to young, but there was significantly greater variance among aged rats. One subgroup of aged rats showed strong preference for the small certain reward. This preference was maintained under conditions in which large reward delivery was also certain, suggesting decreased sensitivity to reward magnitude. In contrast, another subgroup of aged rats showed strong preference for the large reward at low probabilities of delivery. Interestingly, this subgroup also showed elevated preference for probabilistic rewards when reward magnitudes were equalized. Previous findings using this same aged study population described strongly attenuated discounting of delayed rewards with age, together suggesting that a subgroup of aged rats may have deficits associated with accounting for reward costs (i.e., delay or probability). These deficits in cost-accounting were dissociable from the age-related differences in sensitivity to reward magnitude, suggesting that aging influences multiple, distinct mechanisms that can impact cost–benefit decision-making. PMID:22319463
Risk, reward, and decision-making in a rodent model of cognitive aging.
Gilbert, Ryan J; Mitchell, Marci R; Simon, Nicholas W; Bañuelos, Cristina; Setlow, Barry; Bizon, Jennifer L
2011-01-01
Impaired decision-making in aging can directly impact factors (financial security, health care) that are critical to maintaining quality of life and independence at advanced ages. Naturalistic rodent models mimic human aging in other cognitive domains, and afford the opportunity to parse the effects of age on discrete aspects of decision-making in a manner relatively uncontaminated by experiential factors. Young adult (5-7 months) and aged (23-25 months) male F344 rats were trained on a probability discounting task in which they made discrete-trial choices between a small certain reward (one food pellet) and a large but uncertain reward (two food pellets with varying probabilities of delivery ranging from 100 to 0%). Young rats chose the large reward when it was associated with a high probability of delivery and shifted to the small but certain reward as probability of the large reward decreased. As a group, aged rats performed comparably to young, but there was significantly greater variance among aged rats. One subgroup of aged rats showed strong preference for the small certain reward. This preference was maintained under conditions in which large reward delivery was also certain, suggesting decreased sensitivity to reward magnitude. In contrast, another subgroup of aged rats showed strong preference for the large reward at low probabilities of delivery. Interestingly, this subgroup also showed elevated preference for probabilistic rewards when reward magnitudes were equalized. Previous findings using this same aged study population described strongly attenuated discounting of delayed rewards with age, together suggesting that a subgroup of aged rats may have deficits associated with accounting for reward costs (i.e., delay or probability). These deficits in cost-accounting were dissociable from the age-related differences in sensitivity to reward magnitude, suggesting that aging influences multiple, distinct mechanisms that can impact cost-benefit decision-making.
NASA Technical Reports Server (NTRS)
Badler, N. I.; Lee, P.; Wong, S.
1985-01-01
Strength modeling is a complex and multi-dimensional issue. There are numerous parameters to the problem of characterizing human strength, most notably: (1) position and orientation of body joints; (2) isometric versus dynamic strength; (3) effector force versus joint torque; (4) instantaneous versus steady force; (5) active force versus reactive force; (6) presence or absence of gravity; (7) body somatotype and composition; (8) body (segment) masses; (9) muscle group envolvement; (10) muscle size; (11) fatigue; and (12) practice (training) or familiarity. In surveying the available literature on strength measurement and modeling an attempt was made to examine as many of these parameters as possible. The conclusions reached at this point toward the feasibility of implementing computationally reasonable human strength models. The assessment of accuracy of any model against a specific individual, however, will probably not be possible on any realistic scale. Taken statistically, strength modeling may be an effective tool for general questions of task feasibility and strength requirements.
Boosting with Averaged Weight Vectors
NASA Technical Reports Server (NTRS)
Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)
2002-01-01
AdaBoost is a well-known ensemble learning algorithm that constructs its constituent or base models in sequence. A key step in AdaBoost is constructing a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed to be orthogonal to the vector of mistakes made by the previous base model in the sequence. The idea is to make the next base model's errors uncorrelated with those of the previous model. Some researchers have pointed out the intuition that it is probably better to construct a distribution that is orthogonal to the mistake vectors of all the previous base models, but that this is not always possible. We present an algorithm that attempts to come as close as possible to this goal in an efficient manner. We present experimental results demonstrating significant improvement over AdaBoost and the Totally Corrective boosting algorithm, which also attempts to satisfy this goal.
Voermans, N C; Snoeck, M; Jungbluth, H
2016-10-01
Mutations in the skeletal muscle ryanodine receptor (RYR1) gene are associated with a wide spectrum of inherited myopathies presenting throughout life. Malignant hyperthermia susceptibility (MHS)-related RYR1 mutations have emerged as a common cause of exertional rhabdomyolysis, accounting for up to 30% of rhabdomyolysis episodes in otherwise healthy individuals. Common triggers are exercise and heat and, less frequently, viral infections, alcohol and drugs. Most subjects are normally strong and have no personal or family history of malignant hyperthermia. Heat intolerance and cold-induced muscle stiffness may be a feature. Recognition of this (probably not uncommon) rhabdomyolysis cause is vital for effective counselling, to identify potentially malignant hyperthermia-susceptible individuals and to adapt training regimes. Studies in various animal models provide insights regarding possible pathophysiological mechanisms and offer therapeutic perspectives. Copyright © 2016. Published by Elsevier Masson SAS.
2006-09-01
education. LCMS allow subject matter experts, with little technology skills to develop curriculum, deliver courses, and monitor e- learning. Distance...occurring in year seven and therefore it has zero probability of occurring in the first five years. The probability of Ev-2 occurring between years 7 and
Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data
Perryman, Alexander L.; Stratton, Thomas P.; Ekins, Sean; Freundlich, Joel S.
2015-01-01
Purpose Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Methods Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). Results “Pruning” out the moderately unstable/moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 hour. Conclusions Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources. PMID:26415647
Predicting Mouse Liver Microsomal Stability with "Pruned" Machine Learning Models and Public Data.
Perryman, Alexander L; Stratton, Thomas P; Ekins, Sean; Freundlich, Joel S
2016-02-01
Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). "Pruning" out the moderately unstable / moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 h. Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources.
Maximizing lipocalin prediction through balanced and diversified training set and decision fusion.
Nath, Abhigyan; Subbiah, Karthikeyan
2015-12-01
Lipocalins are short in sequence length and perform several important biological functions. These proteins are having less than 20% sequence similarity among paralogs. Experimentally identifying them is an expensive and time consuming process. The computational methods based on the sequence similarity for allocating putative members to this family are also far elusive due to the low sequence similarity existing among the members of this family. Consequently, the machine learning methods become a viable alternative for their prediction by using the underlying sequence/structurally derived features as the input. Ideally, any machine learning based prediction method must be trained with all possible variations in the input feature vector (all the sub-class input patterns) to achieve perfect learning. A near perfect learning can be achieved by training the model with diverse types of input instances belonging to the different regions of the entire input space. Furthermore, the prediction performance can be improved through balancing the training set as the imbalanced data sets will tend to produce the prediction bias towards majority class and its sub-classes. This paper is aimed to achieve (i) the high generalization ability without any classification bias through the diversified and balanced training sets as well as (ii) enhanced the prediction accuracy by combining the results of individual classifiers with an appropriate fusion scheme. Instead of creating the training set randomly, we have first used the unsupervised Kmeans clustering algorithm to create diversified clusters of input patterns and created the diversified and balanced training set by selecting an equal number of patterns from each of these clusters. Finally, probability based classifier fusion scheme was applied on boosted random forest algorithm (which produced greater sensitivity) and K nearest neighbour algorithm (which produced greater specificity) to achieve the enhanced predictive performance than that of individual base classifiers. The performance of the learned models trained on Kmeans preprocessed training set is far better than the randomly generated training sets. The proposed method achieved a sensitivity of 90.6%, specificity of 91.4% and accuracy of 91.0% on the first test set and sensitivity of 92.9%, specificity of 96.2% and accuracy of 94.7% on the second blind test set. These results have established that diversifying training set improves the performance of predictive models through superior generalization ability and balancing the training set improves prediction accuracy. For smaller data sets, unsupervised Kmeans based sampling can be an effective technique to increase generalization than that of the usual random splitting method. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
López-Torrijo, Manuel; Mengual-Andrés, Santiago
2015-01-01
Inclusive education is hard to implement in secondary schools. Probably, one of the determining factors lies in teachers' initial training that determines their attitude, identity and professional practice. This research analyses the initial teacher education programmes for Secondary Education, Higher Secondary Education, called…
Abbo, C; Okello, E S; Nakku, J
2013-03-01
The Global Assessment of Functioning (GAF) is the standard method and an essential tool for representing a clinician's judgment of a patient's overall level of psychological, social and occupational functioning. As such, it is probably the single most widely used method for assessing impairment among the patients with psychiatric illnesses. To assess the effects of one-hour training on application of the GAF by Psychiatric Clinical Officers' in a Ugandan setting. Five Psychiatrists and five Psychiatric Clinical Officers (PCOs) or Assistant Medical Officers who hold a 2 year diploma in Clinical Psychiatry were randomly selected to independently rate a video-recorded psychiatric interview according to the DSM IV-TR. The PCOs were then offered a one-hour training on how to rate the GAF scale and asked to rate the video case interview again. All ratings were assigned on the basis of past one year, at admission and current functioning. Interclass correlations (ICC) were computed using two-way mixed models. The ICC between the psychiatrists and the PCOs before training in the past one year, at admission and current functioning were +0.48, +0.51 and +0.59 respectively. After training, the ICC coefficients were +0.60, +0.82 and +0.83. Brief training given to PCOs improved the applications of their ratings of GAF scale to acceptable levels. There is need for formal training to this cadre of psychiatric practitioners in the use of the GAF.
Chen, Ching-Tai; Peng, Hung-Pin; Jian, Jhih-Wei; Tsai, Keng-Chang; Chang, Jeng-Yih; Yang, Ei-Wen; Chen, Jun-Bo; Ho, Shinn-Ying; Hsu, Wen-Lian; Yang, An-Suei
2012-01-01
Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with the physicochemical complementarity features based on the non-covalent interaction data derived from protein interiors. PMID:22701576
Error Discounting in Probabilistic Category Learning
Craig, Stewart; Lewandowsky, Stephan; Little, Daniel R.
2011-01-01
Some current theories of probabilistic categorization assume that people gradually attenuate their learning in response to unavoidable error. However, existing evidence for this error discounting is sparse and open to alternative interpretations. We report two probabilistic-categorization experiments that investigated error discounting by shifting feedback probabilities to new values after different amounts of training. In both experiments, responding gradually became less responsive to errors, and learning was slowed for some time after the feedback shift. Both results are indicative of error discounting. Quantitative modeling of the data revealed that adding a mechanism for error discounting significantly improved the fits of an exemplar-based and a rule-based associative learning model, as well as of a recency-based model of categorization. We conclude that error discounting is an important component of probabilistic learning. PMID:21355666
Shape-driven 3D segmentation using spherical wavelets.
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2006-01-01
This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details.
A Markovian event-based framework for stochastic spiking neural networks.
Touboul, Jonathan D; Faugeras, Olivier D
2011-11-01
In spiking neural networks, the information is conveyed by the spike times, that depend on the intrinsic dynamics of each neuron, the input they receive and on the connections between neurons. In this article we study the Markovian nature of the sequence of spike times in stochastic neural networks, and in particular the ability to deduce from a spike train the next spike time, and therefore produce a description of the network activity only based on the spike times regardless of the membrane potential process. To study this question in a rigorous manner, we introduce and study an event-based description of networks of noisy integrate-and-fire neurons, i.e. that is based on the computation of the spike times. We show that the firing times of the neurons in the networks constitute a Markov chain, whose transition probability is related to the probability distribution of the interspike interval of the neurons in the network. In the cases where the Markovian model can be developed, the transition probability is explicitly derived in such classical cases of neural networks as the linear integrate-and-fire neuron models with excitatory and inhibitory interactions, for different types of synapses, possibly featuring noisy synaptic integration, transmission delays and absolute and relative refractory period. This covers most of the cases that have been investigated in the event-based description of spiking deterministic neural networks.
Shirley, Matthew H.; Dorazio, Robert M.; Abassery, Ekramy; Elhady, Amr A.; Mekki, Mohammed S.; Asran, Hosni H.
2012-01-01
As part of the development of a management program for Nile crocodiles in Lake Nasser, Egypt, we used a dependent double-observer sampling protocol with multiple observers to compute estimates of population size. To analyze the data, we developed a hierarchical model that allowed us to assess variation in detection probabilities among observers and survey dates, as well as account for variation in crocodile abundance among sites and habitats. We conducted surveys from July 2008-June 2009 in 15 areas of Lake Nasser that were representative of 3 main habitat categories. During these surveys, we sampled 1,086 km of lake shore wherein we detected 386 crocodiles. Analysis of the data revealed significant variability in both inter- and intra-observer detection probabilities. Our raw encounter rate was 0.355 crocodiles/km. When we accounted for observer effects and habitat, we estimated a surface population abundance of 2,581 (2,239-2,987, 95% credible intervals) crocodiles in Lake Nasser. Our results underscore the importance of well-trained, experienced monitoring personnel in order to decrease heterogeneity in intra-observer detection probability and to better detect changes in the population based on survey indices. This study will assist the Egyptian government establish a monitoring program as an integral part of future crocodile harvest activities in Lake Nasser
Rivas, Elena; Lang, Raymond; Eddy, Sean R
2012-02-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.
Rivas, Elena; Lang, Raymond; Eddy, Sean R.
2012-01-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308
NASA Astrophysics Data System (ADS)
Chen, Chaochao; Vachtsevanos, George; Orchard, Marcos E.
2012-04-01
Machine prognosis can be considered as the generation of long-term predictions that describe the evolution in time of a fault indicator, with the purpose of estimating the remaining useful life (RUL) of a failing component/subsystem so that timely maintenance can be performed to avoid catastrophic failures. This paper proposes an integrated RUL prediction method using adaptive neuro-fuzzy inference systems (ANFIS) and high-order particle filtering, which forecasts the time evolution of the fault indicator and estimates the probability density function (pdf) of RUL. The ANFIS is trained and integrated in a high-order particle filter as a model describing the fault progression. The high-order particle filter is used to estimate the current state and carry out p-step-ahead predictions via a set of particles. These predictions are used to estimate the RUL pdf. The performance of the proposed method is evaluated via the real-world data from a seeded fault test for a UH-60 helicopter planetary gear plate. The results demonstrate that it outperforms both the conventional ANFIS predictor and the particle-filter-based predictor where the fault growth model is a first-order model that is trained via the ANFIS.
Ni, Yepeng; Liu, Jianbo; Liu, Shan; Bai, Yaxin
2016-01-01
With the rapid development of smartphones and wireless networks, indoor location-based services have become more and more prevalent. Due to the sophisticated propagation of radio signals, the Received Signal Strength Indicator (RSSI) shows a significant variation during pedestrian walking, which introduces critical errors in deterministic indoor positioning. To solve this problem, we present a novel method to improve the indoor pedestrian positioning accuracy by embedding a fuzzy pattern recognition algorithm into a Hidden Markov Model. The fuzzy pattern recognition algorithm follows the rule that the RSSI fading has a positive correlation to the distance between the measuring point and the AP location even during a dynamic positioning measurement. Through this algorithm, we use the RSSI variation trend to replace the specific RSSI value to achieve a fuzzy positioning. The transition probability of the Hidden Markov Model is trained by the fuzzy pattern recognition algorithm with pedestrian trajectories. Using the Viterbi algorithm with the trained model, we can obtain a set of hidden location states. In our experiments, we demonstrate that, compared with the deterministic pattern matching algorithm, our method can greatly improve the positioning accuracy and shows robust environmental adaptability. PMID:27618053
A coarse-to-fine approach for pericardial effusion localization and segmentation in chest CT scans
NASA Astrophysics Data System (ADS)
Liu, Jiamin; Chellamuthu, Karthik; Lu, Le; Bagheri, Mohammadhadi; Summers, Ronald M.
2018-02-01
Pericardial effusion on CT scans demonstrates very high shape and volume variability and very low contrast to adjacent structures. This inhibits traditional automated segmentation methods from achieving high accuracies. Deep neural networks have been widely used for image segmentation in CT scans. In this work, we present a two-stage method for pericardial effusion localization and segmentation. For the first step, we localize the pericardial area from the entire CT volume, providing a reliable bounding box for the more refined segmentation step. A coarse-scaled holistically-nested convolutional networks (HNN) model is trained on entire CT volume. The resulting HNN per-pixel probability maps are then threshold to produce a bounding box covering the pericardial area. For the second step, a fine-scaled HNN model is trained only on the bounding box region for effusion segmentation to reduce the background distraction. Quantitative evaluation is performed on a dataset of 25 CT scans of patient (1206 images) with pericardial effusion. The segmentation accuracy of our two-stage method, measured by Dice Similarity Coefficient (DSC), is 75.59+/-12.04%, which is significantly better than the segmentation accuracy (62.74+/-15.20%) of only using the coarse-scaled HNN model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Boyun; Duguid, Andrew; Nygaard, Ronar
The objective of this project is to develop a computerized statistical model with the Integrated Neural-Genetic Algorithm (INGA) for predicting the probability of long-term leak of wells in CO 2 sequestration operations. This object has been accomplished by conducting research in three phases: 1) data mining of CO 2-explosed wells, 2) INGA computer model development, and 3) evaluation of the predictive performance of the computer model with data from field tests. Data mining was conducted for 510 wells in two CO 2 sequestration projects in the Texas Gulf Coast region. They are the Hasting West field and Oyster Bayou fieldmore » in the Southern Texas. Missing wellbore integrity data were estimated using an analytical and Finite Element Method (FEM) model. The INGA was first tested for performances of convergence and computing efficiency with the obtained data set of high dimension. It was concluded that the INGA can handle the gathered data set with good accuracy and reasonable computing time after a reduction of dimension with a grouping mechanism. A computerized statistical model with the INGA was then developed based on data pre-processing and grouping. Comprehensive training and testing of the model were carried out to ensure that the model is accurate and efficient enough for predicting the probability of long-term leak of wells in CO 2 sequestration operations. The Cranfield in the southern Mississippi was select as the test site. Observation wells CFU31F2 and CFU31F3 were used for pressure-testing, formation-logging, and cement-sampling. Tools run in the wells include Isolation Scanner, Slim Cement Mapping Tool (SCMT), Cased Hole Formation Dynamics Tester (CHDT), and Mechanical Sidewall Coring Tool (MSCT). Analyses of the obtained data indicate no leak of CO 2 cross the cap zone while it is evident that the well cement sheath was invaded by the CO 2 from the storage zone. This observation is consistent with the result predicted by the INGA model which indicates the well has a CO 2 leak-safe probability of 72%. This comparison implies that the developed INGA model is valid for future use in predicting well leak probability.« less
Paninski, Liam; Haith, Adrian; Szirtes, Gabor
2008-02-01
We recently introduced likelihood-based methods for fitting stochastic integrate-and-fire models to spike train data. The key component of this method involves the likelihood that the model will emit a spike at a given time t. Computing this likelihood is equivalent to computing a Markov first passage time density (the probability that the model voltage crosses threshold for the first time at time t). Here we detail an improved method for computing this likelihood, based on solving a certain integral equation. This integral equation method has several advantages over the techniques discussed in our previous work: in particular, the new method has fewer free parameters and is easily differentiable (for gradient computations). The new method is also easily adaptable for the case in which the model conductance, not just the input current, is time-varying. Finally, we describe how to incorporate large deviations approximations to very small likelihoods.
A new concept in seismic landslide hazard analysis for practical application
NASA Astrophysics Data System (ADS)
Lee, Chyi-Tyi
2017-04-01
A seismic landslide hazard model could be constructed using deterministic approach (Jibson et al., 2000) or statistical approach (Lee, 2014). Both approaches got landslide spatial probability under a certain return-period earthquake. In the statistical approach, our recent study found that there are common patterns among different landslide susceptibility models of the same region. The common susceptibility could reflect relative stability of slopes at a region; higher susceptibility indicates lower stability. Using the common susceptibility together with an earthquake event landslide inventory and a map of topographically corrected Arias intensity, we can build the relationship among probability of failure, Arias intensity and the susceptibility. This relationship can immediately be used to construct a seismic landslide hazard map for the region that the empirical relationship built. If the common susceptibility model is further normalized and the empirical relationship built with normalized susceptibility, then the empirical relationship may be practically applied to different region with similar tectonic environments and climate conditions. This could be feasible, when a region has no existing earthquake-induce landslide data to train the susceptibility model and to build the relationship. It is worth mentioning that a rain-induced landslide susceptibility model has common pattern similar to earthquake-induced landslide susceptibility in the same region, and is usable to build the relationship with an earthquake event landslide inventory and a map of Arias intensity. These will be introduced with examples in the meeting.
Prior probability modulates anticipatory activity in category-specific areas.
Trapp, Sabrina; Lepsien, Jöran; Kotz, Sonja A; Bar, Moshe
2016-02-01
Bayesian models are currently a dominant framework for describing human information processing. However, it is not clear yet how major tenets of this framework can be translated to brain processes. In this study, we addressed the neural underpinning of prior probability and its effect on anticipatory activity in category-specific areas. Before fMRI scanning, participants were trained in two behavioral sessions to learn the prior probability and correct order of visual events within a sequence. The events of each sequence included two different presentations of a geometric shape and one picture of either a house or a face, which appeared with either a high or a low likelihood. Each sequence was preceded by a cue that gave participants probabilistic information about which items to expect next. This allowed examining cue-related anticipatory modulation of activity as a function of prior probability in category-specific areas (fusiform face area and parahippocampal place area). Our findings show that activity in the fusiform face area was higher when faces had a higher prior probability. The finding of a difference between levels of expectations is consistent with graded, probabilistically modulated activity, but the data do not rule out the alternative explanation of a categorical neural response. Importantly, these differences were only visible during anticipation, and vanished at the time of stimulus presentation, calling for a functional distinction when considering the effects of prior probability. Finally, there were no anticipatory effects for houses in the parahippocampal place area, suggesting sensitivity to stimulus material when looking at effects of prediction.
Visual Confirmation of Voice Takeoff Clearance (VICON) Operational Evaluation. Volume I.
1981-02-01
probable language ferred the Mimic panel over the others problem in a second, a crew member’s and stated that it was easiest to use misirterpretation of a...of references were made to the language difficulties which occurred at Tenerife, but do not occur here. 103 0 Safety is negatively affected because...recommendation, and two included language . One controller mentioned better overall controller training. 0 Improve pilot training. Better training and
Singh, A S; Shah, A; Brockmann, A
2018-02-01
In honey bees, continuous foraging at an artificial feeder induced a sustained upregulation of the immediate early genes early growth response protein 1 (Egr-1) and hormone receptor 38 (Hr38). This gene expression response was accompanied by an upregulation of several Egr-1 candidate downstream genes: ecdysone receptor (EcR), dopamine/ecdysteroid receptor (DopEcR), dopamine decarboxylase and dopamine receptor 2. Hr38, EcR and DopEcR are components of the ecdysteroid signalling pathway, which is highly probably involved in learning and memory processes in honey bees and other insects. Time-trained foragers still showed an upregulation of Egr-1 when the feeder was presented at an earlier time of the day, suggesting that the genomic response is more dependent on the food reward than training time. However, presentation of the feeder at the training time without food was still capable of inducing a transient increase in Egr-1 expression. Thus, learnt feeder cues, or even training time, probably affect Egr-1 expression. In contrast, whole brain Egr-1 expression changes did not differ between dancing and nondancing foragers. On the basis of our results we propose that food reward induced continuous foraging ultimately elicits a genomic response involving Egr-1 and Hr38 and their downstream genes. Furthermore this genomic response is highly probably involved in foraging-related learning and memory responses. © 2017 The Royal Entomological Society.
Crime Solving Techniques: Training Bulletin.
ERIC Educational Resources Information Center
Sands, Jack M.
The document is a training bulletin for criminal investigators, explaining the use of probability, logic, lateral thinking, group problem solving, and psychological profiles as methods of solving crimes. One chpater of several pages is devoted to each of the five methods. The use of each method is explained; problems are presented for the user to…
Musical training, neuroplasticity and cognition.
Rodrigues, Ana Carolina; Loureiro, Maurício Alves; Caramelli, Paulo
2010-01-01
The influence of music on the human brain has been recently investigated in numerous studies. Several investigations have shown that structural and functional cerebral neuroplastic processes emerge as a result of long-term musical training, which in turn may produce cognitive differences between musicians and non-musicians. Musicians can be considered ideal cases for studies on brain adaptation, due to their unique and intensive training experiences. This article presents a review of recent findings showing positive effects of musical training on non-musical cognitive abilities, which probably reflect plastic changes in brains of musicians.
PSF estimation for defocus blurred image based on quantum back-propagation neural network
NASA Astrophysics Data System (ADS)
Gao, Kun; Zhang, Yan; Shao, Xiao-guang; Liu, Ying-hui; Ni, Guoqiang
2010-11-01
Images obtained by an aberration-free system are defocused blur due to motion in depth and/or zooming. The precondition of restoring the degraded image is to estimate point spread function (PSF) of the imaging system as precisely as possible. But it is difficult to identify the analytic model of PSF precisely due to the complexity of the degradation process. Inspired by the similarity between the quantum process and imaging process in the probability and statistics fields, one reformed multilayer quantum neural network (QNN) is proposed to estimate PSF of the defocus blurred image. Different from the conventional artificial neural network (ANN), an improved quantum neuron model is used in the hidden layer instead, which introduces a 2-bit controlled NOT quantum gate to control output and adopts 2 texture and edge features as the input vectors. The supervised back-propagation learning rule is adopted to train network based on training sets from the historical images. Test results show that this method owns excellent features of high precision and strong generalization ability.
Human gait recognition by pyramid of HOG feature on silhouette images
NASA Astrophysics Data System (ADS)
Yang, Guang; Yin, Yafeng; Park, Jeanrok; Man, Hong
2013-03-01
As a uncommon biometric modality, human gait recognition has a great advantage of identify people at a distance without high resolution images. It has attracted much attention in recent years, especially in the fields of computer vision and remote sensing. In this paper, we propose a human gait recognition framework that consists of a reliable background subtraction method followed by the pyramid of Histogram of Gradient (pHOG) feature extraction on the silhouette image, and a Hidden Markov Model (HMM) based classifier. Through background subtraction, the silhouette of human gait in each frame is extracted and normalized from the raw video sequence. After removing the shadow and noise in each region of interest (ROI), pHOG feature is computed on the silhouettes images. Then the pHOG features of each gait class will be used to train a corresponding HMM. In the test stage, pHOG feature will be extracted from each test sequence and used to calculate the posterior probability toward each trained HMM model. Experimental results on the CASIA Gait Dataset B1 demonstrate that with our proposed method can achieve very competitive recognition rate.
DeWeber, Jefferson Tyrell; Wagner, Tyler
2015-01-01
The Brook Trout Salvelinus fontinalis is an important species of conservation concern in the eastern USA. We developed a model to predict Brook Trout population status within individual stream reaches throughout the species’ native range in the eastern USA. We utilized hierarchical logistic regression with Bayesian estimation to predict Brook Trout occurrence probability, and we allowed slopes and intercepts to vary among ecological drainage units (EDUs). Model performance was similar for 7,327 training samples and 1,832 validation samples based on the area under the receiver operating curve (∼0.78) and Cohen's kappa statistic (0.44). Predicted water temperature had a strong negative effect on Brook Trout occurrence probability at the stream reach scale and was also negatively associated with the EDU average probability of Brook Trout occurrence (i.e., EDU-specific intercepts). The effect of soil permeability was positive but decreased as EDU mean soil permeability increased. Brook Trout were less likely to occur in stream reaches surrounded by agricultural or developed land cover, and an interaction suggested that agricultural land cover also resulted in an increased sensitivity to water temperature. Our model provides a further understanding of how Brook Trout are shaped by habitat characteristics in the region and yields maps of stream-reach-scale predictions, which together can be used to support ongoing conservation and management efforts. These decision support tools can be used to identify the extent of potentially suitable habitat, estimate historic habitat losses, and prioritize conservation efforts by selecting suitable stream reaches for a given action. Future work could extend the model to account for additional landscape or habitat characteristics, include biotic interactions, or estimate potential Brook Trout responses to climate and land use changes.
The development and functional control of reading-comprehension behavior.
Rosenbaum, M S; Breiling, J
1976-01-01
Reading comprehension, indicated by motor behavior and multiple-choice picture selection called for in written instructions, was taught to an autistic child using verbal prompts, modelling, and physical guidance. The child was rewarded for correct behaviors to training items; nonrewarded probes were used to assess generalization. Probable maintaining events were assessed through their sequential removal in a reversal design. Results showed: (a) following acquisition, performance was maintained at a near-100% level when candy, praise, attention, and training were removed, (b) absence of other persons was correlated with a marked decrease in performance, whereas their presence was associated with performance at near 100%, and (c) performance generalized to probes and across experimenters. Rewards, which may have been reinforcing during acquisition, did not appear necessary to maintain later performance. Instead, presence of others (a setting event) was demonstrated to have control over maintained performance.
Abdulghani, Hamza Mohammad; Shaik, Shaffi Ahamed; Khamis, Nehal; Al-Drees, Abdulmajeed Abdulrahman; Irshad, Mohammad; Khalil, Mahmoud Salah; Alhaqwi, Ali Ibrahim; Isnani, Arthur
2014-04-01
Qualitative and quantitative evaluation of academic programs can enhance the development, effectiveness, and dissemination of comparative quality reports as well as quality improvement efforts. To evaluate the five research methodology workshops through assessing participants' satisfaction, knowledge and skills gain and impact on practices by the Kirkpatrick's evaluation model. The four level Kirkpatrick's model was applied for the evaluation. Training feedback questionnaires, pre and post tests, learner development plan reports and behavioral surveys were used to evaluate the effectiveness of the workshop programs. Of the 116 participants, 28 (24.1%) liked with appreciation, 62 (53.4%) liked with suggestions and 26 (22.4%) disliked the programs. Pre and post MCQs tests mean scores showed significant improvement of relevant basic knowledge and cognitive skills by 17.67% (p ≤ 0.005). Pre-and-post tests scores on workshops sub-topics also significantly improved for the manuscripts (p ≤ 0.031) and proposal writing (p ≤ 0.834). As for the impact, 56.9% of participants started research, and 6.9% published their studies. The results from participants' performance revealed an overall positive feedback and 79% of participant reported transfer of training skills at their workplace. The course outcomes achievement and suggestions given for improvements offer insight into the program which were encouraging and very useful. Encouraging "research culture" and work-based learning are probably the most powerful determinants for research promotion. These findings therefore encourage faculty development unit to continue its training and development in the research methodology aspects.
Standard plane localization in ultrasound by radial component model and selective search.
Ni, Dong; Yang, Xin; Chen, Xin; Chin, Chien-Ting; Chen, Siping; Heng, Pheng Ann; Li, Shengli; Qin, Jing; Wang, Tianfu
2014-11-01
Acquisition of the standard plane is crucial for medical ultrasound diagnosis. However, this process requires substantial experience and a thorough knowledge of human anatomy. Therefore it is very challenging for novices and even time consuming for experienced examiners. We proposed a hierarchical, supervised learning framework for automatically detecting the standard plane from consecutive 2-D ultrasound images. We tested this technique by developing a system that localizes the fetal abdominal standard plane from ultrasound video by detecting three key anatomical structures: the stomach bubble, umbilical vein and spine. We first proposed a novel radial component-based model to describe the geometric constraints of these key anatomical structures. We then introduced a novel selective search method which exploits the vessel probability algorithm to produce probable locations for the spine and umbilical vein. Next, using component classifiers trained by random forests, we detected the key anatomical structures at their probable locations within the regions constrained by the radial component-based model. Finally, a second-level classifier combined the results from the component detection to identify an ultrasound image as either a "fetal abdominal standard plane" or a "non- fetal abdominal standard plane." Experimental results on 223 fetal abdomen videos showed that the detection accuracy of our method was as high as 85.6% and significantly outperformed both the full abdomen and the separate anatomy detection methods without geometric constraints. The experimental results demonstrated that our system shows great promise for application to clinical practice. Copyright © 2014 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Detection of Sea Ice and Open Water from RADARSAT-2 Images for Data Assimilation
NASA Astrophysics Data System (ADS)
Komarov, A.; Buehner, M.
2016-12-01
Automated detection of sea ice and open water from SAR data is very important for further assimilation into coupled ocean-sea ice-atmosphere numerical models, such as the Regional Ice-Ocean Prediction System being implemented at the Environment and Climate Change Canada. Conventional classification approaches based on various learning techniques are found to be limited by the fact that they typically do not indicate the level of confidence for ice and water retrievals. Meanwhile, only ice/water retrievals with a very high level of confidence are allowed to be assimilated into the sea ice model to avoid propagating and magnifying errors into the numerical prediction system. In this study we developed a new technique for ice and water detection from dual-polarization RADARSAT-2 HH-HV images which provides the probability of ice/water at a given location. We collected many hundreds of thousands of SAR signatures over various sea ice types (i.e. new, grey, first-year, and multi-year ice) and open water from all available RADARSAT-2 images and the corresponding Canadian Ice Service Image Analysis products over the period from November 2010 to May 2016. Our analysis of the dataset revealed that ice/water separation can be effectively performed in the space of SAR-based variables independent of the incidence angle and noise floor (such as texture measures) and auxiliary Global Environmental Multiscale Model parameters (such as surface wind speed). Choice of the parameters will be specifically discussed in the presentation. An ice probability empirical model as a function of the selected predictors was built in a form of logistic regression, based on the training dataset from 2012 to 2016. The developed ice probability model showed very good performance on the independent testing subset (year 2011). With the ice/water probability threshold of 0.95 reflecting a very high level of confidence, 79% of the testing ice and water samples were classified with the accuracy of 99%. These results are particularly important in light of the upcoming RADARSAT Constellation mission which will drastically increase the amount of SAR data over the Arctic region.
Learning abstract visual concepts via probabilistic program induction in a Language of Thought.
Overlan, Matthew C; Jacobs, Robert A; Piantadosi, Steven T
2017-11-01
The ability to learn abstract concepts is a powerful component of human cognition. It has been argued that variable binding is the key element enabling this ability, but the computational aspects of variable binding remain poorly understood. Here, we address this shortcoming by formalizing the Hierarchical Language of Thought (HLOT) model of rule learning. Given a set of data items, the model uses Bayesian inference to infer a probability distribution over stochastic programs that implement variable binding. Because the model makes use of symbolic variables as well as Bayesian inference and programs with stochastic primitives, it combines many of the advantages of both symbolic and statistical approaches to cognitive modeling. To evaluate the model, we conducted an experiment in which human subjects viewed training items and then judged which test items belong to the same concept as the training items. We found that the HLOT model provides a close match to human generalization patterns, significantly outperforming two variants of the Generalized Context Model, one variant based on string similarity and the other based on visual similarity using features from a deep convolutional neural network. Additional results suggest that variable binding happens automatically, implying that binding operations do not add complexity to peoples' hypothesized rules. Overall, this work demonstrates that a cognitive model combining symbolic variables with Bayesian inference and stochastic program primitives provides a new perspective for understanding people's patterns of generalization. Copyright © 2017 Elsevier B.V. All rights reserved.
1976-05-01
subjective in nature , -it provides a practical method for analyzing a mass of data, including data which can be utilized to predict probable future... nature and administered when the individual student is unable to maintain acceptable perfornance during the training cycle. Service-wide remedial...are directly related to the curriculum topics of recruit training. Others are of a broader nature related to general Navy problo•is which present a
A token centric part-of-speech tagger for biomedical text.
Barrett, Neil; Weber-Jahnke, Jens
2014-05-01
Difficulties with part-of-speech (POS) tagging of biomedical text is accessing and annotating appropriate training corpora. These difficulties may result in POS taggers trained on corpora that differ from the tagger's target biomedical text (cross-domain tagging). In such cases where training and target corpora differ tagging accuracy decreases. This paper presents a POS tagger for cross-domain tagging called TcT. TcT estimates a tag's likelihood for a given token by combining token collocation probabilities and the token's tag probabilities calculated using a Naive Bayes classifier. We compared TcT to three POS taggers used in the biomedical domain (mxpost, Brill and TnT). We trained each tagger on a non-biomedical corpus and evaluated it on biomedical corpora. TcT was more accurate in cross-domain tagging than mxpost, Brill and TnT (respective averages 83.9, 81.0, 79.5 and 78.8). Our analysis of tagger performance suggests that lexical differences between corpora have more effect on tagging accuracy than originally considered by previous research work. Biomedical POS tagging algorithms may be modified to improve their cross-domain tagging accuracy without requiring extra training or large training data sets. Future work should reexamine POS tagging methods for biomedical text. This differs from the work to date that has focused on retraining existing POS taggers. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhong, Bin-Yan; He, Shi-Cheng; Zhu, Hai-Dong
PurposeWe aim to determine the predictors of new adjacent vertebral fractures (AVCFs) after percutaneous vertebroplasty (PVP) in patients with osteoporotic vertebral compression fractures (OVCFs) and to construct a risk prediction score to estimate a 2-year new AVCF risk-by-risk factor condition.Materials and MethodsPatients with OVCFs who underwent their first PVP between December 2006 and December 2013 at Hospital A (training cohort) and Hospital B (validation cohort) were included in this study. In training cohort, we assessed the independent risk predictors and developed the probability of new adjacent OVCFs (PNAV) score system using the Cox proportional hazard regression analysis. The accuracy ofmore » this system was then validated in both training and validation cohorts by concordance (c) statistic.Results421 patients (training cohort: n = 256; validation cohort: n = 165) were included in this study. In training cohort, new AVCFs after the first PVP treatment occurred in 33 (12.9%) patients. The independent risk factors were intradiscal cement leakage and preexisting old vertebral compression fracture(s). The estimated 2-year absolute risk of new AVCFs ranged from less than 4% in patients with neither independent risk factors to more than 45% in individuals with both factors.ConclusionsThe PNAV score is an objective and easy approach to predict the risk of new AVCFs.« less
High-order statistics of weber local descriptors for image representation.
Han, Xian-Hua; Chen, Yen-Wei; Xu, Gang
2015-06-01
Highly discriminant visual features play a key role in different image classification applications. This study aims to realize a method for extracting highly-discriminant features from images by exploring a robust local descriptor inspired by Weber's law. The investigated local descriptor is based on the fact that human perception for distinguishing a pattern depends not only on the absolute intensity of the stimulus but also on the relative variance of the stimulus. Therefore, we firstly transform the original stimulus (the images in our study) into a differential excitation-domain according to Weber's law, and then explore a local patch, called micro-Texton, in the transformed domain as Weber local descriptor (WLD). Furthermore, we propose to employ a parametric probability process to model the Weber local descriptors, and extract the higher-order statistics to the model parameters for image representation. The proposed strategy can adaptively characterize the WLD space using generative probability model, and then learn the parameters for better fitting the training space, which would lead to more discriminant representation for images. In order to validate the efficiency of the proposed strategy, we apply three different image classification applications including texture, food images and HEp-2 cell pattern recognition, which validates that our proposed strategy has advantages over the state-of-the-art approaches.
Can we (control) Engineer the degree learning process?
NASA Astrophysics Data System (ADS)
White, A. S.; Censlive, M.; Neilsen, D.
2014-07-01
This paper investigates how control theory could be applied to learning processes in engineering education. The initial point for the analysis is White's Double Loop learning model of human automation control modified for the education process where a set of governing principals is chosen, probably by the course designer. After initial training the student decides unknowingly on a mental map or model. After observing how the real world is behaving, a strategy to achieve the governing variables is chosen and a set of actions chosen. This may not be a conscious operation, it maybe completely instinctive. These actions will cause some consequences but not until a certain time delay. The current model is compared with the work of Hollenbeck on goal setting, Nelson's model of self-regulation and that of Abdulwahed, Nagy and Blanchard at Loughborough who investigated control methods applied to the learning process.
Spatial Uncertainty Modeling of Fuzzy Information in Images for Pattern Classification
Pham, Tuan D.
2014-01-01
The modeling of the spatial distribution of image properties is important for many pattern recognition problems in science and engineering. Mathematical methods are needed to quantify the variability of this spatial distribution based on which a decision of classification can be made in an optimal sense. However, image properties are often subject to uncertainty due to both incomplete and imprecise information. This paper presents an integrated approach for estimating the spatial uncertainty of vagueness in images using the theory of geostatistics and the calculus of probability measures of fuzzy events. Such a model for the quantification of spatial uncertainty is utilized as a new image feature extraction method, based on which classifiers can be trained to perform the task of pattern recognition. Applications of the proposed algorithm to the classification of various types of image data suggest the usefulness of the proposed uncertainty modeling technique for texture feature extraction. PMID:25157744
Probability and Statistics in Aerospace Engineering
NASA Technical Reports Server (NTRS)
Rheinfurth, M. H.; Howell, L. W.
1998-01-01
This monograph was prepared to give the practicing engineer a clear understanding of probability and statistics with special consideration to problems frequently encountered in aerospace engineering. It is conceived to be both a desktop reference and a refresher for aerospace engineers in government and industry. It could also be used as a supplement to standard texts for in-house training courses on the subject.
NASA Astrophysics Data System (ADS)
Mirkamali, M. S.; Keshavarz FK, N.; Bakhtiari, M. R.
2013-02-01
Faults, as main pathways for fluids, play a critical role in creating regions of high porosity and permeability, in cutting cap rock and in the migration of hydrocarbons into the reservoir. Therefore, accurate identification of fault zones is very important in maximizing production from petroleum traps. Image processing and modern visualization techniques are provided for better mapping of objects of interest. In this study, the application of fault mapping in the identification of fault zones within the Mishan and Aghajari formations above the Guri base unconformity surface in the eastern part of Persian Gulf is investigated. Seismic single- and multi-trace attribute analyses are employed separately to determine faults in a vertical section, but different kinds of geological objects cannot be identified using individual attributes only. A mapping model is utilized to improve the identification of the faults, giving more accurate results. This method is based on combinations of all individual relevant attributes using a neural network system to create combined attributes, which gives an optimal view of the object of interest. Firstly, a set of relevant attributes were separately calculated on the vertical section. Then, at interpreted positions, some example training locations were manually selected in each fault and non-fault class by an interpreter. A neural network was trained on combinations of the attributes extracted at the example training locations to generate an optimized fault cube. Finally, the results of the fault and nonfault probability cube were estimated, which the neural network applied to the entire data set. The fault probability cube was obtained with higher mapping accuracy and greater contrast, and with fewer disturbances in comparison with individual attributes. The computed results of this study can support better understanding of the data, providing fault zone mapping with reliable results.
Whitty, Jennifer A; Crosland, Paul; Hewson, Kaye; Narula, Rajan; Nathan, Timothy R; Campbell, Peter A; Keller, Andrew; Scuffham, Paul A
2014-03-01
To compare the costs of photoselective vaporisation (PVP) and transurethral resection of the prostate (TURP) for management of symptomatic benign prostatic hyperplasia (BPH) from the perspective of a Queensland public hospital provider. A decision-analytic model was used to compare the costs of PVP and TURP. Cost inputs were sourced from an audit of patients undergoing PVP or TURP across three hospitals. The probability of re-intervention was obtained from secondary literature sources. Probabilistic and multi-way sensitivity analyses were used to account for uncertainty and test the impact of varying key assumptions. In the base case analysis, which included equipment, training and re-intervention costs, PVP was AU$ 739 (95% credible interval [CrI] -12 187 to 14 516) more costly per patient than TURP. The estimate was most sensitive to changes in procedural costs, fibre costs and the probability of re-intervention. Sensitivity analyses based on data from the most favourable site or excluding equipment and training costs reduced the point estimate to favour PVP (incremental cost AU$ -684, 95% CrI -8319 to 5796 and AU$ -100, 95% CrI -13 026 to 13 678, respectively). However, CrIs were wide for all analyses. In this cost minimisation analysis, there was no significant cost difference between PVP and TURP, after accounting for equipment, training and re-intervention costs. However, PVP was associated with a shorter length of stay and lower procedural costs during audit, indicating PVP potentially provides comparatively good value for money once the technology is established. © 2013 The Authors. BJU International © 2013 BJU International.
Recognizing human actions by learning and matching shape-motion prototype trees.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2012-03-01
A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set.
Prendergast, Geoffrey P; Staff, Michael
2017-01-01
This study examines the use of the number of night-time sleep disturbances as a health-based metric to assess the cost effectiveness of rail noise mitigation strategies for situations, wherein high-intensity noises dominate such as freight train pass-bys and wheel squeal. Twenty residential properties adjacent to the existing and proposed rail tracks in a noise catchment area of the Epping to Thornleigh Third Track project were used as a case study. Awakening probabilities were calculated for individual's awakening 1, 3 and 5 times a night when subjected to 10 independent freight train pass-by noise events using internal maximum sound pressure levels (LAFmax). Awakenings were predicted using a random intercept multivariate logistic regression model. With source mitigation in place, the majority of the residents were still predicted to be awoken at least once per night (median 88.0%), although substantial reductions in the median probabilities of awakening three and five times per night from 50.9 to 29.4% and 9.2 to 2.7%, respectively, were predicted. This resulted in a cost-effective estimate of 7.6-8.8 less people being awoken at least three times per night per A$1 million spent on noise barriers. The study demonstrates that an easily understood metric can be readily used to assist making decisions related to noise mitigation for large-scale transport projects.
Context Switch Effects on Acquisition and Extinction in Human Predictive Learning
ERIC Educational Resources Information Center
Rosas, Juan M.; Callejas-Aguilera, Jose E.
2006-01-01
Four experiments tested context switch effects on acquisition and extinction in human predictive learning. A context switch impaired probability judgments about a cue-outcome relationship when the cue was trained in a context in which a different cue underwent extinction. The context switch also impaired judgments about a cue trained in a context…
ERIC Educational Resources Information Center
Collier, Daniel A.; Rosch, David M.; Houston, Derek A.
2017-01-01
International student enrollment has experienced dramatic increases on U.S. campuses. Using a national dataset, the study explores and compares international and domestic students' incoming and post-training levels of motivation to lead, leadership self-efficacy, and leadership skill using inverse-probability weighting of propensity scores to…
The Training of Science Teachers in Papua New Guinea
ERIC Educational Resources Information Center
Palmer, W. P.
1987-01-01
The Currie Report (1964) pointed out that "So far as any one strand in the "seamless web" of education can be picked out as of more fundamental importance than another it is the training of teachers/probably no other activity of the Administration is quite so important as this." Klassen (1982) has also rated teacher education…
The Use of Psychological Tests in Predicting Vocational Success of Disadvantaged Adults.
ERIC Educational Resources Information Center
Stanley, Charlton S.
A study of the relationship between certain test scores and probable training and vocational success was made. Examined were three major training areas: power sewing machine, nurse aide, and clerical office work. Six tests were tested for their ability to predict success: the WAIS Revised Beta; Purdue Pegboard; English, California Surveys of…
NASA Technical Reports Server (NTRS)
Troudet, Terry; Merrill, Walter C.
1990-01-01
The ability of feed-forward neural network architectures to learn continuous valued mappings in the presence of noise was demonstrated in relation to parameter identification and real-time adaptive control applications. An error function was introduced to help optimize parameter values such as number of training iterations, observation time, sampling rate, and scaling of the control signal. The learning performance depended essentially on the degree of embodiment of the control law in the training data set and on the degree of uniformity of the probability distribution function of the data that are presented to the net during sequence. When a control law was corrupted by noise, the fluctuations of the training data biased the probability distribution function of the training data sequence. Only if the noise contamination is minimized and the degree of embodiment of the control law is maximized, can a neural net develop a good representation of the mapping and be used as a neurocontroller. A multilayer net was trained with back-error-propagation to control a cart-pole system for linear and nonlinear control laws in the presence of data processing noise and measurement noise. The neurocontroller exhibited noise-filtering properties and was found to operate more smoothly than the teacher in the presence of measurement noise.
Osmanski-Zenk, Katrin; Finze, Susanne; Lenz, Robert; Bader, Rainer; Mittelmeier, Wolfram
2018-06-26
The study aims to evaluate whether the postoperative outcome and the probability of complications of patients with total hip arthroplasty increases significantly when surgeons in training are in charge, assisted by a high volume surgeon, compared to a highly experienced orthopaedic surgeon, within the context of a high volume hospital certified to EndoCert. 192 patients with a primary hip arthroplasty were included. To assess the outcome, the Harris Hip Score, WOMAC, SF-36 and EuroQol-5D were surveyed pre- and 12 months postoperatively. As complications we considered the quality indicators defined by EndoCert. We found significant improvements in the postoperative score values with the qualifications of the surgeon in charge, even when a high volume surgeon or a surgeon in training was responsible. If a surgeon in training is assisted by a highly experienced surgeon, the risk of complications does not increase, although the operating time was significantly increased. Both the surgeon in training as well as the arthroplasty patient benefit from implementing the EndoCert system, because the postoperative outcome and the complication probability is independent of the qualifcation of the operating orthopaedic surgeon performing total hip arthroplasty when assisted by an experienced surgeon. Georg Thieme Verlag KG Stuttgart · New York.
Prediction and uncertainty in human Pavlovian to instrumental transfer.
Trick, Leanne; Hogarth, Lee; Duka, Theodora
2011-05-01
Attentional capture and behavioral control by conditioned stimuli have been dissociated in animals. The current study assessed this dissociation in humans. Participants were trained on a Pavlovian schedule in which 3 visual stimuli, A, B, and C, predicted the occurrence of an aversive noise with 90%, 50%, or 10% probability, respectively. Participants then went on to separate instrumental training in which a key-press response canceled the aversive noise with a .5 probability on a variable interval schedule. Finally, in the transfer phase, the 3 Pavlovian stimuli were presented in this instrumental schedule and were no longer differentially predictive of the outcome. Observing times and gaze dwell time indexed attention to these stimuli in both training and transfer. Aware participants acquired veridical outcome expectancies in training--that is, A > B > C, and these expectancies persisted into transfer. Most important, the transfer effect accorded with these expectancies, A > B > C. By contrast, observing times accorded with uncertainty--that is, they showed B > A = C during training, and B < A = C in the transfer phase. Dwell time bias supported this association between attention and uncertainty, although these data showed a slightly more complicated pattern. Overall, the study suggests that transfer is linked to outcome prediction and is dissociated from attention to conditioned stimuli, which is linked to outcome uncertainty.
Optimizing Chemical Reactions with Deep Reinforcement Learning.
Zhou, Zhenpeng; Li, Xiaocheng; Zare, Richard N
2017-12-27
Deep reinforcement learning was employed to optimize chemical reactions. Our model iteratively records the results of a chemical reaction and chooses new experimental conditions to improve the reaction outcome. This model outperformed a state-of-the-art blackbox optimization algorithm by using 71% fewer steps on both simulations and real reactions. Furthermore, we introduced an efficient exploration strategy by drawing the reaction conditions from certain probability distributions, which resulted in an improvement on regret from 0.062 to 0.039 compared with a deterministic policy. Combining the efficient exploration policy with accelerated microdroplet reactions, optimal reaction conditions were determined in 30 min for the four reactions considered, and a better understanding of the factors that control microdroplet reactions was reached. Moreover, our model showed a better performance after training on reactions with similar or even dissimilar underlying mechanisms, which demonstrates its learning ability.
Shape-Driven 3D Segmentation Using Spherical Wavelets
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2013-01-01
This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details. PMID:17354875
Mahalingam, Rajasekaran; Peng, Hung-Pin; Yang, An-Suei
2014-08-01
Protein-fatty acid interaction is vital for many cellular processes and understanding this interaction is important for functional annotation as well as drug discovery. In this work, we present a method for predicting the fatty acid (FA)-binding residues by using three-dimensional probability density distributions of interacting atoms of FAs on protein surfaces which are derived from the known protein-FA complex structures. A machine learning algorithm was established to learn the characteristic patterns of the probability density maps specific to the FA-binding sites. The predictor was trained with five-fold cross validation on a non-redundant training set and then evaluated with an independent test set as well as on holo-apo pair's dataset. The results showed good accuracy in predicting the FA-binding residues. Further, the predictor developed in this study is implemented as an online server which is freely accessible at the following website, http://ismblab.genomics.sinica.edu.tw/. Copyright © 2014 Elsevier B.V. All rights reserved.
Interactive vs. Non-Interactive Ensembles for Weather Prediction and Climate Projection
NASA Astrophysics Data System (ADS)
Duane, Gregory
2013-04-01
If the members of an ensemble of different models are allowed to interact with one another in run time, predictive skill can be improved as compared to that of any individual model or any average of indvidual model outputs. Inter-model connections in such an interactive ensemble can be trained, using historical data, so that the resulting ``supermodel" synchronizes with reality when used in weather-prediction mode, where the individual models perform data assimilation from each other (with trainable inter-model "observation error") as well as from real observations. In climate-projection mode, parameters of the individual models are changed, as might occur from an increase in GHG levels, and one obtains relevant statistical properties of the new supermodel attractor. In simple cases, it has been shown that training of the inter-model connections with the old parameter values gives a supermodel that is still predictive when the parameter values are changed. Here we inquire as to the circumstances under which supermodel performance can be expected to exceed that of the customary weighted average of model outputs. We consider a supermodel formed from quasigeostrophic channel models with different forcing coefficients, and introduce an effective training scheme for the inter-model connections. We show that the blocked-zonal index cycle is reproduced better by the supermodel than by any non-interactive ensemble in the extreme case where the forcing coefficients of the different models are very large or very small. With realistic differences in forcing coefficients, as would be representative of actual differences among IPCC-class models, the usual linearity assumption is justified and a weighted average of model outputs is adequate. It is therefore hypothesized that supermodeling is likely to be useful in situations where there are qualitative model differences, as arising from sub-gridscale parameterizations, that affect overall model behavior. Otherwise the usual ex post facto averaging will probably suffice. Previous results from an ENSO-prediction supermodel [Kirtman et al.] are re-examined in light of the hypothesis about the importance of qualitative inter-model differences.
Moshtagh-Khorasani, Majid; Akbarzadeh-T, Mohammad-R; Jahangiri, Nader; Khoobdel, Mehdi
2009-01-01
BACKGROUND: Aphasia diagnosis is particularly challenging due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. METHODS: Fuzzy probability is proposed here as the basic framework for handling the uncertainties in medical diagnosis and particularly aphasia diagnosis. To efficiently construct this fuzzy probabilistic mapping, statistical analysis is performed that constructs input membership functions as well as determines an effective set of input features. RESULTS: Considering the high sensitivity of performance measures to different distribution of testing/training sets, a statistical t-test of significance is applied to compare fuzzy approach results with NN results as well as author's earlier work using fuzzy logic. The proposed fuzzy probability estimator approach clearly provides better diagnosis for both classes of data sets. Specifically, for the first and second type of fuzzy probability classifiers, i.e. spontaneous speech and comprehensive model, P-values are 2.24E-08 and 0.0059, respectively, strongly rejecting the null hypothesis. CONCLUSIONS: The technique is applied and compared on both comprehensive and spontaneous speech test data for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. Statistical analysis confirms that the proposed approach can significantly improve accuracy using fewer Aphasia features. PMID:21772867
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wiita, Joanne
The Alaska Native Weatherization Training and Jobs Project expanded weatherization services for tribal members’ homes in southeast Alaska while providing weatherization training and on the job training (OJT) for tribal citizens that lead to jobs and most probably careers in weatherization-related occupations. The program resulted in; (a) 80 Alaska Native citizens provided with skills training in five weatherization training units that were delivered in cooperation with University of Alaska Southeast, in accordance with the U.S. Department of Energy Core Competencies for Weatherization Training that prepared participants for employment in three weatherizationrelated occupations: Installer, Crew Chief, and Auditor; (b) 25 paidmore » OJT training opportunities for trainees who successfully completed the training course; and (c) employed trained personnel that have begun to rehab on over 1,000 housing units for weatherization.« less
Pitfalls in statistical landslide susceptibility modelling
NASA Astrophysics Data System (ADS)
Schröder, Boris; Vorpahl, Peter; Märker, Michael; Elsenbeer, Helmut
2010-05-01
The use of statistical methods is a well-established approach to predict landslide occurrence probabilities and to assess landslide susceptibility. This is achieved by applying statistical methods relating historical landslide inventories to topographic indices as predictor variables. In our contribution, we compare several new and powerful methods developed in machine learning and well-established in landscape ecology and macroecology for predicting the distribution of shallow landslides in tropical mountain rainforests in southern Ecuador (among others: boosted regression trees, multivariate adaptive regression splines, maximum entropy). Although these methods are powerful, we think it is necessary to follow a basic set of guidelines to avoid some pitfalls regarding data sampling, predictor selection, and model quality assessment, especially if a comparison of different models is contemplated. We therefore suggest to apply a novel toolbox to evaluate approaches to the statistical modelling of landslide susceptibility. Additionally, we propose some methods to open the "black box" as an inherent part of machine learning methods in order to achieve further explanatory insights into preparatory factors that control landslides. Sampling of training data should be guided by hypotheses regarding processes that lead to slope failure taking into account their respective spatial scales. This approach leads to the selection of a set of candidate predictor variables considered on adequate spatial scales. This set should be checked for multicollinearity in order to facilitate model response curve interpretation. Model quality assesses how well a model is able to reproduce independent observations of its response variable. This includes criteria to evaluate different aspects of model performance, i.e. model discrimination, model calibration, and model refinement. In order to assess a possible violation of the assumption of independency in the training samples or a possible lack of explanatory information in the chosen set of predictor variables, the model residuals need to be checked for spatial auto¬correlation. Therefore, we calculate spline correlograms. In addition to this, we investigate partial dependency plots and bivariate interactions plots considering possible interactions between predictors to improve model interpretation. Aiming at presenting this toolbox for model quality assessment, we investigate the influence of strategies in the construction of training datasets for statistical models on model quality.
A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vatsavai, Raju; Bhaduri, Budhendra L
2011-01-01
Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less
Mechanisms of Neurofeedback: A Computation-theoretic Approach.
Davelaar, Eddy J
2018-05-15
Neurofeedback training is a form of brain training in which information about a neural measure is fed back to the trainee who is instructed to increase or decrease the value of that particular measure. This paper focuses on electroencephalography (EEG) neurofeedback in which the neural measures of interest are the brain oscillations. To date, the neural mechanisms that underlie successful neurofeedback training are still unexplained. Such an understanding would benefit researchers, funding agencies, clinicians, regulatory bodies, and insurance firms. Based on recent empirical work, an emerging theory couched firmly within computational neuroscience is proposed that advocates a critical role of the striatum in modulating EEG frequencies. The theory is implemented as a computer simulation of peak alpha upregulation, but in principle any frequency band at one or more electrode sites could be addressed. The simulation successfully learns to increase its peak alpha frequency and demonstrates the influence of threshold setting - the threshold that determines whether positive or negative feedback is provided. Analyses of the model suggest that neurofeedback can be likened to a search process that uses importance sampling to estimate the posterior probability distribution over striatal representational space, with each representation being associated with a distribution of values of the target EEG band. The model provides an important proof of concept to address pertinent methodological questions about how to understand and improve EEG neurofeedback success. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.
Knowledge based word-concept model estimation and refinement for biomedical text mining.
Jimeno Yepes, Antonio; Berlanga, Rafael
2015-02-01
Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Quasar probabilities and redshifts from WISE mid-IR through GALEX UV photometry
NASA Astrophysics Data System (ADS)
DiPompeo, M. A.; Bovy, J.; Myers, A. D.; Lang, D.
2015-09-01
Extreme deconvolution (XD) of broad-band photometric data can both separate stars from quasars and generate probability density functions for quasar redshifts, while incorporating flux uncertainties and missing data. Mid-infrared photometric colours are now widely used to identify hot dust intrinsic to quasars, and the release of all-sky WISE data has led to a dramatic increase in the number of IR-selected quasars. Using forced photometry on public WISE data at the locations of Sloan Digital Sky Survey (SDSS) point sources, we incorporate this all-sky data into the training of the XDQSOz models originally developed to select quasars from optical photometry. The combination of WISE and SDSS information is far more powerful than SDSS alone, particularly at z > 2. The use of SDSS+WISE photometry is comparable to the use of SDSS+ultraviolet+near-IR data. We release a new public catalogue of 5537 436 (total; 3874 639 weighted by probability) potential quasars with probability PQSO > 0.2. The catalogue includes redshift probabilities for all objects. We also release an updated version of the publicly available set of codes to calculate quasar and redshift probabilities for various combinations of data. Finally, we demonstrate that this method of selecting quasars using WISE data is both more complete and efficient than simple WISE colour-cuts, especially at high redshift. Our fits verify that above z ˜ 3 WISE colours become bluer than the standard cuts applied to select quasars. Currently, the analysis is limited to quasars with optical counterparts, and thus cannot be used to find highly obscured quasars that WISE colour-cuts identify in significant numbers.
NASA Astrophysics Data System (ADS)
DiFranco, Matthew D.; Reynolds, Hayley M.; Mitchell, Catherine; Williams, Scott; Allan, Prue; Haworth, Annette
2015-03-01
Reliable automated prostate tumor detection and characterization in whole-mount histology images is sought in many applications, including post-resection tumor staging and as ground-truth data for multi-parametric MRI interpretation. In this study, an ensemble-based supervised classification algorithm for high-resolution histology images was trained on tile-based image features including histogram and gray-level co-occurrence statistics. The algorithm was assessed using different combinations of H and E prostate slides from two separate medical centers and at two different magnifications (400x and 200x), with the aim of applying tumor classification models to new data. Slides from both datasets were annotated by expert pathologists in order to identify homogeneous cancerous and non-cancerous tissue regions of interest, which were then categorized as (1) low-grade tumor (LG-PCa), including Gleason 3 and high-grade prostatic intraepithelial neoplasia (HG-PIN), (2) high-grade tumor (HG-PCa), including various Gleason 4 and 5 patterns, or (3) non-cancerous, including benign stroma and benign prostatic hyperplasia (BPH). Classification models for both LG-PCa and HG-PCa were separately trained using a support vector machine (SVM) approach, and per-tile tumor prediction maps were generated from the resulting ensembles. Results showed high sensitivity for predicting HG-PCa with an AUC up to 0.822 using training data from both medical centres, while LG-PCa showed a lower sensitivity of 0.763 with the same training data. Visual inspection of cancer probability heatmaps from 9 patients showed that 17/19 tumors were detected, and HG-PCa generally reported less false positives than LG-PCa.
NASA Astrophysics Data System (ADS)
Xue, Zhaohui; Du, Peijun; Li, Jun; Su, Hongjun
2017-02-01
The generally limited availability of training data relative to the usually high data dimension pose a great challenge to accurate classification of hyperspectral imagery, especially for identifying crops characterized with highly correlated spectra. However, traditional parametric classification models are problematic due to the need of non-singular class-specific covariance matrices. In this research, a novel sparse graph regularization (SGR) method is presented, aiming at robust crop mapping using hyperspectral imagery with very few in situ data. The core of SGR lies in propagating labels from known data to unknown, which is triggered by: (1) the fraction matrix generated for the large unknown data by using an effective sparse representation algorithm with respect to the few training data serving as the dictionary; (2) the prediction function estimated for the few training data by formulating a regularization model based on sparse graph. Then, the labels of large unknown data can be obtained by maximizing the posterior probability distribution based on the two ingredients. SGR is more discriminative, data-adaptive, robust to noise, and efficient, which is unique with regard to previously proposed approaches and has high potentials in discriminating crops, especially when facing insufficient training data and high-dimensional spectral space. The study area is located at Zhangye basin in the middle reaches of Heihe watershed, Gansu, China, where eight crop types were mapped with Compact Airborne Spectrographic Imager (CASI) and Shortwave Infrared Airborne Spectrogrpahic Imager (SASI) hyperspectral data. Experimental results demonstrate that the proposed method significantly outperforms other traditional and state-of-the-art methods.
Levine, M W
1991-01-01
Simulated neural impulse trains were generated by a digital realization of the integrate-and-fire model. The variability in these impulse trains had as its origin a random noise of specified distribution. Three different distributions were used: the normal (Gaussian) distribution (no skew, normokurtic), a first-order gamma distribution (positive skew, leptokurtic), and a uniform distribution (no skew, platykurtic). Despite these differences in the distribution of the variability, the distributions of the intervals between impulses were nearly indistinguishable. These inter-impulse distributions were better fit with a hyperbolic gamma distribution than a hyperbolic normal distribution, although one might expect a better approximation for normally distributed inverse intervals. Consideration of why the inter-impulse distribution is independent of the distribution of the causative noise suggests two putative interval distributions that do not depend on the assumed noise distribution: the log normal distribution, which is predicated on the assumption that long intervals occur with the joint probability of small input values, and the random walk equation, which is the diffusion equation applied to a random walk model of the impulse generating process. Either of these equations provides a more satisfactory fit to the simulated impulse trains than the hyperbolic normal or hyperbolic gamma distributions. These equations also provide better fits to impulse trains derived from the maintained discharges of ganglion cells in the retinae of cats or goldfish. It is noted that both equations are free from the constraint that the coefficient of variation (CV) have a maximum of unity.(ABSTRACT TRUNCATED AT 250 WORDS)
Yates, Justin R; Breitenstein, Kerry A; Gunkel, Benjamin T; Hughes, Mallory N; Johnson, Anthony B; Rogers, Katherine K; Shape, Sara M
Risky decision making can be measured using a probability-discounting procedure, in which animals choose between a small, certain reinforcer and a large, uncertain reinforcer. Recent evidence has identified glutamate as a mediator of risky decision making, as blocking the N-methyl-d-aspartate (NMDA) receptor with MK-801 increases preference for a large, uncertain reinforcer. Because the order in which probabilities associated with the large reinforcer can modulate the effects of drugs on choice, the current study determined if NMDA receptor ligands alter probability discounting using ascending and descending schedules. Sixteen rats were trained in a probability-discounting procedure in which the odds against obtaining the large reinforcer increased (n=8) or decreased (n=8) across blocks of trials. Following behavioral training, rats received treatments of the NMDA receptor ligands MK-801 (uncompetitive antagonist; 0, 0.003, 0.01, or 0.03mg/kg), ketamine (uncompetitive antagonist; 0, 1.0, 5.0, or 10.0mg/kg), and ifenprodil (NR2B-selective non-competitive antagonist; 0, 1.0, 3.0, or 10.0mg/kg). Results showed discounting was steeper (indicating increased risk aversion) for rats on an ascending schedule relative to rats on the descending schedule. Furthermore, the effects of MK-801, ketamine, and ifenprodil on discounting were dependent on the schedule used. Specifically, the highest dose of each drug decreased risk taking in rats in the descending schedule, but only MK-801 (0.03mg/kg) increased risk taking in rats on an ascending schedule. These results show that probability presentation order modulates the effects of NMDA receptor ligands on risky decision making. Copyright © 2016 Elsevier Inc. All rights reserved.
From image captioning to video summary using deep recurrent networks and unsupervised segmentation
NASA Astrophysics Data System (ADS)
Morosanu, Bogdan-Andrei; Lemnaru, Camelia
2018-04-01
Automatic captioning systems based on recurrent neural networks have been tremendously successful at providing realistic natural language captions for complex and varied image data. We explore methods for adapting existing models trained on large image caption data sets to a similar problem, that of summarising videos using natural language descriptions and frame selection. These architectures create internal high level representations of the input image that can be used to define probability distributions and distance metrics on these distributions. Specifically, we interpret each hidden unit inside a layer of the caption model as representing the un-normalised log probability of some unknown image feature of interest for the caption generation process. We can then apply well understood statistical divergence measures to express the difference between images and create an unsupervised segmentation of video frames, classifying consecutive images of low divergence as belonging to the same context, and those of high divergence as belonging to different contexts. To provide a final summary of the video, we provide a group of selected frames and a text description accompanying them, allowing a user to perform a quick exploration of large unlabeled video databases.
NASA Astrophysics Data System (ADS)
Qi, D.; Majda, A.
2017-12-01
A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with distinct statistical structures.
Lisiecki, R S; Voigt, H F
1995-08-01
A 2-channel action-potential generator system was designed for use in testing neurophysiologic data acquisition/analysis systems. The system consists of a personal computer controlling an external hardware unit. This system is capable of generating 2 channels of simulated action potential (AP) waveshapes. The AP waveforms are generated from the linear combination of 2 principal-component template functions. Each channel generates randomly occurring APs with a specified rate ranging from 1 to 200 events per second. The 2 trains may be independent of one another or the second channel may be made to be excited or inhibited by the events from the first channel with user-specified probabilities. A third internal channel may be made to excite or inhibit events in both of the 2 output channels with user-specified rate parameters and probabilities. The system produces voltage waveforms that may be used to test neurophysiologic data acquisition systems for recording from 2 spike trains simultaneously and for testing multispike-train analysis (e.g., cross-correlation) software.
Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J
2007-08-01
Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
NASA Astrophysics Data System (ADS)
Pérez, B.; Brower, R.; Beckers, J.; Paradis, D.; Balseiro, C.; Lyons, K.; Cure, M.; Sotillo, M. G.; Hacket, B.; Verlaan, M.; Alvarez Fanjul, E.
2011-04-01
ENSURF (Ensemble SURge Forecast) is a multi-model application for sea level forecast that makes use of existing storm surge or circulation models today operational in Europe, as well as near-real time tide gauge data in the region, with the following main goals: - providing an easy access to existing forecasts, as well as to its performance and model validation, by means of an adequate visualization tool - generation of better forecasts of sea level, including confidence intervals, by means of the Bayesian Model Average Technique (BMA) The system was developed and implemented within ECOOP (C.No. 036355) European Project for the NOOS and the IBIROOS regions, based on MATROOS visualization tool developed by Deltares. Both systems are today operational at Deltares and Puertos del Estado respectively. The Bayesian Modelling Average technique generates an overall forecast probability density function (PDF) by making a weighted average of the individual forecasts PDF's; the weights represent the probability that a model will give the correct forecast PDF and are determined and updated operationally based on the performance of the models during a recent training period. This implies the technique needs the availability of sea level data from tide gauges in near-real time. Results of validation of the different models and BMA implementation for the main harbours will be presented for the IBIROOS and Western Mediterranean regions, where this kind of activity is performed for the first time. The work has proved to be useful to detect problems in some of the circulation models not previously well calibrated with sea level data, to identify the differences on baroclinic and barotropic models for sea level applications and to confirm the general improvement of the BMA forecasts.
Revell, Andrew D; Wang, Dechao; Perez-Elias, Maria-Jesus; Wood, Robin; Cogill, Dolphina; Tempelman, Hugo; Hamers, Raph L; Reiss, Peter; van Sighem, Ard I; Rehm, Catherine A; Pozniak, Anton; Montaner, Julio S G; Lane, H Clifford; Larder, Brendan A
2018-06-08
Optimizing antiretroviral drug combination on an individual basis can be challenging, particularly in settings with limited access to drugs and genotypic resistance testing. Here we describe our latest computational models to predict treatment responses, with or without a genotype, and compare their predictive accuracy with that of genotyping. Random forest models were trained to predict the probability of virological response to a new therapy introduced following virological failure using up to 50 000 treatment change episodes (TCEs) without a genotype and 18 000 TCEs including genotypes. Independent data sets were used to evaluate the models. This study tested the effects on model accuracy of relaxing the baseline data timing windows, the use of a new filter to exclude probable non-adherent cases and the addition of maraviroc, tipranavir and elvitegravir to the system. The no-genotype models achieved area under the receiver operator characteristic curve (AUC) values of 0.82 and 0.81 using the standard and relaxed baseline data windows, respectively. The genotype models achieved AUC values of 0.86 with the new non-adherence filter and 0.84 without. Both sets of models were significantly more accurate than genotyping with rules-based interpretation, which achieved AUC values of only 0.55-0.63, and were marginally more accurate than previous models. The models were able to identify alternative regimens that were predicted to be effective for the vast majority of cases in which the new regimen prescribed in the clinic failed. These latest global models predict treatment responses accurately even without a genotype and have the potential to help optimize therapy, particularly in resource-limited settings.
ERIC Educational Resources Information Center
Kyslenko, Dmytro
2017-01-01
The paper discusses the use of information technologies in professional training of future security specialists in the United States, Great Britain, Poland and Israel. The probable use of computer-based techniques being available within the integrated Web-sites have been systematized. It has been suggested that the presented scheme may be of great…
ERIC Educational Resources Information Center
Gunby, Kristin V.; Rapp, John T.
2014-01-01
We examined the effects of behavioral skills training with in situ feedback on safe responding by children with autism to abduction lures that were presented after a high-probability (high-p) request sequence. This sequence was intended to simulate a grooming or recruitment process. Results show that all 3 participants ultimately acquired the…
Predicting thermally stressful events in rivers with a strategy to evaluate management alternatives
Maloney, K.O.; Cole, J.C.; Schmid, M.
2016-01-01
Water temperature is an important factor in river ecology. Numerous models have been developed to predict river temperature. However, many were not designed to predict thermally stressful periods. Because such events are rare, traditionally applied analyses are inappropriate. Here, we developed two logistic regression models to predict thermally stressful events in the Delaware River at the US Geological Survey gage near Lordville, New York. One model predicted the probability of an event >20.0 °C, and a second predicted an event >22.2 °C. Both models were strong (independent test data sensitivity 0.94 and 1.00, specificity 0.96 and 0.96) predicting 63 of 67 events in the >20.0 °C model and all 15 events in the >22.2 °C model. Both showed negative relationships with released volume from the upstream Cannonsville Reservoir and positive relationships with difference between air temperature and previous day's water temperature at Lordville. We further predicted how increasing release volumes from Cannonsville Reservoir affected the probabilities of correctly predicted events. For the >20.0 °C model, an increase of 0.5 to a proportionally adjusted release (that accounts for other sources) resulted in 35.9% of events in the training data falling below cutoffs; increasing this adjustment by 1.0 resulted in 81.7% falling below cutoffs. For the >22.2 °C these adjustments resulted in 71.1% and 100.0% of events falling below cutoffs. Results from these analyses can help managers make informed decisions on alternative release scenarios.
Quantum-Assisted Learning of Hardware-Embedded Probabilistic Graphical Models
NASA Astrophysics Data System (ADS)
Benedetti, Marcello; Realpe-Gómez, John; Biswas, Rupak; Perdomo-Ortiz, Alejandro
2017-10-01
Mainstream machine-learning techniques such as deep learning and probabilistic programming rely heavily on sampling from generally intractable probability distributions. There is increasing interest in the potential advantages of using quantum computing technologies as sampling engines to speed up these tasks or to make them more effective. However, some pressing challenges in state-of-the-art quantum annealers have to be overcome before we can assess their actual performance. The sparse connectivity, resulting from the local interaction between quantum bits in physical hardware implementations, is considered the most severe limitation to the quality of constructing powerful generative unsupervised machine-learning models. Here, we use embedding techniques to add redundancy to data sets, allowing us to increase the modeling capacity of quantum annealers. We illustrate our findings by training hardware-embedded graphical models on a binarized data set of handwritten digits and two synthetic data sets in experiments with up to 940 quantum bits. Our model can be trained in quantum hardware without full knowledge of the effective parameters specifying the corresponding quantum Gibbs-like distribution; therefore, this approach avoids the need to infer the effective temperature at each iteration, speeding up learning; it also mitigates the effect of noise in the control parameters, making it robust to deviations from the reference Gibbs distribution. Our approach demonstrates the feasibility of using quantum annealers for implementing generative models, and it provides a suitable framework for benchmarking these quantum technologies on machine-learning-related tasks.
NASA Astrophysics Data System (ADS)
Skilling, John
2005-11-01
This tutorial gives a basic overview of Bayesian methodology, from its axiomatic foundation through the conventional development of data analysis and model selection to its rôle in quantum mechanics, and ending with some comments on inference in general human affairs. The central theme is that probability calculus is the unique language within which we can develop models of our surroundings that have predictive capability. These models are patterns of belief; there is no need to claim external reality. 1. Logic and probability 2. Probability and inference 3. Probability and model selection 4. Prior probabilities 5. Probability and frequency 6. Probability and quantum mechanics 7. Probability and fundamentalism 8. Probability and deception 9. Prediction and truth
Martin, Lisa; Watanabe, Sharon; Fainsinger, Robin; Lau, Francis; Ghosh, Sunita; Quan, Hue; Atkins, Marlis; Fassbender, Konrad; Downing, G Michael; Baracos, Vickie
2010-10-01
To determine whether elements of a standard nutritional screening assessment are independently prognostic of survival in patients with advanced cancer. A prospective nested cohort of patients with metastatic cancer were accrued from different units of a Regional Palliative Care Program. Patients completed a nutritional screen on admission. Data included age, sex, cancer site, height, weight history, dietary intake, 13 nutrition impact symptoms, and patient- and physician-reported performance status (PS). Univariate and multivariate survival analyses were conducted. Concordance statistics (c-statistics) were used to test the predictive accuracy of models based on training and validation sets; a c-statistic of 0.5 indicates the model predicts the outcome as well as chance; perfect prediction has a c-statistic of 1.0. A training set of patients in palliative home care (n = 1,164) was used to identify prognostic variables. Primary disease site, PS, short-term weight change (either gain or loss), dietary intake, and dysphagia predicted survival in multivariate analysis (P < .05). A model including only patients separated by disease site and PS with high c-statistics between predicted and observed responses for survival in the training set (0.90) and validation set (0.88; n = 603). The addition of weight change, dietary intake, and dysphagia did not further improve the c-statistic of the model. The c-statistic was also not altered by substituting physician-rated palliative PS for patient-reported PS. We demonstrate a high probability of concordance between predicted and observed survival for patients in distinct palliative care settings (home care, tertiary inpatient, ambulatory outpatient) based on patient-reported information.
Ali, Anjum A; Dale, Anders M; Badea, Alexandra; Johnson, G Allan
2005-08-15
We present the automated segmentation of magnetic resonance microscopy (MRM) images of the C57BL/6J mouse brain into 21 neuroanatomical structures, including the ventricular system, corpus callosum, hippocampus, caudate putamen, inferior colliculus, internal capsule, globus pallidus, and substantia nigra. The segmentation algorithm operates on multispectral, three-dimensional (3D) MR data acquired at 90-microm isotropic resolution. Probabilistic information used in the segmentation is extracted from training datasets of T2-weighted, proton density-weighted, and diffusion-weighted acquisitions. Spatial information is employed in the form of prior probabilities of occurrence of a structure at a location (location priors) and the pairwise probabilities between structures (contextual priors). Validation using standard morphometry indices shows good consistency between automatically segmented and manually traced data. Results achieved in the mouse brain are comparable with those achieved in human brain studies using similar techniques. The segmentation algorithm shows excellent potential for routine morphological phenotyping of mouse models.
NASA Astrophysics Data System (ADS)
Kim, Kyungmin; Harry, Ian W.; Hodge, Kari A.; Kim, Young-Min; Lee, Chang-Hwan; Lee, Hyun Kyu; Oh, John J.; Oh, Sang Hoon; Son, Edwin J.
2015-12-01
We apply a machine learning algorithm, the artificial neural network, to the search for gravitational-wave signals associated with short gamma-ray bursts (GRBs). The multi-dimensional samples consisting of data corresponding to the statistical and physical quantities from the coherent search pipeline are fed into the artificial neural network to distinguish simulated gravitational-wave signals from background noise artifacts. Our result shows that the data classification efficiency at a fixed false alarm probability (FAP) is improved by the artificial neural network in comparison to the conventional detection statistic. Specifically, the distance at 50% detection probability at a fixed false positive rate is increased about 8%-14% for the considered waveform models. We also evaluate a few seconds of the gravitational-wave data segment using the trained networks and obtain the FAP. We suggest that the artificial neural network can be a complementary method to the conventional detection statistic for identifying gravitational-wave signals related to the short GRBs.
Quantization and training of object detection networks with low-precision weights and activations
NASA Astrophysics Data System (ADS)
Yang, Bo; Liu, Jian; Zhou, Li; Wang, Yun; Chen, Jie
2018-01-01
As convolutional neural networks have demonstrated state-of-the-art performance in object recognition and detection, there is a growing need for deploying these systems on resource-constrained mobile platforms. However, the computational burden and energy consumption of inference for these networks are significantly higher than what most low-power devices can afford. To address these limitations, this paper proposes a method to train object detection networks with low-precision weights and activations. The probability density functions of weights and activations of each layer are first directly estimated using piecewise Gaussian models. Then, the optimal quantization intervals and step sizes for each convolution layer are adaptively determined according to the distribution of weights and activations. As the most computationally expensive convolutions can be replaced by effective fixed point operations, the proposed method can drastically reduce computation complexity and memory footprint. Performing on the tiny you only look once (YOLO) and YOLO architectures, the proposed method achieves comparable accuracy to their 32-bit counterparts. As an illustration, the proposed 4-bit and 8-bit quantized versions of the YOLO model achieve a mean average precision of 62.6% and 63.9%, respectively, on the Pascal visual object classes 2012 test dataset. The mAP of the 32-bit full-precision baseline model is 64.0%.
Nonlinear Spatial Inversion Without Monte Carlo Sampling
NASA Astrophysics Data System (ADS)
Curtis, A.; Nawaz, A.
2017-12-01
High-dimensional, nonlinear inverse or inference problems usually have non-unique solutions. The distribution of solutions are described by probability distributions, and these are usually found using Monte Carlo (MC) sampling methods. These take pseudo-random samples of models in parameter space, calculate the probability of each sample given available data and other information, and thus map out high or low probability values of model parameters. However, such methods would converge to the solution only as the number of samples tends to infinity; in practice, MC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. We propose a method for Bayesian inversion of categorical variables such as geological facies or rock types in spatial problems, which requires no sampling at all. The method uses a 2-D Hidden Markov Model over a grid of cells, where observations represent localized data constraining the model in each cell. The data in our example application are seismic properties such as P- and S-wave impedances or rock density; our model parameters are the hidden states and represent the geological rock types in each cell. The observations at each location are assumed to depend on the facies at that location only - an assumption referred to as `localized likelihoods'. However, the facies at a location cannot be determined solely by the observation at that location as it also depends on prior information concerning its correlation with the spatial distribution of facies elsewhere. Such prior information is included in the inversion in the form of a training image which represents a conceptual depiction of the distribution of local geologies that might be expected, but other forms of prior information can be used in the method as desired. The method provides direct (pseudo-analytic) estimates of posterior marginal probability distributions over each variable, so these do not need to be estimated from samples as is required in MC methods. On a 2-D test example the method is shown to outperform previous methods significantly, and at a fraction of the computational cost. In many foreseeable applications there are therefore no serious impediments to extending the method to 3-D spatial models.
Dynamic Capacity and Surface Fatigue Life for Spur and Helical Gears
NASA Technical Reports Server (NTRS)
Coy, J. J.; Townsend, D. P.; Zaretsky, E. V.
1975-01-01
A mathematical model for surface fatigue life of gear, pinion, or entire meshing gear train is given. The theory is based on a previous statistical approach for rolling-element bearings. Equations are presented which give the dynamic capacity of the gear set. The dynamic capacity is the transmitted tangential load which gives a 90 percent probability of survival of the gear set for one million pinion revolutions. The analytical results are compared with test data for a set of AISI 9310 spur gears operating at a maximum Hertz stress of 1.71 billion N/sq m and 10,000 rpm. The theoretical life predictions are shown to be good when material constants obtained from rolling-element bearing tests were used in the gear life model.
Automated reference-free detection of motion artifacts in magnetic resonance images.
Küstner, Thomas; Liebgott, Annika; Mauch, Lukas; Martirosian, Petros; Bamberg, Fabian; Nikolaou, Konstantin; Yang, Bin; Schick, Fritz; Gatidis, Sergios
2018-04-01
Our objectives were to provide an automated method for spatially resolved detection and quantification of motion artifacts in MR images of the head and abdomen as well as a quality control of the trained architecture. T1-weighted MR images of the head and the upper abdomen were acquired in 16 healthy volunteers under rest and under motion. Images were divided into overlapping patches of different sizes achieving spatial separation. Using these patches as input data, a convolutional neural network (CNN) was trained to derive probability maps for the presence of motion artifacts. A deep visualization offers a human-interpretable quality control of the trained CNN. Results were visually assessed on probability maps and as classification accuracy on a per-patch, per-slice and per-volunteer basis. On visual assessment, a clear difference of probability maps was observed between data sets with and without motion. The overall accuracy of motion detection on a per-patch/per-volunteer basis reached 97%/100% in the head and 75%/100% in the abdomen, respectively. Automated detection of motion artifacts in MRI is feasible with good accuracy in the head and abdomen. The proposed method provides quantification and localization of artifacts as well as a visualization of the learned content. It may be extended to other anatomic areas and used for quality assurance of MR images.
Spontaneous action potentials and neural coding in unmyelinated axons.
O'Donnell, Cian; van Rossum, Mark C W
2015-04-01
The voltage-gated Na and K channels in neurons are responsible for action potential generation. Because ion channels open and close in a stochastic fashion, spontaneous (ectopic) action potentials can result even in the absence of stimulation. While spontaneous action potentials have been studied in detail in single-compartment models, studies on spatially extended processes have been limited. The simulations and analysis presented here show that spontaneous rate in unmyelinated axon depends nonmonotonically on the length of the axon, that the spontaneous activity has sub-Poisson statistics, and that neural coding can be hampered by the spontaneous spikes by reducing the probability of transmitting the first spike in a train.
Devi, Suma Priya Sudarsana; Howe, James R.
2016-01-01
Key points Purkinje cells of the cerebellum receive ∼180,000 parallel fibre synapses, which have often been viewed as a homogeneous synaptic population and studied using single action potentials.Many parallel fibre synapses might be silent, however, and granule cells in vivo fire in bursts. Here, we used trains of stimuli to study parallel fibre inputs to Purkinje cells in rat cerebellar slices.Analysis of train EPSCs revealed two synaptic components, phase 1 and 2. Phase 1 is initially large and saturates rapidly, whereas phase 2 is initially small and facilitates throughout the train. The two components have a heterogeneous distribution at dendritic sites and different pharmacological profiles.The differential sensitivity of phase 1 and phase 2 to inhibition by pentobarbital and NBQX mirrors the differential sensitivity of AMPA receptors associated with the transmembrane AMPA receptor regulatory protein, γ‐2, gating in the low‐ and high‐open probability modes, respectively. Abstract Cerebellar granule cells fire in bursts, and their parallel fibre axons (PFs) form ∼180,000 excitatory synapses onto the dendritic tree of a Purkinje cell. As many as 85% of these synapses have been proposed to be silent, but most are labelled for AMPA receptors. Here, we studied PF to Purkinje cell synapses using trains of 100 Hz stimulation in rat cerebellar slices. The PF train EPSC consisted of two components that were present in variable proportions at different dendritic sites: one, with large initial EPSC amplitude, saturated after three stimuli and dominated the early phase of the train EPSC; and the other, with small initial amplitude, increased steadily throughout the train of 10 stimuli and dominated the late phase of the train EPSC. The two phases also displayed different pharmacological profiles. Phase 2 was less sensitive to inhibition by NBQX but more sensitive to block by pentobarbital than phase 1. Comparison of synaptic results with fast glutamate applications to recombinant receptors suggests that the high‐open‐probability gating mode of AMPA receptors containing the auxiliary subunit transmembrane AMPA receptor regulatory protein γ‐2 makes a substantial contribution to phase 2. We argue that the two synaptic components arise from AMPA receptors with different functional signatures and synaptic distributions. Comparisons of voltage‐ and current‐clamp responses obtained from the same Purkinje cells indicate that phase 1 of the EPSC arises from synapses ideally suited to transmit short bursts of action potentials, whereas phase 2 is likely to arise from low‐release‐probability or ‘silent’ synapses that are recruited during longer bursts. PMID:27094216
Bahouth, George; Graygo, Jill; Digges, Kennerly; Schulman, Carl; Baur, Peter
2014-01-01
The objectives of this study are to (1) characterize the population of crashes meeting the Centers for Disease Control and Prevention (CDC)-recommended 20% risk of Injury Severity Score (ISS)>15 injury and (2) explore the positive and negative effects of an advanced automatic crash notification (AACN) system whose threshold for high-risk indications is 10% versus 20%. Binary logistic regression analysis was performed to predict the occurrence of motor vehicle crash injuries at both the ISS>15 and Maximum Abbreviated Injury Scale (MAIS) 3+ level. Models were trained using crash characteristics recommended by the CDC Committee on Advanced Automatic Collision Notification and Triage of the Injured Patient. Each model was used to assign the probability of severe injury (defined as MAIS 3+ or ISS>15 injury) to a subset of NASS-CDS cases based on crash attributes. Subsequently, actual AIS and ISS levels were compared with the predicted probability of injury to determine the extent to which the seriously injured had corresponding probabilities exceeding the 10% and 20% risk thresholds. Models were developed using an 80% sample of NASS-CDS data from 2002 to 2012 and evaluations were performed using the remaining 20% of cases from the same period. Within the population of seriously injured (i.e., those having one or more AIS 3 or higher injuries), the number of occupants whose injury risk did not exceed the 10% and 20% thresholds were estimated to be 11,700 and 18,600, respectively, each year using the MAIS 3+ injury model. For the ISS>15 model, 8,100 and 11,000 occupants sustained ISS>15 injuries yet their injury probability did not reach the 10% and 20% probability for severe injury respectively. Conversely, model predictions suggested that, at the 10% and 20% thresholds, 207,700 and 55,400 drivers respectively would be incorrectly flagged as injured when their injuries had not reached the AIS 3 level. For the ISS>15 model, 87,300 and 41,900 drivers would be incorrectly flagged as injured when injury severity had not reached the ISS>15 injury level. This article provides important information comparing the expected positive and negative effects of an AACN system with thresholds at the 10% and 20% levels using 2 outcome metrics. Overall, results suggest that the 20% risk threshold would not provide a useful notification to improve the quality of care for a large number of seriously injured crash victims. Alternately, a lower threshold may increase the over triage rate. Based on the vehicle damage observed for crashes reaching and exceeding the 10% risk threshold, we anticipate that rescue services would have been deployed based on current Public Safety Answering Point (PSAP) practices.
Neural Effects of Short-Term Training on Working Memory
Buschkuehl, Martin; Garcia, Luis Hernandez; Jaeggi, Susanne M.; Bernard, Jessica A.; Jonides, John
2014-01-01
Working memory training has been the focus of intense research interest. Despite accumulating behavioral work, knowledge about the neural mechanisms underlying training effects is scarce. Here we show that seven days of training on an n back task lead to substantial performance improvements in the trained task; furthermore, the experimental group shows cross modal transfer as compared to an active control group. In addition, there are two neural effects that emerged as a function of training: first, increased perfusion during task performance in selected regions, reflecting a neural response to cope with high task demand; second, increased blood flow at rest in regions where training effects were apparent. We also found that perfusion at rest was correlated with task proficiency, probably reflecting an improved neural readiness to perform. Our findings are discussed within the context of the available neuroimaging literature on n back training. PMID:24496717
NASA Astrophysics Data System (ADS)
Rings, Joerg; Vrugt, Jasper A.; Schoups, Gerrit; Huisman, Johan A.; Vereecken, Harry
2012-05-01
Bayesian model averaging (BMA) is a standard method for combining predictive distributions from different models. In recent years, this method has enjoyed widespread application and use in many fields of study to improve the spread-skill relationship of forecast ensembles. The BMA predictive probability density function (pdf) of any quantity of interest is a weighted average of pdfs centered around the individual (possibly bias-corrected) forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts, and reflect the individual models skill over a training (calibration) period. The original BMA approach presented by Raftery et al. (2005) assumes that the conditional pdf of each individual model is adequately described with a rather standard Gaussian or Gamma statistical distribution, possibly with a heteroscedastic variance. Here we analyze the advantages of using BMA with a flexible representation of the conditional pdf. A joint particle filtering and Gaussian mixture modeling framework is presented to derive analytically, as closely and consistently as possible, the evolving forecast density (conditional pdf) of each constituent ensemble member. The median forecasts and evolving conditional pdfs of the constituent models are subsequently combined using BMA to derive one overall predictive distribution. This paper introduces the theory and concepts of this new ensemble postprocessing method, and demonstrates its usefulness and applicability by numerical simulation of the rainfall-runoff transformation using discharge data from three different catchments in the contiguous United States. The revised BMA method receives significantly lower-prediction errors than the original default BMA method (due to filtering) with predictive uncertainty intervals that are substantially smaller but still statistically coherent (due to the use of a time-variant conditional pdf).
Adaptive partially hidden Markov models with application to bilevel image coding.
Forchhammer, S; Rasmussen, T S
1999-01-01
Partially hidden Markov models (PHMMs) have previously been introduced. The transition and emission/output probabilities from hidden states, as known from the HMMs, are conditioned on the past. This way, the HMM may be applied to images introducing the dependencies of the second dimension by conditioning. In this paper, the PHMM is extended to multiple sequences with a multiple token version and adaptive versions of PHMM coding are presented. The different versions of the PHMM are applied to lossless bilevel image coding. To reduce and optimize the model cost and size, the contexts are organized in trees and effective quantization of the parameters is introduced. The new coding methods achieve results that are better than the JBIG standard on selected test images, although at the cost of increased complexity. By the minimum description length principle, the methods presented for optimizing the code length may apply as guidance for training (P)HMMs for, e.g., segmentation or recognition purposes. Thereby, the PHMM models provide a new approach to image modeling.
Disjunctive Normal Shape and Appearance Priors with Applications to Image Segmentation.
Mesadi, Fitsum; Cetin, Mujdat; Tasdizen, Tolga
2015-10-01
The use of appearance and shape priors in image segmentation is known to improve accuracy; however, existing techniques have several drawbacks. Active shape and appearance models require landmark points and assume unimodal shape and appearance distributions. Level set based shape priors are limited to global shape similarity. In this paper, we present a novel shape and appearance priors for image segmentation based on an implicit parametric shape representation called disjunctive normal shape model (DNSM). DNSM is formed by disjunction of conjunctions of half-spaces defined by discriminants. We learn shape and appearance statistics at varying spatial scales using nonparametric density estimation. Our method can generate a rich set of shape variations by locally combining training shapes. Additionally, by studying the intensity and texture statistics around each discriminant of our shape model, we construct a local appearance probability map. Experiments carried out on both medical and natural image datasets show the potential of the proposed method.
CFD Assessment of Aerodynamic Degradation of a Subsonic Transport Due to Airframe Damage
NASA Technical Reports Server (NTRS)
Frink, Neal T.; Pirzadeh, Shahyar Z.; Atkins, Harold L.; Viken, Sally A.; Morrison, Joseph H.
2010-01-01
A computational study is presented to assess the utility of two NASA unstructured Navier-Stokes flow solvers for capturing the degradation in static stability and aerodynamic performance of a NASA General Transport Model (GTM) due to airframe damage. The approach is to correlate computational results with a substantial subset of experimental data for the GTM undergoing progressive losses to the wing, vertical tail, and horizontal tail components. The ultimate goal is to advance the probability of inserting computational data into the creation of advanced flight simulation models of damaged subsonic aircraft in order to improve pilot training. Results presented in this paper demonstrate good correlations with slope-derived quantities, such as pitch static margin and static directional stability, and incremental rolling moment due to wing damage. This study further demonstrates that high fidelity Navier-Stokes flow solvers could augment flight simulation models with additional aerodynamic data for various airframe damage scenarios.
Optimizing Chemical Reactions with Deep Reinforcement Learning
2017-01-01
Deep reinforcement learning was employed to optimize chemical reactions. Our model iteratively records the results of a chemical reaction and chooses new experimental conditions to improve the reaction outcome. This model outperformed a state-of-the-art blackbox optimization algorithm by using 71% fewer steps on both simulations and real reactions. Furthermore, we introduced an efficient exploration strategy by drawing the reaction conditions from certain probability distributions, which resulted in an improvement on regret from 0.062 to 0.039 compared with a deterministic policy. Combining the efficient exploration policy with accelerated microdroplet reactions, optimal reaction conditions were determined in 30 min for the four reactions considered, and a better understanding of the factors that control microdroplet reactions was reached. Moreover, our model showed a better performance after training on reactions with similar or even dissimilar underlying mechanisms, which demonstrates its learning ability. PMID:29296675
Timing in a Variable Interval Procedure: Evidence for a Memory Singularity
Matell, Matthew S.; Kim, Jung S.; Hartshorne, Loryn
2013-01-01
Rats were trained in either a 30s peak-interval procedure, or a 15–45s variable interval peak procedure with a uniform distribution (Exp 1) or a ramping probability distribution (Exp 2). Rats in all groups showed peak shaped response functions centered around 30s, with the uniform group having an earlier and broader peak response function and rats in the ramping group having a later peak function as compared to the single duration group. The changes in these mean functions, as well as the statistics from single trial analyses, can be better captured by a model of timing in which memory is represented by a single, average, delay to reinforcement compared to one in which all durations are stored as a distribution, such as the complete memory model of Scalar Expectancy Theory or a simple associative model. PMID:24012783
da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa
2016-01-01
The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants’ municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar characteristics, presented high probability of certification, associated with the highest degree in social networking platform. PMID:26727472
Di Marca, Salvatore; Cilia, Chiara; Campagna, Andrea; D'Arrigo, Graziella; Abd ElHafeez, Samar; Tripepi, Giovanni; Puccia, Giuseppe; Pisano, Marcella; Mastrosimone, Gianluca; Terranova, Valentina; Cardella, Antonella; Buonacera, Agata; Stancanelli, Benedetta; Zoccali, Carmine; Malatino, Lorenzo
2015-06-01
To assess and compare the diagnostic power for pulmonary embolism (PE) of Wells and revised Geneva scores in two independent cohorts (training and validation groups) of elderly adults hospitalized in a non-emergency department. Prospective clinical study, January 2011 to January 2013. Unit of Internal Medicine inpatients, University of Catania, Italy. Elderly adults (mean age 76 ± 12), presenting with dyspnea or chest pain and with high clinical probability of PE or D-dimer values greater than 500 ng/mL (N = 203), were enrolled and consecutively assigned to a training (n = 101) or a validation (n = 102) group. The clinical probability of PE was assessed using Wells and revised Geneva scores. Clinical examination, D-dimer test, and multidetector computed angiotomography were performed in all participants. The accuracy of the scores was assessed using receiver operating characteristic analyses. PE was confirmed in 46 participants (23%) (24 training group, 22 validation group). In the training group, the area under the receiver operating characteristic curve was 0.91 (95% confidence interval (CI) = 0.85-0.98) for the Wells score and 0.69 (95% CI = 0.56-0.82) for the revised Geneva score (P < .001). These results were confirmed in the validation group (P < .05). The positive (LR+) and negative likelihood ratios (LR-) (two indices combining sensitivity and specificity) of the Wells score were superior to those of the revised Geneva score in the training (LR+, 7.90 vs 1.34; LR-, 0.23 vs 0.66) and validation (LR+, 13.5 vs 1.46; LR-, 0.47 vs 0.54) groups. In high-risk elderly hospitalized adults, the Wells score is more accurate than the revised Geneva score for diagnosing PE. © 2015, Copyright the Authors Journal compilation © 2015, The American Geriatrics Society.
da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa
2016-01-01
The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar characteristics, presented high probability of certification, associated with the highest degree in social networking platform.
NASA Astrophysics Data System (ADS)
Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.
2016-03-01
Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.
Isaranuwatchai, Wanrudee; Brydges, Ryan; Carnahan, Heather; Backstein, David; Dubrowski, Adam
2014-05-01
While the ultimate goal of simulation training is to enhance learning, cost-effectiveness is a critical factor. Research that compares simulation training in terms of educational- and cost-effectiveness will lead to better-informed curricular decisions. Using previously published data we conducted a cost-effectiveness analysis of three simulation-based programs. Medical students (n = 15 per group) practiced in one of three 2-h intravenous catheterization skills training programs: low-fidelity (virtual reality), high-fidelity (mannequin), or progressive (consisting of virtual reality, task trainer, and mannequin simulator). One week later, all performed a transfer test on a hybrid simulation (standardized patient with a task trainer). We used a net benefit regression model to identify the most cost-effective training program via paired comparisons. We also created a cost-effectiveness acceptability curve to visually represent the probability that one program is more cost-effective when compared to its comparator at various 'willingness-to-pay' values. We conducted separate analyses for implementation and total costs. The results showed that the progressive program had the highest total cost (p < 0.001) whereas the high-fidelity program had the highest implementation cost (p < 0.001). While the most cost-effective program depended on the decision makers' willingness-to-pay value, the progressive training program was generally most educationally- and cost-effective. Our analyses suggest that a progressive program that strategically combines simulation modalities provides a cost-effective solution. More generally, we have introduced how a cost-effectiveness analysis may be applied to simulation training; a method that medical educators may use to investment decisions (e.g., purchasing cost-effective and educationally sound simulators).
Probability based models for estimation of wildfire risk
Haiganoush Preisler; D. R. Brillinger; R. E. Burgan; John Benoit
2004-01-01
We present a probability-based model for estimating fire risk. Risk is defined using three probabilities: the probability of fire occurrence; the conditional probability of a large fire given ignition; and the unconditional probability of a large fire. The model is based on grouped data at the 1 km²-day cell level. We fit a spatially and temporally explicit non-...
Army Training Study: Training Effectiveness Analysis (TEA) Summary. Volume 1. Armor.
1978-08-08
c. r tf(e rx =r) rfs it ,, 0’( r’( r , a S’ f "’ u nn Ir Irf I I. * 222 ARM’- NN :TuDY TWANINC EFFCIUENEZ-: ANALISIS TEA : MMARY ’JLUM~ I ARMOR U...marginally above 50%, however, probably is not. 22 TABLE 10 TANK CREW QUALIFICATION PERFORMANCE ON TASK STANDARDS S TANDARD SATI S FACTORY Day
Multistage classification of multispectral Earth observational data: The design approach
NASA Technical Reports Server (NTRS)
Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.
1981-01-01
An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.
Willcox, Michelle; Harrison, Heather; Asiedu, Amos; Nelson, Allyson; Gomez, Patricia; LeFevre, Amnesty
2017-12-06
Low-dose, high-frequency (LDHF) training is a new approach best practices to improve clinical knowledge, build and retain competency, and transfer skills into practice after training. LDHF training in Ghana is an opportunity to build health workforce capacity in critical areas of maternal and newborn health and translate improved capacity into better health outcomes. This study examined the costs of an LDHF training approach for basic emergency obstetric and newborn care and calculates the incremental cost-effectiveness of the LDHF training program for health outcomes of newborn survival, compared to the status quo alternative of no training. The costs of LDHF were compared to costs of traditional workshop-based training per provider trained. Retrospective program cost analysis with activity-based costing was used to measure all resources of the LDHF training program over a 3-year analytic time horizon. Economic costs were estimated from financial records, informant interviews, and regional market prices. Health effects from the program's impact evaluation were used to model lives saved and disability-adjusted life years (DALYs) averted. Uncertainty analysis included one-way and probabilistic sensitivity analysis to explore incremental cost-effectiveness results when fluctuating key parameters. For the 40 health facilities included in the evaluation, the total LDHF training cost was $823,134. During the follow-up period after the first LDHF training-1 year at each participating facility-approximately 544 lives were saved. With deterministic calculation, these findings translate to $1497.77 per life saved or $53.07 per DALY averted. Probabilistic sensitivity analysis, with mean incremental cost-effectiveness ratio of $54.79 per DALY averted ($24.42-$107.01), suggests the LDHF training program as compared to no training has 100% probability of being cost-effective above a willingness to pay threshold of $1480, Ghana's gross national income per capita in 2015. This study provides insight into the investment of LDHF training and value for money of this approach to training in-service providers on basic emergency obstetric and newborn care. The LDHF training approach should be considered for expansion in Ghana and integrated into existing in-service training programs and health system organizational structures for lower cost and more efficiency at scale.
Separation of Sperm Whale Click-Trains for Multipath Rejection and Localization
2010-03-05
Correlation 12 3.3 Multipath Elimination Rules 13 4 LOCALIZATION 15 4.1 Localization Approach 15 4.2 Inter-Sensor Time-Delay Estimation Approach...Using Bayes’ rule , kj = ’°g kj = >°g< P(H, P(H0 HJ) \\J) (2) />(//,)p(zJ//,) p(H0)p(z,JH0) where p(//o) and p(H\\) are the a priori probabilities...overlapping clicks.) 3.3 MULTIPATH ELIMINATION RULES Multipath click-trains are eliminated if the individual clicks within the click-train are
Tsai, Alexander C.; Tomlinson, Mark; Dewing, Sarah; le Roux, Ingrid M.; Harwood, Jessica M.; Chopra, Mickey; Rotheram-Borus, Mary Jane
2014-01-01
Purpose Randomised controlled trials conducted in resource-limited settings have shown that once women with depressed mood are evaluated by specialists and referred for treatment, lay health workers can be trained to effectively administer psychological treatments. We sought to determine the extent to which community health workers could also be trained to conduct case finding using short and ultra-short screening instruments programmed into mobile phones. Methods Pregnant, Xhosa-speaking women were recruited independently in two cross-sectional studies (N=1,144 and N=361) conducted in Khayelitsha, South Africa and assessed for antenatal depression. In the smaller study, community health workers with no training in human subjects research were trained to administer the Edinburgh Postnatal Depression Scale (EPDS) during the routine course of their community-based outreach. We compared the operating characteristics of 4 short and ultra-short versions of the EPDS with the criterion standard of probable depression, defined as an EPDS-10 ≥13. Results The prevalence of probable depression (475/1144 [42%] and 165/361 [46%]) was consistent across both samples. The 2-item subscale demonstrated poor internal consistency (Cronbach’s α ranged from 0.55-0.58). All 4 subscales demonstrated excellent discrimination, with area under the receiver operating characteristic curve (AUC) values ranging from 0.91-0.99. Maximal discrimination was observed for the 7-item depressive symptoms subscale: at the conventional screening threshold of ≥10, it had 0.97 sensitivity and 0.76 specificity for detecting probable antenatal depression. Conclusions The comparability of the findings across the two studies suggests that it is feasible to use community health workers to conduct case finding for antenatal depression. PMID:24682529
Tsai, Alexander C; Tomlinson, Mark; Dewing, Sarah; le Roux, Ingrid M; Harwood, Jessica M; Chopra, Mickey; Rotheram-Borus, Mary Jane
2014-10-01
Randomized controlled trials conducted in resource-limited settings have shown that once women with depressed mood are evaluated by specialists and referred for treatment, lay health workers can be trained to effectively administer psychological treatments. We sought to determine the extent to which community health workers could also be trained to conduct case finding using short and ultrashort screening instruments programmed into mobile phones. Pregnant, Xhosa-speaking women were recruited independently in two cross-sectional studies (N = 1,144 and N = 361) conducted in Khayelitsha, South Africa and assessed for antenatal depression. In the smaller study, community health workers with no training in human subject research were trained to administer the Edinburgh Postnatal Depression Scale (EPDS) during the routine course of their community-based outreach. We compared the operating characteristics of four short and ultrashort versions of the EPDS with the criterion standard of probable depression, defined as an EPDS-10 ≥ 13. The prevalence of probable depression (475/1144 [42 %] and 165/361 [46 %]) was consistent across both samples. The 2-item subscale demonstrated poor internal consistency (Cronbach's α ranged from 0.55 to 0.58). All four subscales demonstrated excellent discrimination, with area under the receiver operating characteristic curve (AUC) values ranging from 0.91 to 0.99. Maximal discrimination was observed for the 7-item depressive symptoms subscale: at the conventional screening threshold of ≥10, it had 0.97 sensitivity and 0.76 specificity for detecting probable antenatal depression. The comparability of the findings across the two studies suggests that it is feasible to use community health workers to conduct case finding for antenatal depression.
Deep convolutional networks for pancreas segmentation in CT imaging
NASA Astrophysics Data System (ADS)
Roth, Holger R.; Farag, Amal; Lu, Le; Turkbey, Evrim B.; Summers, Ronald M.
2015-03-01
Automatic organ segmentation is an important prerequisite for many computer-aided diagnosis systems. The high anatomical variability of organs in the abdomen, such as the pancreas, prevents many segmentation methods from achieving high accuracies when compared to state-of-the-art segmentation of organs like the liver, heart or kidneys. Recently, the availability of large annotated training sets and the accessibility of affordable parallel computing resources via GPUs have made it feasible for "deep learning" methods such as convolutional networks (ConvNets) to succeed in image classification tasks. These methods have the advantage that used classification features are trained directly from the imaging data. We present a fully-automated bottom-up method for pancreas segmentation in computed tomography (CT) images of the abdomen. The method is based on hierarchical coarse-to-fine classification of local image regions (superpixels). Superpixels are extracted from the abdominal region using Simple Linear Iterative Clustering (SLIC). An initial probability response map is generated, using patch-level confidences and a two-level cascade of random forest classifiers, from which superpixel regions with probabilities larger 0.5 are retained. These retained superpixels serve as a highly sensitive initial input of the pancreas and its surroundings to a ConvNet that samples a bounding box around each superpixel at different scales (and random non-rigid deformations at training time) in order to assign a more distinct probability of each superpixel region being pancreas or not. We evaluate our method on CT images of 82 patients (60 for training, 2 for validation, and 20 for testing). Using ConvNets we achieve maximum Dice scores of an average 68% +/- 10% (range, 43-80%) in testing. This shows promise for accurate pancreas segmentation, using a deep learning approach and compares favorably to state-of-the-art methods.
Sport-Specific Assessment of the Effectiveness of Neuromuscular Training in Young Athletes
Zemková, Erika; Hamar, Dušan
2018-01-01
Neuromuscular training in young athletes improves performance and decreases the risk of injuries during sports activities. These effects are primarily ascribed to the enhancement of muscle strength and power but also balance, speed and agility. However, most studies have failed to demonstrate significant improvement in these abilities. This is probably due to the fact that traditional tests do not reflect training methods (e.g., plyometric training vs. isometric or isokinetic strength testing, dynamic balance training vs. static balance testing). The protocols utilized in laboratories only partially fulfill the current needs for testing under sport-specific conditions. Moreover, laboratory testing usually requires skilled staff and a well equipped and costly infrastructure. Nevertheless, experience demonstrates that high-technology and expensive testing is not the only way to proceed. A number of physical fitness field tests are available today. However, the low reliability and limited number of parameters retrieved from simple equipment used also limit their application in competitive sports. Thus, there is a need to develop and validate a functional assessment platform based on portable computerized systems. Variables obtained should be directly linked to specific features of particular sports and capture their complexity. This is essential for revealing weak and strong components of athlete performance and design of individually-tailored exercise programs. Therefore, identifying the drawbacks associated with the assessment of athlete performance under sport-specific conditions would provide a basis for the formation of an innovative approach to their long-term systematic testing. This study aims (i) to review the testing methods used for the evaluation of the effect of neuromuscular training on sport-specific performance in young athletes, (ii) to introduce stages within the Sport Longlife Diagnostic Model, and (iii) to propose future research in this topic. Analysis of the literature identified gaps in the current standard testing methods in terms of their low sensitivity in discriminating between athletes of varied ages and performance levels, insufficent tailoring to athlete performance level and individual needs, a lack of specificity to the requirements of particular sports and also in revealing the effect of training. In order to partly fill in these gaps, the Sport Longlife Diagnostic Model was proposed. PMID:29695970
Sport-Specific Assessment of the Effectiveness of Neuromuscular Training in Young Athletes.
Zemková, Erika; Hamar, Dušan
2018-01-01
Neuromuscular training in young athletes improves performance and decreases the risk of injuries during sports activities. These effects are primarily ascribed to the enhancement of muscle strength and power but also balance, speed and agility. However, most studies have failed to demonstrate significant improvement in these abilities. This is probably due to the fact that traditional tests do not reflect training methods (e.g., plyometric training vs. isometric or isokinetic strength testing, dynamic balance training vs. static balance testing). The protocols utilized in laboratories only partially fulfill the current needs for testing under sport-specific conditions. Moreover, laboratory testing usually requires skilled staff and a well equipped and costly infrastructure. Nevertheless, experience demonstrates that high-technology and expensive testing is not the only way to proceed. A number of physical fitness field tests are available today. However, the low reliability and limited number of parameters retrieved from simple equipment used also limit their application in competitive sports. Thus, there is a need to develop and validate a functional assessment platform based on portable computerized systems. Variables obtained should be directly linked to specific features of particular sports and capture their complexity. This is essential for revealing weak and strong components of athlete performance and design of individually-tailored exercise programs. Therefore, identifying the drawbacks associated with the assessment of athlete performance under sport-specific conditions would provide a basis for the formation of an innovative approach to their long-term systematic testing. This study aims (i) to review the testing methods used for the evaluation of the effect of neuromuscular training on sport-specific performance in young athletes, (ii) to introduce stages within the Sport Longlife Diagnostic Model, and (iii) to propose future research in this topic. Analysis of the literature identified gaps in the current standard testing methods in terms of their low sensitivity in discriminating between athletes of varied ages and performance levels, insufficent tailoring to athlete performance level and individual needs, a lack of specificity to the requirements of particular sports and also in revealing the effect of training. In order to partly fill in these gaps, the Sport Longlife Diagnostic Model was proposed.
Bayesian molecular design with a chemical language model
NASA Astrophysics Data System (ADS)
Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo
2017-04-01
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Bayesian molecular design with a chemical language model.
Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo
2017-04-01
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Modeling of surface dust concentrations using neural networks and kriging
NASA Astrophysics Data System (ADS)
Buevich, Alexander G.; Medvedev, Alexander N.; Sergeev, Alexander P.; Tarasov, Dmitry A.; Shichkin, Andrey V.; Sergeeva, Marina V.; Atanasova, T. B.
2016-12-01
Creating models which are able to accurately predict the distribution of pollutants based on a limited set of input data is an important task in environmental studies. In the paper two neural approaches: (multilayer perceptron (MLP)) and generalized regression neural network (GRNN)), and two geostatistical approaches: (kriging and cokriging), are using for modeling and forecasting of dust concentrations in snow cover. The area of study is under the influence of dust emissions from a copper quarry and a several industrial companies. The comparison of two mentioned approaches is conducted. Three indices are used as the indicators of the models accuracy: the mean absolute error (MAE), root mean square error (RMSE) and relative root mean square error (RRMSE). Models based on artificial neural networks (ANN) have shown better accuracy. When considering all indices, the most precision model was the GRNN, which uses as input parameters for modeling the coordinates of sampling points and the distance to the probable emissions source. The results of work confirm that trained ANN may be more suitable tool for modeling of dust concentrations in snow cover.
Prendergast, Geoffrey P.; Staff, Michael
2017-01-01
Introduction: This study examines the use of the number of night-time sleep disturbances as a health-based metric to assess the cost effectiveness of rail noise mitigation strategies for situations, wherein high-intensity noises dominate such as freight train pass-bys and wheel squeal. Materials and Methods: Twenty residential properties adjacent to the existing and proposed rail tracks in a noise catchment area of the Epping to Thornleigh Third Track project were used as a case study. Awakening probabilities were calculated for individual’s awakening 1, 3 and 5 times a night when subjected to 10 independent freight train pass-by noise events using internal maximum sound pressure levels (LAFmax). Results: Awakenings were predicted using a random intercept multivariate logistic regression model. With source mitigation in place, the majority of the residents were still predicted to be awoken at least once per night (median 88.0%), although substantial reductions in the median probabilities of awakening three and five times per night from 50.9 to 29.4% and 9.2 to 2.7%, respectively, were predicted. This resulted in a cost-effective estimate of 7.6–8.8 less people being awoken at least three times per night per A$1 million spent on noise barriers. Conclusion: The study demonstrates that an easily understood metric can be readily used to assist making decisions related to noise mitigation for large-scale transport projects. PMID:29192613
Stopping decisions: information order effects on nonfocal evaluations.
Yu, Michael; Gonzalez, Cleotilde
2013-08-01
We investigated how the order in which information is presented affects when a person decides to stop performing a task. A stopping decision is a decision to stop performing a task on the basis of a sequence of cues. Previous order-effects models do not account for how these contexts limit available working memory for making such decisions. Participants decided how long to perform a task known as the Work Hazard Game that began by rewarding points but later cost points if work continued after an unannounced "emergency." An additive sequence of cues indicated the probability of an emergency. Study I involved a three-group design with cue sequences that indicated the same risk at each decision point but whose final cue presented a high, medium, or low probability. Study 2 had a 2 x 2 design with high or low final cues and an easy or a challenging task. In Study I, participants stopped sooner when the most recent cue presented a high rather than low probability (p = .09), despite the same emergency risk. In Study 2, participants stopped sooner when the most recent cue presented a high rather than low probability for the challenging task but not for the easy task (p = .08). Stopping decisions appear sensitive to the most recent cue observed while experiencing task load. Participants responded to the same risks differently only on the basis of a change in presentation. Findings may be relevant for research and training for hazardous jobs, such as subsurface coal mining, fishing, and trucking.
Parameter estimation in Cox models with missing failure indicators and the OPPERA study.
Brownstein, Naomi C; Cai, Jianwen; Slade, Gary D; Bair, Eric
2015-12-30
In a prospective cohort study, examining all participants for incidence of the condition of interest may be prohibitively expensive. For example, the "gold standard" for diagnosing temporomandibular disorder (TMD) is a physical examination by a trained clinician. In large studies, examining all participants in this manner is infeasible. Instead, it is common to use questionnaires to screen for incidence of TMD and perform the "gold standard" examination only on participants who screen positively. Unfortunately, some participants may leave the study before receiving the "gold standard" examination. Within the framework of survival analysis, this results in missing failure indicators. Motivated by the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, a large cohort study of TMD, we propose a method for parameter estimation in survival models with missing failure indicators. We estimate the probability of being an incident case for those lacking a "gold standard" examination using logistic regression. These estimated probabilities are used to generate multiple imputations of case status for each missing examination that are combined with observed data in appropriate regression models. The variance introduced by the procedure is estimated using multiple imputation. The method can be used to estimate both regression coefficients in Cox proportional hazard models as well as incidence rates using Poisson regression. We simulate data with missing failure indicators and show that our method performs as well as or better than competing methods. Finally, we apply the proposed method to data from the OPPERA study. Copyright © 2015 John Wiley & Sons, Ltd.
A geomorphic approach to 100-year floodplain mapping for the Conterminous United States
NASA Astrophysics Data System (ADS)
Jafarzadegan, Keighobad; Merwade, Venkatesh; Saksena, Siddharth
2018-06-01
Floodplain mapping using hydrodynamic models is difficult in data scarce regions. Additionally, using hydrodynamic models to map floodplain over large stream network can be computationally challenging. Some of these limitations of floodplain mapping using hydrodynamic modeling can be overcome by developing computationally efficient statistical methods to identify floodplains in large and ungauged watersheds using publicly available data. This paper proposes a geomorphic model to generate probabilistic 100-year floodplain maps for the Conterminous United States (CONUS). The proposed model first categorizes the watersheds in the CONUS into three classes based on the height of the water surface corresponding to the 100-year flood from the streambed. Next, the probability that any watershed in the CONUS belongs to one of these three classes is computed through supervised classification using watershed characteristics related to topography, hydrography, land use and climate. The result of this classification is then fed into a probabilistic threshold binary classifier (PTBC) to generate the probabilistic 100-year floodplain maps. The supervised classification algorithm is trained by using the 100-year Flood Insurance Rated Maps (FIRM) from the U.S. Federal Emergency Management Agency (FEMA). FEMA FIRMs are also used to validate the performance of the proposed model in areas not included in the training. Additionally, HEC-RAS model generated flood inundation extents are used to validate the model performance at fifteen sites that lack FEMA maps. Validation results show that the probabilistic 100-year floodplain maps, generated by proposed model, match well with both FEMA and HEC-RAS generated maps. On average, the error of predicted flood extents is around 14% across the CONUS. The high accuracy of the validation results shows the reliability of the geomorphic model as an alternative approach for fast and cost effective delineation of 100-year floodplains for the CONUS.
Murphy, F Gregory; Hada, Ethan A; Doolette, David J; Howle, Laurens E
2017-07-01
Decompression sickness (DCS) is a disease caused by gas bubbles forming in body tissues following a reduction in ambient pressure, such as occurs in scuba diving. Probabilistic models for quantifying the risk of DCS are typically composed of a collection of independent, perfusion-limited theoretical tissue compartments which describe gas content or bubble volume within these compartments. It has been previously shown that 'pharmacokinetic' gas content models, with compartments coupled in series, show promise as predictors of the incidence of DCS. The mechanism of coupling can be through perfusion or diffusion. This work examines the application of five novel pharmacokinetic structures with compartments coupled by perfusion to the prediction of the probability and time of onset of DCS in humans. We optimize these models against a training set of human dive trial data consisting of 4335 exposures with 223 DCS cases. Further, we examine the extrapolation quality of the models on an additional set of human dive trial data consisting of 3140 exposures with 147 DCS cases. We find that pharmacokinetic models describe the incidence of DCS for single air bounce dives better than a single-compartment, perfusion-limited model. We further find the U.S. Navy LEM-NMRI98 is a better predictor of DCS risk for the entire training set than any of our pharmacokinetic models. However, one of the pharmacokinetic models we consider, the CS2T3 model, is a better predictor of DCS risk for single air bounce dives and oxygen decompression dives. Additionally, we find that LEM-NMRI98 outperforms CS2T3 on the extrapolation data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Interactive vs. Non-Interactive Multi-Model Ensembles
NASA Astrophysics Data System (ADS)
Duane, G. S.
2013-12-01
If the members of an ensemble of different models are allowed to interact with one another in run time, predictive skill can be improved as compared to that of any individual model or any average of indvidual model outputs. Inter-model connections in such an interactive ensemble can be trained, using historical data, so that the resulting ``supermodel' synchronizes with reality when used in weather-prediction mode, where the individual models perform data assimilation from each other (with trainable inter-model 'observation error') as well as from real observations. In climate-projection mode, parameters of the individual models are changed, as might occur from an increase in GHG levels, and one obtains relevant statistical properties of the new supermodel attractor. In simple cases, it has been shown that training of the inter-model connections with the old parameter values gives a supermodel that is still predictive when the parameter values are changed. Here we inquire as to the circumstances under which supermodel performance can be expected to exceed that of the customary weighted average of model outputs. We consider a supermodel formed from quasigeostrophic (QG) channel models with different forcing coefficients, and introduce an effective training scheme for the inter-model connections. We show that the blocked-zonal index cycle is reproduced better by the supermodel than by any non-interactive ensemble in the extreme case where the forcing coefficients of the different models are very large or very small. With realistic differences in forcing coefficients, as would be representative of actual differences among IPCC-class models, the usual linearity assumption is justified and a weighted average of model outputs is adequate. It is therefore hypothesized that supermodeling is likely to be useful in situations where there are qualitative model differences, as arising from sub-gridscale parameterizations, that affect overall model behavior. Otherwise the usual ex post facto averaging will probably suffice. The advantage of supermodeling is seen in statistics such as anticorrelation between blocking activity in the Atlantic and Pacific sectors, in the case of the QG channel model, rather than in overall blocking frequency. Likewise in climate models, the advantage of supermodeling is typically manifest in higher-order statistics rather than in quantities such as mean temperature.
Probabilistic estimation of dune retreat on the Gold Coast, Australia
Palmsten, Margaret L.; Splinter, Kristen D.; Plant, Nathaniel G.; Stockdon, Hilary F.
2014-01-01
Sand dunes are an important natural buffer between storm impacts and development backing the beach on the Gold Coast of Queensland, Australia. The ability to forecast dune erosion at a prediction horizon of days to a week would allow efficient and timely response to dune erosion in this highly populated area. Towards this goal, we modified an existing probabilistic dune erosion model for use on the Gold Coast. The original model was trained using observations of dune response from Hurricane Ivan on Santa Rosa Island, Florida, USA (Plant and Stockdon 2012. Probabilistic prediction of barrier-island response to hurricanes, Journal of Geophysical Research, 117(F3), F03015). The model relates dune position change to pre-storm dune elevations, dune widths, and beach widths, along with storm surge and run-up using a Bayesian network. The Bayesian approach captures the uncertainty of inputs and predictions through the conditional probabilities between variables. Three versions of the barrier island response Bayesian network were tested for use on the Gold Coast. One network has the same structure as the original and was trained with the Santa Rosa Island data. The second network has a modified design and was trained using only pre- and post-storm data from 1988-2009 for the Gold Coast. The third version of the network has the same design as the second version of the network and was trained with the combined data from the Gold Coast and Santa Rosa Island. The two networks modified for use on the Gold Coast hindcast dune retreat with equal accuracy. Both networks explained 60% of the observed dune retreat variance, which is comparable to the skill observed by Plant and Stockdon (2012) in the initial Bayesian network application at Santa Rosa Island. The new networks improved predictions relative to application of the original network on the Gold Coast. Dune width was the most important morphologic variable in hindcasting dune retreat, while hydrodynamic variables, surge and run-up elevation, were also important
NASA Astrophysics Data System (ADS)
Shin, Seulki; Moon, Yong-Jae; Chu, Hyoungseok
2017-08-01
As the application of deep-learning methods has been succeeded in various fields, they have a high potential to be applied to space weather forecasting. Convolutional neural network, one of deep learning methods, is specialized in image recognition. In this study, we apply the AlexNet architecture, which is a winner of Imagenet Large Scale Virtual Recognition Challenge (ILSVRC) 2012, to the forecast of daily solar flare occurrence using the MatConvNet software of MATLAB. Our input images are SOHO/MDI, EIT 195Å, and 304Å from January 1996 to December 2010, and output ones are yes or no of flare occurrence. We select training dataset from Jan 1996 to Dec 2000 and from Jan 2003 to Dec 2008. Testing dataset is chosen from Jan 2001 to Dec 2002 and from Jan 2009 to Dec 2010 in order to consider the solar cycle effect. In training dataset, we randomly select one fifth of training data for validation dataset to avoid the overfitting problem. Our model successfully forecasts the flare occurrence with about 0.90 probability of detection (POD) for common flares (C-, M-, and X-class). While POD of major flares (M- and X-class) forecasting is 0.96, false alarm rate (FAR) also scores relatively high(0.60). We also present several statistical parameters such as critical success index (CSI) and true skill statistics (TSS). Our model can immediately be applied to automatic forecasting service when image data are available.
Riihimaki, Laura D.; Comstock, Jennifer M.; Anderson, Kevin K.; ...
2016-06-10
Knowledge of cloud phase (liquid, ice, mixed, etc.) is necessary to describe the radiative impact of clouds and their lifetimes, but is a property that is difficult to simulate correctly in climate models. One step towards improving those simulations is to make observations of cloud phase with sufficient accuracy to help constrain model representations of cloud processes. In this study, we outline a methodology using a basic Bayesian classifier to estimate the probabilities of cloud-phase class from Atmospheric Radiation Measurement (ARM) vertically pointing active remote sensors. The advantage of this method over previous ones is that it provides uncertainty informationmore » on the phase classification. We also test the value of including higher moments of the cloud radar Doppler spectrum than are traditionally used operationally. Using training data of known phase from the Mixed-Phase Arctic Cloud Experiment (M-PACE) field campaign, we demonstrate a proof of concept for how the method can be used to train an algorithm that identifies ice, liquid, mixed phase, and snow. Over 95 % of data are identified correctly for pure ice and liquid cases used in this study. Mixed-phase and snow cases are more problematic to identify correctly. When lidar data are not available, including additional information from the Doppler spectrum provides substantial improvement to the algorithm. As a result, this is a first step towards an operational algorithm and can be expanded to include additional categories such as drizzle with additional training data.« less
NASA Astrophysics Data System (ADS)
Riihimaki, Laura D.; Comstock, Jennifer M.; Anderson, Kevin K.; Holmes, Aimee; Luke, Edward
2016-06-01
Knowledge of cloud phase (liquid, ice, mixed, etc.) is necessary to describe the radiative impact of clouds and their lifetimes, but is a property that is difficult to simulate correctly in climate models. One step towards improving those simulations is to make observations of cloud phase with sufficient accuracy to help constrain model representations of cloud processes. In this study, we outline a methodology using a basic Bayesian classifier to estimate the probabilities of cloud-phase class from Atmospheric Radiation Measurement (ARM) vertically pointing active remote sensors. The advantage of this method over previous ones is that it provides uncertainty information on the phase classification. We also test the value of including higher moments of the cloud radar Doppler spectrum than are traditionally used operationally. Using training data of known phase from the Mixed-Phase Arctic Cloud Experiment (M-PACE) field campaign, we demonstrate a proof of concept for how the method can be used to train an algorithm that identifies ice, liquid, mixed phase, and snow. Over 95 % of data are identified correctly for pure ice and liquid cases used in this study. Mixed-phase and snow cases are more problematic to identify correctly. When lidar data are not available, including additional information from the Doppler spectrum provides substantial improvement to the algorithm. This is a first step towards an operational algorithm and can be expanded to include additional categories such as drizzle with additional training data.
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
NASA Astrophysics Data System (ADS)
Balasubramanian, A.; Shamsuddin, R.; Prabhakaran, B.; Sawant, A.
2017-03-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%) (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-01-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5–91.4%); (ii) the predictive modeling yields lowest accuracies (50–60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96–0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient lookahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors. PMID:28075331
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts.
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-03-07
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%); (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Nemet, Dan; Eliakim, Alon
2010-01-01
Physical activity plays an important role in tissue anabolism, growth and development, but the mechanisms that link patterns of exercise with tissue anabolism are not completely understood. The effectiveness of physical training depends on the training load and on the individual ability to tolerate it, and an imbalance between the two may lead to under or over-training. Therefore, many efforts have been made to find objective parameters to quantify the balance between training load and the athlete's tolerance. One of the unique features of exercise is that it leads to a simultaneous increase of antagonistic mediators. On the one hand, exercise stimulates anabolic components of the growth hormone (GH) → IGF-1 (insulin-like growth factor-1) axis. On the other hand, exercise elevates catabolic pro-inflammatory cytokines such as interleukin-6 (IL-6), IL-1 and tumor necrosis factor-α (TNF-α). This emphasizes probably the importance of optimal adaptation to exercise in particularly during adolescence. The very fine balance between the anabolic and inflammatory/catabolic response to exercise will determine the effectiveness of exercise training and the health consequences of exercise. If the anabolic response is stronger, exercise will probably lead ultimately to increased muscle mass and improved fitness. A greater catabolic response, in particularly if persists for long duration, may lead to overtraining. Therefore, changes in the anabolic-catabolic hormonal balance and in circulating inflammatory cytokines can be used by adolescent athletes and/or their coaches to gauge the training intensity in individual and team sports. Copyright © 2010 S. Karger AG, Basel.
Learning About Climate and Atmospheric Models Through Machine Learning
NASA Astrophysics Data System (ADS)
Lucas, D. D.
2017-12-01
From the analysis of ensemble variability to improving simulation performance, machine learning algorithms can play a powerful role in understanding the behavior of atmospheric and climate models. To learn about model behavior, we create training and testing data sets through ensemble techniques that sample different model configurations and values of input parameters, and then use supervised machine learning to map the relationships between the inputs and outputs. Following this procedure, we have used support vector machines, random forests, gradient boosting and other methods to investigate a variety of atmospheric and climate model phenomena. We have used machine learning to predict simulation crashes, estimate the probability density function of climate sensitivity, optimize simulations of the Madden Julian oscillation, assess the impacts of weather and emissions uncertainty on atmospheric dispersion, and quantify the effects of model resolution changes on precipitation. This presentation highlights recent examples of our applications of machine learning to improve the understanding of climate and atmospheric models. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
A template-finding algorithm and a comprehensive benchmark for homology modeling of proteins
Vallat, Brinda Kizhakke; Pillardy, Jaroslaw; Elber, Ron
2010-01-01
The first step in homology modeling is to identify a template protein for the target sequence. The template structure is used in later phases of the calculation to construct an atomically detailed model for the target. We have built from the Protein Data Bank a large-scale learning set that includes tens of millions of pair matches that can be either a true template or a false one. Discriminatory learning (learning from positive and negative examples) is employed to train a decision tree. Each branch of the tree is a mathematical programming model. The decision tree is tested on an independent set from PDB entries and on the sequences of CASP7. It provides significant enrichment of true templates (between 50-100 percent) when compared to PSI-BLAST. The model is further verified by building atomically detailed structures for each of the tentative true templates with modeller. The probability that a true match does not yield an acceptable structural model (within 6Å RMSD from the native structure), decays linearly as a function of the TM structural-alignment score. PMID:18300226
ERIC Educational Resources Information Center
Plowman, Sharon Ann
A review of previous research was completed to determine (a) the response of the cardiac time components of the left ventricle to varying types and intensities of training programs, (b) the probable physiological explanations for these responses, and (c) the significance of the changes which did or did not occur. It was found that, at rest,…
Probability Theory Plus Noise: Descriptive Estimation and Inferential Judgment.
Costello, Fintan; Watts, Paul
2018-01-01
We describe a computational model of two central aspects of people's probabilistic reasoning: descriptive probability estimation and inferential probability judgment. This model assumes that people's reasoning follows standard frequentist probability theory, but it is subject to random noise. This random noise has a regressive effect in descriptive probability estimation, moving probability estimates away from normative probabilities and toward the center of the probability scale. This random noise has an anti-regressive effect in inferential judgement, however. These regressive and anti-regressive effects explain various reliable and systematic biases seen in people's descriptive probability estimation and inferential probability judgment. This model predicts that these contrary effects will tend to cancel out in tasks that involve both descriptive estimation and inferential judgement, leading to unbiased responses in those tasks. We test this model by applying it to one such task, described by Gallistel et al. ). Participants' median responses in this task were unbiased, agreeing with normative probability theory over the full range of responses. Our model captures the pattern of unbiased responses in this task, while simultaneously explaining systematic biases away from normatively correct probabilities seen in other tasks. Copyright © 2018 Cognitive Science Society, Inc.
NASA Astrophysics Data System (ADS)
Jenuwine, Natalia M.; Mahesh, Sunny N.; Furst, Jacob D.; Raicu, Daniela S.
2018-02-01
Early detection of lung nodules from CT scans is key to improving lung cancer treatment, but poses a significant challenge for radiologists due to the high throughput required of them. Computer-Aided Detection (CADe) systems aim to automatically detect these nodules with computer algorithms, thus improving diagnosis. These systems typically use a candidate selection step, which identifies all objects that resemble nodules, followed by a machine learning classifier which separates true nodules from false positives. We create a CADe system that uses a 3D convolutional neural network (CNN) to detect nodules in CT scans without a candidate selection step. Using data from the LIDC database, we train a 3D CNN to analyze subvolumes from anywhere within a CT scan and output the probability that each subvolume contains a nodule. Once trained, we apply our CNN to detect nodules from entire scans, by systematically dividing the scan into overlapping subvolumes which we input into the CNN to obtain the corresponding probabilities. By enabling our network to process an entire scan, we expect to streamline the detection process while maintaining its effectiveness. Our results imply that with continued training using an iterative training scheme, the one-step approach has the potential to be highly effective.
Statistical appearance models based on probabilistic correspondences.
Krüger, Julia; Ehrhardt, Jan; Handels, Heinz
2017-04-01
Model-based image analysis is indispensable in medical image processing. One key aspect of building statistical shape and appearance models is the determination of one-to-one correspondences in the training data set. At the same time, the identification of these correspondences is the most challenging part of such methods. In our earlier work, we developed an alternative method using correspondence probabilities instead of exact one-to-one correspondences for a statistical shape model (Hufnagel et al., 2008). In this work, a new approach for statistical appearance models without one-to-one correspondences is proposed. A sparse image representation is used to build a model that combines point position and appearance information at the same time. Probabilistic correspondences between the derived multi-dimensional feature vectors are used to omit the need for extensive preprocessing of finding landmarks and correspondences as well as to reduce the dependence of the generated model on the landmark positions. Model generation and model fitting can now be expressed by optimizing a single global criterion derived from a maximum a-posteriori (MAP) approach with respect to model parameters that directly affect both shape and appearance of the considered objects inside the images. The proposed approach describes statistical appearance modeling in a concise and flexible mathematical framework. Besides eliminating the demand for costly correspondence determination, the method allows for additional constraints as topological regularity in the modeling process. In the evaluation the model was applied for segmentation and landmark identification in hand X-ray images. The results demonstrate the feasibility of the model to detect hand contours as well as the positions of the joints between finger bones for unseen test images. Further, we evaluated the model on brain data of stroke patients to show the ability of the proposed model to handle partially corrupted data and to demonstrate a possible employment of the correspondence probabilities to indicate these corrupted/pathological areas. Copyright © 2017 Elsevier B.V. All rights reserved.
Kim, Yeonho; Nabili, Marjan; Acharya, Priyanka; Lopez, Asis; Myers, Matthew R
2017-01-01
Safety analyses of transcranial therapeutic ultrasound procedures require knowledge of the dependence of the rupture probability and rupture time upon sonication parameters. As previous vessel-rupture studies have concentrated on a specific set of exposure conditions, there is a need for more comprehensive parametric studies. Probability of rupture and rupture times were measured by exposing the large blood vessel of a live earthworm to high-intensity focused ultrasound pulse trains of various characteristics. Pressures generated by the ultrasound transducers were estimated through numerical solutions to the KZK (Khokhlov-Zabolotskaya-Kuznetsov) equation. Three ultrasound frequencies (1.1, 2.5, and 3.3 MHz) were considered, as were three pulse repetition frequencies (1, 3, and 10 Hz), and two duty factors (0.0001, 0.001). The pressures produced ranged from 4 to 18 MPa. Exposures of up to 10 min in duration were employed. Trials were repeated an average of 11 times. No trends as a function of pulse repetition rate were identifiable, for either probability of rupture or rupture time. Rupture time was found to be a strong function of duty factor at the lower pressures; at 1.1 MHz the rupture time was an order of magnitude lower for the 0.001 duty factor than the 0.0001. At moderate pressures, the difference between the duty factors was less, and there was essentially no difference between duty factors at the highest pressure. Probability of rupture was not found to be a strong function of duty factor. Rupture thresholds were about 4 MPa for the 1.1 MHz frequency, 7 MPa at 3.3 MHz, and 11 MPa for the 2.5 MHz, though the pressure value at 2.5 MHz frequency will likely be reduced when steep-angle corrections are accounted for in the KZK model used to estimate pressures. Mechanical index provided a better collapse of the data (less separation of the curves pertaining to the different frequencies) than peak negative pressure, for both probability of rupture and rupture time. The results provide a database with which investigations in more complex animal models can be compared, potentially establishing trends by which bioeffects in human vessels can be estimated.
Generating Seismograms with Deep Neural Networks
NASA Astrophysics Data System (ADS)
Krischer, L.; Fichtner, A.
2017-12-01
The recent surge of successful uses of deep neural networks in computer vision, speech recognition, and natural language processing, mainly enabled by the availability of fast GPUs and extremely large data sets, is starting to see many applications across all natural sciences. In seismology these are largely confined to classification and discrimination tasks. In this contribution we explore the use of deep neural networks for another class of problems: so called generative models.Generative modelling is a branch of statistics concerned with generating new observed data samples, usually by drawing from some underlying probability distribution. Samples with specific attributes can be generated by conditioning on input variables. In this work we condition on seismic source (mechanism and location) and receiver (location) parameters to generate multi-component seismograms.The deep neural networks are trained on synthetic data calculated with Instaseis (http://instaseis.net, van Driel et al. (2015)) and waveforms from the global ShakeMovie project (http://global.shakemovie.princeton.edu, Tromp et al. (2010)). The underlying radially symmetric or smoothly three dimensional Earth structures result in comparatively small waveform differences from similar events or at close receivers and the networks learn to interpolate between training data samples.Of particular importance is the chosen misfit functional. Generative adversarial networks (Goodfellow et al. (2014)) implement a system in which two networks compete: the generator network creates samples and the discriminator network distinguishes these from the true training examples. Both are trained in an adversarial fashion until the discriminator can no longer distinguish between generated and real samples. We show how this can be applied to seismograms and in particular how it compares to networks trained with more conventional misfit metrics. Last but not least we attempt to shed some light on the black-box nature of neural networks by estimating the quality and uncertainties of the generated seismograms.
Sensitivity study of Space Station Freedom operations cost and selected user resources
NASA Technical Reports Server (NTRS)
Accola, Anne; Fincannon, H. J.; Williams, Gregory J.; Meier, R. Timothy
1990-01-01
The results of sensitivity studies performed to estimate probable ranges for four key Space Station parameters using the Space Station Freedom's Model for Estimating Space Station Operations Cost (MESSOC) are discussed. The variables examined are grouped into five main categories: logistics, crew, design, space transportation system, and training. The modification of these variables implies programmatic decisions in areas such as orbital replacement unit (ORU) design, investment in repair capabilities, and crew operations policies. The model utilizes a wide range of algorithms and an extensive trial logistics data base to represent Space Station operations. The trial logistics data base consists largely of a collection of the ORUs that comprise the mature station, and their characteristics based on current engineering understanding of the Space Station. A nondimensional approach is used to examine the relative importance of variables on parameters.
UQ for Decision Making: How (at least five) Kinds of Probability Might Come Into Play
NASA Astrophysics Data System (ADS)
Smith, L. A.
2013-12-01
In 1959 IJ Good published the discussion "Kinds of Probability" in Science. Good identified (at least) five kinds. The need for (at least) a sixth kind of probability when quantifying uncertainty in the context of climate science is discussed. This discussion brings out the differences in weather-like forecasting tasks and climate-links tasks, with a focus on the effective use both of science and of modelling in support of decision making. Good also introduced the idea of a "Dynamic probability" a probability one expects to change without any additional empirical evidence; the probabilities assigned by a chess playing program when it is only half thorough its analysis being an example. This case is contrasted with the case of "Mature probabilities" where a forecast algorithm (or model) has converged on its asymptotic probabilities and the question hinges in whether or not those probabilities are expected to change significantly before the event in question occurs, even in the absence of new empirical evidence. If so, then how might one report and deploy such immature probabilities in scientific-support of decision-making rationally? Mature Probability is suggested as a useful sixth kind, although Good would doubtlessly argue that we can get by with just one, effective communication with decision makers may be enhanced by speaking as if the others existed. This again highlights the distinction between weather-like contexts and climate-like contexts. In the former context one has access to a relevant climatology (a relevant, arguably informative distribution prior to any model simulations), in the latter context that information is not available although one can fall back on the scientific basis upon which the model itself rests, and estimate the probability that the model output is in fact misinformative. This subjective "probability of a big surprise" is one way to communicate the probability of model-based information holding in practice, the probability that the information the model-based probability is conditioned on holds. It is argued that no model-based climate-like probability forecast is complete without a quantitative estimate of its own irrelevance, and that the clear identification of model-based probability forecasts as mature or immature, are critical elements for maintaining the credibility of science-based decision support, and can shape uncertainty quantification more widely.
Storm-based Cloud-to-Ground Lightning Probabilities and Warnings
NASA Astrophysics Data System (ADS)
Calhoun, K. M.; Meyer, T.; Kingfield, D.
2017-12-01
A new cloud-to-ground (CG) lightning probability algorithm has been developed using machine-learning methods. With storm-based inputs of Earth Networks' in-cloud lightning, Vaisala's CG lightning, multi-radar/multi-sensor (MRMS) radar derived products including the Maximum Expected Size of Hail (MESH) and Vertically Integrated Liquid (VIL), and near storm environmental data including lapse rate and CAPE, a random forest algorithm was trained to produce probabilities of CG lightning up to one-hour in advance. As part of the Prototype Probabilistic Hazard Information experiment in the Hazardous Weather Testbed in 2016 and 2017, National Weather Service forecasters were asked to use this CG lightning probability guidance to create rapidly updating probability grids and warnings for the threat of CG lightning for 0-60 minutes. The output from forecasters was shared with end-users, including emergency managers and broadcast meteorologists, as part of an integrated warning team.
Takemura, Kazuhisa; Murakami, Hajime
2016-01-01
A probability weighting function (w(p)) is considered to be a nonlinear function of probability (p) in behavioral decision theory. This study proposes a psychophysical model of probability weighting functions derived from a hyperbolic time discounting model and a geometric distribution. The aim of the study is to show probability weighting functions from the point of view of waiting time for a decision maker. Since the expected value of a geometrically distributed random variable X is 1/p, we formulized the probability weighting function of the expected value model for hyperbolic time discounting as w(p) = (1 - k log p)(-1). Moreover, the probability weighting function is derived from Loewenstein and Prelec's (1992) generalized hyperbolic time discounting model. The latter model is proved to be equivalent to the hyperbolic-logarithmic weighting function considered by Prelec (1998) and Luce (2001). In this study, we derive a model from the generalized hyperbolic time discounting model assuming Fechner's (1860) psychophysical law of time and a geometric distribution of trials. In addition, we develop median models of hyperbolic time discounting and generalized hyperbolic time discounting. To illustrate the fitness of each model, a psychological experiment was conducted to assess the probability weighting and value functions at the level of the individual participant. The participants were 50 university students. The results of individual analysis indicated that the expected value model of generalized hyperbolic discounting fitted better than previous probability weighting decision-making models. The theoretical implications of this finding are discussed.
Alzheimer's Disease Diagnosis in Individual Subjects using Structural MR Images: Validation Studies
Vemuri, Prashanthi; Gunter, Jeffrey L.; Senjem, Matthew L.; Whitwell, Jennifer L.; Kantarci, Kejal; Knopman, David S.; Boeve, Bradley F.; Petersen, Ronald C.; Jack, Clifford R.
2008-01-01
OBJECTIVE To develop and validate a tool for Alzheimer's disease (AD) diagnosis in individual subjects using support vector machine (SVM) based classification of structural MR (sMR) images. BACKGROUND Libraries of sMR scans of clinically well characterized subjects can be harnessed for the purpose of diagnosing new incoming subjects. METHODS 190 patients with probable AD were age- and gender-matched with 190 cognitively normal (CN) subjects. Three different classification models were implemented: Model I uses tissue densities obtained from sMR scans to give STructural Abnormality iNDex (STAND)-score; and Models II and III use tissue densities as well as covariates (demographics and Apolipoprotein E genotype) to give adjusted-STAND (aSTAND)-score. Data from 140 AD and 140 CN were used for training. The SVM parameter optimization and training was done by four-fold cross validation. The remaining independent sample of 50 AD and 50 CN were used to obtain a minimally biased estimate of the generalization error of the algorithm. RESULTS The CV accuracy of Model II and Model III aSTAND-scores was 88.5% and 89.3% respectively and the developed models generalized well on the independent test datasets. Anatomic patterns best differentiating the groups were consistent with the known distribution of neurofibrillary AD pathology. CONCLUSIONS This paper presents preliminary evidence that application of SVM-based classification of an individual sMR scan relative to a library of scans can provide useful information in individual subjects for diagnosis of AD. Including demographic and genetic information in the classification algorithm slightly improves diagnostic accuracy. PMID:18054253
Du, Bo; Zhang, Yuxiang; Zhang, Liangpei; Tao, Dacheng
2016-08-18
Hyperspectral images provide great potential for target detection, however, new challenges are also introduced for hyperspectral target detection, resulting that hyperspectral target detection should be treated as a new problem and modeled differently. Many classical detectors are proposed based on the linear mixing model and the sparsity model. However, the former type of model cannot deal well with spectral variability in limited endmembers, and the latter type of model usually treats the target detection as a simple classification problem and pays less attention to the low target probability. In this case, can we find an efficient way to utilize both the high-dimension features behind hyperspectral images and the limited target information to extract small targets? This paper proposes a novel sparsitybased detector named the hybrid sparsity and statistics detector (HSSD) for target detection in hyperspectral imagery, which can effectively deal with the above two problems. The proposed algorithm designs a hypothesis-specific dictionary based on the prior hypotheses for the test pixel, which can avoid the imbalanced number of training samples for a class-specific dictionary. Then, a purification process is employed for the background training samples in order to construct an effective competition between the two hypotheses. Next, a sparse representation based binary hypothesis model merged with additive Gaussian noise is proposed to represent the image. Finally, a generalized likelihood ratio test is performed to obtain a more robust detection decision than the reconstruction residual based detection methods. Extensive experimental results with three hyperspectral datasets confirm that the proposed HSSD algorithm clearly outperforms the stateof- the-art target detectors.
NASA Astrophysics Data System (ADS)
Zhang, Jun; Cain, Elizabeth Hope; Saha, Ashirbani; Zhu, Zhe; Mazurowski, Maciej A.
2018-02-01
Breast mass detection in mammography and digital breast tomosynthesis (DBT) is an essential step in computerized breast cancer analysis. Deep learning-based methods incorporate feature extraction and model learning into a unified framework and have achieved impressive performance in various medical applications (e.g., disease diagnosis, tumor detection, and landmark detection). However, these methods require large-scale accurately annotated data. Unfortunately, it is challenging to get precise annotations of breast masses. To address this issue, we propose a fully convolutional network (FCN) based heatmap regression method for breast mass detection, using only weakly annotated mass regions in mammography images. Specifically, we first generate heat maps of masses based on human-annotated rough regions for breast masses. We then develop an FCN model for end-to-end heatmap regression with an F-score loss function, where the mammography images are regarded as the input and heatmaps for breast masses are used as the output. Finally, the probability map of mass locations can be estimated with the trained model. Experimental results on a mammography dataset with 439 subjects demonstrate the effectiveness of our method. Furthermore, we evaluate whether we can use mammography data to improve detection models for DBT, since mammography shares similar structure with tomosynthesis. We propose a transfer learning strategy by fine-tuning the learned FCN model from mammography images. We test this approach on a small tomosynthesis dataset with only 40 subjects, and we show an improvement in the detection performance as compared to training the model from scratch.
Bivariate categorical data analysis using normal linear conditional multinomial probability model.
Sun, Bingrui; Sutradhar, Brajendra
2015-02-10
Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.
Psychotherapy in Mexico: practice, training, and regulation.
Sanchez-Sosa, Juan Jose
2007-08-01
Psychotherapy conducted by psychologists in Mexico has a long history and shows promising developments but offers a relatively limited choice for health care recipients, especially in public facilities. Psychotherapy by psychologists occurs mainly in private practice, although it is spreading to public institutions such as hospitals and outpatient clinics. Most clinical psychologists in Mexico are trained in some type of psychodynamic approach, although the use of cognitive-behavioral treatments is spreading quickly. The probability that a patient will actually be seen by a psychologist depends mainly on such characteristics of the patient as socioeconomic status, place of residence, and insurance coverage, if any. These and other attributes of psychotherapy in Mexico are illustrated by the probable treatment of Mrs. A. Psychotherapy in Mexico continues to evolve toward both multidisciplinary work and evidence-based practices. (c) 2007 Wiley Periodicals, Inc.
NASA Technical Reports Server (NTRS)
Watson, Clifford
2010-01-01
Traditional hazard analysis techniques utilize a two-dimensional representation of the results determined by relative likelihood and severity of the residual risk. These matrices present a quick-look at the Likelihood (Y-axis) and Severity (X-axis) of the probable outcome of a hazardous event. A three-dimensional method, described herein, utilizes the traditional X and Y axes, while adding a new, third dimension, shown as the Z-axis, and referred to as the Level of Control. The elements of the Z-axis are modifications of the Hazard Elimination and Control steps (also known as the Hazard Reduction Precedence Sequence). These steps are: 1. Eliminate risk through design. 2. Substitute less risky materials for more hazardous materials. 3. Install safety devices. 4. Install caution and warning devices. 5. Develop administrative controls (to include special procedures and training.) 6. Provide protective clothing and equipment. When added to the twodimensional models, the level of control adds a visual representation of the risk associated with the hazardous condition, creating a tall-pole for the least-well-controlled failure while establishing the relative likelihood and severity of all causes and effects for an identified hazard. Computer modeling of the analytical results, using spreadsheets and threedimensional charting gives a visual confirmation of the relationship between causes and their controls
NASA Technical Reports Server (NTRS)
Watson, Clifford C.
2011-01-01
Traditional hazard analysis techniques utilize a two-dimensional representation of the results determined by relative likelihood and severity of the residual risk. These matrices present a quick-look at the Likelihood (Y-axis) and Severity (X-axis) of the probable outcome of a hazardous event. A three-dimensional method, described herein, utilizes the traditional X and Y axes, while adding a new, third dimension, shown as the Z-axis, and referred to as the Level of Control. The elements of the Z-axis are modifications of the Hazard Elimination and Control steps (also known as the Hazard Reduction Precedence Sequence). These steps are: 1. Eliminate risk through design. 2. Substitute less risky materials for more hazardous materials. 3. Install safety devices. 4. Install caution and warning devices. 5. Develop administrative controls (to include special procedures and training.) 6. Provide protective clothing and equipment. When added to the two-dimensional models, the level of control adds a visual representation of the risk associated with the hazardous condition, creating a tall-pole for the least-well-controlled failure while establishing the relative likelihood and severity of all causes and effects for an identified hazard. Computer modeling of the analytical results, using spreadsheets and three-dimensional charting gives a visual confirmation of the relationship between causes and their controls.
Esfahani, Mohammad Shahrokh; Dougherty, Edward R
2015-01-01
Phenotype classification via genomic data is hampered by small sample sizes that negatively impact classifier design. Utilization of prior biological knowledge in conjunction with training data can improve both classifier design and error estimation via the construction of the optimal Bayesian classifier. In the genomic setting, gene/protein signaling pathways provide a key source of biological knowledge. Although these pathways are neither complete, nor regulatory, with no timing associated with them, they are capable of constraining the set of possible models representing the underlying interaction between molecules. The aim of this paper is to provide a framework and the mathematical tools to transform signaling pathways to prior probabilities governing uncertainty classes of feature-label distributions used in classifier design. Structural motifs extracted from the signaling pathways are mapped to a set of constraints on a prior probability on a Multinomial distribution. Being the conjugate prior for the Multinomial distribution, we propose optimization paradigms to estimate the parameters of a Dirichlet distribution in the Bayesian setting. The performance of the proposed methods is tested on two widely studied pathways: mammalian cell cycle and a p53 pathway model.
Post-processing of multi-model ensemble river discharge forecasts using censored EMOS
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Lisniak, Dmytro; Klein, Bastian
2014-05-01
When forecasting water levels and river discharge, ensemble weather forecasts are used as meteorological input to hydrologic process models. As hydrologic models are imperfect and the input ensembles tend to be biased and underdispersed, the output ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, statistical post-processing is required in order to achieve calibrated and sharp predictions. Standard post-processing methods such as Ensemble Model Output Statistics (EMOS) that have their origins in meteorological forecasting are now increasingly being used in hydrologic applications. Here we consider two sub-catchments of River Rhine, for which the forecasting system of the Federal Institute of Hydrology (BfG) uses runoff data that are censored below predefined thresholds. To address this methodological challenge, we develop a censored EMOS method that is tailored to such data. The censored EMOS forecast distribution can be understood as a mixture of a point mass at the censoring threshold and a continuous part based on a truncated normal distribution. Parameter estimates of the censored EMOS model are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over the training dataset. Model fitting on Box-Cox transformed data allows us to take account of the positive skewness of river discharge distributions. In order to achieve realistic forecast scenarios over an entire range of lead-times, there is a need for multivariate extensions. To this end, we smooth the marginal parameter estimates over lead-times. In order to obtain realistic scenarios of discharge evolution over time, the marginal distributions have to be linked with each other. To this end, the multivariate dependence structure can either be adopted from the raw ensemble like in Ensemble Copula Coupling (ECC), or be estimated from observations in a training period. The censored EMOS model has been applied to multi-model ensemble forecasts issued on a daily basis over a period of three years. For the two catchments considered, this resulted in well calibrated and sharp forecast distributions over all lead-times from 1 to 114 h. Training observations tended to be better indicators for the dependence structure than the raw ensemble.
Does formal research training lead to academic success in otolaryngology?
Bobian, Michael R; Shah, Noor; Svider, Peter F; Hong, Robert S; Shkoukani, Mahdi A; Folbe, Adam J; Eloy, Jean Anderson
2017-01-01
To evaluate whether formalized research training is associated with higher researcher productivity, academic rank, and acquisition of National Institutes of Health (NIH) grants within academic otolaryngology departments. Each of the 100 civilian otolaryngology program's departmental websites were analyzed to obtain a comprehensive list of faculty members credentials and characteristics, including academic rank, completion of a clinical fellowship, completion of a formal research fellowship, and attainment of a doctorate in philosophy (PhD) degree. We also recorded measures of scholarly impact and successful acquisition of NIH funding. A total of 1,495 academic physicians were included in our study. Of these, 14.1% had formal research training. Bivariate associations showed that formal research training was associated with a greater h-index, increased probability of acquiring NIH funding, and higher academic rank. Using a linear regression model, we found that otolaryngologists possessing a PhD had an associated h-index of 1.8 points higher, and those who completed a formal research fellowship had an h-index of 1.6 points higher. A PhD degree or completion of a research fellowship was not associated with a higher academic rank; however, a higher h-index and previous acquisition of an NIH grant were associated with a higher academic rank. The attainment of NIH funding was three times more likely for those with a formal research fellowship and 8.6 times more likely for otolaryngologists with a PhD degree. Formalized research training is associated with academic success in otolaryngology. Such dedicated research training accompanies greater scholarly impact, acquisition of NIH funding, and a higher academic rank. NA Laryngoscope, 127:E15-E21, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Hodgetts, T J
1999-06-01
The South Africans are keen to adopt the Major Incident Medical Management and Support course, developed in the UK. South Africa can provide an excellent training ground for military personnel in the triage, resuscitation and surgical management of the patient with penetrating trauma. Johannesburg General Hospital has a high quality training system under the direction of Dr Ken Boffard. The nearby Baragwaneth Hospital is the closest to military surgical practice one can probably get in a civilian setting. There is an unprecedented opportunity for clinical skills training, and a wealth of research opportunities.
Chikalov, Igor; Yao, Peggy; Moshkov, Mikhail; Latombe, Jean-Claude
2011-02-15
Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. They form and break while a protein deforms, for instance during the transition from a non-functional to a functional state. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor. This paper describes inductive learning methods to train protein-independent probabilistic models of H-bond stability from molecular dynamics (MD) simulation trajectories of various proteins. The training data contains 32 input attributes (predictors) that describe an H-bond and its local environment in a conformation c and the output attribute is the probability that the H-bond will be present in an arbitrary conformation of this protein achievable from c within a time duration Δ. We model dependence of the output variable on the predictors by a regression tree. Several models are built using 6 MD simulation trajectories containing over 4000 distinct H-bonds (millions of occurrences). Experimental results demonstrate that such models can predict H-bond stability quite well. They perform roughly 20% better than models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a conformation. In most tests, about 80% of the 10% H-bonds predicted as the least stable are actually among the 10% truly least stable. The important attributes identified during the tree construction are consistent with previous findings. We use inductive learning methods to build protein-independent probabilistic models to study H-bond stability, and demonstrate that the models perform better than H-bond energy alone.
Reconstructing liver shape and position from MR image slices using an active shape model
NASA Astrophysics Data System (ADS)
Fenchel, Matthias; Thesen, Stefan; Schilling, Andreas
2008-03-01
We present an algorithm for fully automatic reconstruction of 3D position, orientation and shape of the human liver from a sparsely covering set of n 2D MR slice images. Reconstructing the shape of an organ from slice images can be used for scan planning, for surgical planning or other purposes where 3D anatomical knowledge has to be inferred from sparse slices. The algorithm is based on adapting an active shape model of the liver surface to a given set of slice images. The active shape model is created from a training set of liver segmentations from a group of volunteers. The training set is set up with semi-manual segmentations of T1-weighted volumetric MR images. Searching for the optimal shape model that best fits to the image data is done by maximizing a similarity measure based on local appearance at the surface. Two different algorithms for the active shape model search are proposed and compared: both algorithms seek to maximize the a-posteriori probability of the grey level appearance around the surface while constraining the surface to the space of valid shapes. The first algorithm works by using grey value profile statistics in normal direction. The second algorithm uses average and variance images to calculate the local surface appearance on the fly. Both algorithms are validated by fitting the active shape model to abdominal 2D slice images and comparing the shapes, which have been reconstructed, to the manual segmentations and to the results of active shape model searches from 3D image data. The results turn out to be promising and competitive to active shape model segmentations from 3D data.
Climate sensitivity estimated from temperature reconstructions of the Last Glacial Maximum
NASA Astrophysics Data System (ADS)
Schmittner, A.; Urban, N.; Shakun, J. D.; Mahowald, N. M.; Clark, P. U.; Bartlein, P. J.; Mix, A. C.; Rosell-Melé, A.
2011-12-01
In 1959 IJ Good published the discussion "Kinds of Probability" in Science. Good identified (at least) five kinds. The need for (at least) a sixth kind of probability when quantifying uncertainty in the context of climate science is discussed. This discussion brings out the differences in weather-like forecasting tasks and climate-links tasks, with a focus on the effective use both of science and of modelling in support of decision making. Good also introduced the idea of a "Dynamic probability" a probability one expects to change without any additional empirical evidence; the probabilities assigned by a chess playing program when it is only half thorough its analysis being an example. This case is contrasted with the case of "Mature probabilities" where a forecast algorithm (or model) has converged on its asymptotic probabilities and the question hinges in whether or not those probabilities are expected to change significantly before the event in question occurs, even in the absence of new empirical evidence. If so, then how might one report and deploy such immature probabilities in scientific-support of decision-making rationally? Mature Probability is suggested as a useful sixth kind, although Good would doubtlessly argue that we can get by with just one, effective communication with decision makers may be enhanced by speaking as if the others existed. This again highlights the distinction between weather-like contexts and climate-like contexts. In the former context one has access to a relevant climatology (a relevant, arguably informative distribution prior to any model simulations), in the latter context that information is not available although one can fall back on the scientific basis upon which the model itself rests, and estimate the probability that the model output is in fact misinformative. This subjective "probability of a big surprise" is one way to communicate the probability of model-based information holding in practice, the probability that the information the model-based probability is conditioned on holds. It is argued that no model-based climate-like probability forecast is complete without a quantitative estimate of its own irrelevance, and that the clear identification of model-based probability forecasts as mature or immature, are critical elements for maintaining the credibility of science-based decision support, and can shape uncertainty quantification more widely.
Global Quantitative Modeling of Chromatin Factor Interactions
Zhou, Jian; Troyanskaya, Olga G.
2014-01-01
Chromatin is the driver of gene regulation, yet understanding the molecular interactions underlying chromatin factor combinatorial patterns (or the “chromatin codes”) remains a fundamental challenge in chromatin biology. Here we developed a global modeling framework that leverages chromatin profiling data to produce a systems-level view of the macromolecular complex of chromatin. Our model ultilizes maximum entropy modeling with regularization-based structure learning to statistically dissect dependencies between chromatin factors and produce an accurate probability distribution of chromatin code. Our unsupervised quantitative model, trained on genome-wide chromatin profiles of 73 histone marks and chromatin proteins from modENCODE, enabled making various data-driven inferences about chromatin profiles and interactions. We provided a highly accurate predictor of chromatin factor pairwise interactions validated by known experimental evidence, and for the first time enabled higher-order interaction prediction. Our predictions can thus help guide future experimental studies. The model can also serve as an inference engine for predicting unknown chromatin profiles — we demonstrated that with this approach we can leverage data from well-characterized cell types to help understand less-studied cell type or conditions. PMID:24675896
Asynchronous threat awareness by observer trials using crowd simulation
NASA Astrophysics Data System (ADS)
Dunau, Patrick; Huber, Samuel; Stein, Karin U.; Wellig, Peter
2016-10-01
The last few years showed that a high risk of asynchronous threats is given in every day life. Especially in large crowds a high probability of asynchronous attacks is evident. High observational abilities to detect threats are desirable. Consequently highly trained security and observation personal is needed. This paper evaluates the effectiveness of a training methodology to enhance performance of observation personnel engaging in a specific target identification task. For this purpose a crowd simulation video is utilized. The study first provides a measurement of the base performance before the training sessions. Furthermore a training procedure will be performed. Base performance will then be compared to the after training performance in order to look for a training effect. A thorough evaluation of both the training sessions as well as the overall performance will be done in this paper. A specific hypotheses based metric is used. Results will be discussed in order to provide guidelines for the design of training for observational tasks.
NASA Astrophysics Data System (ADS)
Zhang, Jiaxin; Shields, Michael D.
2018-01-01
This paper addresses the problem of uncertainty quantification and propagation when data for characterizing probability distributions are scarce. We propose a methodology wherein the full uncertainty associated with probability model form and parameter estimation are retained and efficiently propagated. This is achieved by applying the information-theoretic multimodel inference method to identify plausible candidate probability densities and associated probabilities that each method is the best model in the Kullback-Leibler sense. The joint parameter densities for each plausible model are then estimated using Bayes' rule. We then propagate this full set of probability models by estimating an optimal importance sampling density that is representative of all plausible models, propagating this density, and reweighting the samples according to each of the candidate probability models. This is in contrast with conventional methods that try to identify a single probability model that encapsulates the full uncertainty caused by lack of data and consequently underestimate uncertainty. The result is a complete probabilistic description of both aleatory and epistemic uncertainty achieved with several orders of magnitude reduction in computational cost. It is shown how the model can be updated to adaptively accommodate added data and added candidate probability models. The method is applied for uncertainty analysis of plate buckling strength where it is demonstrated how dataset size affects the confidence (or lack thereof) we can place in statistical estimates of response when data are lacking.
Kennicutt, A R; Morkowchuk, L; Krein, M; Breneman, C M; Kilduff, J E
2016-08-01
A quantitative structure-activity relationship was developed to predict the efficacy of carbon adsorption as a control technology for endocrine-disrupting compounds, pharmaceuticals, and components of personal care products, as a tool for water quality professionals to protect public health. Here, we expand previous work to investigate a broad spectrum of molecular descriptors including subdivided surface areas, adjacency and distance matrix descriptors, electrostatic partial charges, potential energy descriptors, conformation-dependent charge descriptors, and Transferable Atom Equivalent (TAE) descriptors that characterize the regional electronic properties of molecules. We compare the efficacy of linear (Partial Least Squares) and non-linear (Support Vector Machine) machine learning methods to describe a broad chemical space and produce a user-friendly model. We employ cross-validation, y-scrambling, and external validation for quality control. The recommended Support Vector Machine model trained on 95 compounds having 23 descriptors offered a good balance between good performance statistics, low error, and low probability of over-fitting while describing a wide range of chemical features. The cross-validated model using a log-uptake (qe) response calculated at an aqueous equilibrium concentration (Ce) of 1 μM described the training dataset with an r(2) of 0.932, had a cross-validated r(2) of 0.833, and an average residual of 0.14 log units.
Instructional control of reinforcement learning: A behavioral and neurocomputational investigation
Doll, Bradley B.; Jacobs, W. Jake; Sanfey, Alan G.; Frank, Michael J.
2011-01-01
Humans learn how to behave directly through environmental experience and indirectly through rules and instructions. Behavior analytic research has shown that instructions can control behavior, even when such behavior leads to sub-optimal outcomes (Hayes, S. (Ed.). 1989. Rule-governed behavior: cognition, contingencies, and instructional control. Plenum Press.). Here we examine the control of behavior through instructions in a reinforcement learning task known to depend on striatal dopaminergic function. Participants selected between probabilistically reinforced stimuli, and were (incorrectly) told that a specific stimulus had the highest (or lowest) reinforcement probability. Despite experience to the contrary, instructions drove choice behavior. We present neural network simulations that capture the interactions between instruction-driven and reinforcement-driven behavior via two potential neural circuits: one in which the striatum is inaccurately trained by instruction representations coming from prefrontal cortex/hippocampus (PFC/HC), and another in which the striatum learns the environmentally based reinforcement contingencies, but is “overridden” at decision output. Both models capture the core behavioral phenomena but, because they differ fundamentally on what is learned, make distinct predictions for subsequent behavioral and neuroimaging experiments. Finally, we attempt to distinguish between the proposed computational mechanisms governing instructed behavior by fitting a series of abstract “Q-learning” and Bayesian models to subject data. The best-fitting model supports one of the neural models, suggesting the existence of a “confirmation bias” in which the PFC/HC system trains the reinforcement system by amplifying outcomes that are consistent with instructions while diminishing inconsistent outcomes. PMID:19595993
Haleem, Kirolos
2016-10-01
Private highway-railroad grade crossings (HRGCs) are intersections of highways and railroads on roadways that are not maintained by a public authority. Since no public authority maintains private HRGCs, fatal and injury crashes at these locations are of concern. However, no study has been conducted at private HRGCs to identify the safety issues that might exist and how to alleviate them. This study identifies the significant predictors of traffic casualties (including both injuries and fatalities) at private HRGCs in the U.S. using six years of nationwide crashes from 2009 to 2014. Two levels of injury severity were considered, injury (including fatalities and injuries) and no injury. The study investigates multiple predictors, e.g., temporal crash characteristics, geometry, railroad, traffic, vehicle, and environment. The study applies both the mixed logit and binary logit models. The mixed logit model was found to outperform the binary logit model. The mixed logit model revealed that drivers who did not stop, railroad equipment that struck highway users, higher train speeds, non-presence of advance warning signs, concrete road surface type, and cloudy weather were associated with an increase in injuries and fatalities. For example, a one-mile-per-hour higher train speed increases the probability of fatality by 22%. On the contrary, male drivers, PM peak periods, and presence of warning devices at both approaches were associated with a fatality reduction. Potential strategies are recommended to alleviate injuries and fatalities at private HRGCs. Copyright © 2016 Elsevier Ltd. All rights reserved.
Forensic Seismology and the 1995 Oklahoma City Terrorist Bombing
NASA Astrophysics Data System (ADS)
Holzer, T. L.
2002-05-01
The terrorist bombing of the Alfred P. Murrah Federal Building in Oklahoma City, Oklahoma, on April 19, 1995, was recorded on 2 permanent seismographs, 7 and 26 km away. The more distant seismograph recorded 2 low-frequency wave trains separated by about 10 s. Militia groups speculated that the 2 wave trains were caused by separate explosions and hinted at a government cover up. Preliminary statements by the scientific community also contributed to the uncertainty. A public science organization issued a press release that stated "the location and source of the second surface wave-recording is unknown. Detailed investigations at the building site may offer an explanation as to the cause and origin of the second event." A prominent professional newsletter reported that the "first event was caused by energy from the explosion and the second from the fall of the building." To understand the seismic phases in the April 19 seismograms, the USGS monitored the demolition of the damaged building on May 23, 1995, with a portable seismic array. The array recorded the same 2 wave trains during the demolition and indicated the wave trains were a propagation effect and not the result of multiple sources. Modeling of the waveforms indicated that the 2 wave trains probably resulted from propagation of seismic energy in a near-surface zone with a strong velocity gradient. The first phase appeared to be a packet of scattered body waves and the second was the fundamental-mode Rayleigh wave. Timely resolution of the ambiguity of the seismogram and publication of results in a refereed publication, EOS, discouraged a conspiracy defense by the terrorists.
Wright, C.; Gallant, Alisa L.
2007-01-01
The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.
NASA Astrophysics Data System (ADS)
Oza, Nikunj
2012-03-01
A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. A set of training examples— examples with known output values—is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate’s measurements. The generalization performance of a learned model (how closely the target outputs and the model’s predicted outputs agree for patterns that have not been presented to the learning algorithm) would provide an indication of how well the model has learned the desired mapping. More formally, a classification learning algorithm L takes a training set T as its input. The training set consists of |T| examples or instances. It is assumed that there is a probability distribution D from which all training examples are drawn independently—that is, all the training examples are independently and identically distributed (i.i.d.). The ith training example is of the form (x_i, y_i), where x_i is a vector of values of several features and y_i represents the class to be predicted.* In the sunspot classification example given above, each training example would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to classify new examples, and the strengths and weaknesses of these models. We will end with pointers to further reading on classification methods applied to astronomy data.
Regular rehearsal helps in consolidation of long term memory.
Parle, Milind; Singh, Nirmal; Vasudevan, Mani
2006-01-01
Memory, one of the most complex functions of the brain comprises of multiple components such as perception, registration, consolidation, storage, retrieval and decay. The present study was undertaken to evaluate the impact of different training sessions on the retention capacity of rats. The capacity of retention of learnt task was measured using exteroceptive behavioral models such as Hexagonal swimming pool apparatus, Hebb-Williams maze and Elevated plus-maze. A total of 150 rats divided into fifteen groups were employed in the present study. The animals were subjected to different training sessions during first three days. The ability to retain the learned task was tested after single, sub-acute, acute, sub-chronic and chronic exposure to above exteroceptive memory models in separate groups of animals. The memory score of all animals was recorded after 72 h, 192 h and 432 h of their last training trial. Rats of single exposure group did not show any effect on memory. Sub-acute training group animals showed improved memory up to 72 h only, where as in acute and sub-chronic training groups this memory improvement was extended up to 192 h. The rats, which were subjected to chronic exposures showed a significant improvement in retention capacity that lasted up to a period of eighteen days. These observations suggest that repeated rehearsals at regular intervals are probably necessary for consolidation of long-term memory. It was observed that sub-acute, acute and sub-chronic exposures, improved the retrieval ability of rats but this memory improving effect was short lived. Thus, rehearsal or training plays a crucial role in enhancing one's capacity of retaining the learnt information. Key PointsThe present study underlines the importance of regular rehearsals in enhancing one's capacity of retaining the learnt information. " Sub-acute, acute & sub-chronic rehearsals result in storing of information for a limited period of time.Quick decay of information or forgetting is a natural continuously active process designed to wipe out unnecessary and useless information.The capacities of grasping, understanding and memory are all crucial for career growth.Single exposure to a new environment is not sufficient enough to form a permanent memory trace in brain.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323
p-adic stochastic hidden variable model
NASA Astrophysics Data System (ADS)
Khrennikov, Andrew
1998-03-01
We propose stochastic hidden variables model in which hidden variables have a p-adic probability distribution ρ(λ) and at the same time conditional probabilistic distributions P(U,λ), U=A,A',B,B', are ordinary probabilities defined on the basis of the Kolmogorov measure-theoretical axiomatics. A frequency definition of p-adic probability is quite similar to the ordinary frequency definition of probability. p-adic frequency probability is defined as the limit of relative frequencies νn but in the p-adic metric. We study a model with p-adic stochastics on the level of the hidden variables description. But, of course, responses of macroapparatuses have to be described by ordinary stochastics. Thus our model describes a mixture of p-adic stochastics of the microworld and ordinary stochastics of macroapparatuses. In this model probabilities for physical observables are the ordinary probabilities. At the same time Bell's inequality is violated.
NASA Astrophysics Data System (ADS)
Hermans, Thomas; Nguyen, Frédéric; Caers, Jef
2015-07-01
In inverse problems, investigating uncertainty in the posterior distribution of model parameters is as important as matching data. In recent years, most efforts have focused on techniques to sample the posterior distribution with reasonable computational costs. Within a Bayesian context, this posterior depends on the prior distribution. However, most of the studies ignore modeling the prior with realistic geological uncertainty. In this paper, we propose a workflow inspired by a Popper-Bayes philosophy that data should first be used to falsify models, then only be considered for matching. We propose a workflow consisting of three steps: (1) in defining the prior, we interpret multiple alternative geological scenarios from literature (architecture of facies) and site-specific data (proportions of facies). Prior spatial uncertainty is modeled using multiple-point geostatistics, where each scenario is defined using a training image. (2) We validate these prior geological scenarios by simulating electrical resistivity tomography (ERT) data on realizations of each scenario and comparing them to field ERT in a lower dimensional space. In this second step, the idea is to probabilistically falsify scenarios with ERT, meaning that scenarios which are incompatible receive an updated probability of zero while compatible scenarios receive a nonzero updated belief. (3) We constrain the hydrogeological model with hydraulic head and ERT using a stochastic search method. The workflow is applied to a synthetic and a field case studies in an alluvial aquifer. This study highlights the importance of considering and estimating prior uncertainty (without data) through a process of probabilistic falsification.
Selective Genomic Copy Number Imbalances and Probability of Recurrence in Early-Stage Breast Cancer
Thompson, Patricia A.; Brewster, Abenaa M.; Kim-Anh, Do; Baladandayuthapani, Veerabhadran; Broom, Bradley M.; Edgerton, Mary E.; Hahn, Karin M.; Murray, James L.; Sahin, Aysegul; Tsavachidis, Spyros; Wang, Yuker; Zhang, Li; Hortobagyi, Gabriel N.; Mills, Gordon B.; Bondy, Melissa L.
2011-01-01
A number of studies of copy number imbalances (CNIs) in breast tumors support associations between individual CNIs and patient outcomes. However, no pattern or signature of CNIs has emerged for clinical use. We determined copy number (CN) gains and losses using high-density molecular inversion probe (MIP) arrays for 971 stage I/II breast tumors and applied a boosting strategy to fit hazards models for CN and recurrence, treating chromosomal segments in a dose-specific fashion (-1 [loss], 0 [no change] and +1 [gain]). The concordance index (C-Index) was used to compare prognostic accuracy between a training (n = 728) and test (n = 243) set and across models. Twelve novel prognostic CNIs were identified: losses at 1p12, 12q13.13, 13q12.3, 22q11, and Xp21, and gains at 2p11.1, 3q13.12, 10p11.21, 10q23.1, 11p15, 14q13.2-q13.3, and 17q21.33. In addition, seven CNIs previously implicated as prognostic markers were selected: losses at 8p22 and 16p11.2 and gains at 10p13, 11q13.5, 12p13, 20q13, and Xq28. For all breast cancers combined, the final full model including 19 CNIs, clinical covariates, and tumor marker-approximated subtypes (estrogen receptor [ER], progesterone receptor, ERBB2 amplification, and Ki67) significantly outperformed a model containing only clinical covariates and tumor subtypes (C-Index full model, train[test] = 0.72[0.71] ± 0.02 vs. C-Index clinical + subtype model, train[test] = 0.62[0.62] ± 0.02; p<10−6). In addition, the full model containing 19 CNIs significantly improved prognostication separately for ER–, HER2+, luminal B, and triple negative tumors over clinical variables alone. In summary, we show that a set of 19 CNIs discriminates risk of recurrence among early-stage breast tumors, independent of ER status. Further, our data suggest the presence of specific CNIs that promote and, in some cases, limit tumor spread. PMID:21858162
Poisson process stimulation of an excitable membrane cable model.
Goldfinger, M D
1986-01-01
The convergence of multiple inputs within a single-neuronal substrate is a common design feature of both peripheral and central nervous systems. Typically, the result of such convergence impinges upon an intracellularly contiguous axon, where it is encoded into a train of action potentials. The simplest representation of the result of convergence of multiple inputs is a Poisson process; a general representation of axonal excitability is the Hodgkin-Huxley/cable theory formalism. The present work addressed multiple input convergence upon an axon by applying Poisson process stimulation to the Hodgkin-Huxley axonal cable. The results showed that both absolute and relative refractory periods yielded in the axonal output a random but non-Poisson process. While smaller amplitude stimuli elicited a type of short-interval conditioning, larger amplitude stimuli elicited impulse trains approaching Poisson criteria except for the effects of refractoriness. These results were obtained for stimulus trains consisting of pulses of constant amplitude and constant or variable durations. By contrast, with or without stimulus pulse shape variability, the post-impulse conditional probability for impulse initiation in the steady-state was a Poisson-like process. For stimulus variability consisting of randomly smaller amplitudes or randomly longer durations, mean impulse frequency was attenuated or potentiated, respectively. Limitations and implications of these computations are discussed. PMID:3730505
Distribution-Preserving Stratified Sampling for Learning Problems.
Cervellera, Cristiano; Maccio, Danilo
2017-06-09
The need for extracting a small sample from a large amount of real data, possibly streaming, arises routinely in learning problems, e.g., for storage, to cope with computational limitations, obtain good training/test/validation sets, and select minibatches for stochastic gradient neural network training. Unless we have reasons to select the samples in an active way dictated by the specific task and/or model at hand, it is important that the distribution of the selected points is as similar as possible to the original data. This is obvious for unsupervised learning problems, where the goal is to gain insights on the distribution of the data, but it is also relevant for supervised problems, where the theory explains how the training set distribution influences the generalization error. In this paper, we analyze the technique of stratified sampling from the point of view of distances between probabilities. This allows us to introduce an algorithm, based on recursive binary partition of the input space, aimed at obtaining samples that are distributed as much as possible as the original data. A theoretical analysis is proposed, proving the (greedy) optimality of the procedure together with explicit error bounds. An adaptive version of the algorithm is also introduced to cope with streaming data. Simulation tests on various data sets and different learning tasks are also provided.
A sequence-dependent rigid-base model of DNA
NASA Astrophysics Data System (ADS)
Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.
2013-02-01
A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can successfully predict the nonlocal changes in the minimum energy configuration of an oligomer that are consequent upon a local change of sequence at the level of a single point mutation.
A sequence-dependent rigid-base model of DNA.
Gonzalez, O; Petkevičiūtė, D; Maddocks, J H
2013-02-07
A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can successfully predict the nonlocal changes in the minimum energy configuration of an oligomer that are consequent upon a local change of sequence at the level of a single point mutation.
Tosun, Tuğçe; Gür, Ezgi; Balcı, Fuat
2016-01-01
Animals can shape their timed behaviors based on experienced probabilistic relations in a nearly optimal fashion. On the other hand, it is not clear if they adopt these timed decisions by making computations based on previously learnt task parameters (time intervals, locations, and probabilities) or if they gradually develop their decisions based on trial and error. To address this question, we tested mice in the timed-switching task, which required them to anticipate when (after a short or long delay) and at which of the two delay locations a reward would be presented. The probability of short trials differed between test groups in two experiments. Critically, we first trained mice on relevant task parameters by signaling the active trial with a discriminative stimulus and delivered the corresponding reward after the associated delay without any response requirement (without inducing switching behavior). During the test phase, both options were presented simultaneously to characterize the emergence and temporal characteristics of the switching behavior. Mice exhibited timed-switching behavior starting from the first few test trials, and their performance remained stable throughout testing in the majority of the conditions. Furthermore, as the probability of the short trial increased, mice waited longer before switching from the short to long location (experiment 1). These behavioral adjustments were in directions predicted by reward maximization. These results suggest that rather than gradually adjusting their time-dependent choice behavior, mice abruptly adopted temporal decision strategies by directly integrating their previous knowledge of task parameters into their timed behavior, supporting the model-based representational account of temporal risk assessment. PMID:26733674
Intrusion Detection System Using Deep Neural Network for In-Vehicle Network Security.
Kang, Min-Joo; Kang, Je-Won
2016-01-01
A novel intrusion detection system (IDS) using a deep neural network (DNN) is proposed to enhance the security of in-vehicular network. The parameters building the DNN structure are trained with probability-based feature vectors that are extracted from the in-vehicular network packets. For a given packet, the DNN provides the probability of each class discriminating normal and attack packets, and, thus the sensor can identify any malicious attack to the vehicle. As compared to the traditional artificial neural network applied to the IDS, the proposed technique adopts recent advances in deep learning studies such as initializing the parameters through the unsupervised pre-training of deep belief networks (DBN), therefore improving the detection accuracy. It is demonstrated with experimental results that the proposed technique can provide a real-time response to the attack with a significantly improved detection ratio in controller area network (CAN) bus.
Intrusion Detection System Using Deep Neural Network for In-Vehicle Network Security
Kang, Min-Joo
2016-01-01
A novel intrusion detection system (IDS) using a deep neural network (DNN) is proposed to enhance the security of in-vehicular network. The parameters building the DNN structure are trained with probability-based feature vectors that are extracted from the in-vehicular network packets. For a given packet, the DNN provides the probability of each class discriminating normal and attack packets, and, thus the sensor can identify any malicious attack to the vehicle. As compared to the traditional artificial neural network applied to the IDS, the proposed technique adopts recent advances in deep learning studies such as initializing the parameters through the unsupervised pre-training of deep belief networks (DBN), therefore improving the detection accuracy. It is demonstrated with experimental results that the proposed technique can provide a real-time response to the attack with a significantly improved detection ratio in controller area network (CAN) bus. PMID:27271802
Cycle training induces muscle hypertrophy and strength gain: strategies and mechanisms.
Ozaki, Hayao; Loenneke, J P; Thiebaud, R S; Abe, T
2015-03-01
Cycle training is widely performed as a major part of any exercise program seeking to improve aerobic capacity and cardiovascular health. However, the effect of cycle training on muscle size and strength gain still requires further insight, even though it is known that professional cyclists display larger muscle size compared to controls. Therefore, the purpose of this review is to discuss the effects of cycle training on muscle size and strength of the lower extremity and the possible mechanisms for increasing muscle size with cycle training. It is plausible that cycle training requires a longer period to significantly increase muscle size compared to typical resistance training due to a much slower hypertrophy rate. Cycle training induces muscle hypertrophy similarly between young and older age groups, while strength gain seems to favor older adults, which suggests that the probability for improving in muscle quality appears to be higher in older adults compared to young adults. For young adults, higher-intensity intermittent cycling may be required to achieve strength gains. It also appears that muscle hypertrophy induced by cycle training results from the positive changes in muscle protein net balance.
Ma, Po-Lun; Rasch, Philip J.; Wang, Minghuai; ...
2015-06-23
We report the Community Atmosphere Model Version 5 is run at horizontal grid spacing of 2, 1, 0.5, and 0.25°, with the meteorology nudged toward the Year Of Tropical Convection analysis, and cloud simulators and the collocated A-Train satellite observations are used to explore the resolution dependence of aerosol-cloud interactions. The higher-resolution model produces results that agree better with observations, showing an increase of susceptibility of cloud droplet size, indicating a stronger first aerosol indirect forcing (AIF), and a decrease of susceptibility of precipitation probability, suggesting a weaker second AIF. The resolution sensitivities of AIF are attributed to those ofmore » droplet nucleation and precipitation parameterizations. Finally, the annual average AIF in the Northern Hemisphere midlatitudes (where most anthropogenic emissions occur) in the 0.25° model is reduced by about 1 W m -2 (-30%) compared to the 2° model, leading to a 0.26 W m -2 reduction (-15%) in the global annual average AIF.« less
Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav
2018-04-01
A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).
Viira, Birgit; Gendron, Thibault; Lanfranchi, Don Antoine; Cojean, Sandrine; Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre; Maes, Louis; Maran, Uko; Loiseau, Philippe M; Davioud-Charvet, Elisabeth
2016-06-29
Malaria is a parasitic tropical disease that kills around 600,000 patients every year. The emergence of resistant Plasmodium falciparum parasites to artemisinin-based combination therapies (ACTs) represents a significant public health threat, indicating the urgent need for new effective compounds to reverse ACT resistance and cure the disease. For this, extensive curation and homogenization of experimental anti-Plasmodium screening data from both in-house and ChEMBL sources were conducted. As a result, a coherent strategy was established that allowed compiling coherent training sets that associate compound structures to the respective antimalarial activity measurements. Seventeen of these training sets led to the successful generation of classification models discriminating whether a compound has a significant probability to be active under the specific conditions of the antimalarial test associated with each set. These models were used in consensus prediction of the most likely active from a series of curcuminoids available in-house. Positive predictions together with a few predicted as inactive were then submitted to experimental in vitro antimalarial testing. A large majority from predicted compounds showed antimalarial activity, but not those predicted as inactive, thus experimentally validating the in silico screening approach. The herein proposed consensus machine learning approach showed its potential to reduce the cost and duration of antimalarial drug discovery.
Korjus, Kristjan; Hebart, Martin N.; Vicente, Raul
2016-01-01
Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do. PMID:27564393
Korjus, Kristjan; Hebart, Martin N; Vicente, Raul
2016-01-01
Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier's generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term "Cross-validation and cross-testing" improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.
MRI-alone radiation therapy planning for prostate cancer: Automatic fiducial marker detection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghose, Soumya, E-mail: soumya.ghose@case.edu; Mitra, Jhimli; Rivest-Hénault, David
Purpose: The feasibility of radiation therapy treatment planning using substitute computed tomography (sCT) generated from magnetic resonance images (MRIs) has been demonstrated by a number of research groups. One challenge with an MRI-alone workflow is the accurate identification of intraprostatic gold fiducial markers, which are frequently used for prostate localization prior to each dose delivery fraction. This paper investigates a template-matching approach for the detection of these seeds in MRI. Methods: Two different gradient echo T1 and T2* weighted MRI sequences were acquired from fifteen prostate cancer patients and evaluated for seed detection. For training, seed templates from manual contoursmore » were selected in a spectral clustering manifold learning framework. This aids in clustering “similar” gold fiducial markers together. The marker with the minimum distance to a cluster centroid was selected as the representative template of that cluster during training. During testing, Gaussian mixture modeling followed by a Markovian model was used in automatic detection of the probable candidates. The probable candidates were rigidly registered to the templates identified from spectral clustering, and a similarity metric is computed for ranking and detection. Results: A fiducial detection accuracy of 95% was obtained compared to manual observations. Expert radiation therapist observers were able to correctly identify all three implanted seeds on 11 of the 15 scans (the proposed method correctly identified all seeds on 10 of the 15). Conclusions: An novel automatic framework for gold fiducial marker detection in MRI is proposed and evaluated with detection accuracies comparable to manual detection. When radiation therapists are unable to determine the seed location in MRI, they refer back to the planning CT (only available in the existing clinical framework); similarly, an automatic quality control is built into the automatic software to ensure that all gold seeds are either correctly detected or a warning is raised for further manual intervention.« less
MRI-alone radiation therapy planning for prostate cancer: Automatic fiducial marker detection.
Ghose, Soumya; Mitra, Jhimli; Rivest-Hénault, David; Fazlollahi, Amir; Stanwell, Peter; Pichler, Peter; Sun, Jidi; Fripp, Jurgen; Greer, Peter B; Dowling, Jason A
2016-05-01
The feasibility of radiation therapy treatment planning using substitute computed tomography (sCT) generated from magnetic resonance images (MRIs) has been demonstrated by a number of research groups. One challenge with an MRI-alone workflow is the accurate identification of intraprostatic gold fiducial markers, which are frequently used for prostate localization prior to each dose delivery fraction. This paper investigates a template-matching approach for the detection of these seeds in MRI. Two different gradient echo T1 and T2* weighted MRI sequences were acquired from fifteen prostate cancer patients and evaluated for seed detection. For training, seed templates from manual contours were selected in a spectral clustering manifold learning framework. This aids in clustering "similar" gold fiducial markers together. The marker with the minimum distance to a cluster centroid was selected as the representative template of that cluster during training. During testing, Gaussian mixture modeling followed by a Markovian model was used in automatic detection of the probable candidates. The probable candidates were rigidly registered to the templates identified from spectral clustering, and a similarity metric is computed for ranking and detection. A fiducial detection accuracy of 95% was obtained compared to manual observations. Expert radiation therapist observers were able to correctly identify all three implanted seeds on 11 of the 15 scans (the proposed method correctly identified all seeds on 10 of the 15). An novel automatic framework for gold fiducial marker detection in MRI is proposed and evaluated with detection accuracies comparable to manual detection. When radiation therapists are unable to determine the seed location in MRI, they refer back to the planning CT (only available in the existing clinical framework); similarly, an automatic quality control is built into the automatic software to ensure that all gold seeds are either correctly detected or a warning is raised for further manual intervention.
Dawson, Michael R W; Dupuis, Brian; Spetch, Marcia L; Kelly, Debbie M
2009-08-01
The matching law (Herrnstein 1961) states that response rates become proportional to reinforcement rates; this is related to the empirical phenomenon called probability matching (Vulkan 2000). Here, we show that a simple artificial neural network generates responses consistent with probability matching. This behavior was then used to create an operant procedure for network learning. We use the multiarmed bandit (Gittins 1989), a classic problem of choice behavior, to illustrate that operant training balances exploiting the bandit arm expected to pay off most frequently with exploring other arms. Perceptrons provide a medium for relating results from neural networks, genetic algorithms, animal learning, contingency theory, reinforcement learning, and theories of choice.
Arrizabalaga, Pilar; Abellana, Rosa; Viñas, Odette; Merino, Anna; Ascaso, Carlos
2015-03-29
The feminization of medicine has risen dramatically over the past decades. The aim of this article was to compare the advance of women with that of men and determine the differences between hierarchical status and professional recognition achieved by women in medicine. A retrospective study was carried out in the Hospital Clinic Barcelona, Spain, of the period from 1996 to 2008. Data relating to temporary and permanent positions, hierarchy and career promotion achieved, specialty, age and the sex of the participants were analysed with the ANOVA test and logistic regression using the generalized estimated equation. After completion of specialist training, fewer women than men doctors obtained permanent positions. The ratios between the proportions of women and men remained 1.2 for permanent non-hierarchal medical positions and below 0.2 for higher hierarchal levels. Fewer women than men with hierarchy and fewer women than men achieved the rank of consultant. Promotion to consultant and senior consultant was lower than that to senior specialist, being higher in specialties with gender parity and in masculinised specialties. On comparing the two genders using a statistical model, the probability of continuous promotion decreased with the year of the application and the age of the applicant, except in women. Despite the number of women training as specialists having increased to 50%, women remained in temporary positions twofold longer than men. Compared to women, men showed significant representation in hierarchal medical positions, and women showed a lower adjusted probability of internal professional promotion throughout the study period.
SPACE WARPS - I. Crowdsourcing the discovery of gravitational lenses
NASA Astrophysics Data System (ADS)
Marshall, Philip J.; Verma, Aprajita; More, Anupreeta; Davis, Christopher P.; More, Surhud; Kapadia, Amit; Parrish, Michael; Snyder, Chris; Wilcox, Julianne; Baeten, Elisabeth; Macmillan, Christine; Cornen, Claude; Baumer, Michael; Simpson, Edwin; Lintott, Chris J.; Miller, David; Paget, Edward; Simpson, Robert; Smith, Arfon M.; Küng, Rafael; Saha, Prasenjit; Collett, Thomas E.
2016-01-01
We describe SPACE WARPS, a novel gravitational lens discovery service that yields samples of high purity and completeness through crowdsourced visual inspection. Carefully produced colour composite images are displayed to volunteers via a web-based classification interface, which records their estimates of the positions of candidate lensed features. Images of simulated lenses, as well as real images which lack lenses, are inserted into the image stream at random intervals; this training set is used to give the volunteers instantaneous feedback on their performance, as well as to calibrate a model of the system that provides dynamical updates to the probability that a classified image contains a lens. Low-probability systems are retired from the site periodically, concentrating the sample towards a set of lens candidates. Having divided 160 deg2 of Canada-France-Hawaii Telescope Legacy Survey imaging into some 430 000 overlapping 82 by 82 arcsec tiles and displaying them on the site, we were joined by around 37 000 volunteers who contributed 11 million image classifications over the course of eight months. This stage 1 search reduced the sample to 3381 images containing candidates; these were then refined in stage 2 to yield a sample that we expect to be over 90 per cent complete and 30 per cent pure, based on our analysis of the volunteers performance on training images. We comment on the scalability of the SPACE WARPS system to the wide field survey era, based on our projection that searches of 105 images could be performed by a crowd of 105 volunteers in 6 d.
Akhtar, Kashif; Sugand, Kapil; Sperrin, Matthew; Cobb, Justin; Standfield, Nigel; Gupte, Chinmay
2015-01-01
Virtual-reality (VR) simulation in orthopedic training is still in its infancy, and much of the work has been focused on arthroscopy. We evaluated the construct validity of a new VR trauma simulator for performing dynamic hip screw (DHS) fixation of a trochanteric femoral fracture. 30 volunteers were divided into 3 groups according to the number of postgraduate (PG) years and the amount of clinical experience: novice (1-4 PG years; less than 10 DHS procedures); intermediate (5-12 PG years; 10-100 procedures); expert (> 12 PG years; > 100 procedures). Each participant performed a DHS procedure and objective performance metrics were recorded. These data were analyzed with each performance metric taken as the dependent variable in 3 regression models. There were statistically significant differences in performance between groups for (1) number of attempts at guide-wire insertion, (2) total fluoroscopy time, (3) tip-apex distance, (4) probability of screw cutout, and (5) overall simulator score. The intermediate group performed the procedure most quickly, with the lowest fluoroscopy time, the lowest tip-apex distance, the lowest probability of cutout, and the highest simulator score, which correlated with their frequency of exposure to running the trauma lists for hip fracture surgery. This study demonstrates the construct validity of a haptic VR trauma simulator with surgeons undertaking the procedure most frequently performing best on the simulator. VR simulation may be a means of addressing restrictions on working hours and allows trainees to practice technical tasks without putting patients at risk. The VR DHS simulator evaluated in this study may provide valid assessment of technical skill.
Energetic costs of performance in trained and untrained Anolis carolinensis lizards.
Lailvaux, Simon P; Wang, Andrew Z; Husak, Jerry F
2018-04-23
The energetic costs of performance constitute a non-trivial component of animals' daily energetic budgets. However, we currently lack an understanding of how those costs are partitioned among the various stages of performance development, maintenance and production. We manipulated individual investment in performance by training Anolis carolinensis lizards for endurance or sprinting ability. We then measured energetic expenditure both at rest and immediately following exercise to test whether such training alters the maintenance and production costs of performance. Trained lizards had lower resting metabolic rates than controls, suggestive of a maintenance saving associated with enhanced performance as opposed to a cost. Production costs also differed, with sprint-trained lizards incurring a larger energetic performance cost and experiencing longer recovery times compared with endurance trained and control animals. Although performance training modifies metabolism, production costs are probably the key drivers of trade-offs between performance and other life-history traits in this species. © 2018. Published by The Company of Biologists Ltd.
Field evaluation of distance-estimation error during wetland-dependent bird surveys
Nadeau, Christopher P.; Conway, Courtney J.
2012-01-01
Context: The most common methods to estimate detection probability during avian point-count surveys involve recording a distance between the survey point and individual birds detected during the survey period. Accurately measuring or estimating distance is an important assumption of these methods; however, this assumption is rarely tested in the context of aural avian point-count surveys. Aims: We expand on recent bird-simulation studies to document the error associated with estimating distance to calling birds in a wetland ecosystem. Methods: We used two approaches to estimate the error associated with five surveyor's distance estimates between the survey point and calling birds, and to determine the factors that affect a surveyor's ability to estimate distance. Key results: We observed biased and imprecise distance estimates when estimating distance to simulated birds in a point-count scenario (x̄error = -9 m, s.d.error = 47 m) and when estimating distances to real birds during field trials (x̄error = 39 m, s.d.error = 79 m). The amount of bias and precision in distance estimates differed among surveyors; surveyors with more training and experience were less biased and more precise when estimating distance to both real and simulated birds. Three environmental factors were important in explaining the error associated with distance estimates, including the measured distance from the bird to the surveyor, the volume of the call and the species of bird. Surveyors tended to make large overestimations to birds close to the survey point, which is an especially serious error in distance sampling. Conclusions: Our results suggest that distance-estimation error is prevalent, but surveyor training may be the easiest way to reduce distance-estimation error. Implications: The present study has demonstrated how relatively simple field trials can be used to estimate the error associated with distance estimates used to estimate detection probability during avian point-count surveys. Evaluating distance-estimation errors will allow investigators to better evaluate the accuracy of avian density and trend estimates. Moreover, investigators who evaluate distance-estimation errors could employ recently developed models to incorporate distance-estimation error into analyses. We encourage further development of such models, including the inclusion of such models into distance-analysis software.
Arterial tree tracking from anatomical landmarks in magnetic resonance angiography scans
NASA Astrophysics Data System (ADS)
O'Neil, Alison; Beveridge, Erin; Houston, Graeme; McCormick, Lynne; Poole, Ian
2014-03-01
This paper reports on arterial tree tracking in fourteen Contrast Enhanced MRA volumetric scans, given the positions of a predefined set of vascular landmarks, by using the A* algorithm to find the optimal path for each vessel based on voxel intensity and a learnt vascular probability atlas. The algorithm is intended for use in conjunction with an automatic landmark detection step, to enable fully automatic arterial tree tracking. The scan is filtered to give two further images using the top-hat transform with 4mm and 8mm cubic structuring elements. Vessels are then tracked independently on the scan in which the vessel of interest is best enhanced, as determined from knowledge of typical vessel diameter and surrounding structures. A vascular probability atlas modelling expected vessel location and orientation is constructed by non-rigidly registering the training scans to the test scan using a 3D thin plate spline to match landmark correspondences, and employing kernel density estimation with the ground truth center line points to form a probability density distribution. Threshold estimation by histogram analysis is used to segment background from vessel intensities. The A* algorithm is run using a linear cost function constructed from the threshold and the vascular atlas prior. Tracking results are presented for all major arteries excluding those in the upper limbs. An improvement was observed when tracking was informed by contextual information, with particular benefit for peripheral vessels.
Ryan, K; Williams, D Gareth; Balding, David J
2016-11-01
Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Belief state representation in the dopamine system.
Babayan, Benedicte M; Uchida, Naoshige; Gershman, Samuel J
2018-05-14
Learning to predict future outcomes is critical for driving appropriate behaviors. Reinforcement learning (RL) models have successfully accounted for such learning, relying on reward prediction errors (RPEs) signaled by midbrain dopamine neurons. It has been proposed that when sensory data provide only ambiguous information about which state an animal is in, it can predict reward based on a set of probabilities assigned to hypothetical states (called the belief state). Here we examine how dopamine RPEs and subsequent learning are regulated under state uncertainty. Mice are first trained in a task with two potential states defined by different reward amounts. During testing, intermediate-sized rewards are given in rare trials. Dopamine activity is a non-monotonic function of reward size, consistent with RL models operating on belief states. Furthermore, the magnitude of dopamine responses quantitatively predicts changes in behavior. These results establish the critical role of state inference in RL.
Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.
Wöllmer, Martin; Marchi, Erik; Squartini, Stefano; Schuller, Björn
2011-09-01
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".
Research on artistic gymnastics training guidance model
NASA Astrophysics Data System (ADS)
Luo, Lin; Sun, Xianzhong
2017-04-01
Rhythmic gymnastics training guidance model, taking into consideration the features of artistic gymnastics training, is put forward to help gymnasts identify their deficiencies and unskilled technical movements and improve their training effects. The model is built on the foundation of both physical quality indicator model and artistic gymnastics training indicator model. Physical quality indicator model composed of bodily factor, flexibility-strength factor and speed-dexterity factor delivers an objective evaluation with reference to basic sport testing data. Training indicator model, based on physical fitness indicator, helps analyze the technical movements, through which the impact from each bodily factor on technical movements is revealed. AG training guidance model, in further combination with actual training data and in comparison with the data shown in the training indicator model, helps identify the problems in trainings, and thus improve the training effect. These three models when in combined use and in comparison with historical model data can check and verify the improvement in training effect over a certain period of time.
Lee, R S C; Hermens, D F; Scott, J; O'Dea, B; Glozier, N; Scott, E M; Hickie, I B
2017-09-01
Optimizing functional recovery in young individuals with severe mental illness constitutes a major healthcare priority. The current study sought to quantify the cognitive and clinical factors underpinning academic and vocational engagement in a transdiagnostic and prospective youth mental health cohort. The primary outcome measure was 'not in education, employment or training' ('NEET') status. A clinical sample of psychiatric out-patients aged 15-25 years (n = 163) was assessed at two time points, on average, 24 months apart. Functional status, and clinical and neuropsychological data were collected. Bayesian structural equation modelling was used to confirm the factor structure of predictors and cross-lagged effects at follow-up. Individually, NEET status, cognitive dysfunction and negative symptoms at baseline were predictive of NEET status at follow-up (p < 0.05). Baseline cognitive functioning was the only predictor of follow-up NEET status in the multivariate Bayesian model, while controlling for baseline NEET status. For every 1 s.d. deficit in cognition, the probability of being disengaged at follow-up increased by 40% (95% credible interval 19-58%). Baseline NEET status predicted poorer negative symptoms at follow-up (β = 0.24, 95% credible interval 0.04-0.43). Disengagement with education, employment or training (i.e. being NEET) was reported in about one in four members of this cohort. The initial level of cognitive functioning was the strongest determinant of future NEET status, whereas being academically or vocationally engaged had an impact on future negative symptomatology. If replicated, these findings support the need to develop early interventions that target cognitive phenotypes transdiagnostically.
Inverse stochastic-dynamic models for high-resolution Greenland ice core records
NASA Astrophysics Data System (ADS)
Boers, Niklas; Chekroun, Mickael D.; Liu, Honghu; Kondrashov, Dmitri; Rousseau, Denis-Didier; Svensson, Anders; Bigler, Matthias; Ghil, Michael
2017-12-01
Proxy records from Greenland ice cores have been studied for several decades, yet many open questions remain regarding the climate variability encoded therein. Here, we use a Bayesian framework for inferring inverse, stochastic-dynamic models from δ18O and dust records of unprecedented, subdecadal temporal resolution. The records stem from the North Greenland Ice Core Project (NGRIP), and we focus on the time interval 59-22 ka b2k. Our model reproduces the dynamical characteristics of both the δ18O and dust proxy records, including the millennial-scale Dansgaard-Oeschger variability, as well as statistical properties such as probability density functions, waiting times and power spectra, with no need for any external forcing. The crucial ingredients for capturing these properties are (i) high-resolution training data, (ii) cubic drift terms, (iii) nonlinear coupling terms between the δ18O and dust time series, and (iv) non-Markovian contributions that represent short-term memory effects.
Hu, Peijun; Wu, Fa; Peng, Jialin; Bao, Yuanyuan; Chen, Feng; Kong, Dexing
2017-03-01
Multi-organ segmentation from CT images is an essential step for computer-aided diagnosis and surgery planning. However, manual delineation of the organs by radiologists is tedious, time-consuming and poorly reproducible. Therefore, we propose a fully automatic method for the segmentation of multiple organs from three-dimensional abdominal CT images. The proposed method employs deep fully convolutional neural networks (CNNs) for organ detection and segmentation, which is further refined by a time-implicit multi-phase evolution method. Firstly, a 3D CNN is trained to automatically localize and delineate the organs of interest with a probability prediction map. The learned probability map provides both subject-specific spatial priors and initialization for subsequent fine segmentation. Then, for the refinement of the multi-organ segmentation, image intensity models, probability priors as well as a disjoint region constraint are incorporated into an unified energy functional. Finally, a novel time-implicit multi-phase level-set algorithm is utilized to efficiently optimize the proposed energy functional model. Our method has been evaluated on 140 abdominal CT scans for the segmentation of four organs (liver, spleen and both kidneys). With respect to the ground truth, average Dice overlap ratios for the liver, spleen and both kidneys are 96.0, 94.2 and 95.4%, respectively, and average symmetric surface distance is less than 1.3 mm for all the segmented organs. The computation time for a CT volume is 125 s in average. The achieved accuracy compares well to state-of-the-art methods with much higher efficiency. A fully automatic method for multi-organ segmentation from abdominal CT images was developed and evaluated. The results demonstrated its potential in clinical usage with high effectiveness, robustness and efficiency.
NASA Astrophysics Data System (ADS)
Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.
2012-03-01
This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.
Automated separation of merged Langerhans islets
NASA Astrophysics Data System (ADS)
Švihlík, Jan; Kybic, Jan; Habart, David
2016-03-01
This paper deals with separation of merged Langerhans islets in segmentations in order to evaluate correct histogram of islet diameters. A distribution of islet diameters is useful for determining the feasibility of islet transplantation in diabetes. First, the merged islets at training segmentations are manually separated by medical experts. Based on the single islets, the merged islets are identified and the SVM classifier is trained on both classes (merged/single islets). The testing segmentations were over-segmented using watershed transform and the most probable back merging of islets were found using trained SVM classifier. Finally, the optimized segmentation is compared with ground truth segmentation (correctly separated islets).
An evaluation of open set recognition for FLIR images
NASA Astrophysics Data System (ADS)
Scherreik, Matthew; Rigling, Brian
2015-05-01
Typical supervised classification algorithms label inputs according to what was learned in a training phase. Thus, test inputs that were not seen in training are always given incorrect labels. Open set recognition algorithms address this issue by accounting for inputs that are not present in training and providing the classifier with an option to reject" unknown samples. A number of such techniques have been developed in the literature, many of which are based on support vector machines (SVMs). One approach, the 1-vs-set machine, constructs a slab" in feature space using the SVM hyperplane. Inputs falling on one side of the slab or within the slab belong to a training class, while inputs falling on the far side of the slab are rejected. We note that rejection of unknown inputs can be achieved by thresholding class posterior probabilities. Another recently developed approach, the Probabilistic Open Set SVM (POS-SVM), empirically determines good probability thresholds. We apply the 1-vs-set machine, POS-SVM, and closed set SVMs to FLIR images taken from the Comanche SIG dataset. Vehicles in the dataset are divided into three general classes: wheeled, armored personnel carrier (APC), and tank. For each class, a coarse pose estimate (front, rear, left, right) is taken. In a closed set sense, we analyze these algorithms for prediction of vehicle class and pose. To test open set performance, one or more vehicle classes are held out from training. By considering closed and open set performance separately, we may closely analyze both inter-class discrimination and threshold effectiveness.
49 CFR 219.203 - Responsibilities of railroads and employees.
Code of Federal Regulations, 2010 CFR
2010-10-01
... train which is in proper condition to continue to the next station or its destination after an accident... by § 219.201) indicates a clear probability that the employee played a major role in the cause or...
Careers for the 70's in Diesel Mechanics
ERIC Educational Resources Information Center
Osborne, Barbara
1974-01-01
Increased employment outlook for diesel mechanics is probably due to the fact that most industries using diesel engines in large numbers are expected to expand their activities. The training and workday of one diesel mechanic is described. (MS)
de Klerk, Helen M; Gilbertson, Jason; Lück-Vogel, Melanie; Kemp, Jaco; Munch, Zahn
2016-11-01
Traditionally, to map environmental features using remote sensing, practitioners will use training data to develop models on various satellite data sets using a number of classification approaches and use test data to select a single 'best performer' from which the final map is made. We use a combination of an omission/commission plot to evaluate various results and compile a probability map based on consistently strong performing models across a range of standard accuracy measures. We suggest that this easy-to-use approach can be applied in any study using remote sensing to map natural features for management action. We demonstrate this approach using optical remote sensing products of different spatial and spectral resolution to map the endemic and threatened flora of quartz patches in the Knersvlakte, South Africa. Quartz patches can be mapped using either SPOT 5 (used due to its relatively fine spatial resolution) or Landsat8 imagery (used because it is freely accessible and has higher spectral resolution). Of the variety of classification algorithms available, we tested maximum likelihood and support vector machine, and applied these to raw spectral data, the first three PCA summaries of the data, and the standard normalised difference vegetation index. We found that there is no 'one size fits all' solution to the choice of a 'best fit' model (i.e. combination of classification algorithm or data sets), which is in agreement with the literature that classifier performance will vary with data properties. We feel this lends support to our suggestion that rather than the identification of a 'single best' model and a map based on this result alone, a probability map based on the range of consistently top performing models provides a rigorous solution to environmental mapping. Copyright © 2016 Elsevier Ltd. All rights reserved.
Probability theory for 3-layer remote sensing radiative transfer model: univariate case.
Ben-David, Avishai; Davidson, Charles E
2012-04-23
A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics. © 2012 Optical Society of America
Bonet, Isis; Franco-Montero, Pedro; Rivero, Virginia; Teijeira, Marta; Borges, Fernanda; Uriarte, Eugenio; Morales Helguera, Aliuska
2013-12-23
A(2B) adenosine receptor antagonists may be beneficial in treating diseases like asthma, diabetes, diabetic retinopathy, and certain cancers. This has stimulated research for the development of potent ligands for this subtype, based on quantitative structure-affinity relationships. In this work, a new ensemble machine learning algorithm is proposed for classification and prediction of the ligand-binding affinity of A(2B) adenosine receptor antagonists. This algorithm is based on the training of different classifier models with multiple training sets (composed of the same compounds but represented by diverse features). The k-nearest neighbor, decision trees, neural networks, and support vector machines were used as single classifiers. To select the base classifiers for combining into the ensemble, several diversity measures were employed. The final multiclassifier prediction results were computed from the output obtained by using a combination of selected base classifiers output, by utilizing different mathematical functions including the following: majority vote, maximum and average probability. In this work, 10-fold cross- and external validation were used. The strategy led to the following results: i) the single classifiers, together with previous features selections, resulted in good overall accuracy, ii) a comparison between single classifiers, and their combinations in the multiclassifier model, showed that using our ensemble gave a better performance than the single classifier model, and iii) our multiclassifier model performed better than the most widely used multiclassifier models in the literature. The results and statistical analysis demonstrated the supremacy of our multiclassifier approach for predicting the affinity of A(2B) adenosine receptor antagonists, and it can be used to develop other QSAR models.
Models based on value and probability in health improve shared decision making.
Ortendahl, Monica
2008-10-01
Diagnostic reasoning and treatment decisions are a key competence of doctors. A model based on values and probability provides a conceptual framework for clinical judgments and decisions, and also facilitates the integration of clinical and biomedical knowledge into a diagnostic decision. Both value and probability are usually estimated values in clinical decision making. Therefore, model assumptions and parameter estimates should be continually assessed against data, and models should be revised accordingly. Introducing parameter estimates for both value and probability, which usually pertain in clinical work, gives the model labelled subjective expected utility. Estimated values and probabilities are involved sequentially for every step in the decision-making process. Introducing decision-analytic modelling gives a more complete picture of variables that influence the decisions carried out by the doctor and the patient. A model revised for perceived values and probabilities by both the doctor and the patient could be used as a tool for engaging in a mutual and shared decision-making process in clinical work.
Cook, D A
2006-04-01
Models that estimate the probability of death of intensive care unit patients can be used to stratify patients according to the severity of their condition and to control for casemix and severity of illness. These models have been used for risk adjustment in quality monitoring, administration, management and research and as an aid to clinical decision making. Models such as the Mortality Prediction Model family, SAPS II, APACHE II, APACHE III and the organ system failure models provide estimates of the probability of in-hospital death of ICU patients. This review examines methods to assess the performance of these models. The key attributes of a model are discrimination (the accuracy of the ranking in order of probability of death) and calibration (the extent to which the model's prediction of probability of death reflects the true risk of death). These attributes should be assessed in existing models that predict the probability of patient mortality, and in any subsequent model that is developed for the purposes of estimating these probabilities. The literature contains a range of approaches for assessment which are reviewed and a survey of the methodologies used in studies of intensive care mortality models is presented. The systematic approach used by Standards for Reporting Diagnostic Accuracy provides a framework to incorporate these theoretical considerations of model assessment and recommendations are made for evaluation and presentation of the performance of models that estimate the probability of death of intensive care patients.
Site occupancy models with heterogeneous detection probabilities
Royle, J. Andrew
2006-01-01
Models for estimating the probability of occurrence of a species in the presence of imperfect detection are important in many ecological disciplines. In these ?site occupancy? models, the possibility of heterogeneity in detection probabilities among sites must be considered because variation in abundance (and other factors) among sampled sites induces variation in detection probability (p). In this article, I develop occurrence probability models that allow for heterogeneous detection probabilities by considering several common classes of mixture distributions for p. For any mixing distribution, the likelihood has the general form of a zero-inflated binomial mixture for which inference based upon integrated likelihood is straightforward. A recent paper by Link (2003, Biometrics 59, 1123?1130) demonstrates that in closed population models used for estimating population size, different classes of mixture distributions are indistinguishable from data, yet can produce very different inferences about population size. I demonstrate that this problem can also arise in models for estimating site occupancy in the presence of heterogeneous detection probabilities. The implications of this are discussed in the context of an application to avian survey data and the development of animal monitoring programs.
Probabilistic multi-resolution human classification
NASA Astrophysics Data System (ADS)
Tu, Jun; Ran, H.
2006-02-01
Recently there has been some interest in using infrared cameras for human detection because of the sharply decreasing prices of infrared cameras. The training data used in our work for developing the probabilistic template consists images known to contain humans in different poses and orientation but having the same height. Multiresolution templates are performed. They are based on contour and edges. This is done so that the model does not learn the intensity variations among the background pixels and intensity variations among the foreground pixels. Each template at every level is then translated so that the centroid of the non-zero pixels matches the geometrical center of the image. After this normalization step, for each pixel of the template, the probability of it being pedestrian is calculated based on the how frequently it appears as 1 in the training data. We also use periodicity gait to verify the pedestrian in a Bayesian manner for the whole blob in a probabilistic way. The videos had quite a lot of variations in the scenes, sizes of people, amount of occlusions and clutter in the backgrounds as is clearly evident. Preliminary experiments show the robustness.
Automatic threshold selection for multi-class open set recognition
NASA Astrophysics Data System (ADS)
Scherreik, Matthew; Rigling, Brian
2017-05-01
Multi-class open set recognition is the problem of supervised classification with additional unknown classes encountered after a model has been trained. An open set classifer often has two core components. The first component is a base classifier which estimates the most likely class of a given example. The second component consists of open set logic which estimates if the example is truly a member of the candidate class. Such a system is operated in a feed-forward fashion. That is, a candidate label is first estimated by the base classifier, and the true membership of the example to the candidate class is estimated afterward. Previous works have developed an iterative threshold selection algorithm for rejecting examples from classes which were not present at training time. In those studies, a Platt-calibrated SVM was used as the base classifier, and the thresholds were applied to class posterior probabilities for rejection. In this work, we investigate the effectiveness of other base classifiers when paired with the threshold selection algorithm and compare their performance with the original SVM solution.
Lazarus, Jeffrey E; Klein, Susan K
2010-01-01
This case series examines the practicality of using a standardized method of training children in self-hypnosis (SH) methods to explore its efficiency and short-term efficacy in treating tics in patients with Tourette syndrome. The files of 37 children and adolescents with Tourette syndrome referred for SH training were reviewed, yielding 33 patients for analysis. As part of a protocol for SH training, all viewed a videotape series of a boy undergoing SH training for tic control. Improvement in tic control was abstracted from subjective patient report. Seventy-nine percent of the patients trained in this technique experienced short-term clinical response, defined as control over the average 6-week follow-up period. Of the responders, 46% achieved tic control with SH after only 2 sessions and 96% after 3 visits. One patient required 4 visits. Instruction in SH, aided by the use of videotape training, augments a protocol and probably shortens the time of training in this technique. If SH is made more accessible in this way, it will be a valuable addition to multi-disciplinary management of tic disorders in Tourette syndrome.
Sri Lankan FRAX model and country-specific intervention thresholds.
Lekamwasam, Sarath
2013-01-01
There is a wide variation in fracture probabilities estimated by Asian FRAX models, although the outputs of South Asian models are concordant. Clinicians can choose either fixed or age-specific intervention thresholds when making treatment decisions in postmenopausal women. Cost-effectiveness of such approach, however, needs to be addressed. This study examined suitable fracture probability intervention thresholds (ITs) for Sri Lanka, based on the Sri Lankan FRAX model. Fracture probabilities were estimated using all Asian FRAX models for a postmenopausal woman of BMI 25 kg/m² and has no clinical risk factors apart from a fragility fracture, and they were compared. Age-specific ITs were estimated based on the Sri Lankan FRAX model using the method followed by the National Osteoporosis Guideline Group in the UK. Using the age-specific ITs as the reference standard, suitable fixed ITs were also estimated. Fracture probabilities estimated by different Asian FRAX models varied widely. Japanese and Taiwan models showed higher fracture probabilities while Chinese, Philippine, and Indonesian models gave lower fracture probabilities. Output of remaining FRAX models were generally similar. Age-specific ITs of major osteoporotic fracture probabilities (MOFP) based on the Sri Lankan FRAX model varied from 2.6 to 18% between 50 and 90 years. ITs of hip fracture probabilities (HFP) varied from 0.4 to 6.5% between 50 and 90 years. In finding fixed ITs, MOFP of 11% and HFP of 3.5% gave the lowest misclassification and highest agreement. Sri Lankan FRAX model behaves similar to other Asian FRAX models such as Indian, Singapore-Indian, Thai, and South Korean. Clinicians may use either the fixed or age-specific ITs in making therapeutic decisions in postmenopausal women. The economical aspects of such decisions, however, need to be considered.
Formation of the predicted training parameters in the form of a discrete information stream
NASA Astrophysics Data System (ADS)
Smolentseva, T. E.; Sumin, V. I.; Zolnikov, V. K.; Lavlinsky, V. V.
2018-03-01
In work process of training in the form of a discrete information stream is considered. On each of stages of the considered process portions of the training information and quality of their assimilation are analysed. Individual characteristics and reaction trained for every portion of information on appropriate sections are defined. The control algorithm of training with the predicted number of control checks of the trainee who allows to define what operating influence is considered it is necessary to create for the trainee. On the basis of this algorithm the vector of probabilities of ignorance of elements of the training information is received. As a result of the conducted researches the algorithm on formation of the predicted training parameters is developed. In work the task of comparison of duration of training received experimentally with predicted on the basis of it is solved the conclusion is drawn on efficiency of formation of the predicted training parameters. The program complex on the basis of the values of individual parameters received as a result of experiments on each trainee who allows to calculate individual characteristics is developed, to form rating and to monitor process of change of parameters of training.
Enhancing Flood Prediction Reliability Using Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Liu, Z.; Merwade, V.
2017-12-01
Uncertainty analysis is an indispensable part of modeling the hydrology and hydrodynamics of non-idealized environmental systems. Compared to reliance on prediction from one model simulation, using on ensemble of predictions that consider uncertainty from different sources is more reliable. In this study, Bayesian model averaging (BMA) is applied to Black River watershed in Arkansas and Missouri by combining multi-model simulations to get reliable deterministic water stage and probabilistic inundation extent predictions. The simulation ensemble is generated from 81 LISFLOOD-FP subgrid model configurations that include uncertainty from channel shape, channel width, channel roughness and discharge. Model simulation outputs are trained with observed water stage data during one flood event, and BMA prediction ability is validated for another flood event. Results from this study indicate that BMA does not always outperform all members in the ensemble, but it provides relatively robust deterministic flood stage predictions across the basin. Station based BMA (BMA_S) water stage prediction has better performance than global based BMA (BMA_G) prediction which is superior to the ensemble mean prediction. Additionally, high-frequency flood inundation extent (probability greater than 60%) in BMA_G probabilistic map is more accurate than the probabilistic flood inundation extent based on equal weights.
Reconstruction and Applications of Collective Storylines from Web Photo Collections
2013-09-01
a random surfer model as follows. α = min ( πG(s ∗)q(s∗, st−1) πG(st−1)q(st−1, s∗) , 1 ) where q(i, j ) = λw̃ij + (1− λ)πG( j ). (3.3) 25 In Eq.3.3, the...probability α in Eq.(3.3) where w̃ij is the element (i, j ) of G̃. We repeat this process until the desired numbers of training samples are selected. For...exponential of a linear summation of the functions f lj of the covariates xj with a parameter vector θl = (θl1, · · · , θlJ): log λl(ti|θl) = J ∑ j =1 θljf l j
Emotional Sentence Annotation Helps Predict Fiction Genre.
Samothrakis, Spyridon; Fasli, Maria
2015-01-01
Fiction, a prime form of entertainment, has evolved into multiple genres which one can broadly attribute to different forms of stories. In this paper, we examine the hypothesis that works of fiction can be characterised by the emotions they portray. To investigate this hypothesis, we use the work of fictions in the Project Gutenberg and we attribute basic emotional content to each individual sentence using Ekman's model. A time-smoothed version of the emotional content for each basic emotion is used to train extremely randomized trees. We show through 10-fold Cross-Validation that the emotional content of each work of fiction can help identify each genre with significantly higher probability than random. We also show that the most important differentiator between genre novels is fear.
Cigrang, J A; Todd, S L; Carbone, E G
2000-01-01
A significant proportion of people entering the military are discharged within the first 6 months of enlistment. Mental health related problems are often cited as the cause of discharge. This study evaluated the utility of stress inoculation training in helping reduce the attrition of a sample of Air Force trainees at risk for discharge from basic military training. Participants were 178 trainees referred for a psychological evaluation from basic training. Participants were randomly assigned to a 2-session stress management group or a usual-care control condition. Compared with past studies that used less rigorous methodology, this study did not find that exposure to stress management information increased the probability of graduating basic military training. Results are discussed in terms of possible reasons for the lack of treatment effects and directions for future research.
The CBT Advisor: An Expert System Program for Making Decisions about CBT.
ERIC Educational Resources Information Center
Kearsley, Greg
1985-01-01
Discusses structure, credibility, and use of the Computer Based Training (CBT) Advisor, an expert system designed to help managers make judgements about course selection, system selection, cost/benefits, development effort, and probable success of CBT projects. (MBR)
Cumplido-Hernández, Gustavo; Campos-Arciniega, María Faustina; Chávez-López, Arturo
2007-01-01
Medical specialty training courses have peculiar characteristics that probably influence the learning process of the residents. These training courses take place in large hospitals; the residents are subjected to a rigorous selection process, and at the same time they are affiliated employees of the institution. They work long shifts and are immersed in complex academic and occupational relationships. This study aims to ascertain the significance that these future specialists give to the environment where the training course takes place in relation with their learning process. We used the social anthropology narrative analysis method. A theoretical social perspective was used to emphasize on the context to explain the reality in which the residents live. Discipline, workload, conflictive relationships and strength of family ties were the most significant elements.
NASA Technical Reports Server (NTRS)
Garner, Gregory G.; Thompson, Anne M.
2013-01-01
An ensemble statistical post-processor (ESP) is developed for the National Air Quality Forecast Capability (NAQFC) to address the unique challenges of forecasting surface ozone in Baltimore, MD. Air quality and meteorological data were collected from the eight monitors that constitute the Baltimore forecast region. These data were used to build the ESP using a moving-block bootstrap, regression tree models, and extreme-value theory. The ESP was evaluated using a 10-fold cross-validation to avoid evaluation with the same data used in the development process. Results indicate that the ESP is conditionally biased, likely due to slight overfitting while training the regression tree models. When viewed from the perspective of a decision-maker, the ESP provides a wealth of additional information previously not available through the NAQFC alone. The user is provided the freedom to tailor the forecast to the decision at hand by using decision-specific probability thresholds that define a forecast for an ozone exceedance. Taking advantage of the ESP, the user not only receives an increase in value over the NAQFC, but also receives value for An ensemble statistical post-processor (ESP) is developed for the National Air Quality Forecast Capability (NAQFC) to address the unique challenges of forecasting surface ozone in Baltimore, MD. Air quality and meteorological data were collected from the eight monitors that constitute the Baltimore forecast region. These data were used to build the ESP using a moving-block bootstrap, regression tree models, and extreme-value theory. The ESP was evaluated using a 10-fold cross-validation to avoid evaluation with the same data used in the development process. Results indicate that the ESP is conditionally biased, likely due to slight overfitting while training the regression tree models. When viewed from the perspective of a decision-maker, the ESP provides a wealth of additional information previously not available through the NAQFC alone. The user is provided the freedom to tailor the forecast to the decision at hand by using decision-specific probability thresholds that define a forecast for an ozone exceedance. Taking advantage of the ESP, the user not only receives an increase in value over the NAQFC, but also receives value for
Aldars-García, Laila; Berman, María; Ortiz, Jordi; Ramos, Antonio J; Marín, Sonia
2018-06-01
The probability of growth and aflatoxin B 1 (AFB 1 ) production of 20 isolates of Aspergillus flavus were studied using a full factorial design with eight water activity levels (0.84-0.98 a w ) and six temperature levels (15-40 °C). Binary data obtained from growth studies were modelled using linear logistic regression analysis as a function of temperature, water activity and time for each isolate. In parallel, AFB 1 was extracted at different times from newly formed colonies (up to 20 mm in diameter). Although a total of 950 AFB 1 values over time for all conditions studied were recorded, they were not considered to be enough to build probability models over time, and therefore, only models at 30 days were built. The confidence intervals of the regression coefficients of the probability of growth models showed some differences among the 20 growth models. Further, to assess the growth/no growth and AFB 1 /no- AFB 1 production boundaries, 0.05 and 0.5 probabilities were plotted at 30 days for all of the isolates. The boundaries for growth and AFB 1 showed that, in general, the conditions for growth were wider than those for AFB 1 production. The probability of growth and AFB 1 production seemed to be less variable among isolates than AFB 1 accumulation. Apart from the AFB 1 production probability models, using growth probability models for AFB 1 probability predictions could be, although conservative, a suitable alternative. Predictive mycology should include a number of isolates to generate data to build predictive models and take into account the genetic diversity of the species and thus make predictions as similar as possible to real fungal food contamination. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kramer, Andrew A; Higgins, Thomas L; Zimmerman, Jack E
2014-03-01
To examine the accuracy of the original Mortality Probability Admission Model III, ICU Outcomes Model/National Quality Forum modification of Mortality Probability Admission Model III, and Acute Physiology and Chronic Health Evaluation IVa models for comparing observed and risk-adjusted hospital mortality predictions. Retrospective paired analyses of day 1 hospital mortality predictions using three prognostic models. Fifty-five ICUs at 38 U.S. hospitals from January 2008 to December 2012. Among 174,001 intensive care admissions, 109,926 met model inclusion criteria and 55,304 had data for mortality prediction using all three models. None. We compared patient exclusions and the discrimination, calibration, and accuracy for each model. Acute Physiology and Chronic Health Evaluation IVa excluded 10.7% of all patients, ICU Outcomes Model/National Quality Forum 20.1%, and Mortality Probability Admission Model III 24.1%. Discrimination of Acute Physiology and Chronic Health Evaluation IVa was superior with area under receiver operating curve (0.88) compared with Mortality Probability Admission Model III (0.81) and ICU Outcomes Model/National Quality Forum (0.80). Acute Physiology and Chronic Health Evaluation IVa was better calibrated (lowest Hosmer-Lemeshow statistic). The accuracy of Acute Physiology and Chronic Health Evaluation IVa was superior (adjusted Brier score = 31.0%) to that for Mortality Probability Admission Model III (16.1%) and ICU Outcomes Model/National Quality Forum (17.8%). Compared with observed mortality, Acute Physiology and Chronic Health Evaluation IVa overpredicted mortality by 1.5% and Mortality Probability Admission Model III by 3.1%; ICU Outcomes Model/National Quality Forum underpredicted mortality by 1.2%. Calibration curves showed that Acute Physiology and Chronic Health Evaluation performed well over the entire risk range, unlike the Mortality Probability Admission Model and ICU Outcomes Model/National Quality Forum models. Acute Physiology and Chronic Health Evaluation IVa had better accuracy within patient subgroups and for specific admission diagnoses. Acute Physiology and Chronic Health Evaluation IVa offered the best discrimination and calibration on a large common dataset and excluded fewer patients than Mortality Probability Admission Model III or ICU Outcomes Model/National Quality Forum. The choice of ICU performance benchmarks should be based on a comparison of model accuracy using data for identical patients.
Laursen, Jannie
2014-01-01
Background. When implementing a new surgical technique, the best method for didactic learning has not been settled. There are basically two scenarios: the trainee goes to the teacher's clinic and learns the new technique hands-on, or the teacher goes to the trainee's clinic and performs the teaching there. Methods. An informal literature review was conducted to provide a basis for discussing pros and cons. We also wanted to discuss how many surgeons can be trained in a day and the importance of the demand for a new surgical procedure to ensure a high adoption rate and finally to apply these issues on a discussion of barriers for adoption of the new ONSTEP technique for inguinal hernia repair after initial training. Results and Conclusions. The optimal training method would include moving the teacher to the trainee's department to obtain team-training effects simultaneous with surgical technical training of the trainee surgeon. The training should also include a theoretical presentation and discussion along with the practical training. Importantly, the training visit should probably be followed by a scheduled visit to clear misunderstandings and fine-tune the technique after an initial self-learning period. PMID:25506078
A path integral approach to the Hodgkin-Huxley model
NASA Astrophysics Data System (ADS)
Baravalle, Roman; Rosso, Osvaldo A.; Montani, Fernando
2017-11-01
To understand how single neurons process sensory information, it is necessary to develop suitable stochastic models to describe the response variability of the recorded spike trains. Spikes in a given neuron are produced by the synergistic action of sodium and potassium of the voltage-dependent channels that open or close the gates. Hodgkin and Huxley (HH) equations describe the ionic mechanisms underlying the initiation and propagation of action potentials, through a set of nonlinear ordinary differential equations that approximate the electrical characteristics of the excitable cell. Path integral provides an adequate approach to compute quantities such as transition probabilities, and any stochastic system can be expressed in terms of this methodology. We use the technique of path integrals to determine the analytical solution driven by a non-Gaussian colored noise when considering the HH equations as a stochastic system. The different neuronal dynamics are investigated by estimating the path integral solutions driven by a non-Gaussian colored noise q. More specifically we take into account the correlational structures of the complex neuronal signals not just by estimating the transition probability associated to the Gaussian approach of the stochastic HH equations, but instead considering much more subtle processes accounting for the non-Gaussian noise that could be induced by the surrounding neural network and by feedforward correlations. This allows us to investigate the underlying dynamics of the neural system when different scenarios of noise correlations are considered.
Liu, Bo; Cheng, H D; Huang, Jianhua; Tian, Jiawei; Liu, Jiafeng; Tang, Xianglong
2009-08-01
Because of its complicated structure, low signal/noise ratio, low contrast and blurry boundaries, fully automated segmentation of a breast ultrasound (BUS) image is a difficult task. In this paper, a novel segmentation method for BUS images without human intervention is proposed. Unlike most published approaches, the proposed method handles the segmentation problem by using a two-step strategy: ROI generation and ROI segmentation. First, a well-trained texture classifier categorizes the tissues into different classes, and the background knowledge rules are used for selecting the regions of interest (ROIs) from them. Second, a novel probability distance-based active contour model is applied for segmenting the ROIs and finding the accurate positions of the breast tumors. The active contour model combines both global statistical information and local edge information, using a level set approach. The proposed segmentation method was performed on 103 BUS images (48 benign and 55 malignant). To validate the performance, the results were compared with the corresponding tumor regions marked by an experienced radiologist. Three error metrics, true-positive ratio (TP), false-negative ratio (FN) and false-positive ratio (FP) were used for measuring the performance of the proposed method. The final results (TP = 91.31%, FN = 8.69% and FP = 7.26%) demonstrate that the proposed method can segment BUS images efficiently, quickly and automatically.
Time‐dependent renewal‐model probabilities when date of last earthquake is unknown
Field, Edward H.; Jordan, Thomas H.
2015-01-01
We derive time-dependent, renewal-model earthquake probabilities for the case in which the date of the last event is completely unknown, and compare these with the time-independent Poisson probabilities that are customarily used as an approximation in this situation. For typical parameter values, the renewal-model probabilities exceed Poisson results by more than 10% when the forecast duration exceeds ~20% of the mean recurrence interval. We also derive probabilities for the case in which the last event is further constrained to have occurred before historical record keeping began (the historic open interval), which can only serve to increase earthquake probabilities for typically applied renewal models.We conclude that accounting for the historic open interval can improve long-term earthquake rupture forecasts for California and elsewhere.
Hurford, Amy; Hebblewhite, Mark; Lewis, Mark A
2006-11-01
A reduced probability of finding mates at low densities is a frequently hypothesized mechanism for a component Allee effect. At low densities dispersers are less likely to find mates and establish new breeding units. However, many mathematical models for an Allee effect do not make a distinction between breeding group establishment and subsequent population growth. Our objective is to derive a spatially explicit mathematical model, where dispersers have a reduced probability of finding mates at low densities, and parameterize the model for wolf recolonization in the Greater Yellowstone Ecosystem (GYE). In this model, only the probability of establishing new breeding units is influenced by the reduced probability of finding mates at low densities. We analytically and numerically solve the model to determine the effect of a decreased probability in finding mates at low densities on population spread rate and density. Our results suggest that a reduced probability of finding mates at low densities may slow recolonization rate.
Modeling loosely annotated images using both given and imagined annotations
NASA Astrophysics Data System (ADS)
Tang, Hong; Boujemaa, Nozha; Chen, Yunhao; Deng, Lei
2011-12-01
In this paper, we present an approach to learn latent semantic analysis models from loosely annotated images for automatic image annotation and indexing. The given annotation in training images is loose due to: 1. ambiguous correspondences between visual features and annotated keywords; 2. incomplete lists of annotated keywords. The second reason motivates us to enrich the incomplete annotation in a simple way before learning a topic model. In particular, some ``imagined'' keywords are poured into the incomplete annotation through measuring similarity between keywords in terms of their co-occurrence. Then, both given and imagined annotations are employed to learn probabilistic topic models for automatically annotating new images. We conduct experiments on two image databases (i.e., Corel and ESP) coupled with their loose annotations, and compare the proposed method with state-of-the-art discrete annotation methods. The proposed method improves word-driven probability latent semantic analysis (PLSA-words) up to a comparable performance with the best discrete annotation method, while a merit of PLSA-words is still kept, i.e., a wider semantic range.
Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm
NASA Astrophysics Data System (ADS)
Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.
2017-01-01
This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.
Bozkurt, Selen; Bostanci, Asli; Turhan, Murat
2017-08-11
The goal of this study is to evaluate the results of machine learning methods for the classification of OSA severity of patients with suspected sleep disorder breathing as normal, mild, moderate and severe based on non-polysomnographic variables: 1) clinical data, 2) symptoms and 3) physical examination. In order to produce classification models for OSA severity, five different machine learning methods (Bayesian network, Decision Tree, Random Forest, Neural Networks and Logistic Regression) were trained while relevant variables and their relationships were derived empirically from observed data. Each model was trained and evaluated using 10-fold cross-validation and to evaluate classification performances of all methods, true positive rate (TPR), false positive rate (FPR), Positive Predictive Value (PPV), F measure and Area Under Receiver Operating Characteristics curve (ROC-AUC) were used. Results of 10-fold cross validated tests with different variable settings promisingly indicated that the OSA severity of suspected OSA patients can be classified, using non-polysomnographic features, with 0.71 true positive rate as the highest and, 0.15 false positive rate as the lowest, respectively. Moreover, the test results of different variables settings revealed that the accuracy of the classification models was significantly improved when physical examination variables were added to the model. Study results showed that machine learning methods can be used to estimate the probabilities of no, mild, moderate, and severe obstructive sleep apnea and such approaches may improve accurate initial OSA screening and help referring only the suspected moderate or severe OSA patients to sleep laboratories for the expensive tests.
Combining population and patient-specific characteristics for prostate segmentation on 3D CT images
NASA Astrophysics Data System (ADS)
Ma, Ling; Guo, Rongrong; Tian, Zhiqiang; Venkataraman, Rajesh; Sarkar, Saradwata; Liu, Xiabi; Tade, Funmilayo; Schuster, David M.; Fei, Baowei
2016-03-01
Prostate segmentation on CT images is a challenging task. In this paper, we explore the population and patient-specific characteristics for the segmentation of the prostate on CT images. Because population learning does not consider the inter-patient variations and because patient-specific learning may not perform well for different patients, we are combining the population and patient-specific information to improve segmentation performance. Specifically, we train a population model based on the population data and train a patient-specific model based on the manual segmentation on three slice of the new patient. We compute the similarity between the two models to explore the influence of applicable population knowledge on the specific patient. By combining the patient-specific knowledge with the influence, we can capture the population and patient-specific characteristics to calculate the probability of a pixel belonging to the prostate. Finally, we smooth the prostate surface according to the prostate-density value of the pixels in the distance transform image. We conducted the leave-one-out validation experiments on a set of CT volumes from 15 patients. Manual segmentation results from a radiologist serve as the gold standard for the evaluation. Experimental results show that our method achieved an average DSC of 85.1% as compared to the manual segmentation gold standard. This method outperformed the population learning method and the patient-specific learning approach alone. The CT segmentation method can have various applications in prostate cancer diagnosis and therapy.
Factors influencing reporting and harvest probabilities in North American geese
Zimmerman, G.S.; Moser, T.J.; Kendall, W.L.; Doherty, P.F.; White, Gary C.; Caswell, D.F.
2009-01-01
We assessed variation in reporting probabilities of standard bands among species, populations, harvest locations, and size classes of North American geese to enable estimation of unbiased harvest probabilities. We included reward (US10,20,30,50, or100) and control (0) banded geese from 16 recognized goose populations of 4 species: Canada (Branta canadensis), cackling (B. hutchinsii), Ross's (Chen rossii), and snow geese (C. caerulescens). We incorporated spatially explicit direct recoveries and live recaptures into a multinomial model to estimate reporting, harvest, and band-retention probabilities. We compared various models for estimating harvest probabilities at country (United States vs. Canada), flyway (5 administrative regions), and harvest area (i.e., flyways divided into northern and southern sections) scales. Mean reporting probability of standard bands was 0.73 (95 CI 0.690.77). Point estimates of reporting probabilities for goose populations or spatial units varied from 0.52 to 0.93, but confidence intervals for individual estimates overlapped and model selection indicated that models with species, population, or spatial effects were less parsimonious than those without these effects. Our estimates were similar to recently reported estimates for mallards (Anas platyrhynchos). We provide current harvest probability estimates for these populations using our direct measures of reporting probability, improving the accuracy of previous estimates obtained from recovery probabilities alone. Goose managers and researchers throughout North America can use our reporting probabilities to correct recovery probabilities estimated from standard banding operations for deriving spatially explicit harvest probabilities.
Developing a probability-based model of aquifer vulnerability in an agricultural region
NASA Astrophysics Data System (ADS)
Chen, Shih-Kai; Jang, Cheng-Shin; Peng, Yi-Huei
2013-04-01
SummaryHydrogeological settings of aquifers strongly influence the regional groundwater movement and pollution processes. Establishing a map of aquifer vulnerability is considerably critical for planning a scheme of groundwater quality protection. This study developed a novel probability-based DRASTIC model of aquifer vulnerability in the Choushui River alluvial fan, Taiwan, using indicator kriging and to determine various risk categories of contamination potentials based on estimated vulnerability indexes. Categories and ratings of six parameters in the probability-based DRASTIC model were probabilistically characterized according to the parameter classification methods of selecting a maximum estimation probability and calculating an expected value. Moreover, the probability-based estimation and assessment gave us an excellent insight into propagating the uncertainty of parameters due to limited observation data. To examine the prediction capacity of pollutants for the developed probability-based DRASTIC model, medium, high, and very high risk categories of contamination potentials were compared with observed nitrate-N exceeding 0.5 mg/L indicating the anthropogenic groundwater pollution. The analyzed results reveal that the developed probability-based DRASTIC model is capable of predicting high nitrate-N groundwater pollution and characterizing the parameter uncertainty via the probability estimation processes.
Probability Modeling and Thinking: What Can We Learn from Practice?
ERIC Educational Resources Information Center
Pfannkuch, Maxine; Budgett, Stephanie; Fewster, Rachel; Fitch, Marie; Pattenwise, Simeon; Wild, Chris; Ziedins, Ilze
2016-01-01
Because new learning technologies are enabling students to build and explore probability models, we believe that there is a need to determine the big enduring ideas that underpin probabilistic thinking and modeling. By uncovering the elements of the thinking modes of expert users of probability models we aim to provide a base for the setting of…
Mapping the Transmission Risk of Zika Virus using Machine Learning Models.
Jiang, Dong; Hao, Mengmeng; Ding, Fangyu; Fu, Jingying; Li, Meng
2018-06-19
Zika virus, which has been linked to severe congenital abnormalities, is exacerbating global public health problems with its rapid transnational expansion fueled by increased global travel and trade. Suitability mapping of the transmission risk of Zika virus is essential for drafting public health plans and disease control strategies, which are especially important in areas where medical resources are relatively scarce. Predicting the risk of Zika virus outbreak has been studied in recent years, but the published literature rarely includes multiple model comparisons or predictive uncertainty analysis. Here, three relatively popular machine learning models including backward propagation neural network (BPNN), gradient boosting machine (GBM) and random forest (RF) were adopted to map the probability of Zika epidemic outbreak at the global level, pairing high-dimensional multidisciplinary covariate layers with comprehensive location data on recorded Zika virus infection in humans. The results show that the predicted high-risk areas for Zika transmission are concentrated in four regions: Southeastern North America, Eastern South America, Central Africa and Eastern Asia. To evaluate the performance of machine learning models, the 50 modeling processes were conducted based on a training dataset. The BPNN model obtained the highest predictive accuracy with a 10-fold cross-validation area under the curve (AUC) of 0.966 [95% confidence interval (CI) 0.965-0.967], followed by the GBM model (10-fold cross-validation AUC = 0.964[0.963-0.965]) and the RF model (10-fold cross-validation AUC = 0.963[0.962-0.964]). Based on training samples, compared with the BPNN-based model, we find that significant differences (p = 0.0258* and p = 0.0001***, respectively) are observed for prediction accuracies achieved by the GBM and RF models. Importantly, the prediction uncertainty introduced by the selection of absence data was quantified and could provide more accurate fundamental and scientific information for further study on disease transmission prediction and risk assessment. Copyright © 2018. Published by Elsevier B.V.
A simplified model for the assessment of the impact probability of fragments.
Gubinelli, Gianfilippo; Zanelli, Severino; Cozzani, Valerio
2004-12-31
A model was developed for the assessment of fragment impact probability on a target vessel, following the collapse and fragmentation of a primary vessel due to internal pressure. The model provides the probability of impact of a fragment with defined shape, mass and initial velocity on a target of a known shape and at a given position with respect to the source point. The model is based on the ballistic analysis of the fragment trajectory and on the determination of impact probabilities by the analysis of initial direction of fragment flight. The model was validated using available literature data.
Hwang, Cheng-An
2009-05-01
The objectives of this study were to examine and model the probability of growth of Listeria monocytogenes in cooked salmon containing salt and smoke (phenol) compound and stored at various temperatures. A growth probability model was developed, and the model was compared to a model developed from tryptic soy broth (TSB) to assess the possibility of using TSB as a substitute for salmon. A 6-strain mixture of L. monocytogenes was inoculated into minced cooked salmon and TSB containing 0-10% NaCl and 0-34 ppm phenol to levels of 10(2-3) cfu/g, and the samples were vacuum-packed and stored at 0--25 degrees C for up to 42 days. A total 32 treatments, each with 16 samples, selected by central composite designs were tested. A logistic regression was used to model the probability of growth of L. monocytogenes as a function of concentrations of salt and phenol, and storage temperature. Resulted models showed that the probabilities of growth of L. monocytogenes in both salmon and TSB decreased when the salt and/or phenol concentrations increased, and at lower storage temperatures. In general, the growth probabilities of L. monocytogenes were affected more profoundly by salt and storage temperature than by phenol. The growth probabilities of L. monocytogenes estimated by the TSB model were higher than those by the salmon model at the same salt/phenol concentrations and storage temperatures. The growth probabilities predicted by the salmon and TSB models were comparable at higher storage temperatures, indicating the potential use of TSB as a model system to substitute salmon in studying the growth behavior of L. monocytogenes may only be suitable when the temperatures of interest are in higher storage temperatures (e.g., >12 degrees C). The model for salmon demonstrated the effects of salt, phenol, and storage temperature and their interactions on the growth probabilities of L. monocytogenes, and may be used to determine the growth probability of L. monocytogenes in smoked seafood.
The New Planned Giving Officer.
ERIC Educational Resources Information Center
Jordan, Ronald R.; Quynn, Katelyn L.
1994-01-01
A planned giving officer is seen as an asset to college/university development for technical expertise, credibility, and connections. Attorneys, certified public accountants, bank trust officers, financial planners, investment advisers, life insurance agents, and real estate brokers may be qualified but probably also need training. (MSE)
Factors related to nonuse of seat belts in Michigan.
DOT National Transportation Integrated Search
1987-09-01
This study combined direct observation of seat belt use with interview methods to : identify factors related to seat belt use in a state with a mandatory seat belt use law. Trained : observers recorded restraint use for a probability sample of motori...
NASA Astrophysics Data System (ADS)
Smith, Leonard A.
2010-05-01
This contribution concerns "deep" or "second-order" uncertainty, such as the uncertainty in our probability forecasts themselves. It asks the question: "Is it rational to take (or offer) bets using model-based probabilities as if they were objective probabilities?" If not, what alternative approaches for determining odds, perhaps non-probabilistic odds, might prove useful in practice, given the fact we know our models are imperfect? We consider the case where the aim is to provide sustainable odds: not to produce a profit but merely to rationally expect to break even in the long run. In other words, to run a quantified risk of ruin that is relatively small. Thus the cooperative insurance schemes of coastal villages provide a more appropriate parallel than a casino. A "better" probability forecast would lead to lower premiums charged and less volatile fluctuations in the cash reserves of the village. Note that the Bayesian paradigm does not constrain one to interpret model distributions as subjective probabilities, unless one believes the model to be empirically adequate for the task at hand. In geophysics, this is rarely the case. When a probability forecast is interpreted as the objective probability of an event, the odds on that event can be easily computed as one divided by the probability of the event, and one need not favour taking either side of the wager. (Here we are using "odds-for" not "odds-to", the difference being whether of not the stake is returned; odds of one to one are equivalent to odds of two for one.) The critical question is how to compute sustainable odds based on information from imperfect models. We suggest that this breaks the symmetry between the odds-on an event and the odds-against it. While a probability distribution can always be translated into odds, interpreting the odds on a set of events might result in "implied-probabilities" that sum to more than one. And/or the set of odds may be incomplete, not covering all events. We ask whether or not probabilities based on imperfect models can be expected to yield probabilistic odds which are sustainable. Evidence is provided that suggest this is not the case. Even with very good models (good in an Root-Mean-Square sense), the risk of ruin of probabilistic odds is significantly higher than might be expected. Methods for constructing model-based non-probabilistic odds which are sustainable are discussed. The aim here is to be relevant to real world decision support, and so unrealistic assumptions of equal knowledge, equal compute power, or equal access to information are to be avoided. Finally, the use of non-probabilistic odds as a method for communicating deep uncertainty (uncertainty in a probability forecast itself) is discussed in the context of other methods, such as stating one's subjective probability that the models will prove inadequate in each particular instance (that is, the Probability of a "Big Surprise").
Balagué, Natàlia; González, Jacob; Javierre, Casimiro; Hristovski, Robert; Aragonés, Daniel; Álamo, Juan; Niño, Oscar; Ventura, Josep L.
2016-01-01
Our purpose was to study the effects of different training modalities and detraining on cardiorespiratory coordination (CRC). Thirty-two young males were randomly assigned to four training groups: aerobic (AT), resistance (RT), aerobic plus resistance (AT + RT), and control (C). They were assessed before training, after training (6 weeks) and after detraining (3 weeks) by means of a graded maximal test. A principal component (PC) analysis of selected cardiovascular and cardiorespiratory variables was performed to evaluate CRC. The first PC (PC1) coefficient of congruence in the three conditions (before training, after training and after detraining) was compared between groups. Two PCs were identified in 81% of participants before the training period. After this period the number of PCs and the projection of the selected variables onto them changed only in the groups subject to a training programme. The PC1 coefficient of congruence was significantly lower in the training groups compared with the C group [H(3, N=32) = 11.28; p = 0.01]. In conclusion, training produced changes in CRC, reflected by the change in the number of PCs and the congruence values of PC1. These changes may be more sensitive than the usually explored cardiorespiratory reserve, and they probably precede it. PMID:26903884